Git Product home page Git Product logo

wechat_big_data_challenge's Introduction

2021**高校计算机大赛-微信大数据挑战赛Baseline

本次比赛基于脱敏和采样后的数据信息,对于给定的一定数量到访过微信视频号“热门推荐”的用户,根据这些用户在视频号内的历史n天的行为数据,通过算法在测试集上预测出这些用户对于不同视频内容的互动行为(包括点赞、点击头像、收藏、转发等)的发生概率。

本次比赛以多个行为预测结果的加权uAUC值进行评分。大赛官方网站:https://algo.weixin.qq.com/

1. 环境配置

  • pandas>=1.0.5
  • tensorflow>=1.14.0
  • python3

2. 运行配置

  • CPU/GPU均可

  • 最小内存要求

    • 特征/样本生成:3G
    • 模型训练及评估:6G
  • 耗时

    • 测试环境:内存8G,CPU 2.3 GHz 双核Intel Core i5
    • 特征/样本生成:226 s
    • 模型训练及评估:740 s

3. 目录结构

  • comm.py: 数据集生成
  • baseline.py: 模型训练,评估,提交
  • evaluation.py: uauc 评估
  • data/: 数据,特征,模型
    • wechat_algo_data1/: 初赛数据集
    • feature/: 特征
    • offline_train/:离线训练数据集
    • online_train/:在线训练数据集
    • evaluate/:评估数据集
    • submit/:在线预估结果提交
    • model/: 模型文件

4. 运行流程

  • 新建data目录,下载比赛数据集,放在data目录下并解压,得到wechat_algo_data1目录
  • 生成特征/样本:python comm.py (自动新建data目录下用于存储特征、样本和模型的各个目录)
  • 训练离线模型:python baseline.py offline_train
  • 评估离线模型:python baseline.py evaluate (生成data/evaluate/submit_${timestamp}.csv)
  • 训练在线模型:python baseline.py online_train
  • 生成提交文件:python baseline.py submit (生成data/submit/submit_${timestamp}.csv)
  • 评估代码: evaluation.py

5. 模型及特征

  • 模型:Wide & Deep
  • 参数:
    • batch_size: 128
    • emded_dim: 10
    • num_epochs: 1
    • learning_rate: 0.1
  • 特征:
    • dnn 特征: userid, feedid, authorid, bgm_singer_id, bgm_song_id
    • linear 特征:videoplayseconds, device,用户/feed 历史行为次数

6. 模型结果

stage weight_uauc read_comment like click_avatar forward
离线 0.657003 0.626822 0.633864 0.735366 0.690416
在线 0.607908 0.577496 0.588645 0.682383 0.638398

7. 相关文献

  • Cheng, Heng-Tze, et al. "Wide & deep learning for recommender systems." Proceedings of the 1st workshop on deep learning for recommender systems. 2016.

wechat_big_data_challenge's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

wechat_big_data_challenge's Issues

python baseline.py online_train时报错 #6

I0623 21:19:41.091585 6000 basic_session_run_hooks.py:618] Saving checkpoints for 0 into C:/Users/12162/Desktop/model\online_train\read_comment\mod
el.ckpt.
2021-06-23 21:19:41.397079: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at save_restore_v2_ops.cc:109 : Not found: Failed to c
reate a NewWriteableFile: C:/Users/12162/Desktop/model\online_train\read_comment\model.ckpt-0_temp\part-00000-of-00001.data-00000-of-00001.tempstate
1255047267071366445 : 系统找不到指定的路径。
; No such process
Traceback (most recent call last):
File "D:/wechat/data/wechart_tensorflow.py", line 276, in
tf.app.run(main)
File "D:\vippython\venv\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "D:\vippython\venv\lib\site-packages\absl\app.py", line 303, in run
_run_main(main, args)
File "D:\vippython\venv\lib\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "D:/wechat/data/wechart_tensorflow.py", line 228, in main
model.train()
File "D:/wechat/data/wechart_tensorflow.py", line 111, in train
self.estimator.train(
File "D:\vippython\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 349, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "D:\vippython\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1175, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "D:\vippython\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1206, in _train_model_default
return self._train_with_estimator_spec(estimator_spec, worker_hooks,
File "D:\vippython\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1497, in _train_with_estimator_spec
with training.MonitoredTrainingSession(
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 602, in MonitoredTrainingSession
return MonitoredSession(
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1035, in init
super(MonitoredSession, self).init(
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 750, in init
self._sess = _RecoverableSession(self._coordinated_creator)
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1232, in init
_WrappedSession.init(self, self._create_session())
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1237, in _create_session
return self._sess_creator.create_session()
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 910, in create_session
hook.after_create_session(self.tf_sess, self.coord)
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\basic_session_run_hooks.py", line 587, in after_create_session
self._save(session, global_step)
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\basic_session_run_hooks.py", line 619, in _save
self._get_saver().save(session, self._save_path, global_step=step,
File "D:\vippython\venv\lib\site-packages\tensorflow\python\training\saver.py", line 1188, in save
model_checkpoint_path = sess.run(
File "D:\vippython\venv\lib\site-packages\tensorflow\python\client\session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "D:\vippython\venv\lib\site-packages\tensorflow\python\client\session.py", line 1190, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "D:\vippython\venv\lib\site-packages\tensorflow\python\client\session.py", line 1368, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "D:\vippython\venv\lib\site-packages\tensorflow\python\client\session.py", line 1375, in _do_call
return fn(*args)
File "D:\vippython\venv\lib\site-packages\tensorflow\python\client\session.py", line 1359, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "D:\vippython\venv\lib\site-packages\tensorflow\python\client\session.py", line 1451, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 185: invalid continuation byte

Program gets stuck after "Done running local_init_op."

I am running

python baseline.py offline_train

It will suck and the print information is

I0529 02:03:15.461638 4552134080 estimator.py:1147] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0529 02:03:15.462327 4552134080 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0529 02:03:17.626281 4552134080 monitored_session.py:240] Graph was finalized.
INFO:tensorflow:Running local_init_op.
I0529 02:03:18.371701 4552134080 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0529 02:03:18.524997 4552134080 session_manager.py:502] Done running local_init_op.

After a long time, it will train and the time cost is 10664.87 s

my system information is

  • system: macOS Catalina 10.1.5.7
  • memory: 16GB
  • processor: 3.3GHz Dual-Core Intel Core i7

package infromation

  • tensorflow 1.14.0
  • pandas 1.1.3

What is the problem?

python baseline.py online_train时报错

INFO:tensorflow:Restoring parameters from ./data/model/offline_train/read_comment/model.ckpt-4843
I0605 08:41:00.326473 139973162354496 saver.py:1292] Restoring parameters from ./data/model/offline_train/read_comment/model.ckpt-4843
INFO:tensorflow:Running local_init_op.
I0605 08:41:00.489946 139973162354496 session_manager.py:505] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0605 08:41:00.499381 139973162354496 session_manager.py:508] Done running local_init_op.
Fatal Python error: Segmentation fault

Thread 0x00007f4de2ef5700 (most recent call first):
File "/opt/anaconda3/lib/python3.7/threading.py", line 296 in wait
File "/home/liujialin/.local/lib/python3.7/site-packages/tensorflow/python/summary/writer/event_file_writer.py", line 266 in get
File "/home/liujialin/.local/lib/python3.7/site-packages/tensorflow/python/summary/writer/event_file_writer.py", line 209 in run
File "/opt/anaconda3/lib/python3.7/threading.py", line 926 in _bootstrap_inner
File "/opt/anaconda3/lib/python3.7/threading.py", line 890 in _bootstrap

Current thread 0x00007f4e0a9ec740 (most recent call first):
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/base.py", line 829 in compile_internal
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/arrayobj.py", line 2546 in init_specific
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/arrayobj.py", line 3125 in make_array_nditer
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/base.py", line 1079 in call
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 747 in lower_call
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 781 in lower_expr
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 449 in lower_assign
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 303 in lower_inst
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 254 in lower_block
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 239 in lower_function_body
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 214 in lower_normal_function
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 173 in lower
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 992 in native_lowering_stage
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 615 in backend_nopython_mode
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 628 in _backend
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 678 in stage_nopython_backend
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 245 in run
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 791 in _compile_core
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 804 in _compile_bytecode
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 367 in compile_extra
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 897 in compile_internal
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/base.py", line 799 in compile_subroutine_no_cache
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/base.py", line 819 in compile_subroutine
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/base.py", line 829 in compile_internal
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/arraymath.py", line 166 in array_sum
File "/opt/anaconda3/lib/python3.7/site-packages/numba/targets/base.py", line 1079 in call
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 747 in lower_call
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 781 in lower_expr
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 449 in lower_assign
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 303 in lower_inst
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 254 in lower_block
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 239 in lower_function_body
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 214 in lower_normal_function
File "/opt/anaconda3/lib/python3.7/site-packages/numba/lowering.py", line 173 in lower
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 992 in native_lowering_stage
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 615 in backend_nopython_mode
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 628 in _backend
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 678 in stage_nopython_backend
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 245 in run
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 791 in _compile_core
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 804 in _compile_bytecode
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 367 in compile_extra
File "/opt/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 873 in compile_extra
File "/opt/anaconda3/lib/python3.7/site-packages/numba/dispatcher.py", line 83 in compile
File "/opt/anaconda3/lib/python3.7/site-packages/numba/dispatcher.py", line 653 in compile
File "/opt/anaconda3/lib/python3.7/site-packages/numba/dispatcher.py", line 325 in _compile_for_args
File "/home/liujialin/\u5fae\u4fe1\u5927\u6570\u636e\u6bd4\u8d5b/evaluation.py", line 25 in fast_auc
File "/home/liujialin/\u5fae\u4fe1\u5927\u6570\u636e\u6bd4\u8d5b/evaluation.py", line 54 in uAUC
File "baseline.py", line 134 in evaluate
File "baseline.py", line 225 in main
File "/home/liujialin/.local/lib/python3.7/site-packages/absl/app.py", line 251 in _run_main
File "/home/liujialin/.local/lib/python3.7/site-packages/absl/app.py", line 303 in run
File "/home/liujialin/.local/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40 in run
File "baseline.py", line 270 in
Segmentation fault (core dumped)

在python baseline.py offline_train时报错

已解决。
2022-04-08 22:26:59.617224: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at save_restore_v2_ops.cc:110 : Not found: Failed to create a NewWriteableFile: ./data/model\offline_train\read_comment\model.ckpt-0_temp\part-00000-of-00001.data-00000-of-00001.tempstate8017970075933777709 : 系统找不到 指定的路径。
; No such process
Traceback (most recent call last):
File "C:\Users\nolly\Desktop\WeChat_Big_Data_Challenge-master\baseline.py", line 272, in
tf.app.run(main)
。。。。。。
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 170: invalid continuation byte
2022.4.10:已解决,代码中需要把文件路径相关的"/"改为"",即修改成符合Windows下的文件地址格式,可正常运行。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.