Git Product home page Git Product logo

tianchi-bigdata's People

Contributors

pnyuan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tianchi-bigdata's Issues

您好 代码运行问题

您好我在运行Tianchi-BigData/Mobile_Recommendation/model_based/gbdt_on_subsample.py时出现了这样的问题
Traceback (most recent call last):
File "gbdt_on_subsample.py", line 726, in
GBDT_clf.fit(train_X, train_y)
File "/Library/Python/2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 991, in fit
self.check_params()
File "/Library/Python/2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 829, in check_params
self.loss
= loss_class(self.n_classes
)
File "/Library/Python/2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 472, in init
self.class.name))
ValueError: BinomialDeviance requires 2 classes.
请问是为什么呢

你好,有个特征处理部分的小问题想请教一下

您好,我在阅读您的代码中遇到了一些小问题:

feature_construct_part_1.py文件的step1.1中,您对用户-行为特征在1、3、6天分别进行了统计,在代码的第93行您写的是:

df_part_1_u_b_count_in_6 = df_part_1.drop_duplicates(['user_id','behavior_type'], 'last')[['user_id','behavior_type','cumcount']]

在第117行,您写的是:

df_part_1_u_b_count_in_3 = df_part_1.drop_duplicates(['user_id','behavior_type'], 'last')[['user_id','behavior_type','cumcount']]

在第141行,您写的是:

df_part_1_u_b_count_in_1 = df_part_1_in_1.drop_duplicates(['user_id','behavior_type'], 'last')[['user_id','behavior_type','cumcount']]

我的问题是,在df_part_1_u_b_count_in_6中您没有对时间做限定,因为数据本身的最小日期就是大于22号的,这个算是一个潜在的小bug还是特意如此?

另一个问题是,在上述三段代码中,前2段您使用了全量数据df_part_1去去重,第三段使用了分量数据df_part_1_in_1去去重,我的理解应该使用分量数据去去重,否则的话筛选出来的结果就不准确了,我想请教一下我的理解是否准确?谢谢

找不到引用文件

你好,请问 Mobile_Recommendation\data_preanalysis\data_analysis.py 中第81行
row_dict2csv(count_day, "../data/count_day.csv" )

这里的count_day.csv文件在哪里获取呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.