pnyuan / tianchi-bigdata Goto Github PK
View Code? Open in Web Editor NEWA code repository for my Tianchi big data competition.
A code repository for my Tianchi big data competition.
您好我在运行Tianchi-BigData/Mobile_Recommendation/model_based/gbdt_on_subsample.py时出现了这样的问题
Traceback (most recent call last):
File "gbdt_on_subsample.py", line 726, in
GBDT_clf.fit(train_X, train_y)
File "/Library/Python/2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 991, in fit
self.check_params()
File "/Library/Python/2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 829, in check_params
self.loss = loss_class(self.n_classes)
File "/Library/Python/2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 472, in init
self.class.name))
ValueError: BinomialDeviance requires 2 classes.
请问是为什么呢
您好,我在阅读您的代码中遇到了一些小问题:
在feature_construct_part_1.py
文件的step1.1中,您对用户-行为特征在1、3、6天分别进行了统计,在代码的第93行您写的是:
df_part_1_u_b_count_in_6 = df_part_1.drop_duplicates(['user_id','behavior_type'], 'last')[['user_id','behavior_type','cumcount']]
在第117行,您写的是:
df_part_1_u_b_count_in_3 = df_part_1.drop_duplicates(['user_id','behavior_type'], 'last')[['user_id','behavior_type','cumcount']]
在第141行,您写的是:
df_part_1_u_b_count_in_1 = df_part_1_in_1.drop_duplicates(['user_id','behavior_type'], 'last')[['user_id','behavior_type','cumcount']]
我的问题是,在df_part_1_u_b_count_in_6
中您没有对时间做限定,因为数据本身的最小日期就是大于22号的,这个算是一个潜在的小bug还是特意如此?
另一个问题是,在上述三段代码中,前2段您使用了全量数据df_part_1
去去重,第三段使用了分量数据df_part_1_in_1
去去重,我的理解应该使用分量数据去去重,否则的话筛选出来的结果就不准确了,我想请教一下我的理解是否准确?谢谢
你好,请问 Mobile_Recommendation\data_preanalysis\data_analysis.py 中第81行
row_dict2csv(count_day, "../data/count_day.csv" )
这里的count_day.csv文件在哪里获取呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.