wepe / o2o-coupon-usage-forecast Goto Github PK
View Code? Open in Web Editor NEW1st Place Solution for O2O Coupon Usage Forecast
1st Place Solution for O2O Coupon Usage Forecast
按照你的视频中的说法(https://tianchi.aliyun.com/video.htm?spm=5176.100066.0.0.8a30bc8AtOyVq) coupon feature提取的应该是feature window(在代码中的feature3,feature2,feature1中直接提取),但是在代码中是在label window (代码中的dataset3,dataset2,dataset1)中提取的,请问coupon feature为什么在label window中提取特征?谢谢了!
t7 = dataset3[['user_id','coupon_id','date_received']]
t7 = pd.merge(t7,t6,on=['user_id','coupon_id'],how='left')
t7['date_received_date'] = t7.date_received.astype('str') + '-' + t7.dates
t7['day_gap_before'] = t7.date_received_date.apply(get_day_gap_before)
t7['day_gap_after'] = t7.date_received_date.apply(get_day_gap_after)
t7 = t7[['user_id','coupon_id','date_received','day_gap_before','day_gap_after']]
跑到这个位置的时候(大概155行左右)
出现了以下信息:
File "D:/Code/ali/O2O_data/readcode.py", line 89, in
t7 = pd.merge(t7,t6,on=['user_id','coupon_id'],how='left')
File "D:\anaconda\lib\site-packages\pandas\core\reshape\merge.py", line 60, in merge
validate=validate)
File "D:\anaconda\lib\site-packages\pandas\core\reshape\merge.py", line 554, in init
self._maybe_coerce_merge_keys()
File "D:\anaconda\lib\site-packages\pandas\core\reshape\merge.py", line 980, in _maybe_coerce_merge_keys
raise ValueError(msg)
ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat
求助,谢谢
t7['day_gap_before'] = t7.date_received_date.apply(get_day_gap_before)
Traceback (most recent call last):
File "C:\Users\TangX\AppData\Local\Temp\ipykernel_10460\2415590082.py", line 1, in <cell line: 1>
t7['day_gap_before'] = t7.date_received_date.apply(get_day_gap_before)
File "F:\Anaconda_app\lib\site-packages\pandas\core\series.py", line 4433, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "F:\Anaconda_app\lib\site-packages\pandas\core\apply.py", line 1082, in apply
return self.apply_standard()
File "F:\Anaconda_app\lib\site-packages\pandas\core\apply.py", line 1137, in apply_standard
mapped = lib.map_infer(
File "pandas_libs\lib.pyx", line 2870, in pandas._libs.lib.map_infer
File "C:\Users\TangX\AppData\Local\Temp\ipykernel_10460\2229747806.py", line 6, in get_day_gap_before
this_gap = (date(int(date_received[0:4]),int(date_received[4:6]),int(date_received[6:8]))-date(int(d[0:4]),int(d[4:6]),int(d[6:8]))).days
ValueError: invalid literal for int() with base 10: 'Date'
我看你们第一赛季代码的 xgb 是用 rank:pairwise 作为目标的。我用分类作为目标会造成严重的过拟合。这有什么原因么?
这是什么问题呢???
Traceback (most recent call last):
File "xgb.py", line 58, in
dataset3_preds.label = MinMaxScaler().fit_transform(dataset3_preds.label)#区间缩放到[0,1]
File "/home/cxy/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 518, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/home/cxy/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 308, in fit
return self.partial_fit(X, y)
File "/home/cxy/anaconda3/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 334, in partial_fit
estimator=self, dtype=FLOAT_DTYPES)
File "/home/cxy/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py", line 410, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[-0.21156955 0.33369869 -0.30134243 ..., 0.08833629 -0.15614033
0.31751922].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
直接运行有错:
Traceback (most recent call last):
File "/home/pzy/program/tianchi/test/extract_feature.py", line 99, in
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
File "/usr/local/lib/python2.7/dist-packages/pandas/core/ops.py", line 715, in wrapper
result = wrap_results(safe_na_op(lvalues, rvalues))
File "/usr/local/lib/python2.7/dist-packages/pandas/core/ops.py", line 676, in safe_na_op
return na_op(lvalues, rvalues)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/ops.py", line 658, in na_op
result[mask] = op(x[mask], _values_from_object(y[mask]))
TypeError: unsupported operand type(s) for -: 'float' and 'str'
是不是代码不是最终的
wepon的示例中coupon的特征构造只是利用了测试集构造的coupon相关特征,但是我理解coupon的相关特征比如折扣率等,是context特征,不应该只利用测试集构造。这块请问是怎么思考的?
你好 我是新人 想学习一下你的代码 运行extract_feature.py出现问题 报错如下
Traceback (most recent call last):
File "extract_feature.py", line 60, in
feature3 = off_train[((off_train.date>='20160315')&(off_train.date<='20160630'))|((off_train.date=='null')&(off_train.date_received>='20160315')&(off_train.date_received<='20160630'))]
File "C:\Users\u\AppData\Roaming\Python\Python36\site-packages\pandas\core\ops.py", line 879, in wrapper
res = na_op(values, other)
File "C:\Users\u\AppData\Roaming\Python\Python36\site-packages\pandas\core\ops.py", line 818, in na_op
raise TypeError("invalid type comparison")
TypeError: invalid type comparison
想请问该怎么处理
Traceback (most recent call last):
File "extract_feature.py", line 99, in
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
File "/usr/lib64/python2.7/site-packages/pandas/core/ops.py", line 715, in wrapper
result = wrap_results(safe_na_op(lvalues, rvalues))
File "/usr/lib64/python2.7/site-packages/pandas/core/ops.py", line 676, in safe_na_op
return na_op(lvalues, rvalues)
File "/usr/lib64/python2.7/site-packages/pandas/core/ops.py", line 658, in na_op
result[mask] = op(x[mask], _values_from_object(y[mask]))
TypeError: unsupported operand type(s) for -: 'float' and 'str'
不断地测试?有没有一些trick方便分享么?
我在增加特征的过程中,发现训练集可以提取的特征但测试集不能提取,比如题目让你提交测试集的预测结果,即“领取优惠券后15天以内的使用情况”,但是训练集在提取特征过程中,我增加了与核销有关的特征,比如“商家优惠券被领取后核销次数”,而测试集去无法提取和训练集相同的这个特征,因为测试集只有优惠券的领取数据。我认为可能是因为不能提取一致的特征才导致训练集和测试集特征不匹配,那么我应该如何解决这个给问题?
提供的百度网盘的数据集无法解压,解压出错。请问能有解决办法吗?或者重新提供一份,谢谢了。
我是新手一枚,运行出现:
Traceback (most recent call last):
File "D:/pythonCode/PycharmProjects/O2O-Coupon-Usage-Forecast/code/wepon/season one/extract_feature.py", line 97, in
t3['this_month_user_receive_same_coupon_lastone'] = t3.max_date_received - t3.date_received
File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 721, in wrapper
result = wrap_results(safe_na_op(lvalues, rvalues))
File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 682, in safe_na_op
return na_op(lvalues, rvalues)
File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py", line 664, in na_op
result[mask] = op(x[mask], _values_from_object(y[mask]))
TypeError: unsupported operand type(s) for -: 'float' and 'str'
看了几个issue的问题都一样,但没有找到能够解决问题的方案,希望能够得到解答,万分感谢
有一部分是distance相关的特征,缺失是由于原始字段存在缺失造成的。另有一部分缺失是由于滑窗法分割了数据造成的,这一类就不进行处理吗?除此以外的其他特征,类似于购买次数,购买过的商家次数是不是均可以用0值替换?虽然缺失值对tree一类的方法没有影响,但是其他方法的话是否还是应该处理一下?
请问在运行extract_feature.py时出现了这样的问题
Traceback (most recent call last):
File "extract_feature.py", line 60, in
feature3 = off_train[((off_train.date>='20160315')&(off_train.date<='20160630'))|((off_train.date=='null')&(off_train.date_received>='20160315')&(off_train.date_received<='20160630'))]
File "/Library/Python/2.7/site-packages/pandas/core/ops.py", line 879, in wrapper
res = na_op(values, other)
File "/Library/Python/2.7/site-packages/pandas/core/ops.py", line 818, in na_op
raise TypeError("invalid type comparison")
TypeError: invalid type comparison
怎么解决呢
请问你的队友都不是用python来操作的,都是sql文件是用什么操作的,不管是操作文件连生成的feature都是sql文件?小白好奇,请教一下
您好,我看season1的extract_feature的代码里面好像没有用到online的数据啊。。。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.