Comments (3)
As you can see the convergence has slowed down, it is overfitting under this set of params. Try these:
#1: enlarge the hidden_factor(e.g. [32,256]) as it improve the capability of the model;
#2: train AFM based on the pretrained parameters from FM.
Here are my results:
#1: python AFM.py --dataset ml-tag --epoch 10 --pretrain 0 --batch_size 4096 --hidden_factor '[32,256]' --keep '[1.0,0.5]' --lamda_attention 100.0 --lr 0.1
Init: train=1.0000, validation=1.0000 [9.7 s]
Epoch 1 [17.9 s] train=0.3919, validation=0.5203 [10.1 s]
Epoch 2 [17.0 s] train=0.2870, validation=0.4870 [9.8 s]
Epoch 3 [16.8 s] train=0.2481, validation=0.4813 [10.0 s]
Epoch 4 [18.4 s] train=0.1992, validation=0.4663 [10.3 s]
Epoch 5 [19.4 s] train=0.1705, validation=0.4588 [10.3 s]
Epoch 6 [18.1 s] train=0.1565, validation=0.4552 [10.1 s]
Epoch 7 [16.9 s] train=0.1403, validation=0.4509 [9.1 s]
Epoch 8 [17.2 s] train=0.1405, validation=0.4514 [9.8 s]
Epoch 9 [18.1 s] train=0.1257, validation=0.4480 [10.0 s]
Epoch 10 [17.3 s] train=0.1238, validation=0.4474 [8.9 s]
#2: python AFM.py --dataset ml-tag --epoch 10 --pretrain 1 --batch_size 4096 --hidden_factor '[16,16]' --keep '[1.0,0.5]' --lamda_attention 100.0 --lr 0.1
Init: train=0.7103, validation=0.7238 [7.3 s]
Epoch 1 [9.0 s] train=0.4867, validation=0.5594 [8.4 s]
Epoch 2 [9.9 s] train=0.4363, validation=0.5403 [7.8 s]
Epoch 3 [8.1 s] train=0.4031, validation=0.5307 [8.3 s]
Epoch 4 [9.8 s] train=0.3796, validation=0.5238 [7.6 s]
cEpoch 5 [9.6 s] train=0.3622, validation=0.5192 [8.7 s]
Epoch 6 [9.0 s] train=0.3476, validation=0.5150 [8.2 s]
Epoch 7 [9.6 s] train=0.3366, validation=0.5126 [7.2 s]
Epoch 8 [8.8 s] train=0.3263, validation=0.5108 [8.1 s]
Epoch 9 [9.8 s] train=0.3186, validation=0.5089 [8.2 s]
Epoch 10 [9.1 s] train=0.3104, validation=0.5072 [8.9 s]
from attentional_factorization_machine.
OK, but the training loss and validation loss gap is still huge I think.
from attentional_factorization_machine.
Yes.
from attentional_factorization_machine.
Related Issues (20)
- glorot初始化 HOT 5
- can share the code? HOT 4
- Could you please share the code? HOT 1
- Could you share your code?
- 请问预训练FM factor256需要多大内存?
- 关于AFM不能复现论文实验结果 HOT 5
- No module named LoadData
- valid_dimension的含义?
- Why there is little improvement of attenion model in my experiment ? HOT 1
- reduce_sum和softmax有bug? HOT 2
- 为什么ml-tag和frappe数据的label值是1和-1?为什么不是一个0-1之间的评分呢?求解答
- 何老师您好,这个RMSE是评分预测的损失吗
- 关于attention_score 和 interaction_score 的问题??
- 请问batch_norm_layer这个函数似乎源码中并没有用上??
- 实数特征支持问题 HOT 1
- Difference between FM and AFM is not as pronounced as in the original paper HOT 3
- 数据处理代码 HOT 1
- 您好,frappe最优结果的最优参数是? HOT 8
- dropout in validation/evaluation HOT 5
- AFM训练中出现nan HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from attentional_factorization_machine.