Comments (3)
Hi, thanks for the nice words. I trained the model three times for 25 epochs, and had accuracies of 97.7%, 98.08%, and 97.53%. I used the following config and architecture, which should be the same as in the Git. Maybe you can train a second and third time and see if your results are different?
Otherwise it might also be that the used hardware has a minor influence on the final results.
Additionally, you can increase the number of reasoning steps (see config file), which will make training a bit slower but should ideally increase your final accuracy a little.
Using config:
{'CUDA': True,
'DATASET': {'
'DATA_DIR': '/data/CLEVR/CLEVR_v1.0',},
'GPU_ID': '0',,
'TRAIN': {'BATCH_SIZE': 64,
'CLIP': 8,
'CLIP_GRADS': True,
'FLAG': True,
'LEARNING_RATE': 0.0001,
'MAX_EPOCHS': 25,
'MAX_STEPS': 4,
'PATIENCE': 5,
'SNAPSHOT_INTERVAL': 5,
'WEIGHT_INIT': 'xavier_uniform'},
'WORKERS': 4}
This was my network architecture for the training runs:
MACNetwork(
(input_unit): InputUnit(
(stem): Sequential(
(0): Dropout(p=0.18)
(1): Conv2d(1024, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(2): ELU(alpha=1.0)
(3): Dropout(p=0.18)
(4): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(5): ELU(alpha=1.0)
)
(encoder_embed): Embedding(90, 300)
(encoder): LSTM(300, 256, batch_first=True, bidirectional=True)
(embedding_dropout): Dropout(p=0.15)
(question_dropout): Dropout(p=0.08)
)
(output_unit): OutputUnit(
(question_proj): Linear(in_features=512, out_features=512, bias=True)
(classifier): Sequential(
(0): Dropout(p=0.15)
(1): Linear(in_features=1024, out_features=512, bias=True)
(2): ELU(alpha=1.0)
(3): Dropout(p=0.15)
(4): Linear(in_features=512, out_features=28, bias=True)
)
)
(mac): MACUnit(
(control): ControlUnit(
(attn): Linear(in_features=512, out_features=1, bias=True)
(control_input): Sequential(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Tanh()
)
(control_input_u): ModuleList(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Linear(in_features=512, out_features=512, bias=True)
(2): Linear(in_features=512, out_features=512, bias=True)
(3): Linear(in_features=512, out_features=512, bias=True)
)
)
(read): ReadUnit(
(concat): Linear(in_features=1024, out_features=512, bias=True)
(concat_2): Linear(in_features=512, out_features=512, bias=True)
(attn): Linear(in_features=512, out_features=1, bias=True)
(dropout): Dropout(p=0.15)
(kproj): Linear(in_features=512, out_features=512, bias=True)
(mproj): Linear(in_features=512, out_features=512, bias=True)
(activation): ELU(alpha=1.0)
)
(write): WriteUnit(
(linear): Linear(in_features=1024, out_features=512, bias=True)
)
)
)
from pytorch-mac-network.
Thanks for the reply, it seems the only differences so far are that gradient clipping was set to 'True' but with no value and the number of workers which was 8 in my cfg. Retraining and I'll let you know.
from pytorch-mac-network.
Okay, I ran it again with your configuration and got 97.05% on val, so I guess it was that. Closing!
from pytorch-mac-network.
Related Issues (4)
- You forgot the requirements.txt HOT 1
- Results on subset HOT 4
- Previous control is not used? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch-mac-network.