Thanks for your great work! I have been trying to reproduce the results, however, the training loss didn't decrease and the accuracy was always 0. I followed the instructions and didn't change the code except calculating the ETA in seconds. Do you have any idea what is happening?
Here is parts of the training log.
Start Training, Data Length: 488766
epoch=0,train_iter=0,eta=36370.56232s,CE V=6.21437,lr=0.004800,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=34.82164
epoch=0,train_iter=1,eta=1969.14507s,CE V=6.45720,lr=0.004800,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=36.19475
epoch=1,train_iter=3819,eta=2056.03918s,CE V=7.47496,lr=0.004799,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.12496
epoch=2,train_iter=7638,eta=1983.88589s,CE V=7.74652,lr=0.004797,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=34.23733
epoch=3,train_iter=11457,eta=1980.25018s,CE V=7.42752,lr=0.004793,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=34.97991
epoch=4,train_iter=15276,eta=2137.25215s,CE V=7.26190,lr=0.004787,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.26590
epoch=5,train_iter=19095,eta=2146.40197s,CE V=7.49472,lr=0.004779,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=33.89755
epoch=6,train_iter=22914,eta=2150.36547s,CE V=8.05367,lr=0.004770,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.19277
epoch=7,train_iter=26733,eta=2126.86129s,CE V=7.35550,lr=0.004760,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=34.21873
epoch=8,train_iter=30552,eta=2104.79027s,CE V=7.92131,lr=0.004748,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=36.37280
epoch=9,train_iter=34371,eta=2036.25903s,CE V=7.50125,lr=0.004734,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=35.59670
epoch=10,train_iter=38190,eta=2054.36018s,CE V=7.56499,lr=0.004718,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00205,eta=37.45618
epoch=12,train_iter=45828,eta=2067.50536s,CE V=7.65708,lr=0.004683,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.46259
epoch=13,train_iter=49647,eta=2110.89804s,CE V=7.37481,lr=0.004662,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.27319
epoch=14,train_iter=53466,eta=2129.09115s,CE V=8.00831,lr=0.004641,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00205,eta=37.12486
epoch=15,train_iter=57285,eta=2052.47540s,CE V=7.74086,lr=0.004618,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.35628
epoch=16,train_iter=61104,eta=2103.29428s,CE V=7.08775,lr=0.004593,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=38.62439
epoch=17,train_iter=64923,eta=2127.42126s,CE V=7.76207,lr=0.004566,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=39.42123
epoch=18,train_iter=68742,eta=2032.93654s,CE V=7.45536,lr=0.004539,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=35.68857
epoch=19,train_iter=72561,eta=2047.84631s,CE V=7.57583,lr=0.004509,best_acc=0.000000
Start Testing, Data Length: 25000
start testing
v_acc=0.00000,eta=37.30936
epoch=20,train_iter=76380,eta=2126.75749s,CE V=7.46031,lr=0.004479,best_acc=0.000000
Terminated