Comments (4)
Hi Shuikehuo,
We used Resnet-56 without batch norm from [1] which explains the accuracy difference (and a weaker baseline). And it was trained with SGD optimizer for 50k steps with batch size 128.
The experiment shows the effect of noisy labels on the test accuracy when trained with logistic loss, and with bi-tempered logistic loss. We expect that the results in terms of accuracy delta to remain similar even when trained with the Resnet-50 (with batch norm) model or models of similar capacity. We will make available the code for Resnet-56 model without batch norm from [1] soon to reproduce the results.
Thanks,
[1] Identity Matters in Deep Learning, Moritz Hardt, Tengyu Ma, https://arxiv.org/pdf/1611.04231.pdf
from bi-tempered-loss.
Hi Shuikehuo,
We used Resnet-56 without batch norm from [1] which explains the accuracy difference (and a weaker baseline). And it was trained with SGD optimizer for 50k steps with batch size 128.
The experiment shows the effect of noisy labels on the test accuracy when trained with logistic loss, and with bi-tempered logistic loss. We expect that the results in terms of accuracy delta to remain similar even when trained with the Resnet-50 (with batch norm) model or models of similar capacity. We will make available the code for Resnet-56 model without batch norm from [1] soon to reproduce the results.
Thanks,
[1] Identity Matters in Deep Learning, Moritz Hardt, Tengyu Ma, https://arxiv.org/pdf/1611.04231.pdf
@rohan-anil Is there any reason to use resnet56 without batch normalization? This network seems not to be used a lot in experiments.
When I use resnet110 with BN (as introduced in ResNet v1 paper), the accuracy delta (improvement) does not seems to be very obvious, for clean or noisy labels.
from bi-tempered-loss.
Hi Chi,
Thank you for your interest in our method.
We used the Resnet-56 model because we had the baseline easily available (Moritz was at google, and we used his codebase). I noticed that the bi-tempered loss still gives some improvements in your case. You might achieve even more improvement by tuning t1 and t2 (I would suggest trying a larger t2 value).
Ehsan
from bi-tempered-loss.
@eamid
Thanks a lot!
from bi-tempered-loss.
Related Issues (15)
- trainning is too slow
- How to calculate "simple integration" in Chapter 3 HOT 1
- Why did you use Bergman divergence instead of KL divergence? HOT 7
- Use sigmod or tempered_sigmoid for prediction? HOT 4
- Nan loss during training HOT 10
- noisy instances HOT 2
- How do I implement Tempered_softmax in Cīŧ HOT 1
- loss_test.py fails in test_gradient_error HOT 1
- Accuracy results on MNIST HOT 3
- How are the labels corrupted? HOT 2
- Output activation and bi-tempered loss HOT 1
- TF 2.0 Version HOT 2
- why 5 is the default num_iters? HOT 3
- ValueError: Rank mismatch: Rank of labels (received 2) should equal rank of logits minus 1 (received 2) HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
đ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. đđđ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google â¤ī¸ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bi-tempered-loss.