Hi, Thanks so much for the great work! I tried to restore 'imagenet21k+imagenet201

Correct imagenet top-1 accuracy uploaded pretrained weights of hybrid ViT? about vision_transformer HOT 3 OPEN

coeusguo commented on August 21, 2024

Correct imagenet top-1 accuracy uploaded pretrained weights of hybrid ViT?

from vision_transformer.

Comments (3)

andsteing commented on August 21, 2024

Top-1 accuracy on imagenet2012 should be 85.082% for that model.

We use im = tf.image.resize(im, [384, 384]) and then normalize the image m = (im - 127.5) / 127.5 - see https://github.com/google-research/vision_transformer/blob/master/vit_jax/input_pipeline.py

from vision_transformer.

dribnet commented on August 21, 2024

My current B_16 top-1 accuracy is also a bit lower at 83.6%. One difference I found looking at this more closely is that the image normalization I was using was based on the vit_jax.ipynb sample notebook, which is subtly different:

I've replaced this in my own code now with the m = (im - 127.5) / 127.5 version and am seeing slight differences and will report back if this closes the gap on the reported score.

[Edit: ok - the normalization difference in score is very small (0.01%) - so also checking resize routing on TiT-B_16 top1 score and will report back]

from vision_transformer.

dribnet commented on August 21, 2024

So that normalization difference cited above changes very little (0.01%) and so I looked closer at the resizing code, but matching that exactly also only moved my score about 0.4% - the best top-1 accuracy I could get matching both for the ViT-B_16 model was 84.01%. So I'd been keen to do a deeper dive to diagnose this if anyone else has gotten to 85% and would be willing to compare exact results. I'll suggest a good starting point would be to look at the accuracy of a particular class (50 images each) and then potentially dive down into any differences for particular files, etc.

from vision_transformer.

Recommend Projects

Correct imagenet top-1 accuracy uploaded pretrained weights of hybrid ViT? about vision_transformer HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent