It is interesting that DropPos achieves exactly the same performance as your CVPR pape

Thanks for your reply. I noticed that both DropPos and <a href="http

Why does DropPos achieve exactly the same performance as HPM? about droppos HOT 3 CLOSED

haochen-wang409 commented on July 19, 2024

Why does DropPos achieve exactly the same performance as HPM?

from droppos.

Comments (3)

Haochen-Wang409 commented on July 19, 2024

This is a very interesting question, and I am also very curious about it.

But before answering this question, I would like to clarify that they did not perform "exactly" the same. If two decimal places are retained, their performance is different.

I think the similar performance might be a coincidence, but the underlying information is that both HPM and DropPos provide some inherent inductive bias to the masked autoencoding paradigm (If we take positions as the reconstruction target, DropPos can be somewhat considered as a generalized masked autoencoding method to some extent, although DropPos has never used the reconstruction loss).
This kind of inherent inductive bias, I guess, might be some global prior. Specifically, HPM knows the discriminative parts of an image, and DropPos knows the precise position of each patch within the whole image. These two features reflect that both methods have learned some kind of global information.

The above statement is purely conjecture. I am looking forward to any comments that can verify the statement. Again, this is a very interesting question and I am curious about the answer.

from droppos.

rayleizhu commented on July 19, 2024

Thanks for your reply.

I noticed that both DropPos and HPM are based on the code of UM-MAE. Recently, I reproduced MAE ViT-B 200-epoch pretraining results with UM-MAE official release.

Now, I plan to reproduce the 800-epoch pretraining-fintuning results of DropPos and HPM using your released codes. In both DropPos and HPM repositories, are the provided scripts you used to produce the results in your papers (i.e. 84.2 top-1 Acc with 800-epoch pretraining and 100-epoch finetuning for ViT-B)? Since the 800-epoch experiments are expensive and carbon-intensive, I do not want to waste resources to run invalid experiments due to inconsistent hyperparameters or similar accidents.

from droppos.

Haochen-Wang409 commented on July 19, 2024

The only modification is to set --epochs 800. However, it is strongly recommended to reproduce experiments with 200 epochs first to ensure the environment (PyTorch version, cuda version etc) will not influence the result significantly.

from droppos.

Why does DropPos achieve exactly the same performance as HPM? about droppos HOT 3 CLOSED

Comments (3)

Related Issues (3)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent