Comments (3)
This is a very interesting question, and I am also very curious about it.
But before answering this question, I would like to clarify that they did not perform "exactly" the same. If two decimal places are retained, their performance is different.
I think the similar performance might be a coincidence, but the underlying information is that both HPM and DropPos provide some inherent inductive bias to the masked autoencoding paradigm (If we take positions as the reconstruction target, DropPos can be somewhat considered as a generalized masked autoencoding method to some extent, although DropPos has never used the reconstruction loss).
This kind of inherent inductive bias, I guess, might be some global prior. Specifically, HPM knows the discriminative parts of an image, and DropPos knows the precise position of each patch within the whole image. These two features reflect that both methods have learned some kind of global information.
The above statement is purely conjecture. I am looking forward to any comments that can verify the statement. Again, this is a very interesting question and I am curious about the answer.
from droppos.
Thanks for your reply.
I noticed that both DropPos and HPM are based on the code of UM-MAE. Recently, I reproduced MAE ViT-B 200-epoch pretraining results with UM-MAE official release.
Now, I plan to reproduce the 800-epoch pretraining-fintuning results of DropPos and HPM using your released codes. In both DropPos and HPM repositories, are the provided scripts you used to produce the results in your papers (i.e. 84.2 top-1 Acc with 800-epoch pretraining and 100-epoch finetuning for ViT-B)? Since the 800-epoch experiments are expensive and carbon-intensive, I do not want to waste resources to run invalid experiments due to inconsistent hyperparameters or similar accidents.
from droppos.
The only modification is to set --epochs 800
. However, it is strongly recommended to reproduce experiments with 200 epochs first to ensure the environment (PyTorch version, cuda version etc) will not influence the result significantly.
from droppos.
Related Issues (3)
- 与训练阶段loss的最终值 HOT 2
- pretrain model HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from droppos.