Git Product home page Git Product logo

Comments (3)

Haochen-Wang409 avatar Haochen-Wang409 commented on July 19, 2024

This is a very interesting question, and I am also very curious about it.

But before answering this question, I would like to clarify that they did not perform "exactly" the same. If two decimal places are retained, their performance is different.

I think the similar performance might be a coincidence, but the underlying information is that both HPM and DropPos provide some inherent inductive bias to the masked autoencoding paradigm (If we take positions as the reconstruction target, DropPos can be somewhat considered as a generalized masked autoencoding method to some extent, although DropPos has never used the reconstruction loss).
This kind of inherent inductive bias, I guess, might be some global prior. Specifically, HPM knows the discriminative parts of an image, and DropPos knows the precise position of each patch within the whole image. These two features reflect that both methods have learned some kind of global information.

The above statement is purely conjecture. I am looking forward to any comments that can verify the statement. Again, this is a very interesting question and I am curious about the answer.

from droppos.

rayleizhu avatar rayleizhu commented on July 19, 2024

Thanks for your reply.

I noticed that both DropPos and HPM are based on the code of UM-MAE. Recently, I reproduced MAE ViT-B 200-epoch pretraining results with UM-MAE official release.

Now, I plan to reproduce the 800-epoch pretraining-fintuning results of DropPos and HPM using your released codes. In both DropPos and HPM repositories, are the provided scripts you used to produce the results in your papers (i.e. 84.2 top-1 Acc with 800-epoch pretraining and 100-epoch finetuning for ViT-B)? Since the 800-epoch experiments are expensive and carbon-intensive, I do not want to waste resources to run invalid experiments due to inconsistent hyperparameters or similar accidents.

from droppos.

Haochen-Wang409 avatar Haochen-Wang409 commented on July 19, 2024

The only modification is to set --epochs 800. However, it is strongly recommended to reproduce experiments with 200 epochs first to ensure the environment (PyTorch version, cuda version etc) will not influence the result significantly.

from droppos.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.