Git Product home page Git Product logo

Comments (6)

joe-siyuan-qiao avatar joe-siyuan-qiao commented on August 31, 2024

Hi, thanks for the question.

AWSConv keeps the mean and std of the weights loaded from the pre-trained checkpoints on ImageNet. So no matter what the gradients they receive when fine-tuning on COCO, they cannot get too far away from the pre-trained checkpoints. They are like adding anchors to the weights.

We used it because when playing with pre-trained backbones, it was very easy to get a NaN loss. Fixing the statistics of weights expands the explore space of the architecture design as it's almost unlikely to get a NaN loss as long as we don't push it too hard. As for the performance impact, it depends. It probably will hurt the final performance when training large models for a very long time as fixing statistics reduces the effective solution space. But without it, sometimes we got a NaN in the middle and wasted plenty of time. So it's highly recommended to have them.

from detectors.

Alien1007 avatar Alien1007 commented on August 31, 2024

Thanks for your reply. I have replaced nn.Conv2d with SAConv2d in BasicBlock of my model. Then I use nn.Module.load_state_dict(model_weights) to load a pre-trained model. However, I encounter many missing keys and unexpected keys. Do you have any idea?

from detectors.

joe-siyuan-qiao avatar joe-siyuan-qiao commented on August 31, 2024

I've never tried basicblocks myself as many mmdetection features are not supported for them. But if you are seeing unexpected keys other than those in the final fullly-connected layer, it's likely that you are loading a pre-trained model that does not match the model. For the missing keys, if they are something like weight_gamma or weight_beta, you can ignore them. They'll be computed during loading. Other missing keys will imply an error somewhere.

from detectors.

Alien1007 avatar Alien1007 commented on August 31, 2024

Yep, without replacing nn.Conv2d with SAConv2d, the model can be loaded correctly. And it seems that you have used SAC in Bottleneck rather than BasicBlock?

from detectors.

joe-siyuan-qiao avatar joe-siyuan-qiao commented on August 31, 2024

That's correct. We haven't done any experiments with BasicBlock.

from detectors.

Alien1007 avatar Alien1007 commented on August 31, 2024

Now I understand. Thanks a lot !

from detectors.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.