One idea to jump DETR's impressive results might be to swap in the new ResNeST50 backb

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Suggestion - change to ResNeST50 backbone (new split attention arch) about detr HOT 14 OPEN

facebookresearch commented on July 2, 2024 4

Suggestion - change to ResNeST50 backbone (new split attention arch)

from detr.

Comments (14)

rwightman commented on July 2, 2024 3

FYI, I am working towards a more consistent model creation interface / adapters / layer/bn freeze interface in timm to make all the models for suitable as backbones for various detect/segm applications. This codebase was one of my intended targets, along with my EfficientDet impl, and possibly detectron2/mmdetection. Right now currently getting lost down the rabbit hole of trying to get some other models to be jit script compatible....

from detr.

rwightman commented on July 2, 2024 3

@lessw2020 FBNetV3 is another 'MBConvNet' (the model can be implemented in 10 extra lines in timm), it'll behave much like an EfficientNet but perhaps a little faster thanks to hard-swish and a few other differences. Don't pay much attention to the wide gap between ResNet family in terms of accuracy per FLOP. Heavy deptwise convs use means it'll be memory throughput bound on GPUs just like EfficientNets and you'll see lower practical throughput and higher GPU memory usage per FLOP. A 5x FLOP increase thus will not result in anything close to 5X img/sec increase or memory consumption reduction.

It'll be interesting to try applying their searched hparams for other models in timm ... I've been applying similar recipe of EMA weight avering + RMSProp (modified for stability), stochastic depth with AutoAugment/RandAugment, mixup for sometime now after exploring ideas from the EfficientNet papers. I can basically just plug their numbers in, which are a bit different than EfficientNet and MobileNetV3 defaults...

Also, quite curious if they actually used the PyTorch native RMSProp, it tends to blow up with training schemes like this... or perhaps the modified it to be closer to the TF impl like I did.

from detr.

lessw2020 commented on July 2, 2024 2

Awesome, thanks for some initial feedback on putting resNeST to use with DETR!

I should also note that the official NeST currently won't export as JIT by default. Apparently the split architecture makes JIT think each split should have it's own BN.
Fortunately @rwightman fixed this and I've verified that you can JIT script export NeST nicely with his version.

Here's a link his .py for that - JIT fix:
https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/resnest.py

and you can get the pretrained weights etc here from his TIMM model zoo (just use the create_model call):
https://github.com/rwightman/pytorch-image-models

from detr.

raijinspecial commented on July 2, 2024 2

No problemo!

You are right about the tracing issue, I followed the example in pytorch-image-models and adapted @rwightman's fixes to the original splat.py from zhanghang1989's repo as I'd found it a bit easier to install resnest that way, but either should work just fine.

You can have detr use resnest with a slight modification to the Backbone class, for example, using the zhanghang repo:

class Backbone(BackboneBase):
    """ResNet backbone with frozen BatchNorm."""
    def __init__(self, name: str,
                 train_backbone: bool,
                 return_interm_layers: bool,
                 dilation: bool):
        backbone = getattr(resnest.torch, name)(
            pretrained=True, norm_layer=FrozenBatchNorm2d)
        num_channels = 512 if name in ('resnet18', 'resnet34') else 2048
        super().__init__(backbone, train_backbone, num_channels, return_interm_layers)

It should be similar for the pytorch-image-models version as well.

from detr.

raijinspecial commented on July 2, 2024 1

I have tried this and it works well, at least for one epoch on a machine with GPUs.

I don’t have and numbers yet as my main goal is to run this with torch xla, which also kind of works, at least in the sense of being able to complete a forward pass.

from detr.

m-klasen commented on July 2, 2024 1

Hi,
I've tried to use both HangZhangs & rwightman's resnest version, modified the detr-build function to load it. But as of now i could not get it to outperform the default r50 as i am a few mAP short when converged. So far i couldn't figure out why, maybe some non-FrozenBatchNorm training is required.

from detr.

lessw2020 commented on July 2, 2024

Looks like we should just go directly to the new FBNetv3 architecture - matches Nest50 but with 5x fewer FLOPS so much more performant:

https://arxiv.org/abs/2006.02049v1

from detr.

lessw2020 commented on July 2, 2024

@rwightman - thanks much for weighing in on this! I did see the MBConv architecture but didn't realize the memory ramifications you've pointed out so thanks much for the feedback. I'll stick with Nest for now then.
EMA weight averaging does seem really promising and great to hear that you are exploring that space.
One topic regarding augmentation - (now i can't find the paper...) but there was a new augmentation where they did cutmix but based on feature extraction with a CV adaptive background subtraction to ensure the cut image was relevant and not say background. That set new accuracy for ResNet models so worth exploring.

And regarding optimizers - coolMomentum looks promising: https://arxiv.org/abs/2005.14605v1

from detr.

m-klasen commented on July 2, 2024

Hi, did you manage to improve your results with a resnest backbone?
I've been struggling to achieve any meaningful results compared to the default r50-bb (yellow line). Maybe its just an issue of training with the different Bottleneck Conv's...

from detr.

lessw2020 commented on July 2, 2024

Hi @mlk1337 -planning to test that this week. Which nest version did you use and could I see your mods to the backbone function for hooking it in?
I hooked it in from HangZhangs torchhub for a nest50 but I see the bn is not frozen and wondering if you froze it or not for training since they freeze the default torch.resnet50?

from detr.

lessw2020 commented on July 2, 2024

note - I'm training now and went ahead and froze the bn and it's weights which should be equivalent to the frozenbn in resnet.

from detr.

lessw2020 commented on July 2, 2024

Hi @mlk1337 - thanks for sharing your results!
I did test and results were a bit in-conclusive. The mAP shot high at the start but ending was not very good, and when I ran some images it turned out basically every query had found the object with just a slightly diff bbox...so they all had confidence of .0001 when rounded.
This is in contrast to same dataset with resnet where I would get the normal 1 or 2 high confidence detections and the other queries are no object.
The one improvement was on some 'hard images' nest version detected vs regular did not, so that was the main improvement.
Anyway, for now like you I am reverting to the default resnet. I would like to try unfrozen bn in the future but under time pressure so I'm just going to run with the default resnet atm.
Note I should say this dataset was tiny so can't draw huge conclusions yet and the fact it did hit on ones the resnet version did not still implies it has some advantages, but it seems it will take more work than simply drop and go into detr.

from detr.

ririya commented on July 2, 2024

I tried Resnest101 and EfficientNet and it did not outperform Resnet101 on my dataset. Resnet101-dc5 still outperforms all backbones.

from detr.

munirfarzeen commented on July 2, 2024

hi,
I would like to give my own pre-trained weights for the backbone. Where can I define it? how would it affect the rest of the network?
how can I initialize the rest of the network?
backbone = getattr(resnest.torch, name)( pretrained=True, norm_layer=FrozenBatchNorm2d)
what does getattr does?
@raijinspecial how did you add the backbone?
can you provide the code ?

from detr.

Suggestion - change to ResNeST50 backbone (new split attention arch) about detr HOT 14 OPEN

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent