Git Product home page Git Product logo

mtmamba's Introduction

MTMamba

This repository contains codes and models for the following paper:

Baijiong Lin, Weisen Jiang, Pengguang Chen, Yu Zhang, Shu Liu, and Ying-Cong Chen. MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders. In European Conference on Computer Vision, 2024.

Requirements

  • PyTorch 2.0.0

  • timm 0.9.16

  • mmsegmentation 1.2.2

  • mamba-ssm 1.1.2

  • CUDA 11.8

Usage

  1. Prepare the pretrained Swin-Large checkpoint by running the following command

    cd pretrained_ckpts
    bash run.sh
    cd ../
  2. Download the data from PASCALContext.tar.gz, NYUDv2.tar.gz, and then extract them. You need to modify the dataset directory as db_root variable in configs/mypath.py.

  3. Train the model. Taking training NYUDv2 as an example, you can run the following command

    python -m torch.distributed.launch --nproc_per_node 8 main.py --run_mode train --config_exp ./configs/mtmamba_nyud.yml 

        You can download the pretrained models from mtmamba_nyud.pth.tar, mtmamba_pascal.pth.tar.

  1. Evaluation. You can run the following command,

    python -m torch.distributed.launch --nproc_per_node 1 main.py --run_mode infer --config_exp ./configs/mtmamba_nyud.yml --trained_model ./ckpts/mtmamba_nyud.pth.tar

Acknowledgement

We would like to thank the authors that release the public repositories: Multi-Task-Transformer, mamba, and VMamba.

Citation

If you found this code/work to be useful in your own research, please cite the following:

@inproceedings{lin2024mtmamba,
  title={{MTM}amba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders},
  author={Lin, Baijiong and Jiang, Weisen and Chen, Pengguang and Zhang, Yu and Liu, Shu and Chen, Ying-Cong},
  booktitle={European Conference on Computer Vision},
  year={2024}
}

mtmamba's People

Contributors

baijiong-lin avatar

Stargazers

Rujia Liu avatar Anleeno Xu avatar Yunlong Wang avatar SunHui avatar Zhiliang Ye avatar Jiaqi avatar 鹤城北斗 avatar  avatar Haozhuang Chi avatar  avatar sewon jeon avatar An-zhi WANG avatar SifanZhou avatar phiphi avatar Xiaobing Han avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.