dk-liang / awesome-visual-transformer Goto Github PK

View Code? Open in Web Editor NEW

3.3K 106.0 391.0 171 KB

Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)

transformer-cv transformer detr transformer-with-cv transformer-awesome visual-transformer

awesome-visual-transformer's Introduction

Website: http://dk-liang.github.io/

Google Scholar: https://scholar.google.com/dk-liang

Top Repositories

Visitor count

awesome-visual-transformer's People

Contributors

Stargazers

Watchers

Forkers

andyzhang59 songkq zymale klonggan chaoso tcwltcwl yangsenwxy ilovepopcorn witdsl pingpingzhang wuterry chaoshengt dreamhua82 gedamua maliangzhibi violet998 zhangzheng0131 fredhuang16 slam-box wintersurvival tszhang97 michealray bismex leviosaaaa liujianzhao6328057 defiler24 zengyh1900 software8899 dh0000000001 xingchenzhang july-zh yangyangkiki mczhuge 1104662797 duxiangcheng senwang98 godgang4885 huiqinwu yutinyang andylau-bit georgeggggg cvmi-uestc pumifen zaczgao jawaechan zongyang-li flamato zhangsdly w64228013 godofpdog hzhang57 zzzzlalala lzu-cvpr zerinhwang03 zhanggongjie dumpmemory guoleisun rayguan97 wuseguang wangyingquan jlqzzz hengxyz zn-qiao z7zuqer sailfish009 tjufan haoyev5 huayuuu alcinos youngbaby123 zhangkai2017 niluanwudidadi alphaarking andrew-ng-s-number-one-fan haofeng98 hwguo11 tyroneli fenglixue futureprecd overbestfitting janetwise zishanqin niupeng177 laureateen btgws just-started rodrigoieh chinayi finspire13 jiangbo-shi hucaofighting refrain-lhc suixiaodan sailor-z cvlinks qtjiebin zhendongwang6 qyou snehilsanyal mbyase

awesome-visual-transformer's Issues

some new papers to add

Hi. I came across these paper so its a good idea to add them so we can refer to them when we decide to read them later (i hope i can finally start reading my never-ending list :)

Three things everyone should know about Vision Transformers: https://arxiv.org/pdf/2203.09795.pdf
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection: https://arxiv.org/pdf/2203.03605.pdf
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection: https://arxiv.org/pdf/2112.01526.pdf
A-ViT: Adaptive Tokens for Efficient Vision Transformer: https://arxiv.org/pdf/2112.07658.pdf
Shunted Self-Attention via Multi-Scale Token Aggregation: https://arxiv.org/pdf/2111.15193.pdf

Paper Status of P2T: Pyramid Pooling Transformer

Dear Dingkang,

Thanks a lot for your project. Our paper P2T has been accepted by IEEE TPAMI 2022 recently.
Could you please update the status of P2T?
BTW, full code of P2T has also been released here: https://github.com/yuhuan-wu/P2T
IEEE online address is here: https://ieeexplore.ieee.org/document/9870559

Best,
Yu-Huan

add Styleformer

Styleformer: Transformer based Generative Adversarial Networks with Style Vector
PDF: https://arxiv.org/abs/2106.07023
Code: https://github.com/Jeeseung-Park/Styleformer

add UniFormer

Uniformer: Unified Transformer for Efficient Spatiotemporal Representation Learning

Accepted by ICLR 2022
arxiv: https://arxiv.org/abs/2201.04676
code: https://github.com/Sense-X/UniFormer

add some papers

hi, there are some recent papers i read, and they are missing from here:

TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
https://arxiv.org/pdf/2106.11297v2.pdf

Sliced Recursive Transformer
https://arxiv.org/pdf/2111.05297.pdf

Add arxiv paper

Transformer in Transformer https://arxiv.org/abs/2103.00112

add arxiv paper

Hi, thanks for your awesome repo!

please consider adding the new arxiv paper:

Uformer: A General U-Shaped Transformer for Image Restoration
arxiv: https://arxiv.org/abs/2106.03106

Add Augvit on NeurIPS 2021

Thanks for your awesome paper list ! Our paper 'Augmented Shortcuts for Vision Transformers' has accepted by NeurIPS 2021. Could you add it in the paper list? Thanks.

paper link: https://arxiv.org/abs/2106.15941

Add arxiv paper

TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face Presentation Attack Detection https://arxiv.org/abs/2104.07419

add Contextual Transformer

Contextual Transformer Networks for Visual Recognition
PDF: https://arxiv.org/pdf/2107.12292.pdf
Code: https://github.com/JDAI-CV/CoTNet

A kindly remind of the status of CrossFormer

Thank you for your great project, and we are glad that our paper CrossFormer is also listed.

While our paper is listed as an arxiv pre-print, it has been accepted by ICLR 2022 CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention. You may wish to transfer it to the ICLR section.

Add Focal Transformer

Hi, thanks for such a great collection of awesome vision transformer works! Could you please add our Focal Transformers:

Paper: https://arxiv.org/pdf/2107.00641.pdf
Code: https://github.com/microsoft/Focal-Transformer

thanks!

Awesome work!

Thank you for sharing this collection of papers

I also made a paper collection list about vision attention and transformer:
https://github.com/cmhungsteve/Awesome-Transformer-Attention

Feel free to check and share it!

I will also be appreciative if you can add a link to my repo.
Thank you

add paper KVT: k-NN Attention for Boosting Vision Transformers

https://www.researchgate.net/publication/351905425_KVT_k-NN_Attention_for_Boosting_Vision_Transformers

code for Transformers Solve the Limited Receptive Field for Monocular Depth Prediction

[TransDepth] Transformers Solve the Limited Receptive Field for Monocular Depth Prediction [paper] [code]

add CoAtNet

CoAtNet: Marrying Convolution and Attention for All Data Sizes
https://arxiv.org/pdf/2106.04803.pdf

Add arxiv paper

Thank you for great repo.

Please consider to add:
Unsupervised MRI Reconstruction via Zero-Shot Learned Adversarial Transformers (SLATER)

https://arxiv.org/pdf/2105.08059.pdf

add DearKD(CVPR2022)

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
paper: https://arxiv.org/abs/2204.12997

Thanks~

some papers to add

Escaping the Big Data Paradigm with Compact Transformers: https://arxiv.org/pdf/2104.05704.pdf
How to train your ViT? Data, Augmentation,and Regularization in Vision Transformers: https://arxiv.org/pdf/2106.10270.pdf
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks: https://arxiv.org/pdf/2105.02358.pdf

Please consider add: MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens (https://arxiv.org/abs/2105.15168)

Thanks for your awesome repo.

Please consider add: MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens (https://arxiv.org/abs/2105.15168).

add XCiT

XCiT: Cross-Covariance Image Transformers
PDF: https://arxiv.org/pdf/2106.09681.pdf
Code: https://github.com/facebookresearch/xcit

Add ICT

Hi, @dk-liang, please help add the below papers:

[ICT] High-Fidelity Pluralistic Image Completion with Transformers [paper], [code], ICCV 2021

[BEVT] BEVT: BERT Pretraining of Video Transformers [paper], [code], CVPR 2022

[PeCo] PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers [paper]

[MobileFormer] Mobile-Former: Bridging MobileNet and Transformer [paper], CVPR 2022

Update status of Container

Please update the status of the following paper :

[Container] Container: Context Aggregation Network

[Container] Container: Context Aggregation Network [paper][code] [Neuips 2021]
code : https://github.com/gaopengcuhk/Container

paper names

Hello! new paper!

Hello! Great work!

I would like to introduce CVPR2023 paper called QD-DETR (Query-Dependent Detection Transformer).
Paper : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
arxiv link : https://arxiv.org/abs/2303.13874
Github link : https://github.com/wjun0830/QD-DETR

Thank you.

Add HGOnet [WACV 2022]

Hi, @dk-liang, thanks for this great repository. Could you please consider adding HGOnet, which has been accepted in WACV 2022? Thanks in advance!

Image-Adaptive Hint Generation via Vision Transformer for Outpainting
paper: https://openaccess.thecvf.com/content/WACV2022/papers/Kong_Image-Adaptive_Hint_Generation_via_Vision_Transformer_for_Outpainting_WACV_2022_paper.pdf
code: https://github.com/kdh4672/hgonet

add code for shuffle transformer

Please add code for Shuffle Transformer. Thanks~
paper code

Add arxiv paper

Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation https://arxiv.org/abs/2103.16024

add CATs

paper : https://arxiv.org/abs/2106.02520
github & code : https://github.com/SunghwanHong/CATs

add CCT

Add CoFormer

Hi, @dk-liang. Thanks for this great repository. Please add CoFormer.

Collaborative Transformers for Grounded Situation Recognition

Paper: https://arxiv.org/abs/2203.16518
Code: https://github.com/jhcho99/CoFormer

This paper is accepted to CVPR 2022.

add SignBERT

Please add SignBERT: https://arxiv.org/abs/2110.05382,
Thanks for your support~

Add BatchFormer

Hi @dk-liang, thanks for your awesome repository.
Could you add BatchFormer which has been accepted in CVPR2022.

arxiv: https://arxiv.org/abs/2203.01522
code: https://github.com/zhihou7/BatchFormer

In addition, a more general version, BatchFormerV2, is also released in https://arxiv.org/abs/2204.01254, in which we design a new module and present the consistent effectiveness on object detection, panoptic segmentation, and image classification.

Regards,

Dual-stream Network for Visual Recognition [paper][code] [Neuips 2021]
https://github.com/gaopengcuhk/DSNet

dk-liang / awesome-visual-transformer Goto Github PK

awesome-visual-transformer's Introduction

Top Repositories

awesome-visual-transformer's People

Contributors

Stargazers

Watchers

Forkers

awesome-visual-transformer's Issues

Recommend Projects

Recommend Topics

Recommend Org