tacju / transfg Goto Github PK

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

License: MIT License

Python 100.00%

fine-grained-recognition

transfg's People

Contributors

Stargazers

Watchers

Forkers

damonzhenghuang yukichou stjordanis ammaddd phimachine gcfengxu spideralanken cwq63 entn-at bysen32 y6216886 sadafgulshad1 talalwasim inch-z duyuankai1992 jing--li yncao tiamat-tech l0l00l000 qilong-zhang ammarkamoona luweishuang pangxuejiao ljm198134 jianchunye hongbo-sun jackeywang777 trendingtechnology yuyichen09 yangxh11 wzj207 tor4z wh-forker kolaye renato145 cipher982 markin-wang hungvo304ml 17110203043 namnaku87 aliciafmachado niuweiwei mobulan genhao3 tim-pan yifanpu001 christine620 saultbase spizberg jireh-father pradeep538 ddasdkimo yangdesheng kimx3966 mymuli danielqingz gatsbychen 906364930 jingjunyi shinodamariko me714 kouwasyou celsopitta chiruzy msaqib17 imj2185 trellixvulnteam dl-vit mjoassassin darrenjan eunjuyang emiya-syw lustory lahirukumarahewagama alwayspku everythingismetaphor ailearnwjf yy9783 xuzhikangnba post692 zj0615 skadi007 shjsh yunjin66 harishb97 draskychen

transfg's Issues

About Stanford dogs accuracy

Hi, could you release your training settings for the Stanford dogs dataset? I set the lr to 3e-3 and did not change other settings, however the model is underfitting. I only get 1.7% accuracy after 200k steps.

How to visualize the attention map in TransFG ?

visualization code

Thanks for your wonderful work! I meet some problems when I try to visualize the part attention patch as your paper showed. So could you provide the visualization code. Thanks so much!

Failed to run on multi GPUs

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m torch.distributed.launch --nproc_per_node=4 train.py xxx

The codes above cannot run on multiple GPUs.

It is weird that all the trainning are running on the first GPU. Then if the batch size is increased, OOM error is reported.

Any one knows what's wrong?

About visualization

Hello, I appreciate your visualization work very much. Can you open source this part of the code?

dataset help

Can you provide the car dataset? How are the training and test sets divided? The official link is no longer working.

About train.py

I found you write scheduler.step() before optimizer.step() in line 267-268 in train.py, when I run it,i got a UserWarning UserWarning: Seems like optimizer.step() has been overridden after learning rate scheduler initialization. Please, make sure to call optimizer.step() before lr_scheduler.step().,so is your code right on that place ?

On the problem of test.py

Can you provide test.py file for prediction? I'm a novice and I'd appreciate it if I could

memory error

Dear author, why I always face memory error when I start training. My memory is 16G. Is there any problem in dataset pipline?

NAbirds dataset

Hello, can you provide the data set of NAbirds? Official website can't download it. Thank you very much

About running on one GPU

I have only one GPU. I have set local_rank=-1 and assigned os.environ['CUDA_VISIBLE_DEVICES']='0',but failed to run the code. What do i need to revise to successfully run on one GPU?

About apex

Hello, thanks for your nice work!

when I reproduce your work, I encoutered a challenging problem below:

ImportErrorImportErrorImportError: : : cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)

can you give me some idea?

How can I train my dataset?

Hello, thanks for your nice work!
My dataset have 10 classes,and each category is in a different folder.How can I train my dataset?

[minor error]The linear layer self.out of Attention in file modeling

May be there is something wrong with the first argument of the linear layer self.out (line 78: self.out = Linear(config.hidden_size, config.hidden_size)) of function Attention in the modeling file and should be changed to self.out = Linear(self.all_head_size, config.hidden_size). Because in some cases, config.hidden_size and self.all_head_size might not be equal.

Normalization parameters

Thank you for your excellent work! I am just wondering why all the datasets use the same normalization parameters (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) and they are the same as ImageNet?

Pip won't find requirements

I'm trying to setup the environment but pip won't find the requirements.
In a virtual environment with python 3.7:
$ pip install -r requirements.txt
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement torch==1.5.1 (from versions: 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0)
ERROR: No matching distribution found for torch==1.5.1

About INatureList17 acc

I want to know how to set batch size and lr on INatureList17，Does anyone know?

About CUB ACC

I can not reproduce this code ,and the acc on CUB dataset only 91.1% with overlap. Did I miss something important?

Trained Weights

Could you upload the weights for trained models? I'm personally looking for the weights of the trained nabirds model.
Thanks!

How to pretrain TransFG on my own dataset

I'm very interested in your outstanding work, and I also have a question:

I want to pretrain TransFG on my own dataset, so could you please provide code about pre-training? I'm looking forward to your reply.

How to visualize the attention map in TransFG

Accuracy on the CAR dataset

To the best of my own ability, I can only achieve up to 90% accuracy on the car dataset. Is there something wrong with me? I would like to ask if the parameters of the training car dataset are set the same as the cub dataset?

patch embeddings always 0?

I was reading the paper and checking the code and I can't see when you add value to the patch embbedings, I was debugging the code and in this part I only see you create a zero tensor and after on forward you only add this tensor.
In which moment you give a value to the patch embeddings?

line 157 https://github.com/TACJu/TransFG/blob/master/models/modeling.py#L157
self.position_embeddings = nn.Parameter(torch.zeros(1, n_patches+1, config.hidden_size))

Line 173 embeddings = x + self.position_embeddings

About PSM module

I attempted to verify the functionality of the PSM module and noticed that incorporating only the PSM module into ViT didn't seem to enhance performance. Could you please let me know if there is a transfg model weights trained on CUB?

Batch_size is 16 or 64?

Hi @TACJu, I notice you apply DDP with 4 GPUs in train.py. Therefore, if the batch_size in args is set to 16, then the overall batch_size will be 16x4=64.
However, in your paper, you say that the batch_size is 16. I also try batch_size 16x4 on Tesla V100, but OOM will be raised, so I wonder batch_size is 16 means 16 or 64? thanks!

About valid accuracy

I used different data sets, but the accuracy was always a little over 0.2, anyone know how to fix that?

About the training details

First of all, thank you for your work, which has benefited me a lot.

After several attempts, only 91% accuracy can be obtained on the cub. Can you provide model parameters and training details with 91.7% accuracy.Thank you very much if you reply.

ImportError: cannot import name 'Di1stributedDataParallel' from 'apex.parallel'

The paltform is window10, and there is only one GPU=RTX GeForce 3080
pytorch version==1.7.1,
tensorboard version ==1.15.0,
apex version==0.1
Q: Every time when i tried to train with the command "train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --fp16 --name sample_run", it always gave me a error feedback, like "ImportError: cannot import name 'Di1stributedDataParallel' from 'apex.parallel' "
Any guy know what is the reason causing this error? If you know some likely reasons, please tell me, thank you so much. God bless you!

About part_inx in part select module

TransFG/models/modeling.py

Line 266 in ff28b58

part_inx = part_inx + 1

Why part_inx variable need to plus 1?

About train.py

Every time I start training, there are always mistakes like this

usage: train.py [-h] --name NAME [--dataset {CUB_200_2011,car,dog,nabirds}] [--data_root DATA_ROOT] [--model_type {ViT-B_16,ViT-B_32,ViT-L_16,ViT-L_32,ViT-H_14}] [--pretrained_dir PRETRAINED_DIR] [--pretrained_model PRETRAINED_MODEL] [--output_dir OUTPUT_DIR] [--img_size IMG_SIZE] [--train_batch_size TRAIN_BATCH_SIZE] [--eval_batch_size EVAL_BATCH_SIZE] [--eval_every EVAL_EVERY] [--learning_rate LEARNING_RATE] [--weight_decay WEIGHT_DECAY] [--num_steps NUM_STEPS] [--decay_type {cosine,linear}] [--warmup_steps WARMUP_STEPS] [--max_grad_norm MAX_GRAD_NORM] [--local_rank LOCAL_RANK] [--seed SEED] [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS] [--fp16] [--fp16_opt_level FP16_OPT_LEVEL] [--loss_scale LOSS_SCALE] [--smoothing_value SMOOTHING_VALUE] [--split SPLIT] [--slide_step SLIDE_STEP] train.py: error: the following arguments are required: --name

I don't know how to solve it. If you know , please tell me. Thank you！

About trainint details

Hi, thanks for your great work. I want to know how many epochs/steps have you trained on those benchmarks. Thanks again!

about Part Selection Module

Thanks for your great work!
I have a question about selecting tokens with maximum activation in Part Selection Module.
In Eq.6, is a_l^i the attention-score calculated separately for the class token and other N tokens? So the dimension of a_l^i is N right?

Would you like to open source the implementation based on [DeiT] pretrained on ImageNet-1K with distillation fine-tuning.

There was a sentence on the project page that went, "Implementation based on DeiT pretrained on ImageNet-1K with distillation fine-tuning will be released soon". It will be great if you still have the plan to open source the implementation based on [DeiT] pretrained on ImageNet-1K. Thank you!
I am looking forward to your reply.

About CUB-200-2011's accuracy

Thanks for your work and sharing your codes! However, when I reproduce your code on 4 Tesla GPU V-100 entirely following the instruction with non-overleap, I just got 90.8% accuracy. Could you analyze the problem about this?

How to fix the RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx

Thanks for your work and sharing your codes!

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --master_port 89898 train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --fp16 --name sample_run

When I train on two gpus(1080TI *2), it is current.
the configuration is CUDA 11.1, pythorch 1.8.1, torchvision 0.9.1, python 3.8.3

Warning:  multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback.  Original ImportError was: ModuleNotFoundError("No module named 'amp_C'")
Warning:  apex was installed without --cpp_ext.  Falling back to Python flatten and unflatten.
Training (X / X Steps) (loss=X.X):   0%|| 0/749 [00:00<?, ?it/s]Warning:  apex was installed without --cpp_ext.  Falling back to Python flatten and unflatten.
Training (X / X Steps) (loss=X.X):   0%|| 0/749 [00:42<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 400, in <module>
    main()
  File "train.py", line 397, in main
    train(args, model)
  File "train.py", line 226, in train
    loss, logits = model(x, y)
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/apex-0.1-py3.8.egg/apex/parallel/distributed.py", line 560, in forward
    result = self.module(*inputs, **kwargs)
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/apex-0.1-py3.8.egg/apex/amp/_initialize.py", line 196, in new_fwd
    output = old_fwd(*applier(args, input_caster),
  File "/home/lirunze/xh/project/git/trans-fg_-i2-t/models/modeling.py", line 305, in forward
    part_logits = self.part_head(part_tokens[:, 0])
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/lirunze/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DFALT_TENSOR_OP)`

Could you analyze the problem about this? Thank you!

train from scratch

Did anyone try to train from scratch, without any pertained weight?
I want to make the model adapt to my project, with 224 * 224 input, 8 * 8 patches and 6 sliding size, which means there is no pertained weight for me. I found it very hard to converge, after 10000 steps the train acc is still around 0.6
Than I tried the original training configurations, except loading pertained weight, the same issue
Did I miss anything?