peiqinzhuang / api-net Goto Github PK

View Code? Open in Web Editor NEW

126.0 4.0 31.0 895 KB

Learning Attentive Pairwise Interaction for Fine-Grained Classification, AAAI-2020

Python 100.00%

fine-grained image-classification pairwise-learning

api-net's Introduction

Learning Attentive Pairwise Interaction for Fine-Grained Classification (API-Net)

Peiqin Zhuang, Yali Wang, Yu Qiao

Introduction:

In order to effectively identify contrastive clues among highly-confused categories, we propose a simple but effective Attentive Pairwise Interaction Network (API-Net), which can progressively recognize a pair of fine-grained images by interaction. We aim at learning a mutual vector first to capture semantic differences in the input pair, and then comparing this mutual vector with individual vectors to highlight their semantic differences respectively. Besides, we also introduce a score-ranking regularization to promote the priorities of these features. For more details, please refer to our paper.

Framework:

Dependencies:

Python 2.7
Pytorch 0.4.1
torchvision 0.2.0

How to use:

# python train.py

Citing:

Please kindly cite the following paper, if you find this code helpful in your work.

@inproceedings{zhuang2020learning,
  title={Learning Attentive Pairwise Interaction for Fine-Grained Classification.},
  author={Zhuang, Peiqin and Wang, Yali and Qiao, Yu},
  booktitle={AAAI},
  pages={13130--13137},
  year={2020}
}

Contact:

Please feel free to contact [email protected] or {yl.wang, yu.qiao}@siat.ac.cn, if you have any questions.

Acknowledgement:

Some of the codes are borrowed from siamese-triplet and triplet-reid-pytorch. Many thanks to them.

api-net's People

Contributors

Stargazers

Watchers

api-net's Issues

Error: RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 4D

Hello,

Due to computational limitations, I have reduced the n_classes to 5 and n_samples = 2. But in the function pdist() in the file models. py, it is showing the above mentioned error: RuntimeError: t() expects a tensor with <= 2 dimensions, but self is 4D during training time.
pool_out = self.avg(conv_out).squeeze()
This gives pool_out as 4D-tensor.

Can you please check this?

Thanks,

Reproducing results on CUB-200-2011

Hello, I have been trying to reproduce the benchmark results reported in your paper, more specifically the ones where you train a ResNet-101 backbone with an API-net on top on the CUB-200-2011 dataset.

My first attempt was on a 16G GPU, so I could only set --n_classes 9, and --n_samples 3. I reached 87.0% validation accuracy, whereas in the paper a 88.6% was reported. The obvious explanation for this was that reducing the number of classes and samples, reduced API-net's performance, which makes sense.
So I have tried some experiments on a larger GPU, so that I could use more classes and samples, but this didn't give me the expected improvement.

When I set --n_classes 25, --n_samples 3, I only got 85.9% validation accuracy. (Notice that this is even lower than my first 9x3 experiment!) I even tried using Adam instead of SGD, but no improvement there.
So, I tried --n_classes 19, --n_samples 4, but got similar results.
I noticed that the training curves flatten early: already around epoch 30 training accuracy is 100%, and the loss and validation accuracy hardly change anymore.

Any ideas about what could go wrong or how to improve my results would be very welcome, thanks!

about the converge of softmax

Hi, I have tried your code on my dataset, but the softmax loss converge slowly compared to other methods. do you know the reason? thank you.

Two classes do not work.

I train the two classes classification. It can work in training time.But it can not work in validation

Can you share the detail of your device ?

What GPU do you use ? and how much memory this project need ?

test configs and parameter settings

Congratulations to the great improvement on your paper，but could you update a version which can have a testing to reproduce the great improvement

About val.txt

Hi,

I recently did some research on fine-grained classification, and I found that there is a 'val.txt' in your code. Is the 'val.txt' in the code actually the test set in datasets like CUB? Whether should I split trainset into train and val set when I do some experiments on datasets like CUB_200_2011, since I also didn't find any description about validation set in other existing papers?

Thank you.

Kind regards.

Could your model for image matching?

Hello, I have read your paper and your code. Thanks for you great job.
I want to use your model in image matching task. My dream model is: input is 2 images, and output is their match score, just as your training phase.
I review your code and your paper, and find that: this model is required the one image and output score is <Pooling feat + FC> while testing.
My question is: If I use an image pair(2 images) to get the final score from your model, would this model work find in the fine-grained dataset?
Thank you.

How much GPU memory do you need？

If you use class size 30.sample images 4. The batch number is 120. It is so big.