naver / deep-image-retrieval Goto Github PK

View Code? Open in Web Editor NEW

623.0 25.0 98.0 116 KB

End-to-end learning of deep visual representations for image retrieval

Home Page: https://europe.naverlabs.com/Research/Computer-Vision/Learning-Visual-Representations/Deep-Image-Retrieval/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

deep-image-retrieval's Introduction

Deep Image Retrieval

This repository contains the models and the evaluation scripts (in Python3 and Pytorch 1.0+) of the papers:

[1] End-to-end Learning of Deep Visual Representations for Image Retrieval Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus, IJCV 2017 [PDF]

[2] Learning with Average Precision: Training Image Retrieval with a Listwise Loss Jerome Revaud, Jon Almazan, Rafael S. Rezende, Cesar de Souza, ICCV 2019 [PDF]

Both papers tackle the problem of image retrieval and explore different ways to learn deep visual representations for this task. In both cases, a CNN is used to extract a feature map that is aggregated into a compact, fixed-length representation by a global-aggregation layer*. Finally, this representation is first projected using a FC layer, and L2 normalized so images can be efficiently compared with the dot product.

All components in this network, including the aggregation layer, are differentiable, which makes it end-to-end trainable for the end task. In [1], a Siamese architecture that combines three streams with a triplet loss was proposed to train this network. In [2], this work was extended by replacing the triplet loss with a new loss that directly optimizes for Average Precision.

* Originally, [1] used R-MAC pooling [3] as the global-aggregation layer. However, due to its efficiency and better performace we have replaced the R-MAC pooling layer with the Generalized-mean pooling layer (GeM) proposed in [4]. You can find the original implementation of [1] in Caffe following this link.

News

(6/9/2019) AP loss, Tie-aware AP loss, Triplet Margin loss, and Triplet LogExp loss added for reference
(5/9/2019) Update evaluation and AP numbers for all the benchmarks
(22/7/2019) Paper Learning with Average Precision: Training Image Retrieval with a Listwise Loss accepted at ICCV 2019

Pre-requisites

In order to run this toolbox you will need:

Python3 (tested with Python 3.7.3)
PyTorch (tested with version 1.4)
The following packages: numpy, matplotlib, tqdm, scikit-learn

With conda you can run the following commands:

conda install numpy matplotlib tqdm scikit-learn
conda install pytorch torchvision -c pytorch

Installation

# Download the code
git clone https://github.com/naver/deep-image-retrieval.git

# Create env variables
cd deep-image-retrieval
export DIR_ROOT=$PWD
export DB_ROOT=/PATH/TO/YOUR/DATASETS
# for example: export DB_ROOT=$PWD/dirtorch/data/datasets

Evaluation

Pre-trained models

The table below contains the pre-trained models that we provide with this library, together with their mAP performance on some of the most well-know image retrieval benchmakrs: Oxford5K, Paris6K, and their Revisited versions (ROxford5K and RParis6K).

Model	Oxford5K	Paris6K	ROxford5K (med/hard)	RParis6K (med/hard)
Resnet101-TL-MAC	85.6	90.1	63.3 / 35.7	76.6 / 55.5
Resnet101-TL-GeM	85.7	93.4	64.5 / 40.9	78.8 / 59.2
Resnet50-AP-GeM	87.7	91.9	65.5 / 41.0	77.6 / 57.1
Resnet101-AP-GeM	89.1	93.0	67.1 / 42.3	80.3/60.9
Resnet101-AP-GeM-LM18**	88.1	93.1	66.3 / 42.5	80.2 / 60.8

The name of the model encodes the backbone architecture of the network and the loss that has been used to train it (TL for triplet loss and AP for Average Precision loss). All models use Generalized-mean pooling (GeM) [3] as the global pooling mechanism, except for the model in the first row that uses MAC [3] (i.e. max-pooling), and have been trained on the Landmarks-clean [1] dataset (the clean version of the Landmarks dataset) directly fine-tuning from ImageNet. These numbers have been obtained using a single resolution and applying whitening to the output features (which has also been learned on Landmarks-clean). For a detailed explanation of all the hyper-parameters see [1] and [2] for the triplet loss and AP loss models, respectively.

** For the sake of completeness, we have added an extra model, Resnet101-AP-LM18, which has been trained on the Google-Landmarks Dataset, a large dataset consisting of more than 1M images and 15K classes.

Reproducing the results

The script test_dir.py can be used to evaluate the pre-trained models provided and to reproduce the results above:

python -m dirtorch.test_dir --dataset DATASET --checkpoint PATH_TO_MODEL \
		[--whiten DATASET] [--whitenp POWER] [--aqe ALPHA-QEXP] \
		[--trfs TRANSFORMS] [--gpu ID] [...]

--dataset: selects the dataset (eg.: Oxford5K, Paris6K, ROxford5K, RParis6K) [required]
--checkpoint: path to the model weights [required]
--whiten: applies whitening to the output features [default 'Landmarks_clean']
--whitenp: whitening power [default: 0.25]
--aqe: alpha-query expansion parameters [default: None]
--trfs: input image transformations (can be used to apply multi-scale) [default: None]
--gpu: selects the GPU ID (-1 selects the CPU)

For example, to reproduce the results of the Resnet101-AP_loss model on the RParis6K dataset download the model Resnet-101-AP-GeM.pt from here and run:

cd $DIR_ROOT
export DB_ROOT=/PATH/TO/YOUR/DATASETS

python -m dirtorch.test_dir --dataset RParis6K \
		--checkpoint dirtorch/data/Resnet101-AP-GeM.pt \
		--whiten Landmarks_clean --whitenp 0.25 --gpu 0

And you should see the following output:

>> Evaluation...
 * mAP-easy = 0.907568
 * mAP-medium = 0.803098
 * mAP-hard = 0.608556

Note: this script integrates an automatic downloader for the Oxford5K, Paris6K, ROxford5K, and RParis6K datasets (kudos to Filip Radenovic ;)). The datasets will be saved in $DB_ROOT.

Feature extractor

You can also use the pre-trained models to extract features from your own datasets or collection of images. For that we provide the script feature_extractor.py:

python -m dirtorch.extract_features --dataset DATASET --checkpoint PATH_TO_MODEL \
		--output PATH_TO_FILE [--whiten DATASET] [--whitenp POWER] \
		[--trfs TRANSFORMS] [--gpu ID] [...]

where --output is used to specify the destination where the features will be saved. The rest of the parameters are the same as seen above.

For example, this is how the script can be used to extract a feature representation for each one of the images in the RParis6K dataset using the Resnet-101-AP-GeM.pt model, and storing them in rparis6k_features.npy:

cd $DIR_ROOT
export DB_ROOT=/PATH/TO/YOUR/DATASETS

python -m dirtorch.extract_features --dataset RParis6K \
		--checkpoint dirtorch/data/Resnet101-AP-GeM.pt \
		--output rparis6k_features.npy \
		--whiten Landmarks_clean --whitenp 0.25 --gpu 0

The library also provides a generic class dataset (ImageList) that allows you to specify the list of images by providing a simple text file.

--dataset 'ImageList("PATH_TO_TEXTFILE" [, "IMAGES_ROOT"])'

Each row of the text file should contain a single path to a given image:

/PATH/TO/YOUR/DATASET/images/image1.jpg
/PATH/TO/YOUR/DATASET/images/image2.jpg
/PATH/TO/YOUR/DATASET/images/image3.jpg
/PATH/TO/YOUR/DATASET/images/image4.jpg
/PATH/TO/YOUR/DATASET/images/image5.jpg

Alternatively, you can also use relative paths, and use IMAGES_ROOT to specify the root folder.

Feature extraction with kapture datasets

Kapture is a pivot file format, based on text and binary files, used to describe SFM (Structure From Motion) and more generally sensor-acquired data.

It is available at https://github.com/naver/kapture. It contains conversion tools for popular formats and several popular datasets are directly available in kapture.

It can be installed with:

pip install kapture

Datasets can be downloaded with:

kapture_download_dataset.py update
kapture_download_dataset.py list
# e.g.: install mapping and query of Extended-CMU-Seasons_slice22
kapture_download_dataset.py install "Extended-CMU-Seasons_slice22_*"

If you want to convert your own dataset into kapture, please find some examples here.

Once installed, you can extract global features for your kapture dataset with:

cd $DIR_ROOT
python -m dirtorch.extract_kapture --kapture-root pathto/yourkapturedataset --checkpoint dirtorch/data/Resnet101-AP-GeM-LM18.pt --gpu 0

Run python -m dirtorch.extract_kapture --help for more information on the extraction parameters.

Citations

Please consider citing the following papers in your publications if this helps your research.

@article{GARL17,
 title = {End-to-end Learning of Deep Visual Representations for Image Retrieval},
 author = {Gordo, A. and Almazan, J. and Revaud, J. and Larlus, D.}
 journal = {IJCV},
 year = {2017}
}

@inproceedings{RARS19,
 title = {Learning with Average Precision: Training Image Retrieval with a Listwise Loss},
 author = {Revaud, J. and Almazan, J. and Rezende, R.S. and de Souza, C.R.}
 booktitle = {ICCV},
 year = {2019}
}

Contributors

This library has been developed by Jerome Revaud, Rafael de Rezende, Cesar de Souza, Diane Larlus, and Jon Almazan at Naver Labs Europe.

Special thanks to Filip Radenovic. In this library, we have used the ROxford5K and RParis6K downloader from his awesome CNN-imageretrieval repository. Consider checking it out if you want to train your own models for image retrieval!

References

[1] Gordo, A., Almazan, J., Revaud, J., Larlus, D., End-to-end Learning of Deep Visual Representations for Image Retrieval. IJCV 2017

[2] Revaud, J., Almazan, J., Rezende, R.S., de Souza, C., Learning with Average Precision: Training Image Retrieval with a Listwise Loss. ICCV 2019

[3] Tolias, G., Sicre, R., Jegou, H., Particular object retrieval with integral max-pooling of CNN activations. ICLR 2016

[4] Radenovic, F., Tolias, G., Chum, O., Fine-tuning CNN Image Retrieval with No Human Annotation. TPAMI 2018

deep-image-retrieval's People

Contributors

Stargazers

Watchers

Forkers

jiforcen paul0m xibinyue happyelva ray0809 hajungong007 ttl18 reborm zhyj3038 kapitsa2811 awesome-archive sanghyukchun yan796 xiehongle rhluo ljl02521 tianqi-777 dongan-beta sirlps csuk0914 fyang93 imhmhm ttl518 dhyang33 peternara weixingithubjiang intjun caiyingfeng lzh990711 berooo aiplus2019 ledduy610 ahwhbc big-chan chen-song ricky328 sherryliang dzungdk rrtaylor asmiftekhar end18 kurhula kiminh tommyvsfu1 zerojuzi happog yueyedeai tf369 fridayjk elias-ramzi zhouchuangchuang taeu qinziwen quinnqiao rudyryk liuguoyou bankbiz emnaa12 qvpr luweiyfb bigchou graffity-technologies zeta1999 yixusama jaswant7 azizimj xiaoweidao iamfaisalkhan uuid001 daoran keunmo imageretrievalorg hyu-egdn chrisbyd neural-diffusion-research ii-research-yu huangjh98 spencer551 vinbert0203 hgy1014 wzb1005 likunmo ceccocats paolobolettieri insaf-setitra monadical eduard6421 zwl995 seungriyou karta3426406 n1ckfg cindycxy mnseong domsavictor

deep-image-retrieval's Issues

Would you like to provide the Landmarks-clean datatset?

@almazan Hi, Almazan,

I'm very interested in the series of deep image retrieval work. I have tried to reimplement the work of deep image retrieval, and download the Landmarks dataset from the link, but some of the image links have been broken, and I was failed to reimplement since I have no Landmarks-clean last year. Would you like to provide the Landmarks-clean datatset? I want to train it on the Landmarks-clean again, So I could know the MAP performance on the public image retrieval dataset? Looking forward to your reply. Thanks.

Could you explain the flip in extract_image_features

Hello. Thank you very much for releasing the scripts and sharing your great work. The performance of your work is very impressive.

As I tried to follow through the usage of your work, I stumped upon the setting of flip in extract_image_features (Line 47 )...

Could you please help explain what is the function of flip in extract_image_features? Is it flipping the left and right? Also, it would be nice if you would explain how to assign this variable ?...

How to select the image triplets during traing?

Hi, i wonder to know how to select the image triples while training the net, i.e., the query image, a relevant element and a non-relevant image? Randomly select from the dataset?

Thank you in advance!

my own dataset

Hello！I am a novice, I want to ask, can I use my own dataset for feature extraction and query?Thanks！

Error: unknown dataset datasets/test.txt

Error: unknown dataset datasets/test.txt
Available datasets: Dataset, ImageClusters, ImageList, ImageListLabels, ImageListLabelsQ, ImageListROIs, ImageListRelevants, ImagesAndLabels, LabelledDataset, Landmarks18, Landmarks18_5K, Landmarks18_index, Landmarks18_lite, Landmarks18_mid, Landmarks18_missing_index, Landmarks18_new_index, Landmarks18_pca, Landmarks18_test, Landmarks18_train, Landmarks18_val, Landmarks18_valdstr, Landmarks_clean, Landmarks_clean_val, Landmarks_lite, NullCluster, Oxford5K, Paris6K, ROxford5K, RParis6K

executive command:
python -m dirtorch.extract_features --dataset 'ImageList("test.txt", ["/data01/gtr/studyPytorch/ytst/deep-image-retrieval/datasets"])' --checkpoint /data01/gtr/studyPytorch/ytst/deep-image-retrieval/Resnet101-AP-GeM-LM18.pt --output rparis6k_features.npy --whiten Landmarks_clean --whitenp 0.25 --gpu -1

The picture uses my own picture to generate the feature code. Why is the data invalid

Training in a custom dataset

How do I train it in a custom dataset?

Change the input type

The current input for dataset is a Imagelist which contain image_path for each image. How can i change it to a list vector image ?
Example : img = cv2.imread(img_path) then set dataset = img

I am a novice, how to run it under Windows system?KeyError: 'DB_ROOT' How should I specify the path？

KeyError: 'DB_ROOT'
How should I specify the path？

What is the license of this code?

Dear developers,

Could you please provide the license for this code? Can I use it for commercial purposes?

GeM pooling parameter

Hi @almazan ,

For your experiment trained on the Google Landmarks dataset 2018 (codenamed Resnet101-AP-GeM-LM18): could you share to which value the GeM pooling p parameter converged to?

If you could additionally share learning curve showing the evolution of p over the training run, that would be even better :)

Thanks!

When I'm implementing multi-staged backpropagation, output of model's backward method is not working.

In dirtorch/nets, These are problems when implementing multi-staged backpropagation.

rmac_resnet_fpn.py
rmac_resnet.py
rmac_resnext.py

For example, at line 64 in 1) rmac_resnet.py, original code is below

        x.squeeze_()

But this have problem that inplace operation cannot work when calculating gradient descriptor w.r.t. model's parameters.

So, I suggest this code below,

        x = x.squeeze() # This is not in-place operatoration.

As above example, 2) and 3) have same problem.

Have anybody implemented the train.py ???

Help!!! I wonder that How you implemented train.py.

How to assign weights to AP?

As described in the paper, a weight is introduced into the calculation of mAP in order to counter-balance the dataset imbalance issue.
I guess the weight is simply inversely proportional to the number of samples of each class in a single batch, so I wrote code like

uni_labels, inv_ids, counts = torch.unique(img_labels, return_inverse=True, 
                                           return_counts=True)
weights = 1 / counts.float()
weights = weights[inv_ids]  # query weights
...
loss = APLoss(scores, labels, weights)

But training by this code only results in explosion in the loss, while anything works fine without the weight.
Could you give an example on how to assign the weights to AP?
Thank you very much.

How to assign ground truth for training ?

Hello, Thank you for releasing the scripts and sharing the awesome work.
Here, I would like to ask about the training label. I have a look at the Multistaged backpropagation algorithm in Supp.

I wonder how did you assign the binary-ground truth for training? I know each of the entries in binary-ground truth indicates the similarity of the images in a batch.
But, since there are a number of training images from Landmarks clean, so I wonder how did you generate the binary-ground truth.

Your information will greatly help me verify my implementation with yours.

Pre-trained model

Hello, Dear author:
The download address of the pre-training model you provided is invalid. How can I find it?
I am looking forward to your reply, Thanks.

paper please?

Hi Mr.Almazán,

Today I watched your video talk in deview 2018, you were giving a talk about "image retrieval with fashion aesthetics", I'm so interested about this method, but I searched for a while, I didn't find any paper name like this, can you provide the exact paper name or link you mentioned in the talk?

Best regards,
Justin

AttributeError: 'DataParallel' object has no attribute 'pca'

when i run the command.
python -m dirtorch.test_dir --dataset Oxford5K --checkpoint Resnet-101-AP-GeM.pt/Resnet-101-AP-GeM.pt --whiten Landmarks_clean --whitenp 0.25 --gpu 0

=> loading checkpoint 'Resnet-101-AP-GeM.pt/Resnet-101-AP-GeM.pt' (current_iter 296)
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/image_retrieval/Deep_Image_Retrieval/deep-image-retrieval/dirtorch/test_dir.py", line 299, in
net = load_model(args.checkpoint, args.iscuda)
File "/home/image_retrieval/Deep_Image_Retrieval/deep-image-retrieval/dirtorch/test_dir.py", line 241, in load_model
net.pca = checkpoint.get('pca', net.pca)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 537, in getattr
type(self).name, name))
AttributeError: 'DataParallel' object has no attribute 'pca'

ResNext Code is incorrect.

Why ResNext Code is equal to ResNet?

You need to see files, rmac_resnet.py and rmac_resnext.py in nets.

rmac_resnext.py is just copied from rmac_resnet.py

Please see and modify codes.

Updated Resnet101-AP-GeM-LM18 URL

The link to the pre-trained model (Resnet101-AP-GeM-LM18) seems to be broken. Could you please share an updated URL?

Have you trained resnet50 on Google-Landmarks dataset?

Hello. Thank you very much for releasing the scripts and sharing your great work. The performance of your work is very impressive.

I have two question:

1、we evaluate Resnet101-AP-LM18, it performed well on pitts250k and tokyo24/7 and better than models trained on Landmarks-clean. Our embedded devices have limited compute ability and space, so we prefer resnet50 as backbone, but we can't find Resnet50-AP-LM18.

Have you trained resnet50-AP on Google-Landmarks Dataset? If you do, can you release it ? This will help us a lot.

2、Besides, We found that APGeM performed not so good on indoor datasets such as InLoc. Do you have some suggestions for that? We're looking forward to your reply.

senet miss

Hi Mr.Almazán,
It seems miss the senet.py when I trying to run the code.
==================================== ERRORS
_________________________ ERROR collecting test_dir.py _________________________
ImportError while importing test module '/home/sherry/Documents/github_code/deep-image-retrieval-master/dirtorch/test_dir.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../../anaconda3/lib/python3.7/site-packages/_pytest/python.py:498: in _importtestmodule
mod = self.fspath.pyimport(ensuresyspath=importmode)
../../../../anaconda3/lib/python3.7/site-packages/py/_path/local.py:701: in pyimport
import(modname)
../../../../anaconda3/lib/python3.7/site-packages/_pytest/assertion/rewrite.py:149: in exec_module
exec(co, module.dict)
test_dir.py:16: in
import dirtorch.nets as nets
nets/init.py:94: in
from .rmac_senet import senet154_rmac, se_resnet50_rmac, se_resnet101_rmac, se_resnet152_rmac, se_resnext50_32x4d_rmac, se_resnext101_32x4d_rmac
nets/rmac_senet.py:2: in
from .backbones.senet import *

E ModuleNotFoundError: No module named 'dirtorch.nets.backbones.senet'

!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!
=========================== 1 error in 0.42 seconds ============================

Input image resolution

Hi there, thanks for releasing the code.

For an ongoing place recognition project, we are using datasets where image resolution ranges from 224x224 to 1920x1080.

As mentioned in the ICCV 2019 paper, "during test, we feed the original images", do you recommend any particular image resolution (800x800?) for the method to perform well on datasets other than those used in the paper?

Triplet Training procedure?

Hi,

Really interesting approach! I've worked a bit with the triplet loss recently, and I really think this approach seems great. However, it is not clear to me what hard-negative mining procedure you use for the triplet mining? In the paper you refer to ". End-to-end learning of deep visual representations for image retrieval." for the HNM procedure, but I can't find the description of the procedure here either. Could you elaborate a bit on the HNM?

Thank you in advance!

checkpoint name wrong and error with latest version of scikit-learn

In the section reproducing results in README.md, you have:

python -m dirtorch.extract_features --dataset RParis6K \
		--checkpoint dirtorch/data/Resnet101-AP-GeM.pt \
		--output rparis6k_features.npy \
		--whiten Landmarks_clean --whitenp 0.25 --gpu 0

The name of the checkpoint (dirtorch/data/Resnet101-AP-GeM.pt) is wrong and needs to be replaced with dirtorch/data/Resnet-101-AP-GeM.pt.

Besides, $ conda install -c anaconda scikit-learn installs the latest version v0.24.1 as in March 2021 which returns the following error:

Traceback (most recent call last):
  File "/home/alijani/.conda/envs/py37/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/alijani/.conda/envs/py37/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/alijani/WS_Farid/OxfordRobotCar/AP-GeM/dirtorch/test_dir.py", line 214, in <module>
    net = load_model(args.checkpoint, args.iscuda)
  File "/home/alijani/WS_Farid/OxfordRobotCar/AP-GeM/dirtorch/test_dir.py", line 168, in load_model
    checkpoint = common.load_checkpoint(path, iscuda)
  File "/home/alijani/WS_Farid/OxfordRobotCar/AP-GeM/dirtorch/utils/common.py", line 121, in load_checkpoint
    checkpoint = torch.load(filename, map_location=lambda storage, loc: storage)
  File "/home/alijani/.conda/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 595, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/alijani/.conda/envs/py37/lib/python3.7/site-packages/torch/serialization.py", line 774, in _legacy_load
    result = unpickler.load()
ModuleNotFoundError: No module named 'sklearn.decomposition.pca'

about RPN

Dear sir, I did not found the RPN module and ROI pooling module during network inference either in rmac_resnet.py or in rmac_resnet_fpn.py. Would you please let me know if you add these modules as paper proposed?
Many thanks.

Liu

Implementation of ap loss

Hello, I am very interested in the ap loss you proposed, but I don't quite understand his implementation. Can you briefly explain it?

Can't load checkpoints

Hi,
I'm trying to load the checkpoints (https://drive.google.com/drive/folders/1mi50tG6oXY1eE9yJnmGCPdTmlIjG7mr0).
But I am getting this error :
ModuleNotFoundError: No module named 'sklearn.decomposition.pca'

And when passing fix_imports=False
I get :
ModuleNotFoundError: No module named '__builtin__'

Would it be possible to have the versions of python version and the versions of the librairies your are using ?

Thanks,
Elias

How to define label during the training procedure

When you do training, how do you define the label? Can you release the text file about your dataset? Thanks a lot

AP computation for Revisited Oxford/Paris datasets

Hi, congratulations on the great work and thanks for providing this reference implementation!

I have a question regarding AP computation. The convention for the Oxford/Paris datasets (and their revisited extensions) is to use an interpolation method, by averaging two adjacent precision points then multiplying by the recall step (see implementation by @filipradenovic here, and my reimplementation here). This is different from the "finite sum" method (Wikipedia reference), which I believe is the one used by sklearn.metrics.average_precision_score -- which is used by your code. Please correct me if I am wrong here :)

So, if what I state above about your code is correct, I am wondering what the mAP figures would be if you use the different mAP convention.

To illustrate the differences in implementation, here is a toy example, where the AP computed by the sklearn implementation is much higher than the one computed by the convention of Oxf/Par datasets (0.5 versus 0.3333):

# similarities and labels.
s = [3, 4, 1, 2]
y = [1, 0, 1, 0]

# computed from the library used in your code, produces AP=0.5.
from sklearn.metrics import average_precision_score
sklearn_ap = average_precision_score(y, s)
print("sklearn AP: %f" % sklearn_ap)

# computed from Revisited dataset convention, using my code, produces AP=0.333333.
# Note: see installation instructions at: https://github.com/tensorflow/models/blob/master/research/delf/INSTALL_INSTRUCTIONS.md
from delf.python.detect_to_retrieve import dataset
import numpy as np
ranks = []
for rank, i in enumerate(np.argsort(-np.array(s))):
  if y[i]:
    ranks.append(rank)
revisited_ap = dataset.ComputeAveragePrecision(ranks)
print("revisited AP: %f" % revisited_ap)

Overall, my guess would be that the results would not differ by that much since these datasets are large, but it would be good if can be sure of that. Also, as I said above, I may have missed something, so please feel free to correct me :)

Could you provide evaluation code for oxford/paris+distractors?

Happy new year!
Your work is very inspiring and the code is also well organized.
I noticed you have code to import distractors but the source file is not released yet.
I wish that you can release the evaluation code for ROxf+1M and RPar+1M so we can reproduce all results in your paper.
Look forward to your reply :)

AP loss Backpropagation

@almazan
How do you calculate the derivative of the similarity matrix S and matrix D?I calculated it automatically through pytorch, but the parameter update seems to be a bit problematic.
'''
desc_db = Variable(torch.cuda.FloatTensor(desc_db),requires_grad=True)
scores = torch.matmul(desc_db,desc_db.t())
vaild_index = np.arange(batch_sizeindex,batch_size(index+1),1)
Y = np.array(Y_all)[vaild_index][:,vaild_index]
Y = torch.cuda.FloatTensor(np.array(Y))
rank_loss = criterion(scores, Y)
rank_loss.backward()
loss += rank_loss.item()
net.train()
for i,img in enumerate(imgs):
img = Variable(img.cuda(),requires_grad=True)
desc = net(img.unsqueeze(dim=0))
one_grad = desc_db.grad[i].unsqueeze(0)
desc.unsqueeze(0).backward(one_grad)
optimizer.step()
scheduler_mul.step()
optimizer.zero_grad()
lr = scheduler_mul.get_lr()[0]
'''

Supplementary material from the paper

Hi,
I was trying to use your implementation of the AP-Loss and to replicate your training procedure. In your paper, some supplementary material (with the pseudocode of the multistaged training) is referenced, but I can't find it anywhere. Can you help me with this?

Best Regards,
Federico

Do you have a plan to release your train code?

Feature extractor error

Hi,great work you did.But I have a running problem when I do the feature_extractor.py.
Looking forward your reply,thank you.
$ python -m dirtorch.extract_features --dataset 'ImageList("dirtorch/image.txt")' --checkpoint dirtorch/data/Resnet-101-AP-GeM.pt --output ditorch/data/results --whiten Lankmarks_clean --whitenp 0.25 --gpu 0
Launching on GPUs 0
Dataset: Dataset: ImageList
3 images
root: ...
/home/zhoul/anaconda3/lib/python3.6/site-packages/sklearn/base.py:311: UserWarning: Trying to unpickle estimator PCA from version 0.20.2 when using version 0.19.1. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
=> loading checkpoint 'dirtorch/data/Resnet-101-AP-GeM.pt' (current_iter 296)
Traceback (most recent call last):
File "/home/zhoul/anaconda3/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/zhoul/anaconda3/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/zhoul/deep-image-retrieval/dirtorch/extract_features.py", line 116, in
net.pca = net.pca[args.whiten]
KeyError: 'Lankmarks_clean'

python dirtorch/extract_features.py --dataset /mnt/829A20D99A20CB8B/projects/Datasets/Products_Digi/images/shoe --checkpoint /mnt/829A20D99A20CB8B/projects/deep-image-retrieval/Resnet101-TL-GeM.pt --output /mnt/829A20D99A20CB8B/projects/deep-image-retrieval/features.txt --gpu 0 --whiten Landmarks_clean --whitenp 0.25

This command is exactly based on your instructions. However I got the following error:

Traceback (most recent call last):
  File "dirtorch/extract_features.py", line 111, in <module>
    dataset = datasets.create(args.dataset)
  File "/mnt/829A20D99A20CB8B/projects/deep-image-retrieval/dirtorch/datasets/create.py", line 24, in __call__
    return eval(dataset_cmd)
  File "<string>", line 1
    /mnt/829A20D99A20CB8B/projects/Datasets/Products_Digi/images/shoe()
    ^
SyntaxError: invalid syntax

naver / deep-image-retrieval Goto Github PK

deep-image-retrieval's Introduction

Deep Image Retrieval

News

Pre-requisites

Installation

Evaluation

Pre-trained models

Reproducing the results

Feature extractor

Feature extraction with kapture datasets

Citations

Contributors

References

deep-image-retrieval's People

Contributors

Stargazers

Watchers

Forkers

deep-image-retrieval's Issues

Recommend Projects

Recommend Topics

Recommend Org