Git Product home page Git Product logo

superglobal's Introduction

SuperGlobal

arXiv

PWC PWC

ICCV 2023 Paper Global Features are All You Need for Image Retrieval and Reranking Official Repository🚀🚀🚀

Image retrieval systems conventionally use a two-stage paradigm, leveraging global features for initial retrieval and local features for reranking. However, the scalability of this method is often limited due to the significant storage and computation cost incurred by local feature matching in the reranking stage. In this paper, we present SuperGlobal, a novel approach that exclusively employs global features for both stages, improving efficiency without sacrificing accuracy. SuperGlobal introduces key enhancements to the retrieval system, specifically focusing on the global feature extraction and reranking processes. For extraction, we identify sub-optimal performance when the widely-used ArcFace loss and Generalized Mean (GeM) pooling methods are combined and propose several new modules to improve GeM pooling. In the reranking stage, we introduce a novel method to update the global features of the query and top-ranked images by only considering feature refinement with a small set of images, thus being very compute and memory efficient. Our experiments demonstrate substantial improvements compared to the state of the art in standard benchmarks. Notably, on the Revisited Oxford+1M Hard dataset, our single-stage results improve by 7.1%, while our two-stage gain reaches 3.7% with a strong 64,865x speedup. Our two-stage system surpasses the current single-stage state-of-the-art by 16.3%, offering a scalable, accurate alternative for high-performing image retrieval systems with minimal time overhead.

Leveraging global features only, our series of methods contribute to state-of-the-art performance in ROxford (+1M), RParis (+1M), and GLD test set with orders-of-magnitude speedup.

News

9/14/2023 Evaluation code on 1M is released!

Demo

image

Results Reproduce

  1. Download Revisited Oxford & Paris from https://github.com/filipradenovic/revisitop, and save to path ./revisitop.

  2. Download CVNet pretrained weights from https://github.com/sungonce/CVNet, and save to path ./weights.

  3. Run

python test.py MODEL.DEPTH [50, 101] TEST.WEIGHTS ./weights TEST.DATA_DIR ./revisitop SupG.gemp SupG.rgem SupG.sgem SupG.relup SupG.rerank SupG.onemeval False

And you will get the exact reported results in log.txt.

Evaluation on 1M distractors

  1. Run python ./extract_rop1m.py --weight [path-to-weight] --depth [depth], and it gives you a .pth file in the current path.

  2. Run python test.py MODEL.DEPTH [50, 101] TEST.WEIGHTS ./weights TEST.DATA_DIR ./revisitop SupG.gemp SupG.rgem SupG.sgem SupG.relup SupG.rerank SupG.onemeval

  3. See results in log.txt.

Application

If you would like to try out our methods on other benchmarks or tasks, I recommend to go over ./modules in this repository, and plug in your desired modules. They are very easy to use, and can be directly attached to your trained model!

Acknowledgement

Many thanks to CVNet, DELG-pytorch, where we found resources to build our repository and they inspired us to have this work published!

Contact us

Feel free to reach out our co-corresponding authors at [email protected], and [email protected].

superglobal's People

Contributors

shihaoshao-gh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

superglobal's Issues

Understand the reordering network

Hello, thank you for your excellent work, but I don’t quite understand the reordering network. Why should I choose max pooling and average pooling to refine features? What is its theoretical basis? Can you give some non-experimental explanations of interpretability?

Re-ranking network

Thank you for your great work! I am very interested in your re-ranking network. It seems to be weighted for retrieval features through preliminary ranking results, without involving more complex network optimization processes. Is my understanding correct? Is β among them learnable?

image
R

Training CVNET Backbone With GeM+, Regional-GeM, Scale-GeM

Hey @ShihaoShao-GH ,

First all of thanks for the paper, releasing your code on github and following up on issues. My knowledge of the image retrieval world is brand new, and my stats knowledge is basic, so forgive me if this is naive.

My questions relates to training a CVNET backbone with your modifications in place (GeM+, Rgem, ScaleGem). Is it as simple as plug and play. I have ported your existing code in forward and forward_singlesacle to here, with the same losses proposed in CVNET (CurricularFace, ContrastiveLoss), would this work?

Edit: After deeper consideration, nothing wrong with doing this

代码疑问

resnet.py的 264行有一个F.normalize(x6, p=2, dim=-1),在原始CVNET没有这个操作

how to set GeM when training?

when train model, I have known to use arcface loss, but in the model, how to use the GeM?
whether the p is trainable when training, if yes, how to initialize it? if not, how to set the p for it?
if do not ise gem when training, then where the feature come from?

Thank you very much!

Custom dataset

Hello,
I'm a newbie in the field of image retrieval. I'd like to run your code using my own dataset. I've downloaded the pre-trained CVNet weights as you've described and would like to use the CVNet model. My dataset consists of query images and a bunch of surrounding images for each query. Unlike the roxford5k dataset, it's not annotated. The query images are simply in a query folder, and the images that should go into the database are in a separate database folder. I'm not sure how to apply my custom dataset to your code in this situation. Additionally, you mentioned grid search in your paper, which I don't fully understand. If I use my custom dataset, do I need to perform grid search separately? Is there separate code available for this?

GLDv2 test performance reproduction

Hi,
Thank you for your great work and sharing your code.

I was able to reproduce mAP scores for ROxford and RParis datasets, also with +1M distractors, both with and without re-ranking. Unfortunately, I cannot reproduce mAP@100 scores for GLDv2 retrieval test set. I am around 0.8 below your scores even without re-ranking. The same applies when I try to reproduce original CVNet - I can reproduce ROP but not GLDv2.

I cannot figure out what I am doing wrong. Is there any special pre-processing of the GLDv2 images which is different to ROP images? For example some image resizing? I noticed that if I keep GLDv2 images in their original size, some of them have their smaller side too small and the extraction with your code crashes. This does not happen with ROP as they are generally larger images with less extreme aspect ratios.

Thank you for your assistance.

code for CVnet training

Thank you for your excellent work, I would like to ask you if you have the code for CVnet training? The official project of CVnet has not been provided

about DELG-pytouch

Hi, I'm really sorry, after I read your paper and your open source code, I want to use the main framework of DELG-pytouch, and use the pooling method in your paper plus some of my own ideas to improve it, but after I download DELG's project and configure the environment according to his readme, I can't run it according to the steps he said. to run it, and then thought that you have run that code, so I would like to ask if the code you ran through is still there, or have you ever encountered a problem with not finding val_list.txt or any other problem that

Issue Replicating SuperGlobal Results: "Override list has odd length" Error

Dear authors,

I recently came across your paper on SuperGlobal and found it to be particularly insightful. In addition, I have also familiarized myself with the contents of CVNET.

For CVNET, I was able to successfully execute the following command without any issues:
python test.py MODEL.DEPTH 50 TEST.WEIGHTS CVPR2022_CVNet_R50.pyth TEST.DATA_DIR ./revisitop/data/datasets
However, when attempting to replicate the results for SuperGlobal using the following command, I encountered an error suggesting an odd length for the parameter list. I would greatly appreciate your guidance in resolving this matter.
python test.py MODEL.DEPTH 50 TEST.WEIGHTS ./weights/CVPR2022_CVNet_R50.pyth TEST.DATA_DIR ./revisitop/data/datasets SupG.gemp SupG.rgem SupG.sgem SupG.relup SupG.rerank SupG.onemeval False
Below is the traceback for your reference:
Traceback (most recent call last): File "D:\test\github\SuperGlobal\test.py", line 17, in <module> main() File "D:\test\github\SuperGlobal\test.py", line 10, in main config.load_cfg_fom_args("test a CVNet model.") File "D:\test\github\SuperGlobal\config.py", line 145, in load_cfg_fom_args _C.merge_from_list(args.opts) File "C:\Users\u\.conda\envs\test\Lib\site-packages\yacs\config.py", line 223, in merge_from_list _assert_with_logging( File "C:\Users\u\.conda\envs\test\Lib\site-packages\yacs\config.py", line 545, in _assert_with_logging assert cond, msg AssertionError: Override list has odd length: ['MODEL.DEPTH', '50', 'TEST.WEIGHTS', './weights/CVPR2022_CVNet_R50.pyth', 'TEST.DATA_DIR', './revisitop/data/datasets', 'SupG.gemp', 'SupG.rgem', 'SupG.sgem', 'SupG.relup', 'SupG.rerank', 'SupG.onemeval', 'False']; it must be a list of pairs
Thank you in advance for your assistance.

Code

Hello again!
Thanks for your helpful response to my previous question. Before building my own pipeline, I'm studying your code, but some parts are unclear.

  1. Regional-GeM
    In the rgem.py file, the code looks like this:
class rgem(nn.Module):
    """ Reranking with maximum descriptors aggregation """
    def __init__(self, pr=2.5, size=5):
        super(rgem, self).__init__()
        self.pr = pr
        self.size = size
        self.lppool = nn.LPPool2d(self.pr, int(self.size), stride=1)
        self.pad = nn.ReflectionPad2d(int((self.size-1)//2))
    def forward(self, x):
        nominator = (self.size ** 2)

I thought nominator was for normalization, but I don't understand the role of **(1./self.pr).

  1. Paper and Code Alignment
    In your RerankwMDA.py file, both X1 and X2 appear to be refined. According to the paper, I thought res_rerank would correspond to S2 and be computed with the original global descriptor(g_d), not the refined one(g_dr).

I'm sure I'm missing some details, so I'd appreciate your explanation of these aspects.

Thank you!

Dataset issue

请问下,这个Oxford+1M Hard数据集,是根据 https://github.com/filipradenovic/revisitop 这个生成的是吧。好像原生的数据集只有Oxford 5k 和 Paris 6k这两个,其他的数据集都是造出来的。但是这两个好像都不让下载.
pic1

还有一个问题想问下,如果我没理解错的话,这个数据集是带有监督标签的, 大家的网络模型都在这个数据集上训练。但是现实场景中,图片数据集通常是没有标签的,只是做一个相似度召回系统,是否有考虑到这种情况。这样的话很多模型结构都不成立了,因为没法训练。

Score gap for +1m datasets

I get +1m score based on resnet50 with a little gap from the paper.
Roxford
Without rerank
Retrieval results: mAP E: 87.98, M: 73.13, H: 50.75
With rerank
Retrieval results: mAP E: 92.11, M: 78.7, H: 62.36

Rparis
Without rerank
Retrieval results: mAP E: 90.6, M: 79.3, H: 62.62
With rerank
Retrieval results: mAP E: 92.71, M: 82.24, H: 68.0

Reranking network

Thank you for the great work you do. It seems that your reranking network does not require training, and there is no corresponding loss. I used the same processing logic. It seems that the reranking network for my extracted global features is not very ideal.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.