pyretri / pyretri Goto Github PK
View Code? Open in Web Editor NEWOpen source deep learning based unsupervised image retrieval toolbox built on PyTorch🔥
License: Apache License 2.0
Open source deep learning based unsupervised image retrieval toolbox built on PyTorch🔥
License: Apache License 2.0
Hi,
Thanks for the great work on this library!
I have a small suggestion to improve the pairwise distance calculation
https://github.com/PyRetri/PyRetri/blob/master/pyretri/index/re_ranker/re_ranker_impl/k_reciprocal.py#L35-L51
I see this function being used in 3 places - kreciprocal, knn and query expansion.
This function can simple be replaced by 1 line - torch.cdist(query_fea, gallery_fea)
PyTorch's official docs don't list this function for 1.2 but it is supported (I've used it).
This change will reduce duplication and make the code more readable.
Thanks!
In this implementation, the similarity measure is based on the euclidean distances. How to add other similarity methods like cosine?
Thanks a lot for your great great work putting so many nice stuffs together.
This question is not related to your work, but I think maybe you can shed me some light.
All aggregation methods except the max pooling gives very high similarity score for two very different images. This cosine similarity is typically larger than 0.7 or 0.8 with avg pooling, spoc, crow etc. What do you do with it in image retrieval? I think about mapping the similarity score baco to 0 to 1, like a min-max norm or a hard mapping, but I am not sure it is the right thing to do.
Thanks a lot.
I added my own model as https://github.com/PyRetri/PyRetri/blob/master/docs/GETTING_STARTED.md
. But it needs more modification for my own module work. I have made others changes and got the features by running extract_feature.py
. But I encountered this problem when running python3 main/index.py -cfg configs/my_own_local.yaml
. Can you help me?
Will you continue to update this project?
While attempting to extract features from a collection of images, I get empty results. The specified directory doesn't appear, nor does any data populate.
in anaconda console:
python main/extract_feature.py -dj data_jsons/gallery.json -sp /data/features/nft/gallery -cfg configs/nft.yaml
0it [00:00, ?it/s]
time: 12.621525526046753
This may be related to an earlier error regarding CUDA:
OSError: [WinError 1455] The paging file is too small for this operation to complete. Error loading "C:\Users\user\Anaconda3\lib\site-packages\torch\lib\cudnn_cnn_train64_8.dll" or one of its dependencies.
Hi there,
I get this error message:
File "/pyretri-0.1.0+unknown-py3.7.egg/pyretri/evaluate/evaluator/evaluators_impl/overall.py", line 78, in call
gt.pop(0)
AttributeError: 'numpy.ndarray' object has no attribute 'pop'
Please, what is the reason for this?
''pop'' can normally be used for lists, as far as I know, not for ndarrays... is this a bug in the code or something else?
Thanks!
Is the output feature directly retrieved, or need to continue training?
Hi
When I downloaded the vgg16_hybrid1365.pt model and do feature_extraction on oxford dataset. I found the shape mismatch error. Any idea why? Thank you,
python3 main/extract_feature.py -dj oxford_query.json -sp ~/Downloads/oxford/features/query/ -cfg configs/oxford.yaml
/Users/tslaqq/miniconda3/lib/python3.9/site-packages/torchvision-0.9.1-py3.9-macosx-10.9-x86_64.egg/torchvision/transforms/transforms.py:257: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
warnings.warn(
[LoadStateDict]: shape mismatch in parameter classifier.6.bias, torch.Size([1000]) vs torch.Size([1365])
[LoadStateDict]: shape mismatch in parameter classifier.6.weight, torch.Size([1000, 4096]) vs torch.Size([1365, 4096])
[LoadStateDict]: missing keys or mismatch param in state_dict: {'classifier.6.weight', 'classifier.6.bias'}
0%|
Nice job! I wanna ask that, how can I extract the features from some specific layers, e.g., conv3-1 in Vgg-16, using PyRetri?
How to get the same performance as SCDA paper in CUB200-2011 dataset?
In config
file, I want to use all
features as the image descriptors:
extractor:
name: "MoCoResSeries" # name of the extractor.
MoCoResSeries:
extract_features: ["all"] # name of the output feature map. If it is ["all"], then all available features will be output.
But if the middle feature is [N1,C1,H1,W1], [N2,C2,H2,W2] and the last feature is [N, L]. How to deal with this problem? And in the indexing phase, what about feature_names
? is ["all_GEM"] if aggregators=GeM
Thank you very much for your code and your reply.
I want to see the recall@5 and just change the default_hyper_params "recall_k": [1, 2, 4, 8] to "recall_k": [1, 2, 5, 8] in pyretri/evaluate/evaluator/evaluators_impl/overall.py in 22 lines. when print the result , I fond it did not take effect. it still print "R@1: 46.4 R@2: 59.3 R@4: 71.6 R@8: 81.2". And then I red the pyretri/evaluate/helper/helper.py, i don't know what's wrong.
In addition, I want to get the top1 mAP and top5 mAP precision be achieved in SCDA paper. But i can't understand how to calculate them in detail ? I try to modify the calculation process of mAP in pyretri/evaluate/evaluator/evaluators_impl/overall.py. Could you give some advice or details? For mAP and Recall I referred to this article to understand.https://yongyuan.name/blog/evaluation-of-information-retrieval.html.
It's a great job! But I encountered some problems.
After python main/split_dataset.py -d ./data/caltech101/ -sf main/split_file/caltech_split.txt
In data/caltech101/gallery/accordion/
there is the link not the image file
And when I run
python main/extract_feature.py -dj data_jsons/caltech_gallery.json -sp ./data/features/caltech/gallery/ -cfg configs/caltech.yaml
error happened
With the Caltech101
datasets, I found several little findings that may be helpful for you to perfect the great project.
data_jsons
folder before making data json and mkdir retrieved_images
before indexing a single image or error will happen. Maybe we can add the mkdir process into the related codes.single_index.py
54 line should add .cpu()stacked_feature.append(img_fea_info[0][name].cpu()) # should add cpu()
Hi, I am using this repo for my own data. I can now successfully run single_index.py to visualize retrieval result
However, the retrieval images seems to be odd, and no matter how I changed the input images in single_index.py, the result remain the same retrieval images.
I think it is all about config setting, such as normalization parameters, but I have no idea how to set those value.
Besides normalization parameters, is there others ways to improve my result?
Thanks
train_fea_dir: "/data/features/best_features/paris" # path of the features for training SVD.
I find this in oxford.yaml, and it is strange that when I run main/index.py, the shell post a error:
[LoadFeature]: loading feature from /mnt/ceph/home/richardzuo/wx_yszm/ft_local/PyRetri-master/features/oxford/query/part_0.json... [LoadFeature] Success, total 55 images, feature names: dict_keys(['pool5_GAP']) [LoadFeature]: loading feature from /mnt/ceph/home/richardzuo/wx_yszm/ft_local/PyRetri-master/features/oxford/gallery/part_1.json... [LoadFeature]: loading feature from /mnt/ceph/home/richardzuo/wx_yszm/ft_local/PyRetri-master/features/oxford/gallery/part_0.json... [LoadFeature] Success, total 5063 images, feature names: dict_keys(['pool5_GAP']) Traceback (most recent call last): File "main/index.py", line 48, in <module> main() File "main/index.py", line 36, in main index_helper = build_index_helper(cfg.index) File "/data1/anaconda3/envs/PyRetri/lib/python3.7/site-packages/pyretri-0.1.0+unknown-py3.7.egg/pyretri/index/builder.py", line 89, in build_index_helper dim_processors = build_processors(cfg["feature_names"], cfg.dim_processors) File "/data1/anaconda3/envs/PyRetri/lib/python3.7/site-packages/pyretri-0.1.0+unknown-py3.7.egg/pyretri/index/builder.py", line 60, in build_processors processors.append(simple_build(name, cfg, DIMPROCESSORS, feature_names=feature_names)) File "/data1/anaconda3/envs/PyRetri/lib/python3.7/site-packages/pyretri-0.1.0+unknown-py3.7.egg/pyretri/utils/builder.py", line 50, in simple_build return module(hps=hps, **kwargs) File "/data1/anaconda3/envs/PyRetri/lib/python3.7/site-packages/pyretri-0.1.0+unknown-py3.7.egg/pyretri/index/dim_processor/dim_processors_impl/svd.py", line 42, in __init__ self._train(self._hyper_params["train_fea_dir"]) File "/data1/anaconda3/envs/PyRetri/lib/python3.7/site-packages/pyretri-0.1.0+unknown-py3.7.egg/pyretri/index/dim_processor/dim_processors_impl/svd.py", line 51, in _train train_fea, _, _ = feature_loader.load(fea_dir, self.feature_names) File "/data1/anaconda3/envs/PyRetri/lib/python3.7/site-packages/pyretri-0.1.0+unknown-py3.7.egg/pyretri/index/utils/feature_loader.py", line 67, in load assert os.path.exists(fea_dir), "non-exist feature path: {}".format(fea_dir)
can you explain it ? thanks~
Command lines is:
python extract_feature.py -dj ../data_jsons/indoor_gallery.json -sp ../data/features/indoor/gallery/ -cfg ../configs/indoor.yaml
/data/places365_model/res50_places365.pt use https://drive.google.com/open?id=1lp_nNw7hh1MQO_kBW86GG8y3_CyugdS2
and Model and state_dict portion miss match
[LoadStateDict]: shape mismatch in parameter fc.weight, torch.Size([1000, 2048]) vs torch.Size([365, 2048]) [LoadStateDict]: shape mismatch in parameter fc.bias, torch.Size([1000]) vs torch.Size([365]) [LoadStateDict]: missing keys or mismatch param in state_dict: {'layer2.0.downsample.1.num_batches_tracked', 'layer2.3.bn1.num_batches_tracked', 'layer2.1.bn3.num_batches_tracked', 'layer3.0.bn1.num_batches_tracked', 'layer4.0.bn1.num_batches_tracked', 'layer3.4.bn2.num_batches_tracked', 'layer4.1.bn1.num_batches_tracked', 'layer3.4.bn1.num_batches_tracked', 'layer4.0.bn3.num_batches_tracked', 'layer2.2.bn3.num_batches_tracked', 'layer1.2.bn2.num_batches_tracked', 'layer2.0.bn2.num_batches_tracked', 'layer1.1.bn1.num_batches_tracked', 'layer4.0.bn2.num_batches_tracked', 'layer3.1.bn3.num_batches_tracked', 'layer3.1.bn1.num_batches_tracked', 'layer1.2.bn1.num_batches_tracked', 'layer2.2.bn2.num_batches_tracked', 'layer4.2.bn2.num_batches_tracked', 'layer1.0.downsample.1.num_batches_tracked', 'layer3.0.bn3.num_batches_tracked', 'layer3.3.bn2.num_batches_tracked', 'layer3.2.bn2.num_batches_tracked', 'layer2.2.bn1.num_batches_tracked', 'layer1.1.bn2.num_batches_tracked', 'layer3.5.bn1.num_batches_tracked', 'layer3.4.bn3.num_batches_tracked', 'layer4.2.bn3.num_batches_tracked', 'layer4.2.bn1.num_batches_tracked', 'layer1.0.bn2.num_batches_tracked', 'layer2.1.bn1.num_batches_tracked', 'layer2.3.bn3.num_batches_tracked', 'layer2.0.bn1.num_batches_tracked', 'layer1.0.bn3.num_batches_tracked', 'layer1.0.bn1.num_batches_tracked', 'layer3.0.bn2.num_batches_tracked', 'layer3.5.bn2.num_batches_tracked', 'layer4.0.downsample.1.num_batches_tracked', 'layer4.1.bn3.num_batches_tracked', 'bn1.num_batches_tracked', 'layer3.2.bn1.num_batches_tracked', 'layer1.1.bn3.num_batches_tracked', 'fc.bias', 'layer3.3.bn1.num_batches_tracked', 'layer1.2.bn3.num_batches_tracked', 'layer2.3.bn2.num_batches_tracked', 'layer3.0.downsample.1.num_batches_tracked', 'layer3.2.bn3.num_batches_tracked', 'layer3.5.bn3.num_batches_tracked', 'layer2.1.bn2.num_batches_tracked', 'layer3.3.bn3.num_batches_tracked', 'layer3.1.bn2.num_batches_tracked', 'fc.weight', 'layer2.0.bn3.num_batches_tracked', 'layer4.1.bn2.num_batches_tracked'}
@hby96
您好代码里面的aqe实现我没有看的很明白,您有空可以回复一下,计算的大致过程吗?
I encountered this error when running main/extract_feature.py , how can I modify it?
I follow the error, it was coming with the code:
with open(data_json_path,"rb") as f:
self.data_info = pickle.load(f)
I am pretty sure that the "wb" format was used when generating this json file
please help me,
thanks in advance
I have a local dataset including only .jpg or .png
images, I name it DataOri
.
I want to use it to make image retrieval by making two folders: gallery and query
. The gallery folder has all original images of DataOri. And the query folder has some patches
randomly made from the DataOri, i.e. 0.5 scale size of the original image. Target: just use one query patches to search for top-k original images.
Given the above two folders: gallery and query
, what should I do to quickly make this local dataset valid in this project and how to calculate metrics like mAP.
Appreciate your advice.
Firstly, thank you for sharing your work.
When I run "python3 main/single_index.py -cfg configs/caltech.yaml" according to GETTING_STARTED.md, some errors occur.
FileNotFoundError: [Errno 2] No such file or directory: '/data/caltech101/query/airplanes/image_0004.jpg'
And I find that the variable(path) is fixed in single_index.py line 5:
path = '/data/caltech101/query/airplanes/image_0004.jpg'
To interface the pertained model on Oxford dataset, features are needed to be extracted from Paris dataset.
So splitting Paris dataset is also required, right?
When I'am using main/single_index.py
, how to output confidence
Hi,
I think I may have found a bug in the k reciprocal computation.
Based on the implementation from the original author https://github.com/zhunzhong07/person-re-ranking/blob/c11b3514114cbffc70588decda48c958fc965f5a/python-version/re_ranking_feature#L49 the re ranking function expects the actual distance which is then squared to get squared distance.
In your implementation, the _cal_dis
function returns the squared distance not the actual distance.
In my own implementation, I removed this line and it improved my performance by 1% pts across all my metrics.
Thank you for the great work!
P.S. Implementing my suggestion in #14 can fix this issue without having to remove the squaring operation
when i run python main/extract_feature.py -dj data_jsons/oxford_gallery.json -sp data/features/oxford/gallery.pickle -cfg configs/oxford.yaml
i get below output. This do not dump any pickle file, after little debugging i found out it does not go inside https://github.com/PyRetri/PyRetri/blob/master/pyretri/extract/helper/helper.py/#L109 dataloader loop.
0it [00:00, ?it/s]
time: 0.47732090950012207
Why is data augmentation used in inference?
Error:
Traceback (most recent call last):
File "setup.py", line 179, in
long_description=readme(),
File "setup.py", line 69, in readme
content = fid.read()
File "C:\Users\user\Anaconda3\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 273: character maps to
Fix:
edited the readme() function in setup.py. open README.md with encoding utf8
def readme():
"""Get readme.
Returns:
str, readme content string.
"""
with open('README.md', encoding="utf8") as fid:
content = fid.read()
return content
""
Thank you very much for your code.I have a few questions about this code.Why can't MAP precision be achieved in SCDA paper.
Looking forward to your reply .
按照提供的预训练模型和数据集跑出的mAP特别低,我完全是按照指引的步骤运行的。还有就是OXford数据集的指标部分,一直报找不到gt文件,换成绝对路径也不行。请问是哪里出了问题吗
``[LoadFeature]: loading feature from /home/ubuntu/CBIR/PyRetri/data/features/caltech/query01/part_0.json...
[LoadFeature] Success, total 4310 images,
feature names: dict_keys(['pool5_GeM'])
[LoadFeature]: loading feature from /home/ubuntu/CBIR/PyRetri/data/features/caltech/gallery01/part_0.json...
[LoadFeature] Success, total 4367 images,
feature names: dict_keys(['pool5_GeM'])
[LoadFeature] Success, total 4367 images,
feature names: dict_keys(['pool5_GeM'])
--------------- Retrieval Evaluation ------------
mAP: 1.1
R@1: 0.0 R@2: 0.1 R@4: 0.1 R@8: 0.3
How do I prepare the custom dataset?
I use the corret model named vgg16_hybrid1365.pt, but the MAP is so low.
that is why?
$python3 main/index.py -cfg configs/oxford.yaml
[LoadFeature]: loading feature from data/features/oxford/query/part_0.json...
[LoadFeature] Success, total 55 images,
feature names: dict_keys(['pool5_GAP'])
[LoadFeature]: loading feature from data/features/oxford/gallery/part_1.json...
[LoadFeature]: loading feature from data/features/oxford/gallery/part_0.json...
[LoadFeature] Success, total 5063 images,
feature names: dict_keys(['pool5_GAP'])
[LoadFeature]: loading feature from data/features/paris/part_1.json...
[LoadFeature]: loading feature from data/features/paris/part_0.json...
[LoadFeature] Success, total 6417 images,
feature names: dict_keys(['pool5_GAP'])
--------------- Retrieval Evaluation ------------
mAP: 1.1
R@1: 5.5 R@2: 5.5 R@4: 5.5 R@8: 12.7
If I want to use the original feature output to make evaluation of image retrieve as the baseline, which means we do not use the Feature Representation and Post-processing. How to do that?
Thank you.
Hello, there are two questions about constructing Oxford5k dataset:
├── cbir
│ ├── oxford
│ │ ├── gt
│ │ │ ├── all_souls_1_good.txt
│ │ │ └── ···
│ │ └── images
│ │ ├── all_souls_000000.jpg
│ │ └── ···
│ └── pairs
│ ├── gt
│ │ ├── defense_1_good.txt
│ │ └── ···
│ └── images
│ ├── defense
│ │ ├── paris_defense_000000.jpg
│ │ └── ···
│ └── ···
Looking forward to your answer, thanks!
While attempting extraction through Windows Anaconda console:
python C:\Users\user\Documents\Projects\PyRetri\main\extract_feature.py -dj C:\Users\user\Documents\Projects\PyRetri\data_jsons\gallery.json -sp C:\Users\user\Documents\Projects\PyRetri\data\features -cfg configs/nft.yaml
Traceback (most recent call last):
File "C:\Users\user\Documents\Projects\PyRetri\main\extract_feature.py", line 52, in
main()
File "C:\Users\user\Documents\Projects\PyRetri\main\extract_feature.py", line 40, in main
dataset = build_folder(args.data_json, cfg.datasets)
File "C:\Users\user\Anaconda3\lib\site-packages\pyretri-0.1.0+unknown-py3.8.egg\pyretri\datasets\builder.py", line 61, in build_folder
trans = build_transformers(cfg.transformers)
File "C:\Users\user\Anaconda3\lib\site-packages\pyretri-0.1.0+unknown-py3.8.egg\pyretri\datasets\builder.py", line 45, in build_transformers
transformers.append(simple_build(name, cfg, TRANSFORMERS))
File "C:\Users\user\Anaconda3\lib\site-packages\pyretri-0.1.0+unknown-py3.8.egg\pyretri\utils\builder.py", line 42, in simple_build
assert name in registry
AssertionError
Hi,
Thanks for the code. Very cool work. I was trying to reimplement for the oxford 5k data.
I followed the following steps :
a) Setup the data structures/folders using the install.md
file.
b) Generate the split-files.
c) Now, the next I am trying to do is generate the features using extract_features.py
and somehow my dataloader is empty. Can you help me here?
Things I tried:
folder; name
on line 12-13 in oxford.yaml
file. But that throws the AssertionError
with respect to the registry.Traceback (most recent call last):
KeyError: 'radcliffe_camera_000519'
For a given retrieval task how can i train the model.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.