alibaba / cluster-contrast-reid Goto Github PK

License: MIT License

Python 98.49% Shell 1.51%

cluster-contrast-reid's Introduction

Cluster Contrast for Unsupervised Person Re-Identification

The official repository for Cluster Contrast for Unsupervised Person Re-Identification. We achieve state-of-the-art performances on unsupervised learning tasks for object re-ID, including person re-ID and vehicle re-ID.

Our unified framework

Updates

11/19/2021

Memory dictionary update changed from bath hard update to momentum update. Because the bath hard update is sensitive to parameters, good results need to adjust many parameters, which is not robust enough.
Add the results of the InforMap clustering algorithm. Compared with the DBSCAN clustering algorithm, it can achieve better results. At the same time, we found through experiments that it is more robust on each data set.

Notes

In the process of doing experiments, we found that some settings have a greater impact on the results. Share them here to prevent everyone from stepping on the pit when applying our method.

The dataloader sampler uses RandomMultipleGallerySampler, see the code implementation for details. At the same time, we also provide RandomMultipleGallerySamplerNoCam sampler, which can be used in non-ReID fields.
Add batch normalization to the final output layer of the network, see the code for details.
we obtain a total number of P × Z images in the mini batch. P represents the number of categories, Z represents the number of instances of each category. mini batch = P x Z, P is set to 16, Z changes with the mini batch.

Requirements

Installation

git clone https://github.com/alibaba/cluster-contrast-reid.git
cd ClusterContrast
python setup.py develop

Prepare Datasets

cd examples && mkdir data

Download the person datasets Market-1501,MSMT17,PersonX,DukeMTMC-reID and the vehicle datasets VeRi-776 from aliyun. Then unzip them under the directory like

ClusterContrast/examples/data
├── market1501
│   └── Market-1501-v15.09.15
├── msmt17
│   └── MSMT17_V1
├── personx
│   └── PersonX
├── dukemtmcreid
│   └── DukeMTMC-reID
└── veri
    └── VeRi

Prepare ImageNet Pre-trained Models for IBN-Net

When training with the backbone of IBN-ResNet, you need to download the ImageNet-pretrained model from this link and save it under the path of examples/pretrained/.

ImageNet-pretrained models for ResNet-50 will be automatically downloaded in the python script.

Training

We utilize 4 GTX-2080TI GPUs for training. For more parameter configuration, please check run_code.sh.

examples:

Market-1501:

Using DBSCAN:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet50 -d market1501 --iters 200 --momentum 0.1 --eps 0.6 --num-instances 16

Using InfoMap:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl_infomap.py -b 256 -a resnet50 -d market1501 --iters 200 --momentum 0.1 --eps 0.5 --k1 15 --k2 4 --num-instances 16

MSMT17:

Using DBSCAN:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet50 -d msmt17 --iters 400 --momentum 0.1 --eps 0.6 --num-instances 16

Using InfoMap:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl_infomap.py -b 256 -a resnet50 -d msmt17 --iters 400 --momentum 0.1 --eps 0.5 --k1 15 --k2 4 --num-instances 16

DukeMTMC-reID:

Using DBSCAN:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet50 -d dukemtmcreid --iters 200 --momentum 0.1 --eps 0.6 --num-instances 16

Using InfoMap:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl_infomap.py -b 256 -a resnet50 -d dukemtmcreid --iters 200 --momentum 0.1 --eps 0.5 --k1 15 --k2 4 --num-instances 16

VeRi-776

Using DBSCAN:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet50 -d veri --iters 400 --momentum 0.1 --eps 0.6 --num-instances 16 --height 224 --width 224

Using InfoMap:

CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl_infomap.py -b 256 -a resnet50 -d veri --iters 400 --momentum 0.1 --eps 0.5 --k1 15 --k2 4 --num-instances 16 --height 224 --width 224

Evaluation

We utilize 1 GTX-2080TI GPU for testing. Note that

use --width 128 --height 256 (default) for person datasets, and --height 224 --width 224 for vehicle datasets;
use -a resnet50 (default) for the backbone of ResNet-50, and -a resnet_ibn50a for the backbone of IBN-ResNet.

To evaluate the model, run:

CUDA_VISIBLE_DEVICES=0 \
python examples/test.py \
  -d $DATASET --resume $PATH

Some examples:

### Market-1501 ###
CUDA_VISIBLE_DEVICES=0 \
python examples/test.py \
  -d market1501 --resume logs/spcl_usl/market_resnet50/model_best.pth.tar

Results

You can download the above models in the paper from aliyun

Citation

If you find this code useful for your research, please cite our paper

@article{dai2021cluster,
  title={Cluster Contrast for Unsupervised Person Re-Identification},
  author={Dai, Zuozhuo and Wang, Guangyuan and Zhu, Siyu and Yuan, Weihao and Tan, Ping},
  journal={arXiv preprint arXiv:2103.11568},
  year={2021}
}

Acknowledgements

Thanks to Yixiao Ge for opening source of his excellent works SpCL.

cluster-contrast-reid's People

Contributors

Stargazers

Watchers

cluster-contrast-reid's Issues

why the cluster_loader uses test data augmentation mode instead of training mode?

Thank you for your great contribution !
I check the code and notice:
cluster_loader = get_test_loader(dataset, args.height, args.width, args.batch_size, args.workers, testset=sorted(dataset.train))
why don't you use the training mode augmentation with this train set for cluster?

dataset

can I use my own dataset to train and test the model rather than using provided datasets?

cm模块中的backward代码好像没有起作用

cluster-contrast-reid-main\clustercontrast\models\cm.py
21行~25行，应该是用来实现论文中的，更新聚类中心特征向量的操作，但是在debug过程中，在22行打上了断点，但是代码貌似没有走这个代码这里，请问大概是什么问题？

@staticmethod def backward(ctx, grad_outputs):
    inputs, targets = ctx.saved_tensors
    grad_inputs = None
    if ctx.needs_input_grad[0]:
        grad_inputs = grad_outputs.mm(ctx.features)

代码调试相关参数如下：

parser.add_argument('-d', '--dataset', type=str, default='market1501',
                        choices=datasets.names())
    parser.add_argument('-b', '--batch-size', type=int, default=32)
    parser.add_argument('-j', '--workers', type=int, default=4)
    parser.add_argument('--height', type=int, default=256, help="input height")
    parser.add_argument('--width', type=int, default=128, help="input width")
    parser.add_argument('--num-instances', type=int, default=4,
                        help="each minibatch consist of "
                             "(batch_size // num_instances) identities, and "
                             "each identity has num_instances instances, "
                             "default: 0 (NOT USE)")
    # cluster
    parser.add_argument('--eps', type=float, default=0.6,
                        help="max neighbor distance for DBSCAN")
    parser.add_argument('--eps-gap', type=float, default=0.02,
                        help="multi-scale criterion for measuring cluster reliability")
    parser.add_argument('--k1', type=int, default=30,
                        help="hyperparameter for jaccard distance")
    parser.add_argument('--k2', type=int, default=6,
                        help="hyperparameter for jaccard distance")

    # model
    parser.add_argument('-a', '--arch', type=str, default='resnet50',
                        choices=models.names())
    parser.add_argument('--features', type=int, default=0)
    parser.add_argument('--dropout', type=float, default=0)
    parser.add_argument('--momentum', type=float, default=0.2,
                        help="update momentum for the hybrid memory")
    # optimizer
    parser.add_argument('--lr', type=float, default=0.00035,
                        help="learning rate")
    parser.add_argument('--weight-decay', type=float, default=5e-4)
    parser.add_argument('--epochs', type=int, default=50)
    parser.add_argument('--iters', type=int, default=400)
    parser.add_argument('--step-size', type=int, default=20)
    # training configs
    parser.add_argument('--seed', type=int, default=1)
    parser.add_argument('--print-freq', type=int, default=10)
    parser.add_argument('--eval-step', type=int, default=10)
    parser.add_argument('--temp', type=float, default=0.05,
                        help="temperature for scaling contrastive loss")
    # path
    working_dir = osp.dirname(osp.abspath(__file__))
    parser.add_argument('--data-dir', type=str, metavar='PATH',
                        default=osp.join(working_dir, 'data'))
    parser.add_argument('--logs-dir', type=str, metavar='PATH',
                        default=osp.join(working_dir, 'logs'))
    parser.add_argument('--pooling-type', type=str, default='gem')
    parser.add_argument('--use-hard', action="store_true")
    parser.add_argument('--no-cam',  action="store_true")

Question about memory updating in q_hard

I follow your codes without the hard instance updating and get the score 82.7% mAP like your paper. However, when choose the use-hard option, the performance decreases to 79.4%. Codes about CM_Hard in cm.py is as follows:
class CM_Hard(autograd.Function):
@staticmethod
def forward(ctx, inputs, targets, features, momentum):
ctx.features = features
ctx.momentum = momentum
ctx.save_for_backward(inputs, targets)
outputs = inputs.mm(ctx.features.t())
return outputs
@staticmethod
def backward(ctx, grad_outputs):
inputs, targets = ctx.saved_tensors
grad_inputs = None
if ctx.needs_input_grad[0]:
grad_inputs = grad_outputs.mm(ctx.features) #[32,2048]
batch_centers = collections.defaultdict(list)
for instance_feature, index in zip(inputs, targets.tolist()):
batch_centers[index].append(instance_feature)
for index, features in batch_centers.items():
distances = []
for feature in features:
distance = feature.unsqueeze(0).mm(ctx.features[index].unsqueeze(0).t())[0][0]
distances.append(distance.cpu().numpy())
median = np.argmin(np.array(distances))
ctx.features[index] = ctx.features[index] * ctx.momentum + (1 - ctx.momentum) * features[median]
ctx.features[index] /= ctx.features[index].norm()
return grad_inputs, None, None, None
def cm_hard(inputs, indexes, features, momentum=0.5):
return CM_Hard.apply(inputs, indexes, features, torch.Tensor([momentum]).to(inputs.device))

Could you tell me how to show the advantages about CM_hard compared with the baseline CM in Spcl?

为啥msmt17的指标sota都只有20多呢？

这个数据集有什么问题吗？duke和market的mAP都有七八十了

Why is the performance of the model trained with multi GPUs much better than that of the model trained with single GPU in this method?

Thank you for your contribution to USL. I have some questions about your paper. We find that there is a huge performance gap between the model trained with single GPU (single tesla v100) and that trained with multiple GPUs (four tesla v100s) in this method. Specifically, 'cluster contrast' trained with single GPU only achieve 78.1% in Rank-1 and 59.1% in mAP on Market1501, while 'cluster contrast' trained with multi GPUs achieves 93.0% in Rank-1 and 82.6% in mAP. I would appreciate it if you could tell me some possible reasons.

Why the result cannot be reproduced?

The result of the paper is produced with average pooling, but the default hyper-parameter of the code is gem pooling. When I replaced the hyper-parameter with average pooling, the mAP of resnet50 on market1501 is only 80.9%, which is lower than 82.1% in the paper. I would appreciate it if you could could provide me with the command to reproduce the result in the paper.

why don't use pseudo_labeled=-1 ?

hello teacher !
i want to know your code that centers don't have label == -1 and pseudo_labeled_dataset don't have label == -1?

faiss:gpu out of memory

Hello, due to the large amount of my data used, the faiss_gpu used in clustering always overflows. I don’t know how to deal with it, ask for guidance

PersonX 数据集上mAP只有36.0左右

你好，请问PersonX上的训练参数Command是怎么的，方便提供一下么，是不是哪里需要额外设置，我这边目前测的mAP很低只有36.0左右，跟文章中报的84.8相差甚远。

About the setting of batch_size and iter number

For market1501 dataset, I try to set the batch size 128 or 64, iter num and other settings remain the same as run.sh, and the result are 68% mAP & 86% Rank-1, which is not in accordance with the results in your paper. I just wonder that if I need to lower the batch size, is there any other hyperparameters that need to be adjusted accordingly? And I also wonder that why u set the iteration number to 200 or 400 manually instead of directly using "for i,inputs in enumerate(data_loader)" to train each iteration?

Different methods in memory update

Hi. I read two memory update methods in your two version papers. It looks like updating with the hardest sample gives better performance, but the selection of hyperparameters is tricky. The latest version of the paper proposed the cm method considering all query images, which may be more robust. Do you think it is possible to mix these two methods like applying them randomly during training or at the first n epochs uses a certain method and the rest of epochs uses another method? Thanks.

有没有可视化的例子呢？

测试了一下，感觉效果很好！

如果能有一个可视化的程序就好了，比如，输入一张图片，在图像集中找出top10的图片。。。

@wangguangyuan @alibaba-oss

i see in the experiment you only train in batch 256,different from that in method like spcl,and is the good result get by larger batchsize?

Can't reproduce MSMT17 results

Thanks for your work!
I reproduce the result of in dataset Market1501 and Duke, but I have some problem in MSMT17.
I trained MSMT17 using 4GPUs as follow
CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet50 -d msmt17 --iters 400 --momentum 0.1 --eps 0.7 --num-instances 16
My best result of MSMT17 was about 20 which is lower than your result map=33.3 metioned in the paper. What's the problom?
Could you tell me how to adjust some specific hyper-parameters can improve the performance

I downloaded the trained model you released, but only market and Dukle. Can you publish the model trained on VeRi?

duke on 1 GTX 2080ti,the mAP can not reach 72.8%

I run the duke dataset on 1 GTX2080ti, then I set the parameters as below:
batch_size=64
num_instances=4
iters=200
pooling-type=gem
momentum=0.1
use-hard
lr=0.00035
the other parameters are same as before, finally I got mAP 70.3%, (< mAP 72.8% in paper).
Is it any chance to reach 72.8% when I only use 1 GTX 2080ti?

关于use hard的问题

作者您好！感谢您的工作 ~
我在market1501上测试代码效果确实如您论文所述，但是我在veri776上测试，使用ibn，参考run_code.sh里面veri上的命令训练得到的结果mAP为43.0较您论文43.6有点差距，然后我加上--use_hard参数得到的结果反而不如43.0了，请问加上这个参数还需要修改其他超参吗？您论文中的指标加上了--use_hard么？感谢您抽空回答。

How to determine the EPS parameters of DBSCAN for different datasets?

Parameter configuration for Duke and MSMT17 dataset, as well as memory initialization

Hi, thanks for releasing the implementation code of your method. I have two questions about the code:

From the other issues, i can see that the hyper-parameter configuration is give for Market-1501. Following the configuration, i'm able to obtain the same result as your paper (on Market-1501 using Resnet-50 and average pooling: rank1=92.8%, mAP=83.1%). However, when i run the code on DukeMTMC-reID and MSMT17 following the guidance of the ReadMe.md, the result is low and cannot match the performance of the paper. So can you provide the specific parameter values (such as eps), so that the performance on other two datasets can be achieved?
The paper introduces random sample initialization and hard sample update for the memory bank. But from Line 175 in examples/cluster_contrast_train_usl.py, it seems that mean feature is adopted as initialization, instead of random feature. So i want to know if the paper full model is achieved using mean feature or random feature initialization. Thanks!

Why hard sampling is only beneficial for resnet_ibn_a?

Warning: Leaking Caffe2 thread-pool after fork.

Only me run this project show ：Warning: Leaking Caffe2 thread-pool after fork ?
How can I solve it ？

[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [6][10/200]      Time 0.399 (0.484)      Data 0.000 (0.075)      Loss 0.296 (0.452)
Epoch: [6][20/200]      Time 0.405 (0.447)      Data 0.000 (0.038)      Loss 0.445 (0.449)
Epoch: [6][30/200]      Time 0.400 (0.432)      Data 0.000 (0.025)      Loss 0.228 (0.433)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [6][40/200]      Time 0.403 (0.457)      Data 0.001 (0.049)      Loss 2.377 (0.696)
Epoch: [6][50/200]      Time 0.403 (0.446)      Data 0.000 (0.039)      Loss 1.873 (0.908)
Epoch: [6][60/200]      Time 0.411 (0.439)      Data 0.000 (0.033)      Loss 1.986 (1.066)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Epoch: [6][70/200]      Time 0.401 (0.450)      Data 0.000 (0.043)      Loss 2.278 (1.202)
Epoch: [6][80/200]      Time 0.402 (0.444)      Data 0.001 (0.038)      Loss 1.703 (1.279)
Epoch: [6][90/200]      Time 0.401 (0.440)      Data 0.000 (0.034)      Loss 2.213 (1.362)
...

关于Memory Initialization的一点问题

感谢作者在USL上的工作，关于您的论文我有点小疑问，在3.2的Memory Initialization中我看您写的是在初始化Memory Initialization的时候随机选择聚类中的一个实例存储在Memory中，那么因为聚类会存在噪声，这个随机实例也可能是噪声样本，这样这个样本实际上不应该代表这个聚类，这样初始化不会有影响吗？我好像只在您论文中看到了您分析选择最难样本更新memory的时候如果是噪声样本的影响，没有看到对初始化时如果是噪声样本的分析，如果我有阅读不仔细还请见谅~希望作者抽空解答。
感谢！

question about the ablation studies

Hi, I just read the "Cluster Contrast" paper on arxiv and found it a creative work. However, I am still confused on the ablation studies of Table 4, how do you fix the cluster size to a constant number since it can not be pre-defined in DBSCAN clustering method?

Can't reproduce the result of MTMC17 to 33.3 using avg pooling

The code use the gem as the polling layer and I only got 29.2 with the setting of cm. Is it normal?
More importantly, I can't get the same result when I use the avg pooling, which is about 20.
Besides, I using the gem and cmhard also get a worse result.

Why this paper is not accepted...

Hi, thanks for your excellent work and codes.

I like this work very much.
It has a clear and simple motivation, and superior performance.
So I'm a little confused about why this paper doesn't appear in some conference or journal.
Was it submitted to some conference but is rejected? If so, why???

I'm sorry if I offended you.

learning with sourcedomain?

did you try to use your method with source domain like spcl? such as use dukemtmc as source domain to train market

实验多次复现结果不同的问题

作者，您好！
请问一下，我使用相同的配置和设备，在固定随机种子的情况下，为什么自己跑出来两次的结果还会有差异，而且结果有时差别还很大（mAP能有1.5个点的浮动），请问这个是正常的吗？有没有办法可以固定住。

关于不能复现msmt17相关指标的问题

很棒的工作！但是我在近期复现此工作时，在msmt17上，有关map只有28%，无法达到论文所给的33%，相关实验设置，按照按照默认，如下，数据集为本项目所给链接下载，请问是什么原因呢：
CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet50 -d msmt17 --iters 400 --momentum 0.1 --eps 0.6 --num-instances 16

在使用nocam sampler 的情况下，batchsize越大结果反而越低

在256的设置下，market1501map只有79.4，但在128的情况下还能有81.3大概和论文结果相当。这是为什么？

运行问题

运行时出现以下问题，该怎么解决呢，faiss-gpu==1.6.4
AttributeError: module 'faiss' has no attribute 'cast_integer_to_idx_t_ptr'

Sorry that we didn't not provides the CM_hard command in readme.md. I will update it.

Sorry that we didn't not provides the CM_hard command in readme.md. I will update it.
For this command:
CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet_ibn50a -d market1501 --iters 400 --momentum 0.1 --eps 0.4 --num-instances 16 --pooling-type gem --use-hard --logs-dir /data0/developer/cluster-contrast/examples/logs/gem-hard
You will get result:
Mean AP: 87.0%
CMC Scores:
top-1 94.6%
top-5 98.2%
top-10 98.8%
which is much higher than paper result since we use gem pooling.

If use average pooling like:
CUDA_VISIBLE_DEVICES=0,1,2,3 python examples/cluster_contrast_train_usl.py -b 256 -a resnet_ibn50a -d market1501 --iters 400 --momentum 0.1 --eps 0.4 --num-instances 16 --pooling-type avg --use-hard
You will get result similar or a litter higher than our paper:
Mean AP: 84.5%
CMC Scores:
top-1 93.6%
top-5 97.5%
top-10 98.4%

Thank!

Originally posted by @daizuozhuo in #2 (comment)