Git Product home page Git Product logo

spcl's Introduction

Python >=3.5 PyTorch >=1.0

Self-paced Contrastive Learning (SpCL)

The official repository for Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID, which is accepted by NeurIPS-2020. SpCL achieves state-of-the-art performances on both unsupervised domain adaptation tasks and unsupervised learning tasks for object re-ID, including person re-ID and vehicle re-ID.

framework

Updates

[2020-10-13] All trained models for the camera-ready version have been updated, see Trained Models for details.

[2020-09-25] SpCL has been accepted by NeurIPS on the condition that experiments on DukeMTMC-reID dataset should be removed, since the dataset has been taken down and should no longer be used.

[2020-07-01] We did the code refactoring to support distributed training, stronger performances and more features. Please see OpenUnReID.

Requirements

Installation

git clone https://github.com/yxgeee/SpCL.git
cd SpCL
python setup.py develop

Prepare Datasets

cd examples && mkdir data

Download the person datasets Market-1501, MSMT17, PersonX, and the vehicle datasets VehicleID, VeRi-776, VehicleX. Then unzip them under the directory like

SpCL/examples/data
├── market1501
│   └── Market-1501-v15.09.15
├── msmt17
│   └── MSMT17_V1
├── personx
│   └── PersonX
├── vehicleid
│   └── VehicleID -> VehicleID_V1.0
├── vehiclex
│   └── AIC20_ReID_Simulation -> AIC20_track2/AIC20_ReID_Simulation
└── veri
    └── VeRi -> VeRi_with_plate

Prepare ImageNet Pre-trained Models for IBN-Net

When training with the backbone of IBN-ResNet, you need to download the ImageNet-pretrained model from this link and save it under the path of logs/pretrained/.

mkdir logs && cd logs
mkdir pretrained

The file tree should be

SpCL/logs
└── pretrained
    └── resnet50_ibn_a.pth.tar

ImageNet-pretrained models for ResNet-50 will be automatically downloaded in the python script.

Training

We utilize 4 GTX-1080TI GPUs for training. Note that

  • The training for SpCL is end-to-end, which means that no source-domain pre-training is required.
  • use --iters 400 (default) for Market-1501 and PersonX datasets, and --iters 800 for MSMT17, VeRi-776, VehicleID and VehicleX datasets;
  • use --width 128 --height 256 (default) for person datasets, and --height 224 --width 224 for vehicle datasets;
  • use -a resnet50 (default) for the backbone of ResNet-50, and -a resnet_ibn50a for the backbone of IBN-ResNet.

Unsupervised Domain Adaptation

To train the model(s) in the paper, run this command:

CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_uda.py \
  -ds $SOURCE_DATASET -dt $TARGET_DATASET --logs-dir $PATH_OF_LOGS

Some examples:

### PersonX -> Market-1501 ###
# use all default settings is ok
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_uda.py \
  -ds personx -dt market1501 --logs-dir logs/spcl_uda/personx2market_resnet50

### Market-1501 -> MSMT17 ###
# use all default settings except for iters=800
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_uda.py --iters 800 \
  -ds market1501 -dt msmt17 --logs-dir logs/spcl_uda/market2msmt_resnet50

### VehicleID -> VeRi-776 ###
# use all default settings except for iters=800, height=224 and width=224
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_uda.py --iters 800 --height 224 --width 224 \
  -ds vehicleid -dt veri --logs-dir logs/spcl_uda/vehicleid2veri_resnet50

Unsupervised Learning

To train the model(s) in the paper, run this command:

CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_usl.py \
  -d $DATASET --logs-dir $PATH_OF_LOGS

Some examples:

### Market-1501 ###
# use all default settings is ok
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_usl.py \
  -d market1501 --logs-dir logs/spcl_usl/market_resnet50

### MSMT17 ###
# use all default settings except for iters=800
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_usl.py --iters 800 \
  -d msmt17 --logs-dir logs/spcl_usl/msmt_resnet50

### VeRi-776 ###
# use all default settings except for iters=800, height=224 and width=224
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python examples/spcl_train_usl.py --iters 800 --height 224 --width 224 \
  -d veri --logs-dir logs/spcl_usl/veri_resnet50

Evaluation

We utilize 1 GTX-1080TI GPU for testing. Note that

  • use --width 128 --height 256 (default) for person datasets, and --height 224 --width 224 for vehicle datasets;
  • use --dsbn for domain adaptive models, and add --test-source if you want to test on the source domain;
  • use -a resnet50 (default) for the backbone of ResNet-50, and -a resnet_ibn50a for the backbone of IBN-ResNet.

Unsupervised Domain Adaptation

To evaluate the domain adaptive model on the target-domain dataset, run:

CUDA_VISIBLE_DEVICES=0 \
python examples/test.py --dsbn \
  -d $DATASET --resume $PATH_OF_MODEL

To evaluate the domain adaptive model on the source-domain dataset, run:

CUDA_VISIBLE_DEVICES=0 \
python examples/test.py --dsbn --test-source \
  -d $DATASET --resume $PATH_OF_MODEL

Some examples:

### Market-1501 -> MSMT17 ###
# test on the target domain
CUDA_VISIBLE_DEVICES=0 \
python examples/test.py --dsbn \
  -d msmt17 --resume logs/spcl_uda/market2msmt_resnet50/model_best.pth.tar
# test on the source domain
CUDA_VISIBLE_DEVICES=0 \
python examples/test.py --dsbn --test-source \
  -d market1501 --resume logs/spcl_uda/market2msmt_resnet50/model_best.pth.tar

Unsupervised Learning

To evaluate the model, run:

CUDA_VISIBLE_DEVICES=0 \
python examples/test.py \
  -d $DATASET --resume $PATH

Some examples:

### Market-1501 ###
CUDA_VISIBLE_DEVICES=0 \
python examples/test.py \
  -d market1501 --resume logs/spcl_usl/market_resnet50/model_best.pth.tar

Trained Models

framework

You can download the above models in the paper from [Google Drive] or [Baidu Yun](password: w3l9).

Citation

If you find this code useful for your research, please cite our paper

@inproceedings{ge2020selfpaced,
    title={Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID},
    author={Yixiao Ge and Feng Zhu and Dapeng Chen and Rui Zhao and Hongsheng Li},
    booktitle={Advances in Neural Information Processing Systems},
    year={2020}
}

spcl's People

Contributors

ckmessi avatar yxgeee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

spcl's Issues

low result with only 1 GPU

Hi Yixiao, I have tried to run code with only 1 2080ti, but got a very low result(eg. 20% mAP drop on market1501), so what causes the performance drop? I can't undersand.
Could you please explain for me?

The pseudo_labels might be discontinuous

Thanks for your great sharing. When I‘m debugging the code, I found that the pseudo_labels might be discontinuous.
If I let:
len_ = len(set(pseudo_labels.numpy().tolist()))
min_ = min(set(pseudo_labels.numpy().tolist()))
max_ = max(set(pseudo_labels.numpy().tolist()))
sametimes max_ not equal to the (min_ + len_ - 1).

Thanks again for sharing. It would be perfect if you could solve this.

Problem with the MSMT dataset

Hello, I think there is something wrong with my MSMT17 dataset, with which I have mAP 6.5% and rank-1 15.3%.
I believe that there must be something different. Would you like to provide your MSMT17 dataset ? Thank you very much.

number of instances in each mini batch

Hi Yixiao, I have conducted several experiments by running ur code, finding that when setting the number of instances 0, the accuracy drops a lot. Have u found out this before?
Theoretically, the Sampler should be such crucial to the Re-ID task.
Could u please explain for me?

Question on the baseline model constructed from your code

Hi, I constructed a clustering-based unsupervised baseline from your code, which obtains much higher performance (Market: rank1=91.1%,mAP=79.5%) than the pure unsupervised version in your paper (w/o source data, Market: rank1=88.1%, mAP=73.1%), so i got a little confused and curious, have you ever experimented with the baseline described below:

Using the same network and hyper-parameter setting, the baseline differs from your method in the following:

  1. Before each epoch, the image features $f_i$ are extracted by the recent model and utilized for DBSCAN clustering (without the self-paced independence and compactness refinement);
  2. A cluster-level memory bank is re-constructed with shape=(num_clusters, num_features), and initialized by the averaged image feature $f_i$ for each cluster;
  3. the model is trained by removing the outliers from training;

Since such a memory-based clustering-fine-tuning baseline seems very naive, i'm confused why it performs so well. Any advice or insight would be much appreciated~

Performance improvements compared to the original version

Hi, I noticed that the USL performances on Market1501 dataset of your original arxiv version (mAP:72.6%) are slightly lower than that of your NeurIPS-2020 camera-ready version (mAP:73.1%), have you used some extral tricks? And what contributes to the amazing performance improvements of the OpenUnReID SpCL+ version compared to your original SpCL version? Thank you!

Impact of Domain Specific BN and RandomMultipleGallerySampler

Hi, I am wondering how much Domain Specific BN and RandomMultipleGallerySampler contribute to the final result? As you mentioned DSBN is crucial in OpenUnReID, but I didn't find a fair comparison in papers.

On the other hand, I noticed you ranked 1st in Visda 2020 challenge now, which proves your method is superior in Domain Adaptation. Is SpCl your baseline method used in Visda challenge?

Thank you!

An error occurred when creating a pseudo label

==> Create pseudo labels for unlabeled target domain with self-paced policy
Computing jaccard distance...
Traceback (most recent call last):
File "examples/spcl_train_uda.py", line 341, in
main()
File "examples/spcl_train_uda.py", line 109, in main
main_worker(args)
File "examples/spcl_train_uda.py", line 175, in main_worker
rerank_dist = compute_jaccard_distance(target_features, k1=args.k1, k2=args.k2)
File "/home/amax/SpCL/spcl/utils/faiss_rerank.py", line 41, in compute_jaccard_distance
_, initial_rank = search_raw_array_pytorch(res, target_features, target_features, k1)
File "/home/amax/SpCL/spcl/utils/faiss_utils.py", line 82, in search_raw_array_pytorch
I_ptr = swig_ptr_from_LongTensor(I)
File "/home/amax/SpCL/spcl/utils/faiss_utils.py", line 15, in swig_ptr_from_LongTensor
return faiss.cast_integer_to_long_ptr(
AttributeError: module 'faiss' has no attribute 'cast_integer_to_long_ptr'

Is there anybody meet this problem?

bruteForceKnn is deprecated; call bfKnn instead
Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::runL2Norm(faiss::gpu::Tensor<T, 2, true, IndexType>&, bool, faiss::gpu::Tensor<float, 1, true, IndexType>&, bool, cudaStream_t) [with T = float; TVec = float4; IndexType = int; cudaStream_t = CUstream_st*] at gpu/impl/L2Norm.cu:292; details: CUDA error 11 invalid argument

the update of the hybrid memory

how can i debug the hybrid memory's features. i just change the moemtumn update function to the simple ctx.features = ctx.features +1 . And when i debug the hm.py, the self.features is never changed.

[CUDA 11.2] Experiment on RTX 3090 with faiss-gpu 1.6.5

After installing all dependencies according to the README, I have encountered several errors.

By now most of them have been solved.

If you meet the same error, hope you can find some reference here.

My experiment env-info:

GPU : RTX 3090
CUDA : 11.2
Python : 3.8
Pytorch : 1.8.0+cu111
Scikit-learn : 0.24.1

Error 1

When I use faiss-gpu 1.6.3 under CUDA 10.2, process would be killed sometimes when computing 'jaccard distance'.

Abnormal Memory usage : Process was killed when computing 'jaccard distance'.

Solution

  • Upgrade scikit-learn to 0.20.2+.

  • Change n_jobs=-1 to 2 or 4.

#184

cluster = DBSCAN(eps=eps, min_samples=4, metric='precomputed', n_jobs=4)
cluster_tight = DBSCAN(eps=eps_tight, min_samples=4, metric='precomputed', n_jobs=4)
cluster_loose = DBSCAN(eps=eps_loose, min_samples=4, metric='precomputed', n_jobs=4)

Error 2

Abnormal GPU usage

When I use faiss-gpu 1.6.3 under CUDA 11.2, the model training can be processed but encountered CUDA error soon.

That's because faiss-gpu 1.6.3 is not compatible with CUDA 11.2.

Solution

  • Upgrade faiss-gpu to 1.6.5 by using:
conda install -c conda-forge faiss=1.6.5=py38h60a57df_0_cuda
  • Then I got traceback : "module 'faiss' has no attribute 'cast_integer_to long ptr'", solving that by:

#L15

#replacing "cast_integer_to_long_ptr" by "cast_integer_to_idx_t_ptr"
def swig_ptr_from_LongTensor(x):
    assert x.is_contiguous()
    assert x.dtype == torch.int64, 'dtype=%s' % x.dtype
    # error
    return faiss.cast_integer_to_idx_t_ptr(
        x.storage().data_ptr() + x.storage_offset() * 8)
  • In a word, thanks for yxgeee's great work. 👍

hello, anyone meet this error?

Computing jaccard distance...
bruteForceKnn is deprecated; call bfKnn instead
Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::runL2Norm(faiss::gpu::Tensor<T, 2, true, IndexType>&, bool, faiss::gpu::Tensor<float, 1, true, IndexType>&, bool, cudaStream_t) [with T = float; TVec = float4; IndexType = int; cudaStream_t = CUstream_st*] at gpu/impl/L2Norm.cu:292; details: CUDA error 11 invalid argument
Aborted (core dumped)

env info:

torch1.7.0
cuda11.1
2080TI

About the momentum update

Hello, when I debug the program with a one-step, I can't run to the location where the memory bank is updated, which is lines 28 to 32 of hm.py, is there something wrong with my debugging method?

A question about len(cluster_R_indep)

In spcl_train_uda.py line 245:
pseudo_labels[i] = source_classes+len(cluster_R_indep)+outliers
why is len(cluster_R_indep)?
Your propose is just to distinguish outliers and inliers(It seems change the value of len(cluster_R_indep) to a enough big constant won't effect model performance)?

Question about the paper

Hi, In your paper “Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID”, there was one particular place that gets me a little bit confuse.
image
In Table 5(a), there is a huge improvement in mAP and CMC when using self-paced learning strategy. However, there is still a large gap between Src. class + tgt. cluster (w/o self-paced) and Ours w/o R_comp & R_indep. Can you explain more about self-paced learning strategy, does that means using R_comp & R_indep?

Error occurs while uda target dataset is too big

While UDA training,my target dataset training split has 79188 images, and my super computer memory is 128G, but it failed, the error info is Pytorch RuntimeError: [enforce fail at CPUAllocator.cpp:56] posix_memalign(&data, gAlignment, nbytes) == 0. 12 vs 0. Because the memory is not enough. So I find the most memory consumed code part, then rewrite them by numpy operation instead of torch.Tensor operation, and divide the whole operation into several parts.
In spcl_train_uda.py :

        # select & cluster images as training set of this epochs
        pseudo_labels = cluster.fit_predict(rerank_dist) # shape: num_imgs
        pseudo_labels_tight = cluster_tight.fit_predict(rerank_dist)
        pseudo_labels_loose = cluster_loose.fit_predict(rerank_dist)
        num_ids = len(set(pseudo_labels)) - (1 if -1 in pseudo_labels else 0)
        num_ids_tight = len(set(pseudo_labels_tight)) - (1 if -1 in pseudo_labels_tight else 0)
        num_ids_loose = len(set(pseudo_labels_loose)) - (1 if -1 in pseudo_labels_loose else 0)

        # generate new dataset and calculate cluster centers
        def generate_pseudo_labels(cluster_id, num):
            labels = []
            outliers = 0
            for i, ((fname, _, cid), id) in enumerate(zip(sorted(dataset_target.train), cluster_id)):
                if id!=-1:
                    labels.append(source_classes+id)
                else:
                    labels.append(source_classes+num+outliers)
                    outliers += 1
            return torch.Tensor(labels).long()

        pseudo_labels = generate_pseudo_labels(pseudo_labels, num_ids)
        pseudo_labels_tight = generate_pseudo_labels(pseudo_labels_tight, num_ids_tight)
        pseudo_labels_loose = generate_pseudo_labels(pseudo_labels_loose, num_ids_loose)
# above code is not changed

        # compute_R_old is the old method to compute R_comp and R_indep
        def compute_R_old(pseudo_labels, pseudo_labels_tight, pseudo_labels_loose):
            def convert2tensor(label):
                if(isinstance(label, torch.Tensor)):
                    return label
                else:
                    return torch.from_numpy(label)
            pseudo_labels = convert2tensor(pseudo_labels)
            pseudo_labels_tight = convert2tensor(pseudo_labels_tight)
            pseudo_labels_loose = convert2tensor(pseudo_labels_loose)
            # compute R_indep and R_comp
            N = pseudo_labels.size(0)
            label_sim = pseudo_labels.expand(N, N).eq(pseudo_labels.expand(N, N).t()).float() # shape:[num_imgs, num_imgs]
            label_sim_tight = pseudo_labels_tight.expand(N, N).eq(pseudo_labels_tight.expand(N, N).t()).float()
            label_sim_loose = pseudo_labels_loose.expand(N, N).eq(pseudo_labels_loose.expand(N, N).t()).float()

            R_comp = 1-torch.min(label_sim, label_sim_tight).sum(-1)/torch.max(label_sim, label_sim_tight).sum(-1) # shape: num_imgs
            R_indep = 1-torch.min(label_sim, label_sim_loose).sum(-1)/torch.max(label_sim, label_sim_loose).sum(-1)
            assert((R_comp.min()>=0) and (R_comp.max()<=1))
            assert((R_indep.min()>=0) and (R_indep.max()<=1))
            return R_comp, R_indep

        # compute_R_divide is my divided method to compute R_comp and R_indep
        def compute_R_divide(pseudo_labels, pseudo_labels_tight, pseudo_labels_loose):
            # compute in divided numpy to avoid error '[enforce fail at CPUAllocator.cpp:56]'
            def convert2numpy(label):
                if(isinstance(label, np.ndarray)):
                    return label
                else:
                    return label.numpy().astype(np.int32) # if is torch.Tensor
            def get_sub_label_sim(label_np, start, end):
                label_sim = np.expand_dims(label_np, 0).repeat(end - start, axis=0)
                label_sim_T = np.expand_dims(label_np[start:end], 0).repeat(len(label_np), axis=0).T
                return (label_sim == label_sim_T).astype(np.int32)

            pseudo_labels = convert2numpy(pseudo_labels)
            pseudo_labels_tight = convert2numpy(pseudo_labels_tight)
            pseudo_labels_loose = convert2numpy(pseudo_labels_loose)
            N = pseudo_labels.shape[0]
            divide_base = 15000  # this factor is determined intuitively
            divide = max(int((N/divide_base) * (N/divide_base)), 1)
            num_each = math.ceil(N / divide)
            for i in range(divide):
                start = i*num_each
                end = min((i+1)*num_each, N)
                label_sim_np = get_sub_label_sim(pseudo_labels, start, end)
                label_sim_tight_np = get_sub_label_sim(pseudo_labels_tight, start, end)
                label_sim_loose_np = get_sub_label_sim(pseudo_labels_loose, start, end)
                R_comp_np = 1 - (label_sim_np & label_sim_tight_np).sum(-1)/(label_sim_np | label_sim_tight_np).sum(-1)
                R_indep_np = 1 - (label_sim_np & label_sim_loose_np).sum(-1) / (label_sim_np | label_sim_loose_np).sum(-1)
                if(i==0):
                    R_COMP_np = R_comp_np
                    R_INDEP_np = R_indep_np
                else:
                    R_COMP_np = np.concatenate((R_COMP_np, R_comp_np), axis=-1)
                    R_INDEP_np = np.concatenate((R_INDEP_np, R_comp_np), axis=-1)

            R_comp = torch.from_numpy(R_COMP_np).float()
            R_indep = torch.from_numpy(R_INDEP_np).float()
            assert ((R_comp.min() >= 0) and (R_comp.max() <= 1))
            assert ((R_indep.min() >= 0) and (R_indep.max() <= 1))
            return R_comp, R_indep

        R_comp, R_indep = compute_R_divide(pseudo_labels, pseudo_labels_tight, pseudo_labels_loose)


This is just a workaround to tackle this problem, if there is better solution, it is expected.

errors occured when testing

While run the examples/test.py, there are some errors:
1、It failed in examples/test.py[line:70] model = models.create(args.arch, pretrained=False, num_features=args.features, dropout=args.dropout, num_classes=0), the solution is to set pretrained=True

2、It failed in examples/test.py[line:90] evaluator.evaluate(test_loader, dataset.query, dataset.gallery, cmc_flag=True, rerank=args.rerank), where the args.rerank is True. The reason is that AttributeError: 'tuple' object has no attribute 'numpy' in spcl/evaluators.py[line:121] distmat = re_ranking(distmat.numpy(), distmat_qq.numpy(), distmat_gg.numpy()), the distmat_qq and distmat_gg is tuple but not tensor. I don't know how to solve this problem.

Please help to solve these problems.

cannot reproduce the results in the paper

Thanks for your insightful work. I have runned the uda code for duke to market. The result is rank1: 86.9%, mAP:71.6, which are 3.39% less than that in the paper (rank1: 90.3%, map: 76.7%). Why?

The update of source feature in hybrid memory

In your paper, "For the source-domain class centroids {w}, the k-th centroid wk is updated by the mean of the encoded features belonging to class k in the mini-batch."
But where is the mean operation in your code?

Could data augmentation improve the clustering?

SpCL is amazing project. You use Unified Contrastive Learning as loss function, which is different from SimCLR. But I find that the transformer don't use any data augmentation as SimCLR when I read the code of SpCl. I wonder whether the data augmentation (just like SimCLR or MoCo V2) could improve the clustering. For example, the centroids of data augmentation's feature outputs became a Cluster Centroids instead of Outlier Instance Features.

聚类紧密度计算

作者您好,感谢您卓越工作的分享!
在看代码时,这里有一些疑惑,为何每个类的紧密度要取min值呢?这里面的值难道不都一样吗?
cluster_R_comp = [min(cluster_R_comp[i]) for i in sorted(cluster_R_comp.keys())]

About the dataset centroid initialization

image
About two hours passed,the ‘Initialize source-domain class centroids in the hybrid memory’ is still there.
Is this normal?

the nvidia-smi screen shot(it seem like normal):
image
target-domin question:
image

image

about the code

Hi, great work!
Can you explain this line of code
sim /= (mask*nums+(1-mask)).clone().expand_as(sim)
I did not understand its meaning.

gpu ram and batch_size

Thank you for releasing the code!
Because I have only one 2080s GPU, I am wondering how much gup-ram does
CUDA_VISIBLE_DEVICES=0 python examples/spcl_train_uda.py -ds dukemtmc -dt market1501 --logs-dir logs/spcl_uda/duke2market_resnet50
need in the default setting(batch_size=64).

Another issue is will the performance drop a lot if setting batch-size=32?

`fassi-gpu` version should be fixed below `v1.6.4`

After installing dependencies according to the guidance, we encountered a runtime error about fassi:

AttributeError: module 'faiss' has no attribute 'cast_integer_to_long_ptr'

Search these keywords in Google can not get any relative answers.

By analyzing the source code of Faiss, we found that from version v1.6.5 the method cast_integer_to_long_ptr disappeared.

So the fassi-gpu lib should be fixed below version v1.6.4, just as the commit in OpenUnReID did.

a very simple PR #29 has been created.

关于代码的细节问题

想像您请教一下代码这个函数
def compute_jaccard_distance(target_features, k1=20, k2=6, print_flag=True, search_option=0, use_float16=False):
这里的k1 k2的物理意义是什么,或者说这个参数控制了什么,这里看您设置的是30和6,如果改变这些参数会对最后结果产生什么影响。谢谢您

Performance on MSMT17

I use the default setting and iters=800 to train spcl_uda, source dataset=msmt17 and target dataset=market1501.
I only get map=72.0, rank1=87.5, which is far away from your claimed metrics.
Are there any tricks, or some missing points I should focus on?

Memory Error

Hello All,
Please has a fix been identified for this error.
RuntimeError: Error in void faiss::gpu::allocMemorySpaceV(faiss::gpu::MemorySpace, void**, size_t) at gpu/utils/MemorySpace.cpp:26: Error: 'err == cudaSuccess' failed: failed to cudaMalloc 1610612736 bytes (error 2 out of memory)
Your suggestions are welcomed.
I tried to changes the K2 and K1 parameters to no avail.
I hope to hear from you.

problem with rerank

i use spcl_train_usl.py on duke with 1 V100,I got the result mAP=64.8% R1=79.6% R5=88.9% R10=91.1%.When I set rerank=True ,I got the result mAP=78.2%,R1=83.5%,R5=89.4%,R10=91.8%.I think it is a little strange that mAP increases a lot while R1,R2,R3 increase only a little.

About NMI scores

image
作者您好,我想问一下您文章中的这个NMI scores of clusters 是怎么计算的呢?

indep_thres is defined by the top-90%?why?

作者您好,在评价聚类可靠性时使用了聚类独立性和聚类紧凑性,其中独立性判定的阈值使用了epoch=0时的所有聚类独立性的前90%,而紧凑性则是动态的针对每个聚类来判定的,请问这样做是出于什么考虑呢?epoch=0时的阈值可靠吗?(reid新手,实在想不清楚内在原因😫)

gap between single 2080ti GPU and double 2080ti GPU

Hi!thx to your work!My question is that when I train the spcl with default config from two 2080ti gpus to single 2080ti gpu,the result drops about 6% in mAP. So I want to know the reason about it, thank u !

Question on PersonX

PersonX has 6 version with different background, and i wonder which version do you choose?

num of gpus

Hi, I tried to run this program with two GPUs, and it turned out to be very different from the result of 4 GPUs, and the loss of using 2 GPUs drops quickly. Is this program related to the number of GPUs?

VehicleX Datasets

The 2020 AI CITY CHALLENGE is over, how to apply for the datset? Can you share? I am a master in Huazhong University of Science and Techology, thanks a lot!

Loss cannot be converged

During the training process, loss_s can quickly converge, but Loss_t has been maintained between 8-9. What could be the reason for this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.