Git Product home page Git Product logo

deepcluster's Introduction

Deep Clustering for Unsupervised Learning of Visual Features

News

We release paper and code for SwAV, our new self-supervised method. SwAV pushes self-supervised learning to only 1.2% away from supervised learning on ImageNet with a ResNet-50! It combines online clustering with a multi-crop data augmentation.

We also present DeepCluster-v2, which is an improved version of DeepCluster (ResNet-50, better data augmentation, cosine learning rate schedule, MLP projection head, use of centroids, ...). Check out DeepCluster-v2 code.

DeepCluster

This code implements the unsupervised training of convolutional neural networks, or convnets, as described in the paper Deep Clustering for Unsupervised Learning of Visual Features.

Moreover, we provide the evaluation protocol codes we used in the paper:

  • Pascal VOC classification
  • Linear classification on activations
  • Instance-level image retrieval

Finally, this code also includes a visualisation module that allows to assess visually the quality of the learned features.

Requirements

  • a Python installation version 2.7
  • the SciPy and scikit-learn packages
  • a PyTorch install version 0.1.8 (pytorch.org)
  • CUDA 8.0
  • a Faiss install (Faiss)
  • The ImageNet dataset (which can be automatically downloaded by recent version of torchvision)

Pre-trained models

We provide pre-trained models with AlexNet and VGG-16 architectures, available for download.

  • The models in Caffe format expect BGR inputs that range in [0, 255]. You do not need to subtract the per-color-channel mean image since the preprocessing of the data is already included in our released models.
  • The models in PyTorch format expect RGB inputs that range in [0, 1]. You should preprocessed your data before passing them to the released models by normalizing them: mean_rgb = [0.485, 0.456, 0.406]; std_rgb = [0.229, 0.224, 0.225] Note that in all our released models, sobel filters are computed within the models as two convolutional layers (greyscale + sobel filters).

You can download all variants by running

$ ./download_model.sh

This will fetch the models into ${HOME}/deepcluster_models by default. You can change that path in the environment variable. Direct download links are provided here:

We also provide the last epoch cluster assignments for these models. After downloading, open the file with Python 2:

import pickle
with open("./alexnet_cluster_assignment.pickle", "rb") as f:
    b = pickle.load(f)

If you're a Python 3 user, specify encoding='latin1' in the load fonction. Each file is a list of (image path, cluster_index) tuples.

Finally, we release the features extracted with DeepCluster model for ImageNet dataset. These features are in dimension 4096 and correspond to a forward on the model up to the penultimate convolutional layer (just before last ReLU). In you plan to cluster the features, don't forget to normalize and reduce/whiten them.

Running the unsupervised training

Unsupervised training can be launched by running:

$ ./main.sh

Please provide the path to the data folder:

DIR=/datasets01/imagenet_full_size/061417/train

To train an AlexNet network, specify ARCH=alexnet whereas to train a VGG-16 convnet use ARCH=vgg16.

You can also specify where you want to save the clustering logs and checkpoints using:

EXP=exp

During training, models are saved every other n iterations (set using the --checkpoints flag), and can be found in for instance in ${EXP}/checkpoints/checkpoint_0.pth.tar. A log of the assignments in the clusters at each epoch can be found in the pickle file ${EXP}/clusters.

Full documentation of the unsupervised training code main.py:

usage: main.py [-h] [--arch ARCH] [--sobel] [--clustering {Kmeans,PIC}]
               [--nmb_cluster NMB_CLUSTER] [--lr LR] [--wd WD]
               [--reassign REASSIGN] [--workers WORKERS] [--epochs EPOCHS]
               [--start_epoch START_EPOCH] [--batch BATCH]
               [--momentum MOMENTUM] [--resume PATH]
               [--checkpoints CHECKPOINTS] [--seed SEED] [--exp EXP]
               [--verbose]
               DIR

PyTorch Implementation of DeepCluster

positional arguments:
  DIR                   path to dataset

optional arguments:
  -h, --help            show this help message and exit
  --arch ARCH, -a ARCH  CNN architecture (default: alexnet)
  --sobel               Sobel filtering
  --clustering {Kmeans,PIC}
                        clustering algorithm (default: Kmeans)
  --nmb_cluster NMB_CLUSTER, --k NMB_CLUSTER
                        number of cluster for k-means (default: 10000)
  --lr LR               learning rate (default: 0.05)
  --wd WD               weight decay pow (default: -5)
  --reassign REASSIGN   how many epochs of training between two consecutive
                        reassignments of clusters (default: 1)
  --workers WORKERS     number of data loading workers (default: 4)
  --epochs EPOCHS       number of total epochs to run (default: 200)
  --start_epoch START_EPOCH
                        manual epoch number (useful on restarts) (default: 0)
  --batch BATCH         mini-batch size (default: 256)
  --momentum MOMENTUM   momentum (default: 0.9)
  --resume PATH         path to checkpoint (default: None)
  --checkpoints CHECKPOINTS
                        how many iterations between two checkpoints (default:
                        25000)
  --seed SEED           random seed (default: 31)
  --exp EXP             path to exp folder
  --verbose             chatty

Evaluation protocols

Pascal VOC

To run the classification task with fine-tuning launch:

./eval_voc_classif_all.sh

and with no finetuning:

./eval_voc_classif_fc6_8.sh

Both these scripts download this code. You need to download the VOC 2007 dataset. Then, specify in both ./eval_voc_classif_all.sh and ./eval_voc_classif_fc6_8.sh scripts the path CAFFE to point to the caffe branch, and VOC to point to the Pascal VOC directory. Indicate in PROTO and MODEL respectively the path to the prototxt file of the model and the path to the model weights of the model to evaluate. The flag --train-from allows to indicate the separation between the frozen and to-train layers.

We implemented voc classification with PyTorch.

Erratum: When training the MLP only (fc6-8), the parameters of scaling of the batch-norm layers in the whole network are trained. With freezing these parameters we get 70.4 mAP.

Linear classification on activations

You can run these transfer tasks using:

$ ./eval_linear.sh

You need to specify the path to the supervised data (ImageNet or Places):

DATA=/datasets01/imagenet_full_size/061417/

the path of your model:

MODEL=/private/home/mathilde/deepcluster/checkpoint.pth.tar

and on top of which convolutional layer to train the classifier:

CONV=3

You can specify where you want to save the output of this experiment (checkpoints and best models) with

EXP=exp

Full documentation for this task:

usage: eval_linear.py [-h] [--data DATA] [--model MODEL] [--conv {1,2,3,4,5}]
                      [--tencrops] [--exp EXP] [--workers WORKERS]
                      [--epochs EPOCHS] [--batch_size BATCH_SIZE] [--lr LR]
                      [--momentum MOMENTUM] [--weight_decay WEIGHT_DECAY]
                      [--seed SEED] [--verbose]

Train linear classifier on top of frozen convolutional layers of an AlexNet.

optional arguments:
  -h, --help            show this help message and exit
  --data DATA           path to dataset
  --model MODEL         path to model
  --conv {1,2,3,4,5}    on top of which convolutional layer train logistic
                        regression
  --tencrops            validation accuracy averaged over 10 crops
  --exp EXP             exp folder
  --workers WORKERS     number of data loading workers (default: 4)
  --epochs EPOCHS       number of total epochs to run (default: 90)
  --batch_size BATCH_SIZE
                        mini-batch size (default: 256)
  --lr LR               learning rate
  --momentum MOMENTUM   momentum (default: 0.9)
  --weight_decay WEIGHT_DECAY, --wd WEIGHT_DECAY
                        weight decay pow (default: -4)
  --seed SEED           random seed
  --verbose             chatty

Instance-level image retrieval

You can run the instance-level image retrieval transfer task using:

./eval_retrieval.sh

Visualisation

We provide two standard visualisation methods presented in our paper.

Filter visualisation with gradient ascent

First, it is posible to learn an input image that maximizes the activation of a given filter. We follow the process described by Yosinki et al. with a cross entropy function between the target filter and the other filters in the same layer. From the visu folder you can run

./gradient_ascent.sh

You will need to specify the model path MODEL, the architecture of your model ARCH, the path of the folder in which you want to save the synthetic images EXP and the convolutional layer to consider CONV.

Full documentation:

usage: gradient_ascent.py [-h] [--model MODEL] [--arch {alexnet,vgg16}]
                          [--conv CONV] [--exp EXP] [--lr LR] [--wd WD]
                          [--sig SIG] [--step STEP] [--niter NITER]
                          [--idim IDIM]

Gradient ascent visualisation

optional arguments:
  -h, --help            show this help message and exit
  --model MODEL         Model
  --arch {alexnet,vgg16}
                        arch
  --conv CONV           convolutional layer
  --exp EXP             path to res
  --lr LR               learning rate (default: 3)
  --wd WD               weight decay (default: 10^-5)
  --sig SIG             gaussian blur (default: 0.3)
  --step STEP           number of iter between gaussian blurs (default: 5)
  --niter NITER         total number of iterations (default: 1000)
  --idim IDIM           size of input image (default: 224)

I recommand you play with the hyper-parameters to find a regime where the visualisations are good. For example with the pre-trained deepcluster AlexNet, for conv1 using a learning rate of 3 and 30.000 iterations works well. For conv5, using a learning rate of 30 and 3.000 iterations gives nice images with the other parameters set to their default values.

Top 9 maximally activated images in a dataset

Finally, we provide code to retrieve images in a dataset that maximally activate a given filter in the convnet. From the visu folder, after having changed the fields MODEL, EXP, CONV and DATA, run

./activ-retrieval.sh

DeeperCluster

We have proposed another unsupervised feature learning paper at ICCV 2019. We have shown that unsupervised learning can be used to pre-train convnets, leading to a boost in performance on ImageNet classification. We achieve that by scaling DeepCluster to 96M images and mixing it with RotNet self-supervision. Check out the paper and code.

License

You may find out more about the license here.

Reference

If you use this code, please cite the following paper:

Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. "Deep Clustering for Unsupervised Learning of Visual Features." Proc. ECCV (2018).

@InProceedings{caron2018deep,
  title={Deep Clustering for Unsupervised Learning of Visual Features},
  author={Caron, Mathilde and Bojanowski, Piotr and Joulin, Armand and Douze, Matthijs},
  booktitle={European Conference on Computer Vision},
  year={2018},
}

deepcluster's People

Contributors

mathildecaron31 avatar piotr-bojanowski avatar saramsv avatar sbrugman avatar tomoyukun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepcluster's Issues

CUDA error 35 CUDA driver version is insufficient for CUDA runtime version

when i use my dataset train the model,i came cross the error

Faiss assertion 'err__ == cudaSuccess' failed in int faiss::gpu::getNumDevices() at gpu/utils/DeviceUtils.cu:32; details: CUDA error 35 CUDA driver version is insufficient for CUDA runtime version

but when i try the error code

index = faiss.GpuIndexFlatL2(res,256)
flat_config = faiss.GpuIndexFlatConfig()
flat_config.useFloat16 = False
flat_config.device = 0
index = faiss.GpuIndexFlatL2(res,256,flat_config)

it'right

the error code
res = faiss.StandardGpuResources()
flat_config = faiss.GpuIndexFlatConfig()
flat_config.useFloat16 = False
flat_config.device = 0
index = faiss.GpuIndexFlatL2(res, d, flat_config)

Clustering of unlabeled data

Hi! Great work!

I've trained AlexNet arch with deep cluster on ~1 million of unlabeled images. I want to use the trained parameters to cluster another (way bigger) set of unlabeled images. Should I use a linear classifier as in eval_linear.py or just run 1 additional epoch of training with frozen layers and then pseudolabel extraction?
Thanks!

Invalid index of a 0-dim tensor

The training script is failing after completing an epoch:

$ python main.py --verbose --k 1000 ./images
Architecture: alexnet
Load dataset: 0.35 s
Compute features
main.py:296: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  input_var = torch.autograd.Variable(input_tensor.cuda(), volatile=True)
0 / 215	Time: 7.964 (7.964)
200 / 215	Time: 0.067 (1.363)
k-means loss evolution: [74579.91  44611.473 43375.715 42842.293 42551.477 42388.676 42286.797
 42216.695 42169.83  42135.938 42109.06  42090.5   42076.816 42066.973
 42059.99  42054.387 42049.074 42045.258 42041.926 42039.15 ]
k-means time: 8 s
Save checkpoint at: checkpoints/checkpoint_0.pth.tar
Traceback (most recent call last):
  File "main.py", line 320, in <module>
    main()
  File "main.py", line 181, in main
    loss = train(train_dataloader, model, criterion, optimizer, epoch)
  File "main.py", line 265, in train
    losses.update(loss.data[0], input_tensor.size(0))
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

It can successfully start from the checkpoint, but still fails after the next epoch.

Curious about relu removal for feature clustering

Hello, thank you for the code release!
I was curious about the removal of the final relu before clustering the features. Is this standard when performing feature clustering, or did you try both and find it to be better than leaving the relu?
Thanks!

Time for validating

Hi there. Could you mention the time taken for the validation step (the one where you train mlp on imagenet data) with only conv1 freezed features on Pascal P100?

Applying this Model to CIFAR10 Dataset

Thank you so much for your attention.

I am trying to applying this model to some smaller size datasets like CIFAR-10 and MNIST. But during my training, I didn't find significant "learning" using this unsupervised method (I printed the NMI values between the pseudo labels and the true labels in each epoch, but find these values all below 0.1 in all 200 epoches, and no significant improvement was found).

I trained CIFAR-10 dataset with clusters number k = 10. Is this because this k values is too small?
I also applied multiple times of training with learning rate lr = 0.001, 0.005, 0.01, 0.05, while all these results were not satisfactory.

I am wondering whether there are already experiment results applying this model on CIFAR-10 dataset, and I will be very appreciate if you might share your results with me or help me to implement.

I am a university student and I really need your help to finish my assignment.

Question: Can I use the model to cluster unlabelled texts?

Hi, I'ts not an issue but a question. I want to make clusters of texts of different topics, which are unlabelled. Say I have notices about sport, economy and politics, would it be possible to train the DeepCluster for making this 3 clusters without any labelled text?
Thanks for your work!

ValueError: 'a' cannot be empty unless no samples are taken

The error occurs at 32nd epoch and run normally before it, I guess maybe Kmeans has assigned an empty cluster so a list become empty?

package version:

  • pytorch: 0.4.1
  • python: 3.7.3 (use 2to3 to modify the codes)
  • CUDA: 8.0.61
main.py:265: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  losses.update(loss.data[0], input_tensor.size(0))
Epoch: [32][0/170]	Time: 0.575 (0.575)	Data: 0.561 (0.561)	Loss: 4.6828 (4.6828)
Epoch: [32][10/170]	Time: 0.360 (0.376)	Data: 0.000 (0.082)	Loss: 3.0669 (3.4403)
Epoch: [32][20/170]	Time: 0.359 (0.368)	Data: 0.000 (0.060)	Loss: 3.7505 (3.2823)
Epoch: [32][30/170]	Time: 0.358 (0.365)	Data: 0.000 (0.052)	Loss: 3.1550 (3.1970)
Epoch: [32][40/170]	Time: 0.358 (0.363)	Data: 0.000 (0.048)	Loss: 3.0475 (3.0846)
Epoch: [32][50/170]	Time: 0.358 (0.362)	Data: 0.000 (0.045)	Loss: 3.0932 (2.9922)
Epoch: [32][60/170]	Time: 0.359 (0.361)	Data: 0.000 (0.044)	Loss: 2.8730 (3.0125)
Epoch: [32][70/170]	Time: 0.359 (0.361)	Data: 0.000 (0.043)	Loss: 3.4860 (2.9648)
Epoch: [32][80/170]	Time: 0.358 (0.361)	Data: 0.000 (0.042)	Loss: 3.1373 (2.9436)
Epoch: [32][90/170]	Time: 0.359 (0.361)	Data: 0.000 (0.041)	Loss: 2.6147 (2.9145)
Epoch: [32][100/170]	Time: 0.359 (0.360)	Data: 0.000 (0.040)	Loss: 2.6068 (2.8890)
Epoch: [32][110/170]	Time: 0.360 (0.360)	Data: 0.000 (0.040)	Loss: 2.4559 (2.8653)
Epoch: [32][120/170]	Time: 0.359 (0.360)	Data: 0.000 (0.040)	Loss: 2.2136 (2.8452)
Epoch: [32][130/170]	Time: 0.358 (0.360)	Data: 0.000 (0.039)	Loss: 2.4173 (2.8292)
Epoch: [32][140/170]	Time: 0.358 (0.360)	Data: 0.000 (0.039)	Loss: 2.7725 (2.8246)
Epoch: [32][150/170]	Time: 0.360 (0.360)	Data: 0.000 (0.039)	Loss: 2.2551 (2.7988)
Epoch: [32][160/170]	Time: 0.358 (0.360)	Data: 0.000 (0.038)	Loss: 2.4294 (2.7789)
###### Epoch [32] ######
Time: 61.166 s
Clustering loss: 1608.053
ConvNet loss: 2.779
/home/yaochu/anaconda3/envs/env1/lib/python3.7/site-packages/sklearn/metrics/cluster/supervised.py:859: FutureWarning: The behavior of NMI will change in version 0.22. To match the behavior of 'v_measure_score', NMI will use average_method='arithmetic' by default.
  FutureWarning)
NMI against previous assignment: 0.723
#######################

Compute features
main.py:296: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  input_var = torch.autograd.Variable(input_tensor.cuda(), volatile=True)
0 / 170	Time: 0.777 (0.777)
10 / 170	Time: 0.282 (0.212)
20 / 170	Time: 0.121 (0.176)
30 / 170	Time: 0.111 (0.168)
40 / 170	Time: 0.396 (0.171)
50 / 170	Time: 0.112 (0.163)
60 / 170	Time: 0.208 (0.161)
70 / 170	Time: 0.111 (0.163)
80 / 170	Time: 0.110 (0.162)
90 / 170	Time: 0.112 (0.161)
100 / 170	Time: 0.111 (0.161)
110 / 170	Time: 0.111 (0.162)
120 / 170	Time: 0.110 (0.161)
130 / 170	Time: 0.109 (0.159)
140 / 170	Time: 0.107 (0.158)
150 / 170	Time: 0.151 (0.157)
160 / 170	Time: 0.110 (0.156)
WARNING clustering 3384 points to 100 centroids: please provide at least 3900 training points
k-means loss evolution: [3554.9004 1898.8888 1713.1171 1647.2383 1626.5762 1618.4319 1610.1742
 1604.1263 1599.3331 1596.9933 1595.8425 1594.5605 1593.4675 1592.6006
 1592.2092 1591.5988 1591.486  1591.486  1591.486  1591.486 ]
k-means time: 5 s
Traceback (most recent call last):
  File "main.py", line 320, in <module>
    main()
  File "main.py", line 160, in main
    deepcluster.images_lists)
  File "/home/yaochu/deepcluster/util.py", line 59, in __init__
    self.indexes = self.generate_indexes_epoch()
  File "/home/yaochu/deepcluster/util.py", line 69, in generate_indexes_epoch
    replace=(len(self.images_lists[i]) <= size_per_pseudolabel)
  File "mtrand.pyx", line 1125, in mtrand.RandomState.choice
ValueError: 'a' cannot be empty unless no samples are taken

Please help, thank you!

About the detection/segmentation cluster

I understood that it can cluster ImageNet data into 10000 cIasses.
But I wonder that it also can cluster the bounding box of object or segmentation area.
Does it need a separate object detector or segmentor(like pyfast/er rcnn or mask rcnn) for capturing the right boxes or seg-area which are input to deepcluster?
Maybe, I guess, I should relace softmax layer in the faster rcnn with deep cluster. Then it will cluster the result output features(objects' features in the whole image, like ball, person, bicycle or dog) which output by the last conv layer. Or maybe, I should remove whole detection head part after RPN and attach deepcluster after RPN. As RPN tries to find object's location and it's score(just object or not bg), all objects of all mini batch images can be clustered by deepcluster.

If so, should I run the detector/sementor first to get the object box or area and then run deep cluster with the result?
I'm little confused that deepcluster is only for clustering classes of images. So, I think it can't localize the coordinate and size of objects. Maybe these information should be provided by pretrained faster rcnn. But again, if so, it's meaningless to cluster the result bboxes because faster rcnn already recognize both bbox and classes!
pyfasterrcnn is just for reference or comparison? or deepcluster and pyfaster rcnn work together?

I have read #10's answer. Can you tell me why you provided the 2 repos?

Please let me know some hint~!

Thank you.

Extracted features

Is it possible to release the features extracted w/ the trained model for Imagenet? I want to do a quick experiment, and this would save me from downloading Imagenet, running it through the model, etc.

Thanks!

CPU or other CUDA versions supported?

Hi. Due to hardware dependency, I cannot really use this library. I tried to set caffe in CPU mode and tried to run it but I face other issues.

Is there any other ways to run in CPU or other CUDA compatible modes?

Thanks.

cluster labels

Hi Iam running the unsupervised training. The clusters assignment that is saved is a list of list of list where each image is assigned an index. how do i map that index to image file?

Quesetion about principle of deepcluster

I tried to use deepcluster on non-image and non-linear dataset, i.e. spiral data etc.

When I do pseudo class methods, it`s not obvious to me that how pseudo-class (which represents clustering results of k-means whose objective is a convex form) could be used to separate nonlinear data.

Can I receive some hint for this question...?

Some question about NMI.

In my experiments, I'm confused about my NMI value did not tend to a certain value, but it kept fluctuating and was irregular.

Error loading PyTorch vgg16 model

When loading the downloaded vgg16 model with the following code:

import torch
from models import vgg16

model = vgg16(sobel=True, bn=True, out=10000)

PATH = '/path/to/deepcluster_models/vgg16/checkpoint.pth.tar'
checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['state_dict'])

I receive the following error:

Traceback (most recent call last): File "<input>", line 10, in <module> File "/path/to/lib/python3.5/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for VGG: Missing key(s) in state_dict: "features.0.weight", "features.0.bias", "features.1.weight", "features.1.bias", "features.1.running_var", "features.1.running_mean", "features.3.weight", "features.3.bias", "features.4.weight", "features.4.bias", "features.4.running_var", "features.4.running_mean", "features.7.weight", "features.7.bias", "features.8.weight", "features.8.bias", "features.8.running_var", "features.8.running_mean", "features.10.weight", "features.10.bias", "features.11.weight", "features.11.bias", "features.11.running_var", "features.11.running_mean", "features.14.weight", "features.14.bias", "features.15.weight", "features.15.bias", "features.15.running_var", "features.15.running_mean", "features.17.weight", "features.17.bias", "features.18.weight", "features.18.bias", "features.18.running_var", "features.18.running_mean", "features.20.weight", "features.20.bias", "features.21.weight", "features.21.bias", "features.21.running_var", "features.21.running_mean", "features.24.weight", "features.24.bias", "features.25.weight", "features.25.bias", "features.25.running_var", "features.25.running_mean", "features.27.weight", "features.27.bias", "features.28.weight", "features.28.bias", "features.28.running_var", "features.28.running_mean", "features.30.weight", "features.30.bias", "features.31.weight", "features.31.bias", "features.31.running_var", "features.31.running_mean", "features.34.weight", "features.34.bias", "features.35.weight", "features.35.bias", "features.35.running_var", "features.35.running_mean", "features.37.weight", "features.37.bias", "features.38.weight", "features.38.bias", "features.38.running_var", "features.38.running_mean", "features.40.weight", "features.40.bias", "features.41.weight", "features.41.bias", "features.41.running_var", "features.41.running_mean". Unexpected key(s) in state_dict: "features.module.0.weight", "features.module.0.bias", "features.module.1.weight", "features.module.1.bias", "features.module.1.running_mean", "features.module.1.running_var", "features.module.3.weight", "features.module.3.bias", "features.module.4.weight", "features.module.4.bias", "features.module.4.running_mean", "features.module.4.running_var", "features.module.7.weight", "features.module.7.bias", "features.module.8.weight", "features.module.8.bias", "features.module.8.running_mean", "features.module.8.running_var", "features.module.10.weight", "features.module.10.bias", "features.module.11.weight", "features.module.11.bias", "features.module.11.running_mean", "features.module.11.running_var", "features.module.14.weight", "features.module.14.bias", "features.module.15.weight", "features.module.15.bias", "features.module.15.running_mean", "features.module.15.running_var", "features.module.17.weight", "features.module.17.bias", "features.module.18.weight", "features.module.18.bias", "features.module.18.running_mean", "features.module.18.running_var", "features.module.20.weight", "features.module.20.bias", "features.module.21.weight", "features.module.21.bias", "features.module.21.running_mean", "features.module.21.running_var", "features.module.24.weight", "features.module.24.bias", "features.module.25.weight", "features.module.25.bias", "features.module.25.running_mean", "features.module.25.running_var", "features.module.27.weight", "features.module.27.bias", "features.module.28.weight", "features.module.28.bias", "features.module.28.running_mean", "features.module.28.running_var", "features.module.30.weight", "features.module.30.bias", "features.module.31.weight", "features.module.31.bias", "features.module.31.running_mean", "features.module.31.running_var", "features.module.34.weight", "features.module.34.bias", "features.module.35.weight", "features.module.35.bias", "features.module.35.running_mean", "features.module.35.running_var", "features.module.37.weight", "features.module.37.bias", "features.module.38.weight", "features.module.38.bias", "features.module.38.running_mean", "features.module.38.running_var", "features.module.40.weight", "features.module.40.bias", "features.module.41.weight", "features.module.41.bias", "features.module.41.running_mean", "features.module.41.running_var".

Model weights are not fully set

Hi,

I've been wanting to use the trained models for a clustering task but I've come across a strange behavior.

The downloaded weights do not fully set the model weights. Between two successive runs, the output feature vector are different for the same input image.

Did I miss something ?

Thank you for your time

Confused about training!

Hello there,

The Alexnet model you defined creates a network with the final layer having 1000 classes by default. And this can be seen in the following piece of code in models/alexnet.py:95:

def alexnet(sobel=False, bn=True, out=1000):
    dim = 2 + int(not sobel)
    model = AlexNet(make_layers_features(CFG['2012'], dim, bn=bn), out, sobel)
    return model

On the other side the default number of clusters is set to 10,000 in main.py:38:

parser.add_argument('--nmb_cluster', '--k', type=int, default=10000,
                    help='number of cluster for k-means (default: 10000)')

How it is possible to train a network with 1000 classes using pseudo labels of 10,000 clusters?

i have question about eval-linear.sh?

I want to ask when running eval-linear.sh, each epoch has an accuracy rate, but the accuracy rate of each epoch is different, and there are even large differences. Which accuracy rate should I choose?thanks

OOM for K=10^5

When I'm trying to run on K=10^5 clusters.
I'm getting OOM issues.
the shape of data in the cluster
(1281167, 4096)
Traceback (most recent call last):
File "main_resnet_18_sobel.py", line 392, in
main()
File "main_resnet_18_sobel.py", line 198, in main
clustering_loss = deepcluster.cluster(features, verbose=args.verbose)
File "deepcluster/clustering.py", line 208, in cluster
I, loss = run_kmeans(xb, self.k, verbose)
File "deepcluster/clustering.py", line 168, in run_kmeans
index = faiss.GpuIndexFlatL2(res, d, flat_config)
File "lib/python3.7/site-packages/faiss/init.py", line 333, in replacement_init
original_init(self, args)
File "lib/python3.7/site-packages/faiss/swigfaiss.py", line 5430, in init
this = _swigfaiss.new_GpuIndexFlatL2(args)
RuntimeError: Error in void faiss::gpu::allocMemorySpaceV(faiss::gpu::MemorySpace, void
, size_t) at gpu/utils/MemorySpace.cpp:27: Error: 'err == cudaSuccess' failed: failed to cudaMalloc 1610612736 bytes (error 2 out of memory)

How to prevent empty clusters?

Hi~

I use sklearn.cluster.Kmeans, PCA, and pytorch to do the same job on my own dataset.

And I find that when I clustering for each epoch, there are many empty clusters as said in this paper 3.3.

But I don't find the solution on this repository. I want to know what can I do exactly to prevent empty clusters?

Thanks!

why do we need at least 39 training points per cluster ?

when I trained on my own dataset, it raises the warning: clustering 14243 points to 810 centroids,please provide as least 31590 training points.So I guess per cluster class need 39 training points,so how to break the limit for the lack of training data? thanks

Time for clustering

Hi, thanks for releasing the code for the paper. I was wondering how much time would it take for the clustering to get the psuedo-labels after one epoch using Faiss?

PCA error on own dataset

I'm running into this error with my own dataset:

RuntimeError: Error in void faiss::PCAMatrix::prepare_Ab() at VectorTransform.cpp:482: Error: 'd_out * d_in <= PCAMat.size()' failed: PCA matrix cannot output 256 dimensions from 4096

My environment setup matches your defined dependencies (except cuda10, which may become an issue...?).

Parameters I tested, but resulted in the same errror:

  • alexnet vs vgg16
  • kmeans vs PIC
  • images preprocessed manually to 256x256 and 3channel grayscale (original images grayscale) vs preprocessing handled by your code

Thanks for your help!

Data Argumentation for Training

Hi there,

Thanks a lot for the contribution and the code is amazingly well organized!

I found that when doing the clustering, the data argumentation you used is very simple and also with "centercrop". May I know if there is any specific reason of doing that? Or do you try to use other data argumentations?

Resuming from checkpoint error : RuntimeError: OrderedDict mutated during iteration

Hello there,

I am having the following error when resuming from a checkpoint.

It turns out one can't pop elements from a dictionary while iterating it - according to the following stack overflow answer.

This happens in the main.py file in checkpoint resuming block.

for key in checkpoint['state_dict']:           # Line 101
    if 'top_layer' in key:                     # Line 102
        del checkpoint['state_dict'][key]      # Line 103

This is supposed to be solved by iterating over a copy of the state_dict rather than the original one as following:

    for key in checkpoint['state_dict'].copy():   # <------------
      if 'top_layer' in key:
        del checkpoint['state_dict'][key]

Please correct me if I am wrong.

image_list

When I running the unsupervised training on cifar100, there is a problem:
Traceback (most recent call last):
File "main.py", line 323, in
main()
File "main.py", line 161, in main
deepcluster.images_lists)
File "/home/deepcluster-master/util.py", line 59, in init
self.indexes = self.generate_indexes_epoch()
File "/home/deepcluster-master/util.py", line 69, in generate_indexes_epoch
replace=(len(self.images_lists[i]) <= size_per_pseudolabel)
File "mtrand.pyx", line 1126, in mtrand.RandomState.choice
ValueError: a must be non-empty
I want to know how to deal with the image_list. Should I create one for cifar100?

Experimental results reproduction

Hello,

I come across your paper and find it simple & useful. I try to reproduce the experimental results in the paper by your released code without any changes, but I find that NMI converges from 0.54 to 0.68 very soon after dozens of epochs. The NMI training curve is not that like Fig. 2 (b) in the paper so as the linear classification experiment in Table 1. I know the core of your algorithm is not only the pseudo labels provided by k-means but also the data augmentation/transformation invariance charateristic from your training settings ("We cluster the central cropped images features and perform data augmentation (random horizontal flips and crops of random sizes and aspect ratios) when training the network"). Did I miss some important settings? Can you help me point out the problem? It will help me a lot in my research and I hope to cite your paper.

Thanks~

the loss of unsupervised training could not decline

I trained mnist dataset used "unsupervised training", but the loss could not decline.
this is the training process:

Epoch [4]

Time: 8.809 s
Clustering loss: 953.089
ConvNet loss: 2.084
NMI against previous assignment: 0.227
#######################

Compute features
0 / 32 Time: 0.227 (0.227)
k-means loss evolution: [1795.6045 978.0879 971.03766 959.80444 957.0581 956.3157
955.26294 954.6638 954.14594 953.74384 953.33545 952.984
952.7616 952.4509 952.3556 952.30585 952.2396 952.1643
952.1643 952.1643 ]
k-means time: 1 s
Epoch: [5][0/32] Time: 0.139 (0.139) Data: 0.129 (0.129) Loss: 2.3899 (2.3899)

Epoch [5]

Time: 8.807 s
Clustering loss: 952.164
ConvNet loss: 2.133
NMI against previous assignment: 0.243
#######################

Epoch [264]

Time: 8.869 s
Clustering loss: 950.949
ConvNet loss: 1.832
NMI against previous assignment: 0.336
#######################

Compute features
0 / 32 Time: 0.233 (0.233)
k-means loss evolution: [1795.1869 977.5625 972.47687 967.0636 960.8979 955.13916
952.04254 949.78107 945.9483 944.14136 943.93304 943.7497
943.62286 943.5072 943.5072 943.5072 943.5072 943.5072
943.5072 943.5072 ]
k-means time: 1 s
Epoch: [265][0/32] Time: 0.140 (0.140) Data: 0.129 (0.129) Loss: 2.5108 (2.5108)

Epoch [265]

Time: 8.867 s
Clustering loss: 943.507
ConvNet loss: 1.752
NMI against previous assignment: 0.251
#######################

I use different learning rates lr = 0.001, 0.005, 0.01, 0.05, but the loss never decline.
Can you help me?
thanks

error 2 out of memory

Hi!
I'm training AlexNet with PIC on NVIDIA Tesla M60 GPU (AWS g3.4xlarge instance), 800e3 images. After 2-3 epochs I get the following:

Compute features
0 / 3175        Time: 5.483 (5.483)
200 / 3175      Time: 0.824 (0.681)
400 / 3175      Time: 0.611 (0.680)
600 / 3175      Time: 0.794 (0.681)
800 / 3175      Time: 0.611 (0.673)
1000 / 3175     Time: 0.620 (0.676)
1200 / 3175     Time: 0.609 (0.671)
1400 / 3175     Time: 0.810 (0.674)
1600 / 3175     Time: 0.611 (0.670)
1800 / 3175     Time: 0.724 (0.675)
2000 / 3175     Time: 0.829 (0.672)
2200 / 3175     Time: 0.616 (0.674)
2400 / 3175     Time: 0.806 (0.675)
2600 / 3175     Time: 0.609 (0.670)
2800 / 3175     Time: 0.608 (0.666)
3000 / 3175     Time: 0.613 (0.662)
Traceback (most recent call last):
  File "main.py", line 320, in <module>
    main()
  File "main.py", line 152, in main
    clustering_loss = deepcluster.cluster(features, verbose=args.verbose)
  File "/home/aogorodnikov/deepcluster/clustering.py", line 338, in cluster
    I, D = make_graph(xb, self.nnn)
  File "/home/aogorodnikov/deepcluster/clustering.py", line 117, in make_graph
    index = faiss.GpuIndexFlatL2(res, dim, flat_config)
  File "/home/aogorodnikov/anaconda3/envs/imgSudoku/lib/python3.7/site-packages/faiss/__init__.py", line 333, in replacement_init
    original_init(self, *args)
  File "/home/aogorodnikov/anaconda3/envs/imgSudoku/lib/python3.7/site-packages/faiss/swigfaiss.py", line 5430, in __init__
    this = _swigfaiss.new_GpuIndexFlatL2(*args)
RuntimeError: Error in void faiss::gpu::allocMemorySpaceV(faiss::gpu::MemorySpace, void**, size_t) at gpu/utils/MemorySpace.cpp:27: Error: 'err == cudaSuccess' failed: failed to cudaMalloc 1073741824 bytes (error 2 out of memory)

I saw the issue to originate from Faiss library. Can you advice anything from your side?
Thanks!

RuntimeError: invalid argument 5: k not in range for dimension at /opt/conda/conda-bld/pytorch_1556653000816/work/aten/src/THC/generic/THCTensorTopK.cu:21

(pytorch27) root@6b93e8c68090:~/userfolder/deepcluster# bash eval_linear.sh 
=> loading checkpoint '/root/userfolder/exp/checkpoints/checkpoint.pth.tar'
Loaded
Traceback (most recent call last):
  File "eval_linear.py", line 320, in <module>
    main()
  File "eval_linear.py", line 136, in main
    train(train_loader, model, reglog, criterion, optimizer, epoch)
  File "eval_linear.py", line 246, in train
    prec1, prec5 = accuracy(output.data, target, topk=(1,5))
  File "eval_linear.py", line 208, in accuracy
    _, pred = output.topk(maxk, 1, True, True)
RuntimeError: invalid argument 5: k not in range for dimension at /opt/conda/conda-bld/pytorch_1556653000816/work/aten/src/THC/generic/THCTensorTopK.cu:21

When i run eval_linear.sh,This mistake has occurred. I don't know how to solve it. I hope you can help me.

prepare own image dataset

Is there any documentation of how to prepare a dataset consisting of an folder of images for usage with deepcluster?

Visualization of the 5th conv layer is a noise?

Hallo!
First, thank you for your amazing work!

Secondly, I am wondering why am I receiving a noise from visualization a 5th conv layer? I am using pretrained AlexNet and imageNet-tiny dataset(10 k pics), first I run main.sh and then gradient_ascent.sh. I tried with default settings as well as tried to change number of training epochs, batch size and nothing really changed - I am still recieving noise in con layer 5. Do you have any suggestions about why it doesn't work?

Also a question about the parameter --nmb_cluster - why is it by default 10k and what is it's meaning?

Confused about the args.fc6_8 flag in eval_voc_classify.py

I thought the flag fc6_8 is used to determine whether or not to fine-tune the entire model or just the final classifier. However, I don't understand why when fc6_8 is set to False, the model.feature will be randomly initialized?

if not args.fc6_8:
        for y, m in enumerate(model.features.modules()):
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                for i in range(m.out_channels):
                    m.weight.data[i].normal_(0, math.sqrt(2. / n))
                if m.bias is not None:
                    m.bias.data.zero_()

Failed to reproduce the detection result using the AlexNet caffemodel provided

Hi,

I am getting trouble with reproducing VOC detection result.

I used your pretrained caffe model AlexNet-prototxt + AlexNet-caffemodel and I finetune your model under Fast-RCNN framework from https://github.com/rbgirshick/py-faster-rcnn.

To be specifically, I trained the Fast RCNN with the following script. All parameters were all set as default.

./tools/train_net.py --gpu ${GPU_ID}
--solver models/${PT_DIR}/${NET}/fast_rcnn/solver.prototxt
--weights data/imagenet_models/yourmodel.caffemodel
--imdb ${TRAIN_IMDB}
--iters ${ITERS}
${EXTRA_ARGS}

I could get 60.26 Mean AP with default ZF.v2.caffemodel, but it failed to converge when using your pre-trained model. The pre-trained weight is nearly close to noise initiation such that I cannot obtain any testing performance (MAP). The testing code returned me following error, which means the trained detector cannot detect anything.

BB = BB[sorted_ind, :]
IndexError: too many indices for array

I tested your caffemodel for VOC classification, and return a comparable result. But when coming to detection, it seemed not working.

I think I definitely made mistake at some point, or missed some processing procedure. Could you please tell me how you fine-tuned the Fast RCNN? Did you modified any hyper parameters or train.prototxt/ test.prototxt file? I read other self-supervised paper that did batch norm absorbance or weight rescaling from https://github.com/philkr/magic_init. I tested both scenario with or without post-processing techniques for detection, but failed in both case.

Pascal evaluation really frustrated me for the past two weeks. I appreciate it if you could share me some insight.

Best,
Chang

Shuffling training data

Hi @mathildecaron31

First of all thank you for making this research code available to the wider community.
I have couple of questions/issues that I want to address:

1. Computing features

deepcluster/main.py

Lines 299 to 300 in f5995e9

if i == 0:
features = np.zeros((N, aux.shape[1])).astype('float32')

For the first batch you only initialize numpy array and you don't save the computed features. After initialization you should probably add a line to also insert computed features for the first batch.

2. Shuffling training data

I see that you are not shuffling training data which can make models more general i.e. less overfitting. For the ImageNet and the amount of data this is probably not so important. I noticed that simply adding shuffle=True to DataLoader will not suffice as deepcluster.images_lists indexes are simply ordered indexes of the computed features (

deepcluster/clustering.py

Lines 207 to 208 in f5995e9

for i in range(len(data)):
self.images_lists[I[i]].append(i)
). When creating dataset with new pseudolabels this indices are used which are not correct then as the order of the computed features don't correspond to the actual index in the dataset.

I just wanted to note that down as one may simply put shuffle=True in DataLoader.

3. Smaller datasets

Have you tried your method on any smaller datasets e.g. initializing it with supervised model learned on ImageNet and then unsuspervised fine tuning on a new dataset? Any success with such smaller datasets?

Training on Pascal VOC 2007

When training the model on Pascal VOC2007 for clustering, is the model randomly initialized ? or initialized with the parameters pre-trained on ImageNet?

How do you convert a PyTorch model to Caffe model?

A silly question: In order to use the Faster RCNN lib, it seems you need to generate the ".caffemodel" and the ".prototext" file for a given PyTorch model. Why is that part of the script not included in this repo?

Request for cluster assignment list

Hello,

for an experiment I need the cluster assignment resulting from this algorithm. I would get it myself but at the moment I do not have sufficient RAM to store the complete feature matrix. Therefore I am kindly asking if someone of the authors or somebody else who has run the experiment might be so kind and upload this file for me.

Specifically I need the clustering resulting from the provided pre-trained VGG model. I do not need the feature vectors but only the content of deepcluster.images_list which already gets saved to the exp/1/clusters file (see line 207 in main.py), so there might be the chance that somebody could upload it for me that has already run the experiment for at least one epoch with the pretrained model.

Thanks in advance,
Chris

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.