Git Product home page Git Product logo

simclrv1-keras-tensorflow's Introduction

SimCLR

A Tensorflow-Keras Implementation of SimCLRv1 which allows to improve the feature representation quality of your base_model by the means of the Simple Framework for Contrastive Learning of Visual Representations (SimCLR). The provided code should allow to apply the framework to any Keras model with only minor changes.

alt text
Fig.1 - SimCLR Illustration [1]

The given implementation allowed for an top-1 accuracy increase of 17% on the linear classifier trained, with 5% of the data. Furthermore, the t-SNE plot demonstrates a clear clustering of the features according to their class, after training with the SimCLR framework.

alt text alt text
Fig.2.1 - t-SNE of VGG16-features before SimCLR Fig.2.2 - t-SNE of VGG16-features after SimCLR

It is possible to reproduce this results via the following notebook: Open In Colab

References: Paper, GitHub, Blog

How to use?

SimCLR = SimCLR(base_model, input_shape, batch_size, feat_dim, feat_dims_ph, num_of_unfrozen_layers, save_path)

The method SimCLR.train can be used to train the SimCLR_model by passing the training and validation data of the type DataGeneratorSimCLR. The attribute SimCLR.base_model keeps track of the changing base_model. The feature representation quality can be evaluated in a number of ways, see below.

Implementation

A SimCLR-class has been defined which builds a Keras SimCLR_model around the base_model. It is the aim to improve the feature encoding quality of this base_model. The SimCLR_model has (2.batch_size) Inputs of the image size and 1 matrix-output with shape (batch_size x 4.batch_size).

  1. Each of the batch_size images are transformed twice by a random image distortion (see Fig.1), giving the 2.batch_size input images. See DataGeneratorSimCLR and SimCLR_data_util for the details.
  2. These input images are passed through the base model and a MLP projection head, resulting in a feature encoding.
  3. The SimCLR_model-output is obtained from a pairwise vector multiplication between all computed feature encodings. This vector multiplications correspond with the cosine similarity, after which the similarity is passed through a softmax. Since it is the aim to 'attract' feature representations of the same image, and 'repel' representations of different images, the SimCLR-output matrix should match to [I|O|I|O], with I = identity-matrix and O = zero-matrix. For this purpose, a custom Keras-layer is defined: SoftmaxCosineSim (see notebook for intuitive toy example).
  4. A simple Keras cross_entropy-loss can be used to evaluate the difference between the SimCLR-output and [I|O|I|O].
  5. As such, the SimCLR_model can be trained and simultaneously the feature encoding improves.

Difference with official implementation:

  • Swish activation instead of relu in projection head
  • As only 1 device is used, no global batch normalization
  • Only colour distortion used with reduced color_jitter strength of 0.5 instead of 1.0. Possible to activate other distortions in DataGeneratorSimCLR.
  • Adam optimizer instead of Lars, no warmup nor cosine decay on learning rate, reduction on plateau instead.

Experiments

SimCLR has been used as a self-supervised learning approach to improve the feature encoding quality of a pretrained VGG16-network. A SimCLR_model has been built around the base_model and consequently trained on the SimCLR-task. For this, a gradual defreeze of the base model was adopted. A clear improvement of the feature representations could be observed for the downstream classification task.

Data: Trashnet

The trashnet-dataset has been used. The original dataset has been reduced to 5 classes with the following number of instances:

  • Glass: 501
  • Paper: 594
  • Cardboard: 403
  • Plastic: 482
  • Metal: 410

The original images of (512x384) have been center-cropped and reduced to a size (80x80). Data has been split in train/val/test - 70/15/15.

Note that the similar results have been observed on a private dataset, see project context below.

Evaluation

The feature quality is evaluated by the means of

  • A linear classifier (logistic regression) trained on the extracted features of the encoder
  • A fine-tuned classifier. 5 attempts are performed, the best classifier is kept.
  • A t-SNE visualization is made.

These evaluations are done for 3 fractions of the training data: 100%, 20%, 5%.

Results

The table below lists the top-1 accuracy for all cases. It can be seen that SimCLR improves the classification performance for all fractions of the training set on both the linear and fine-tuned classifier.

One can consequently conclude that the feature encoding of the base_model clearly improves thanks to the SimCLR framework.

Fraction of training set Classifier VGG16 SimCLR
100% Linear 0.79 ± 0.00 0.82 ± 0.01
Fine-tuned 0.85 ± 0.01 0.87 ± 0.01
20% Linear 0.70 ± 0.00 0.81 ±0.02
Fine-tuned 0.83 ± 0.01 0.86 ± 0.01
5% Linear 0.63 ± 0.00 0.80 ± 0.02
Fine-tuned 0.80 ± 0.02 0.84 ± 0.03

Since the results change slightly because of the stochastic nature of the optimization procedure of both the SimCLR_model and the fine-tuned classifier, the average and standard deviation over 10 runs are presented in the table above.

Project Context

This repository is part of a joined research project of KU Leuven, Sagacify and BESIX on the topic of automatic monitoring of waste containers on construction sites. For this purpose, data has been collected during a period of 5 months. Similar results where achieved on this dataset. See below for an illustration of the type of data. If you would be interested in the details of this research, please feel free to reach out.

alt text alt text
Fig.3 - Illustration of ContAIner output

simclrv1-keras-tensorflow's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simclrv1-keras-tensorflow's Issues

Error in executing training and predictions

While executing this notebook https://github.com/mwdhont/SimCLRv1-keras-tensorflow/blob/master/2_model_SimCLR.ipynb in google colab, this code y_predict_test_before = SimCLR.predict(data_test) gives the following error.

ValueError: Failed to find data adapter that can handle input: <class 'DataGeneratorSimCLR.DataGeneratorSimCLR'>, <class 'NoneType'>
Converting data_test, data_train, data_val to numpy arrays does not help.

Similar error while executing this line also.
SimCLR.train(data_train, data_val, epochs = 5)

ValueError: could not convert string to float: '[1, 0]'

Hello @mwdhont, I tried to train on a custom dataset consisting of two classes "car" and "car_and_person".
I did check and modify the required areas such as Dataset Generator, num_classes, class_one_hot =[0, 1]

But I'm Unable to finetune the model after training

Here is the Colab notebook you can check the errors and If you want I can share the dataset for testing

https://colab.research.google.com/drive/1-en1-r3Rq_bKwbQTl6QrmMM2QaaRhFxt?usp=sharing

    ==== 100.0% of the training data used ==== 

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-34-ffaef4a4cb5d> in <module>()
     12                                 epochs = params_training_classifier[str(fraction)]["epochs"],
     13                                 verbose_epoch = 0,
---> 14                                 verbose_cycle = 0
     15                                 )

9 frames
/content/SimCLRv1-keras-tensorflow/DataGeneratorClass.py in __getitem__(self, index)
     69 
     70             if self.subset == "train":
---> 71                 y[i,] = row[1]["class_one_hot"]
     72 
     73         if self.preprocess is not None:

ValueError: could not convert string to float: '[1, 0]'

Thanks for all the support and a great repository 👍🏼

AttributeError: 'NoneType' object has no attribute 'dtype'

I am using tensorflow 2.0. I got following error when running 2_model_SimCLR exactly on the line y_predict_test_before = SimCLR.predict(data_test)

~\AppData\Local\Continuum\anaconda3\envs\segmentation_tf_api\lib\site-packages\tensorflow_core\python\util\nest.py in <listcomp>(.0)
    533 
    534   return pack_sequence_as(
--> 535       structure[0], [func(*x) for x in entries],
    536       expand_composites=expand_composites)
    537 

~\AppData\Local\Continuum\anaconda3\envs\segmentation_tf_api\lib\site-packages\tensorflow_core\python\keras\engine\data_adapter.py in <lambda>(t)
    604 
    605     peek = x[0]
--> 606     nested_dtypes = nest.map_structure(lambda t: t.dtype, peek)
    607     nested_shape = nest.map_structure(dynamic_shape_like, peek)
    608 

AttributeError: 'NoneType' object has no attribute 'dtype'

please advise

thank you

Randomly pause during epoch, neither output nor error

Hello, @mwdhont.
what platform did you experiment on?windows or linux?
I have random pauses during training, especially during iterations. The process hangs, the program neither outputs nor reports errors, the situation will be improved by increasing the batch size, but it still exists.Often get stuck in the middle of an epoch, unable to proceed to the next steps.
Thanks.

Linear accuracy is better than non linear

I am using the code for my own dataset but the accuracy of linear is much better than non linear. Also I have problem when running train_NL_and_evaluate with error printed is "Learning diverged, stopped." .

please advise

thank you

'NoneType' object has no attribute 'shape'

The following error is generated by both: SimCLR.train(data_train, data_val, epochs = 5) and y_predict_test_before = SimCLR.predict(data_test).

I could not trace it why this happens. Even though I installed the same version of TF and Keras as suggested in the requirements file.

Another question is that why your data generator gives a final output with 5 dimensions for images, since the VGG takes the b x w x h x c as the input only.

Output for one batch:
batch shape: (64, 1, 80, 80, 3), label_shape: (32, 128)

Thanks.

Unused GPU? Long processed because it used CPU, and not GPU

I've tried to run some process based on Colab guide. But I don't know why the process didn't use GPU at all, instead it only used CPU and make the process runtime too long(more than 10 hours processed) .
Is it normal or is there something I must do to make the process use GPU and make it faster?

Failing to predict

Getting this error:

ValueError                                Traceback (most recent call last)
[<ipython-input-23-d72c61a34340>](https://localhost:8080/#) in <module>
----> 1 y_predict_test_before = SimCLR.predict(data_test)

2 frames
[/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py](https://localhost:8080/#) in select_data_adapter(x, y)
    986         "Failed to find data adapter that can handle "
    987         "input: {}, {}".format(
--> 988             _type_name(x), _type_name(y)))
    989   elif len(adapter_cls) > 1:
    990     raise RuntimeError(

ValueError: Failed to find data adapter that can handle input: <class 'DataGeneratorSimCLR.DataGeneratorSimCLR'>, <class 'NoneType'>

when running:

y_predict_test_before = SimCLR.predict(data_test)

Loss function returns nan

System used:
Ubuntu 18.04
Tensorflow-gpu 2.1


I used the "2_model_SimCR.ipynb" to train a model. After two epochs, the loss function returns "nan" values and messes up the training.
I wonder if you have any solution for this?

Train for 53 steps, validate for 12 steps
Epoch 1/5
52/53 [============================>.] - ETA: 4s - loss: 582.9939     
Epoch 00001: val_loss improved from inf to 497.00585, saving model to models/trashnet/SimCLR/SimCLR_05_05_11h_05.h5
53/53 [==============================] - 395s 7s/step - loss: 581.4498 - val_loss: 497.0059
Epoch 2/5
52/53 [============================>.] - ETA: 0s - loss: 421.8934 
Epoch 00002: val_loss improved from 497.00585 to 342.78980, saving model to models/trashnet/SimCLR/SimCLR_05_05_11h_05.h5
53/53 [==============================] - 36s 675ms/step - loss: 420.4594 - val_loss: 342.7898
Epoch 3/5
52/53 [============================>.] - ETA: 0s - loss: 278.3572 
Epoch 00003: val_loss improved from 342.78980 to 213.78286, saving model to models/trashnet/SimCLR/SimCLR_05_05_11h_05.h5
53/53 [==============================] - 37s 694ms/step - loss: 277.1834 - val_loss: 213.7829
Epoch 4/5
52/53 [============================>.] - ETA: 0s - loss: nan      
Epoch 00004: val_loss did not improve from 213.78286
53/53 [==============================] - 34s 643ms/step - loss: nan - val_loss: nan
Epoch 5/5
52/53 [============================>.] - ETA: 0s - loss: nan 
Epoch 00005: val_loss did not improve from 213.78286
53/53 [==============================] - 34s 639ms/step - loss: nan - val_loss: nan
trainable parameters: 11.86 M.
non-trainable parameters: 4.05 M.
Random guess accuracy: 0.0156
accuracy - test - before: 0.74
accuracy - test - after: nan
y_predict_test_before
0.73 | 0.77 | 0.66 | 0.92 | 0.95 | 0.22 | 0.51 | 0.92 | 0.71 | 0.9 | 0.11 | 0.84 | 0.84 | 0.8 | 0.69 | 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.