Git Product home page Git Product logo

dsc-4-45-06-denoising-autoencoders-lab's Introduction

Denoising Autoencoders - Lab

Introduction

In this lab, we will build a simple de-noising autoencoder using a shallow architecture. Following the approach in previously seen in the section, the simple architecture can be replaced by a deep network having multiple layers to learn the intermediate representation. The basic architecture remains the same here , however, the application area changes from data compression to data de-noising. Let's get on with it .

Objectives

You will be able to:

  • Build a simple denoising Autoencoder architecture in Keras
  • Add random Gaussian Noise to a given images dataset
  • Predict a clean image from a (previously unseen) noisy image

Load necessary libraries

We need to first load necessary libraries including numpy and keras for building our DAE model.

# Import necessary libraries

# Your code here 

Load data

This experiment can be performed with any small image database, to help us keep our focus on the architecture and the approach. You can try with MNIST, fashion-MNIST, CIFAR10 and CIFAR100 datasets. CIFAR datasets are colored images and carry RGB channels which can possibly increase the training times many folds. You are encouraged to try these and other larger datasets with this code and give it a few hours (maybe overnight) training time to run a bigger experiment.

Let's perform following tasks first, similar to our previous labs

  • Load MNIST/fashion-MNIST dataset in keras (both datasets contain images with similar dimensions). Create train and test datasets
  • Neural networks only accepts row vectors as an input - Reshape train and test datasets from 2D array to 1D.
  • Scale the data in range [0,1] to allow us to use sigmoid activation function in output neurons.
  • Print the shape of resulting datasets
# Code here 
MNIST

60000 training samples
10000 test samples

Create a "Noisy" Dataset

Here we will introduce random Gaussian noise to the test and train data. The noiy dataset can be generated using following general formula, which will a add noise with mean 0 and standard deviation=1 :

$$NoisyDataset~=OriginalDataset+NoiseFactor*~np.random.normal(loc=0.0, scale=1.0, size=OriginalDataset.shape)$$

  • Use a noise factor of 0.5
  • Create a set of noise test and train datasets from original datasets using formula given above
  • Use np.clip() to restrict the values between 0 and 1.

numpy.clip(a, a_min, a_max, out=None) clips (limit) the values in an array.

Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

# Code here 

Build the DAE

  • Build the encoder model for creating a hidden representation of length 32 from input vector of length 784.
  • Use RELU activation for the encoder model
  • Build the decoder model with signmoid activation
  • Show model summary
# Code here 
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
inputs (InputLayer)          (None, 784)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 32)                25120     
_________________________________________________________________
dense_2 (Dense)              (None, 784)               25872     
=================================================================
Total params: 50,992
Trainable params: 50,992
Non-trainable params: 0
_________________________________________________________________

Compile and Predict

  • Use batch size = 128 and 30 epochs for training (increase epochs for better results)
  • Use adam optimizer and binary cross entropy as the loss measure to compile DAE model
  • Fit DAE with noisy dataset as the input and original dataset as the output. We are trying to teach the network to learn how a clean version compares to the noisy version of data - for all images.
  • Set shuffle=True for shuffling batches of data.
  • Make predictions with the noise test dataset
# Code here 
Train on 60000 samples, validate on 10000 samples
Epoch 1/30
60000/60000 [==============================] - 3s 49us/step - loss: 0.3941 - val_loss: 0.3420
Epoch 2/30
60000/60000 [==============================] - 2s 37us/step - loss: 0.3322 - val_loss: 0.3284
Epoch 3/30
60000/60000 [==============================] - 2s 40us/step - loss: 0.3226 - val_loss: 0.3214
Epoch 4/30
60000/60000 [==============================] - 3s 46us/step - loss: 0.3169 - val_loss: 0.3171
Epoch 5/30
60000/60000 [==============================] - 3s 43us/step - loss: 0.3133 - val_loss: 0.3144
Epoch 6/30
60000/60000 [==============================] - 2s 42us/step - loss: 0.3108 - val_loss: 0.3124
Epoch 7/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3089 - val_loss: 0.3106
Epoch 8/30
60000/60000 [==============================] - 2s 40us/step - loss: 0.3075 - val_loss: 0.3095
Epoch 9/30
60000/60000 [==============================] - 2s 40us/step - loss: 0.3064 - val_loss: 0.3082
Epoch 10/30
60000/60000 [==============================] - 2s 40us/step - loss: 0.3054 - val_loss: 0.3077
Epoch 11/30
60000/60000 [==============================] - 2s 41us/step - loss: 0.3047 - val_loss: 0.3069
Epoch 12/30
60000/60000 [==============================] - 2s 41us/step - loss: 0.3042 - val_loss: 0.3066
Epoch 13/30
60000/60000 [==============================] - 2s 40us/step - loss: 0.3039 - val_loss: 0.3063
Epoch 14/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3036 - val_loss: 0.3062
Epoch 15/30
60000/60000 [==============================] - 2s 40us/step - loss: 0.3034 - val_loss: 0.3061
Epoch 16/30
60000/60000 [==============================] - 3s 43us/step - loss: 0.3033 - val_loss: 0.3060
Epoch 17/30
60000/60000 [==============================] - 3s 43us/step - loss: 0.3032 - val_loss: 0.3059
Epoch 18/30
60000/60000 [==============================] - 3s 43us/step - loss: 0.3031 - val_loss: 0.3058
Epoch 19/30
60000/60000 [==============================] - 2s 41us/step - loss: 0.3030 - val_loss: 0.3057
Epoch 20/30
60000/60000 [==============================] - 2s 39us/step - loss: 0.3029 - val_loss: 0.3056
Epoch 21/30
60000/60000 [==============================] - 2s 39us/step - loss: 0.3028 - val_loss: 0.3055
Epoch 22/30
60000/60000 [==============================] - 3s 44us/step - loss: 0.3028 - val_loss: 0.3056
Epoch 23/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3027 - val_loss: 0.3055
Epoch 24/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3026 - val_loss: 0.3056
Epoch 25/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3026 - val_loss: 0.3055
Epoch 26/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3025 - val_loss: 0.3053
Epoch 27/30
60000/60000 [==============================] - 3s 42us/step - loss: 0.3025 - val_loss: 0.3052
Epoch 28/30
60000/60000 [==============================] - 3s 43us/step - loss: 0.3025 - val_loss: 0.3052
Epoch 29/30
60000/60000 [==============================] - 3s 44us/step - loss: 0.3024 - val_loss: 0.3053
Epoch 30/30
60000/60000 [==============================] - 3s 47us/step - loss: 0.3023 - val_loss: 0.3052

View the results

  • Show the first ten images from the clean dataset.
  • Show the images with added noise and images predicted by the DAE.
# display original - Clean dataset

png

# Display noisy and  predicted clean images

png

Here we can see that the our model is actually performing very well. We do see some poor predictions above due to highly reduced dimensionality and high noise we have introduced in our dataset. We can further inspect the performance by checking the training and validation loss above. As always , a key takeaway here is the number of training examples and training time have a huge impact on the performance of a deep architecture.

Level up - Optional

  • Increase the size of encoded representation / decrease the amount of noise to see if the performance improves.
  • See how training epochs effect the performance
  • Import the faces dataset that we saw with PCA dimensionality reduction lab from scikit-learn, and repeat the above experiment.
  • Look for other interesting datasets/create your own noise datasets and train the network.
  • Create a DEEP denoising autoencoder by modifying the code above.

Summary

In this lab we looked at building a simple denoising autoencoder. We created noisy datasets by adding random Gaussian noise to the fashion MNIST dataset in keras. Our results show that the network is able to identify the shapes very well , but due to using a hugely oversimplified architecture , the accuracy remains questionable. Next we'll see how we can use convolutional network approach to simplify the task of image reconstruction.

dsc-4-45-06-denoising-autoencoders-lab's People

Contributors

loredirick avatar shakeelraja avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

jirvingphd

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.