Image Classification with MLPs - Lab

Introduction

For the final lab in this section, we'll build a more advanced Multi-Layer Perceptron to solve image classification for a classic dataset, MNIST! This dataset consists of thousands of labeled images of handwritten digits, and it has a special place in the history of Deep Learning.

Objectives

Build a multi-layer neural network image classifier using Keras

Packages

First, let's import all the classes and packages you'll need for this lab.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.datasets import mnist
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True' #This prevents kernel shut down due to xgboost conflict

Data

Before we get into building the model, let's load our data and take a look at a sample image and label.

The MNIST dataset is often used for benchmarking model performance in the world of AI/Deep Learning research. Because it's commonly used, Keras actually includes a helper function to load the data and labels from MNIST -- it even loads the data in a format already split into training and test sets!

Run the cell below to load the MNIST dataset. Note that if this is the first time you are working with MNIST through Keras, this will take a few minutes while Keras downloads the data.

(X_train, y_train), (X_test, y_test) = mnist.load_data()

Great!

Now, let's quickly take a look at an image from the MNIST dataset -- we can visualize it using Matplotlib. Run the cell below to visualize the first image and its corresponding label.

sample_image = X_train[0]
sample_label = y_train[0]
display(plt.imshow(sample_image))
print('Label: {}'.format(sample_label))

Great! That was easy. Now, we'll see that preprocessing image data has a few extra steps in order to get it into a shape where an MLP can work with it.

Preprocessing Images For Use With MLPs

By definition, images are matrices -- they are a spreadsheet of pixel values between 0 and 255. We can see this easily enough by just looking at a raw image:

sample_image

This is a problem in its current format, because MLPs take their input as vectors, not matrices or tensors. If all of the images were different sizes, then we would have a more significant problem on our hands, because we'd have challenges getting each image reshaped into a vector the exact same size as our input layer. However, this isn't a problem with MNIST, because all images are black white 28x28 pixel images. This means that we can just concatenate each row (or column) into a single 784-dimensional vector! Since each image will be concatenated in the exact same way, positional information is still preserved (e.g. the pixel value for the second pixel in the second row of an image will always be element number 29 in the vector).

Let's get started. In the cell below, print the .shape of both X_train and X_test

We can interpret these numbers as saying "X_train consists of 60,000 images that are 28x28". We'll need to reshape them from (28, 28), a 28x28 matrix, to (784,), a 784-element vector. However, we need to make sure that the first number in our reshape call for both X_train and X_test still correspond to the number of observations we have in each.

In the cell below:

Use the .reshape() method to reshape X_train. The first parameter should be 60000, and the second parameter should be 784
Similarly, reshape X_test to 10000 and 784
Also, chain both .reshape() calls with an .astype('float32'), so that we convert our data from type uint8 to float32

X_train = None
X_test = None

Now, let's check the shape of our training and test data again to see if it worked.

Great! Now, we just need to normalize our data!

Normalizing Image Data

Since all pixel values will always be between 0 and 255, we can just scale our data by dividing every element by 255! Run the cell below to do so now.

X_train /= 255.
X_test /= 255.

Great! We've now finished preprocessing our image data. However, we still need to deal with our labels.

Preprocessing our Labels

Let's take a quick look at the first 10 labels in our training data:

As we can see, the labels for each digit image in the training set are stored as the corresponding integer value -- if the image is of a 5, then the corresponding label will be 5. This means that this is a Multiclass Classification problem, which means that we need to One-Hot Encode our labels before we can use them for training.

Luckily, Keras provides a really easy utility function to handle this for us.

In the cell below:

Use the function to_categorical() to one-hot encode our labels. This function can be found in the keras.utils sub-module. Pass in the following parameters:
- The object we want to one-hot encode, which will be y_train/y_test
- The number of classes contained in the labels, 10

y_train = None
y_test = None

Great. Now, let's examine the label for the first data point, which we saw was 5 before.

Perfect! As we can see, the fifth index is set to 1, while everything else is set to 0. That was easy! Now, let's get to the fun part -- building our model!

Building our Model

For the remainder of this lab, we won't hold your hand as much -- flex your newfound Keras muscles and build an MLP with the following specifications:

A Dense hidden layer with 64 neurons, and a 'tanh' activation function. Also, since this is the first hidden layer, be sure to pass in input_shape=(784,) in order to create a correctly-sized input layer!
Since this is a multiclass classification problem, our output layer will need to be a Dense layer where the number of neurons is the same as the number of classes in the labels. Also, be sure to set the activation function to 'softmax'

model_1  = None

Now, compile your model with the following parameters:

loss='categorical_crossentropy'
optimizer='sgd'
metrics = ['acc']

Let's quickly inspect the shape of our model before training it and see how many training parameters we have. In the cell below, call the model's .summary() method.

50,890 trainable parameters! Note that while this may seem large, deep neural networks in production may have hundreds or thousands of layers and many millions of trainable parameters!

Let's get on to training. In the cell below, fit the model. Use the following parameters:

Our training data and labels
epochs=5
batch_size=64
validation_data=(X_test, y_test)

results_1 = None

Visualizing our Loss and Accuracy Curves

Now, let's inspect the model's performance and see if we detect any overfitting or other issues. In the cell below, create two plots:

The loss and val_loss over the training epochs
The acc and val_acc over the training epochs

HINT: Consider copying over the visualization function from the previous lab in order to save time!

def visualize_training_results(results):
    pass

Pretty good! Note that since our validation scores are currently higher than our training scores, its extremely unlikely that our model is overfitting to the training data. This is a good sign -- that means that we can probably trust the results that our model is ~91.7% accurate at classifying handwritten digits!

Building a Bigger Model

Now, let's add another hidden layer and see how this changes things. In the cells below, create a second model. This model should have the following architecture:

Input layer and first hidden layer same as model_1
Another Dense hidden layer, this time with 32 neurons and a 'tanh' activation function
An output layer same as model_1

model_2 = None

Let's quickly inspect the .summary() of the model again, to see how many new trainable parameters this extra hidden layer has introduced.

This model isn't much bigger, but the layout means that the 2080 parameters in the new hidden layer will be focused on higher layers of abstraction than the first hidden layer. Let's see how it compares after training.

In the cells below, compile and fit the model using the same parameters you did for model_1.

results_2 = None

Now, visualize the plots again.

Slightly better validation accuracy, with no evidence of overfitting -- great! If you run the model for more epochs, you'll see the model's performance continues to improve until the validation metrics plateau and the model begins to overfit to training data.

A Bit of Tuning

As a final exercise, let's see what happens to the model's performance if we switch activation functions from 'tanh' to 'relu'. In the cell below, recreate model_2, but replace all 'tanh' activations with 'relu'. Then, compile, train, and plot the results using the same parameters as the other two.

model_3 = None

results_3 = None

Performance improved even further! ReLU is one of the most commonly used activation functions around right now -- it's especially useful in computer vision problems like image classification, as we've just seen.

Summary

In this lab, you once again practiced and reviewed the process of building a neural network. This time, you built a more complex network with additional layers which improved the performance of your model on the MNIST dataset!

learn-co-students / dsc-image-classification-with-mlps-lab-staff Goto Github PK

dsc-image-classification-with-mlps-lab-staff's Introduction

Image Classification with MLPs - Lab

Introduction

Objectives

Packages

Data

Preprocessing Images For Use With MLPs

Normalizing Image Data

Preprocessing our Labels

Building our Model

Visualizing our Loss and Accuracy Curves

Building a Bigger Model

A Bit of Tuning

Summary

dsc-image-classification-with-mlps-lab-staff's People

Contributors

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org