Git Product home page Git Product logo

dsc-image-classification-with-mlps-lab-staff's Introduction

Image Classification with MLPs - Lab

Introduction

For the final lab in this section, we'll build a more advanced Multi-Layer Perceptron to solve image classification for a classic dataset, MNIST! This dataset consists of thousands of labeled images of handwritten digits, and it has a special place in the history of Deep Learning.

Objectives

  • Build a multi-layer neural network image classifier using Keras

Packages

First, let's import all the classes and packages you'll need for this lab.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.datasets import mnist
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True' #This prevents kernel shut down due to xgboost conflict

Data

Before we get into building the model, let's load our data and take a look at a sample image and label.

The MNIST dataset is often used for benchmarking model performance in the world of AI/Deep Learning research. Because it's commonly used, Keras actually includes a helper function to load the data and labels from MNIST -- it even loads the data in a format already split into training and test sets!

Run the cell below to load the MNIST dataset. Note that if this is the first time you are working with MNIST through Keras, this will take a few minutes while Keras downloads the data.

(X_train, y_train), (X_test, y_test) = mnist.load_data()

Great!

Now, let's quickly take a look at an image from the MNIST dataset -- we can visualize it using Matplotlib. Run the cell below to visualize the first image and its corresponding label.

sample_image = X_train[0]
sample_label = y_train[0]
display(plt.imshow(sample_image))
print('Label: {}'.format(sample_label))

Great! That was easy. Now, we'll see that preprocessing image data has a few extra steps in order to get it into a shape where an MLP can work with it.

Preprocessing Images For Use With MLPs

By definition, images are matrices -- they are a spreadsheet of pixel values between 0 and 255. We can see this easily enough by just looking at a raw image:

sample_image

This is a problem in its current format, because MLPs take their input as vectors, not matrices or tensors. If all of the images were different sizes, then we would have a more significant problem on our hands, because we'd have challenges getting each image reshaped into a vector the exact same size as our input layer. However, this isn't a problem with MNIST, because all images are black white 28x28 pixel images. This means that we can just concatenate each row (or column) into a single 784-dimensional vector! Since each image will be concatenated in the exact same way, positional information is still preserved (e.g. the pixel value for the second pixel in the second row of an image will always be element number 29 in the vector).

Let's get started. In the cell below, print the .shape of both X_train and X_test

We can interpret these numbers as saying "X_train consists of 60,000 images that are 28x28". We'll need to reshape them from (28, 28), a 28x28 matrix, to (784,), a 784-element vector. However, we need to make sure that the first number in our reshape call for both X_train and X_test still correspond to the number of observations we have in each.

In the cell below:

  • Use the .reshape() method to reshape X_train. The first parameter should be 60000, and the second parameter should be 784
  • Similarly, reshape X_test to 10000 and 784
  • Also, chain both .reshape() calls with an .astype('float32'), so that we convert our data from type uint8 to float32
X_train = None
X_test = None

Now, let's check the shape of our training and test data again to see if it worked.

Great! Now, we just need to normalize our data!

Normalizing Image Data

Since all pixel values will always be between 0 and 255, we can just scale our data by dividing every element by 255! Run the cell below to do so now.

X_train /= 255.
X_test /= 255.

Great! We've now finished preprocessing our image data. However, we still need to deal with our labels.

Preprocessing our Labels

Let's take a quick look at the first 10 labels in our training data:

As we can see, the labels for each digit image in the training set are stored as the corresponding integer value -- if the image is of a 5, then the corresponding label will be 5. This means that this is a Multiclass Classification problem, which means that we need to One-Hot Encode our labels before we can use them for training.

Luckily, Keras provides a really easy utility function to handle this for us.

In the cell below:

  • Use the function to_categorical() to one-hot encode our labels. This function can be found in the keras.utils sub-module. Pass in the following parameters:
    • The object we want to one-hot encode, which will be y_train/y_test
    • The number of classes contained in the labels, 10
y_train = None
y_test = None

Great. Now, let's examine the label for the first data point, which we saw was 5 before.

Perfect! As we can see, the fifth index is set to 1, while everything else is set to 0. That was easy! Now, let's get to the fun part -- building our model!

Building our Model

For the remainder of this lab, we won't hold your hand as much -- flex your newfound Keras muscles and build an MLP with the following specifications:

  • A Dense hidden layer with 64 neurons, and a 'tanh' activation function. Also, since this is the first hidden layer, be sure to pass in input_shape=(784,) in order to create a correctly-sized input layer!
  • Since this is a multiclass classification problem, our output layer will need to be a Dense layer where the number of neurons is the same as the number of classes in the labels. Also, be sure to set the activation function to 'softmax'
model_1  = None

Now, compile your model with the following parameters:

  • loss='categorical_crossentropy'
  • optimizer='sgd'
  • metrics = ['acc']

Let's quickly inspect the shape of our model before training it and see how many training parameters we have. In the cell below, call the model's .summary() method.

50,890 trainable parameters! Note that while this may seem large, deep neural networks in production may have hundreds or thousands of layers and many millions of trainable parameters!

Let's get on to training. In the cell below, fit the model. Use the following parameters:

  • Our training data and labels
  • epochs=5
  • batch_size=64
  • validation_data=(X_test, y_test)
results_1 = None

Visualizing our Loss and Accuracy Curves

Now, let's inspect the model's performance and see if we detect any overfitting or other issues. In the cell below, create two plots:

  • The loss and val_loss over the training epochs
  • The acc and val_acc over the training epochs

HINT: Consider copying over the visualization function from the previous lab in order to save time!

def visualize_training_results(results):
    pass

Pretty good! Note that since our validation scores are currently higher than our training scores, its extremely unlikely that our model is overfitting to the training data. This is a good sign -- that means that we can probably trust the results that our model is ~91.7% accurate at classifying handwritten digits!

Building a Bigger Model

Now, let's add another hidden layer and see how this changes things. In the cells below, create a second model. This model should have the following architecture:

  • Input layer and first hidden layer same as model_1
  • Another Dense hidden layer, this time with 32 neurons and a 'tanh' activation function
  • An output layer same as model_1
model_2 = None

Let's quickly inspect the .summary() of the model again, to see how many new trainable parameters this extra hidden layer has introduced.

This model isn't much bigger, but the layout means that the 2080 parameters in the new hidden layer will be focused on higher layers of abstraction than the first hidden layer. Let's see how it compares after training.

In the cells below, compile and fit the model using the same parameters you did for model_1.

results_2 = None

Now, visualize the plots again.

Slightly better validation accuracy, with no evidence of overfitting -- great! If you run the model for more epochs, you'll see the model's performance continues to improve until the validation metrics plateau and the model begins to overfit to training data.

A Bit of Tuning

As a final exercise, let's see what happens to the model's performance if we switch activation functions from 'tanh' to 'relu'. In the cell below, recreate model_2, but replace all 'tanh' activations with 'relu'. Then, compile, train, and plot the results using the same parameters as the other two.

model_3 = None
results_3 = None

Performance improved even further! ReLU is one of the most commonly used activation functions around right now -- it's especially useful in computer vision problems like image classification, as we've just seen.

Summary

In this lab, you once again practiced and reviewed the process of building a neural network. This time, you built a more complex network with additional layers which improved the performance of your model on the MNIST dataset!

dsc-image-classification-with-mlps-lab-staff's People

Contributors

cheffrey2000 avatar mathymitchell avatar mike-kane avatar sumedh10 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

baozupopo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.