The goals / steps of this project are the following:
- Use the simulator to collect data of good driving behavior
- Build, a convolution neural network in Keras that predicts steering angles from images
- Train and validate the model with a training and validation set
- Test that the model successfully drives around track one without leaving the road
- Summarize the results with a written report
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
My project includes the following files:
- model.py containing the script to create and train the model
- drive.py for driving the car in autonomous mode
- model.h5 containing a trained convolution neural network
- readme.md summarizing the results
- video.mp4 video recording of autonomously driving car from center camera
- sdcnd.mp4 video recording of Self Driving Car while testing it.
Using the Udacity provided simulator and my drive.py file, the car can be driven autonomously around the track by executing
python drive.py model.h5
The model.py file contains the code for training and saving the convolution neural network. The file shows the pipeline I used for training and validating the model, and it contains comments to explain how the code works.
My model's architecture is based this paper by nvidia. I have made some changes to the network for training my network.
My model is using the power of Convolution Networks:
model.add(Convolution2D(24, 5, 5, subsample=(2,2), activation='relu'))
model.add(Convolution2D(36, 5, 5, subsample=(2,2), activation='relu'))
model.add(Convolution2D(48, 5, 5, subsample=(2,2), activation='relu'))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(Convolution2D(88, 3, 3, activation='relu'))
I have used 5 Convolution layers with Rectified Linear Unit as activation function. After trying multiple combinations of filter's depth, width, height and strides, I decided to choose the above mentioned values.
I have cropped the images using keras Cropping2D(), so as to capture only the required portion of image. This reduces the training time and enhances the results.
I have taken a few steps to reduce the probability of over-fitting and build an efficient model:
- Normalization of Data using Keras Lambda().
- Augmenting data by flipping the images. This increased the training data by 3x
- Training on another track to help the model generalize better.
- Usage of Max-pooling Layer and Dropout in the network.
The model was trained and validated on different data sets to ensure that the model was not overfitting. The model was tested by running it through the simulator and ensuring that the vehicle could stay on the track.
The model used an adam optimizer, so the learning rate was not tuned manually. Other parameters like height weight and depth of filters along with number of layers, %age of data in dropout layer, Epochs value are tuned after trying numerous combinations. I am using 1050Ti GPU(4GB), so batch size value as 64 worked perfectly for me.
Training has been done by driving the car:
- In the Center lane
- Recovering from Left & Right sides of the road
- Clockwise and Anticlockwise driving on the track
- Driving the car on a different track
- Capturing more data on turns
For details about how I created the training data, see the next section.
The strategy while designing the model was to create a complex enough model that can learn the behavior of our training data and drive the car autonomously on the simulator.
Decided to use a convolution neural network model similar to the Nvidia's architecture. I thought this model might be appropriate because it's also based on similar problem and it actually worked pretty well.
First step was to build a training data. For this, I used the simulator provided by Udacity. I captured 90000+ images(center+left+right camera). The simulator creates a log file which includes left, center, right camera image, steering angle, speed etc.
I decided to use all the images as my training features and steering angle as the output. This lead the model to solve a regression problem for predicting the steering angle.
Used the training data to train my CNN.
The final step was to run the simulator to see how well the car was driving around track one. There were a few spots where the vehicle fell off the track and in water. To improve the driving behavior in these cases, I trained a bit extra on turns and added the dropout layer. This gave pretty good results(as per my expectation after 7th try). ๐
At the end of the process, the vehicle is able to drive autonomously around the track without leaving the road.
The final model architecture consisted of a convolution neural network with the following layers and layer sizes:
Layer | Description |
---|---|
Input | 160x320x3 RGB image |
Cropping2D | Cropping layer to crop the images |
Lambda Layer | Normalization of pixel values using Lambda layer |
Convolution2D | 24 filters of 5x 5 dimension, 2x2 stride values with activation function as ReLu |
Convolution2D | 36 filters of 5x 5 dimension, 2x2 stride values with activation function as ReLu |
Convolution2D | 48 filters of 5x 5 dimension, 2x2 stride values with activation function as ReLu |
Convolution2D | 64 filters of 3x3 dimension with default stride val and ReLu activation |
Convolution2D | 88 filters of 3x3 dimension with default stride val and ReLu activation |
MaxPooling2D | MaxPooling layer to reduce the dimension i.e, over-fitting and increase depth of Network |
Flatten | The final outputs from the above mentioned network is flattened to make a 1D matrix. |
Dense | Fully connected layer 320 output nodes |
Dropout | This helps to reduce probability of overfitting. Reduced 50% of nodes. |
Dense | Fully connected layer with 100 output nodes |
Dense | Fully connected layer with 50 output nodes |
Dense | Fully connected layer with 1 output node i.e, final steering angle. ๐ |
To capture good driving behavior, I first recorded two laps on track one using center lane driving. Here is an example image of center lane driving:
I then recorded the vehicle recovering from the left side and right sides of the road back to center so that the vehicle would learn to come back to the track:
Then I repeated this process on track two in order to get more data points.
To augment the data sat, I also flipped images and angles thinking that this would generalize the model. For example, here is an image that has then been flipped:
After the collection process, I had 91,491 images captured by camera. I then preprocessed augmented the data by flipping images. This created thrice of those data i.e, 274,473 data points.
I finally randomly shuffled the data set and put 20% of the data into a validation set.
I used this training data for training the model. The validation set helped determine if the model was over or under fitting. The ideal number of epochs was 3 as evidenced by viewing the losses. Even with 5 epochs it was fluctuating a bit. So, I reduced it to 3, so that the model's training'll take reasonable amount of tim. I used an adam optimizer so that manually training the learning rate wasn't necessary.