Explore the EMNIST letter dataset with various dimensionality reduction techniques and visualizations, and then create an ML model to predict handwritten letters.
Heroku Demo Link - may be in active in the future
Colab Demo - same as Heroku App but can be run straight from Colab
- EMNIST Letter Dataset
- Dimensionality Reduction Techniques
- ML Model
- User Application
- Web Deployment
When exploring various was to import the EMNIST dataset we came across a pre-made Python library called emnist which made loading the various sets of data into our project straight forward and simple. After a few custom functions we were able to generate quick visualizations of the letters along with their labels.
During the data exploration stage we applied several dimensionality reduction techniques to better visualize how the letter images could be clustered together. The highlights are PCA (Principal Component Analysis) which is a good baseline technique and UMAP (Uniform Manifold Approximation and Projection) which provide clearer clusters and better separation. Images shown below and full code can be found in the dev_notebooks folder.
After creating a few different versions of a Tensorflow Deep Neural Net we ultimately used an architecture that did well in a Kaggle competition with the 0-9 MNIST digit data set.
To create an interactive experience for users to draw handwritten digits and provide model prediction in real time, we used Streamlit to create our front end application. This demo can easily be run in out Streamlit_App.ipynb Colab notebook.
To make our Letter drawing prediction application as accessible as possible we created a self contained web app and deployed it using Heroku which can be found in the heroku_app folder.
coryroyce akanksha0911 AbrahamKong Karishma-Kuria
Live Demo Video: https://youtu.be/wI2hDtuRbUw
Project PPT Link: https://docs.google.com/presentation/d/1svfLKvYqRFLc2mmICwIRXpxxdTbY63B64NXi0cnEuyM/edit#slide=id.g106ebbdb28d_0_51
Project Report Link: https://docs.google.com/document/d/1S3f0IWZUfYTdzhxm7Hy8ESF2y3pgSkd_AvRUsdQ6OI0
EMNIST dataset: https://www.nist.gov/itl/iad/image-group/emnist-dataset
Direct download: http://www.itl.nist.gov/iaui/vip/cs_links/EMNIST/gzip.zip
Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373
Importing and formatting Image data inspired by ArangurenAndres/EMNSIT-Image-classification
Mapping and original file reference Website
Parts of the visualization nad PCA were inspired by Rahul228646's Kaggle notebook
Example Data Visualizations for Images from Kaggle Notebook
Image Classification in 10 Minutes with MNIST Dataset Article
How to Develop a CNN for MNIST Article
Kaggle Competition with MNIST 0-9 digits Article