Git Product home page Git Product logo

emnist_letter_exploration_and_prediction's Introduction

Explore EMNIST Letter Dataset with Prediction

Explore the EMNIST letter dataset with various dimensionality reduction techniques and visualizations, and then create an ML model to predict handwritten letters.

Project Writeup

Project Presentation Slides

Project Presentation Video

Heroku Demo Link - may be in active in the future

Colab Demo - same as Heroku App but can be run straight from Colab

Project Overview

  1. EMNIST Letter Dataset
  2. Dimensionality Reduction Techniques
  3. ML Model
  4. User Application
  5. Web Deployment

Loading EMNIST Letter Dataset

When exploring various was to import the EMNIST dataset we came across a pre-made Python library called emnist which made loading the various sets of data into our project straight forward and simple. After a few custom functions we were able to generate quick visualizations of the letters along with their labels.

Letter Data Visualization with Label

Dimensionality Reduction Techniques

During the data exploration stage we applied several dimensionality reduction techniques to better visualize how the letter images could be clustered together. The highlights are PCA (Principal Component Analysis) which is a good baseline technique and UMAP (Uniform Manifold Approximation and Projection) which provide clearer clusters and better separation. Images shown below and full code can be found in the dev_notebooks folder.

PCA of Letters

UMAP of Letters

ML Model

After creating a few different versions of a Tensorflow Deep Neural Net we ultimately used an architecture that did well in a Kaggle competition with the 0-9 MNIST digit data set.

Model Architecture - Image from cdeotte/25-million-images-0-99757-mnist

User Application

To create an interactive experience for users to draw handwritten digits and provide model prediction in real time, we used Streamlit to create our front end application. This demo can easily be run in out Streamlit_App.ipynb Colab notebook.

App Example

Web Deployment

To make our Letter drawing prediction application as accessible as possible we created a self contained web app and deployed it using Heroku which can be found in the heroku_app folder.

Initial Demo Link

Contributors

coryroyce akanksha0911 AbrahamKong Karishma-Kuria

Reference

Live Demo Video: https://youtu.be/wI2hDtuRbUw

Project PPT Link: https://docs.google.com/presentation/d/1svfLKvYqRFLc2mmICwIRXpxxdTbY63B64NXi0cnEuyM/edit#slide=id.g106ebbdb28d_0_51

Project Report Link: https://docs.google.com/document/d/1S3f0IWZUfYTdzhxm7Hy8ESF2y3pgSkd_AvRUsdQ6OI0

Data Set References

Original EMNIST Paper

EMNIST dataset: https://www.nist.gov/itl/iad/image-group/emnist-dataset

Direct download: http://www.itl.nist.gov/iaui/vip/cs_links/EMNIST/gzip.zip

Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373

Importing and formatting Image data inspired by ArangurenAndres/EMNSIT-Image-classification

Mapping and original file reference Website

Data Reduction References

Parts of the visualization nad PCA were inspired by Rahul228646's Kaggle notebook

Example Data Visualizations for Images from Kaggle Notebook

Classification Reference

Image Classification in 10 Minutes with MNIST Dataset Article

How to Develop a CNN for MNIST Article

Kaggle Competition with MNIST 0-9 digits Article

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.