Git Product home page Git Product logo

iasl-backend's Introduction

iASL-Backend

iASLPoster

This repository is intended to contain the back-end material for the iASL application. Specifically, it will contain the machine learning model for the neural network used in this application.

The main code for running the training portion of the model can be found in the scripts folder. The main script, train.sh, runs the training network and saves the model to the folder output/p1_train. The training process results in three files being saved, as well as one additional weights file for every epoch.

The ASL finger dataset used is located here:
https://www.kaggle.com/grassknoted/asl-alphabet

Training

Included in the data_processing directory is a collect_data.py file. This contains a class responsible for batch data generation. This class takes in class mapping which is a dictionary where the key is the label and the value is the integer index value. This file creates a master mapping of each image file and its label, and allows for parallel batch processing during training. This data generator is necessary for the training script. To train a model on a dataset, a file list with a new line seperated list of file names should be provided. Along with this, the corresponding ref file should be provided which contains the label in the same order as the image. Use the gen_labels.py script to generate this after the file list has been created. Some hyper parameter values can be found in the parameter file in params/params_train.txt. Each epoch, the model will output to output/p1_train a weights file. To run the training script, simply edit the parameter file as necessary and run train.sh. This will source the _runtime_env.sh which contains environment variables necessary for the script to run. Note that train.py will fail to run if the runtime file is not sourced.

Realtime Detection

The realtime-detection.py script only takes a parameter file. In this file should be a model directory (relative to the iASL_OUT directory) where the model and weights are stored. This script will load these files to use as the model. On start-up, there will be a green box in the top left of the screen. Place your hand there within the first three seconds. This will activate the object tracking so the region of interest will follow your hand. Each frame displayed will be sent to the model for classsification. On detection, the confidence and label will be displayed in a black box. Running detect.sh will provide the parameter file and source the runtime file, so it is suggested to run the bash file as opposed to running the python script directly.

Video Classification Model

In the containerized directory, there is an environment for deploying a video classifier. This directory includes a Dockerfile that allows for building a Docker image that can be deployed.

Example: docker build -t tf-server . docker run -d -p 8080 tf-server

This is possible through the use of Flask in the app.py script. It exposes port 8080 and takes in requests through the /predictroute. This POST method takes in an argument vid_stuff which takes in a Base64 encoded string of 40 50x50x3 images. The shape must exactly match a flattened 40x50x50x3 array. These inputs are resized back up to 40x150x150x3. In the mdl_dir directory, there exists a JSON file containing the model architecture and an hdf5 file with the latest weights. One can load this model up to train on their own data. The input to the model is a sequence of 40x150x150x3. This uses the Inception3D model, which uses 3D-Convolutional neural networks. Currently, the top layer of this model has 55 output nodes, due to only 55 words that were trained on. To increase the number of output nodes, a new Inception3D model must be instantiated and a new top layer must be put in place.

iasl-backend's People

Contributors

applebaumian avatar lozaa avatar shakeelalibhai avatar tareke-dev avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Forkers

rotywonka

iasl-backend's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.