Git Product home page Git Product logo

lipify-lipreading's Introduction

Lipify - A Lip Reading Application


Project Dependencies:
  • Python>=3.7.1
  • tensorflow>=2.1.0
  • opencv-python>=4.2.0
  • dlib
  • moviepy>=1.0.1
  • numpy>=1.18.1
  • Pillow
  • matplotlib
  • tqdm
  • pyDot
  • seaborn
  • scikit-learn
  • imutils>=0.5.3

Note: All Dependencies can be found inside 'setup.py'


Project's Dataset Structure:
  • GP DataSet/
    | --> align/
    | --> video/
  • Videos-After-Extraction/
    | --> S1/
    | --> ....
    | --> S20/
  • New-DataSet-Videos/
    | --> S1/
    | --> ....
    | --> S20/
  • S1/
    | --> Adverb/
    | --> Alphabet/
    | --> Colors/
    | --> Commands/
    | --> Numbers/
    | --> Prepositions/


Dataset Info:

We use the GRID Corpus dataset which is publicly available at this link
You can download the dataset using our script: GridCorpus-Downloader.sh
which was adapted from the code provided here

To Download please run the following line of code in your terminal:
bash GridCorpus-Downloader.sh FirstSpeaker SecondSpeaker
where FirstSpeaker and SecondSpeaker are integers for the number of speakers to download

  • NOTE: Speaker 21 is missing from the GRID Corpus dataset due to technical issues.

Datset Segmentation Steps:
  1. Run DatasetSegmentation.py
  2. Run Pre-Processing/frameManipulator.py

* After running the above files, all resultant videos will have 30 FPS and 1 second long.
CNN Models Training Steps:
  • Model codes can be found in the directory "NN-Models"

  • First you will need to change the common path value to the directory of your training and test data.

  • Run Each network to start training.

  • Early stopping was used to help stop the training of the model at its optimum validation accuracy.

  • Resultant accuracies after training on the data can be found in: Project Accuracies

or in the following illustration: General CNN Architecture


CNN Architecture:

All of our networks have the same architecture with the only
difference being the output layer, As shown in:

Train & Test Accuracies of each category


TODOs:
  • Dataset preprocessing module
  • Initial Convolutional Neural networks' architecture
  • Facial detection algorithm
  • Optimization of the networks' architectures
  • Unittesting of project files
  • Proper documentation for the whole project

License:

MIT License

lipify-lipreading's People

Contributors

amrkh97 avatar kandayozo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.