Git Product home page Git Product logo

lipnet-pytorch's Introduction

LipNet: End-to-End Sentence-level Lipreading

The state-of-art PyTorch implementation of 'LipNet: End-to-End Sentence-level Lipreading' by Yannis M. Assael, Brendan Shillingford, Shimon Whiteson, and Nando de Freitas (https://arxiv.org/abs/1611.01599). This version achieves the best performance in all evaluation metrics.

LipNet Demo

Results

Scenario Image Size (W x H) CER WER
Unseen speakers (Origin) 100 x 50 6.7% 13.6%
Overlapped speakers (Origin) 100 x 50 2.0% 5.6%
Unseen speakers (Ours) 128 x 64 6.7% 13.3%
Overlapped speakers (Ours) 128 x 64 1.9% 4.6%

Notes:

  • Contribution in sharing the results of this model is highly appreciated

Data Statistics

Scenario Train Validation
Unseen speakers (Origin) 28775 3971
Overlapped speakers (Origin) 24331 8415
Unseen speakers (Ours) 28837 3986
Overlapped speakers (Ours) 24408 8415

Preprocessing

Link of processed lip images and text: https://pan.baidu.com/s/165swxO8A-GUZ03-InTL6VA

Google Drive: https://drive.google.com/drive/folders/1Wn2EJw2101nF59eNDXEto6qXqfgDDucL

Download all parts and concatenate the files using the command

cat GRID_LIP_160x80_TXT.zip.* > GRID_LIP_160x80_TXT.zip
unzip GRID_LIP_160x80_TXT.zip
rm GRID_LIP_160x80_TXT.zip

We provide examples of face detection and alignment in scripts/ folder for your own dataset.

Training And Testing

python main.py

Data path and hyperparameters are configured in options.py. Please pay attention that you may need to modify options.py to make the program work as expected.

Dependencies

  • PyTorch 1.0+
  • opencv-python

Bibtex

@article{assael2016lipnet,
  title={LipNet: End-to-End Sentence-level Lipreading},
  author={Assael, Yannis M and Shillingford, Brendan and Whiteson, Shimon and de Freitas, Nando},
  journal={GPU Technology Conference},
  year={2017},
  url={https://github.com/Fengdalu/LipNet-PyTorch}
}

License

The MIT License

lipnet-pytorch's People

Contributors

fengdalu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.