Git Product home page Git Product logo

kartik2112 / silatra Goto Github PK

View Code? Open in Web Editor NEW
9.0 5.0 7.0 4.67 GB

This is the server-side of the SiLaTra System. This system is targetted towards the hearing and speech impaired community that use sign language for communicating with each other. But when they communicate with other people outside this community, there is a communication gap. This system is an attempt towards bridging this gap.

CMake 2.10% Python 24.05% C++ 21.76% Shell 0.06% Jupyter Notebook 41.03% C 2.06% Makefile 8.49% XSLT 0.45%
signlanguage isl hmm knn gesture-recognition

silatra's Introduction

Sign Language Translator (SiLaTra) Server-Side

This is the server-side of the SiLaTra System. This system is targetted towards the hearing and speech impaired community that use sign language for communicating with each other. But when they communicate with other people outside this community, there is a communication gap. This system is an attempt towards bridging this gap.

Currently, the system supports recognition of:

  • 33 hand poses (whose recognition needs only 1 frame):
    • 23 letters (A-Z except H, J. These 2 letters are conveyed through gestures. Hence, wasn't covered. V hand pose is equivalent to 2, hence not counted in letters)
    • 10 digits (0-9)
  • 12 Gestures (whose recognition needs a sequence of at least 5 frames):
    1. After
    2. All The Best
    3. Apple
    4. Good Afternoon
    5. Good Morning
    6. Good Night
    7. I Am Sorry
    8. Leader
    9. Please Give Me Your Pen
    10. Strike
    11. That is Good
    12. Towards

Sign Language Translator (SiLaTra) Client-Side

The system can be experienced using the latest version of Sign Language Translator (SiLaTra) Client-Side Android Application

Download the Android App by clicking here. This application is working as of 28 April, 2020. Let us know in case of any issues.

IEEE Conference Paper

The research done for this project and all the findings have been compiled in the paper: Kartik Shenoy, Tejas Dastane, Varun Rao, Devendra Vyavaharkar, "Real-time Indian Sign Language (ISL) Recognition", 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), July 2018

Demo Videos

The Demo Videos of this application can be found here:

Installation and Usage

Quick Installation

You can quickly see the working functionality by unzipping Receiver.zip. You could either directly execute it or get it invoked using server.py (which will randomly allocate the port nos for socket connection. Few changes in server.py will be needed). The main advantage of this zip is that you can save hours of configuration by directly executing the executable ./Receiver on linux. This has been compiled using pyInstaller. You can execute this executable directly as follows:

Receiver [-h] [--portNo PORTNO] [--displayWindows DISPLAYWINDOWS]
                [--recognitionMode RECOGNITIONMODE]
                [--socketTimeOutEnable SOCKETTIMEOUTENABLE]
                [--stabilize STABILIZE] [--recordVideos RECORDVIDEOS]
                [--subDir SUBDIR]

Main Entry Point

optional arguments:
-h, --help            show this help message and exit
--portNo PORTNO       
                        Usage: python3 Receiver.py --portNo 12345
--displayWindows DISPLAYWINDOWS
                        Usage: python3 Receiver.py --displayWindows True | False
--recognitionMode RECOGNITIONMODE
                        Usage: python3 Receiver.py --recognitionMode SIGN | GESTURE
--socketTimeOutEnable SOCKETTIMEOUTENABLE
                        Usage: python3 Receiver.py --socketTimeOutEnable True | False
--stabilize STABILIZE
                        Usage: python3 Receiver.py --stabilize True | False
--recordVideos RECORDVIDEOS
                        Usage: python3 Receiver.py --recordVideos True | False --subDir GN
--subDir SUBDIR       
                        Usage: python3 Receiver.py --recordVideos True | False --subDir GN

Example usage: ./Receiver --portNo 49164 --displayWindows False --recognitionMode SIGN --socketTimeOutEnable True

Detailed Installation

If you want to install all the dependencies so that you could tweak this model and modify the code, follow the following steps:

Dependency Details

The dependencies used in python3 and that can be directly installed using pip3 install libName are hmmlearn, sklearn (for kNN), pandas, netifaces, argparse, numpy, imutils, dlib (For face detection), Flask, atexit, pickle, nose. OpenCV library cannot be installed on Ubuntu directly using pip3 install command.

pip3 install hmmlearn sklearn pandas netifaces argparse numpy imutils dlib Flask atexit pickle nose

For the latest build a lot of steps are involved which are specified well by PyImageSearch here: Ubuntu 16.04: How to install OpenCV - PyImageSearch

Local Usage

You can start the Flask server on the local machine by giving command: python3 server.py. Now you can start using these services by using the latest Silatra Android App. The IP Address of the local machine needs to be specified in Settings. Keep Direct Connection unchecked. Now, you can click on Message icon and click on Capture button. This will start the transmission feed. Before clicking on the Capture button, you can select the mode - SIGN or GESTURE. After the need is over, click on Stop button. You can use the flash button to switch on flash.

Note: Direct connection should be used only when you know what you are doing. Its usage is as follows:

  • Instead of starting server.py, here we start Receiver.py directly by specifying the desired port No. Example: python3 Receiver.py --portNo 49164 --displayWindows False --recognitionMode SIGN --socketTimeOutEnable True
  • After this, in Settings, you specify the machine's IP Address and this port no. (49164). This port no. can be any open port on which TCP Server Socket can be established.

server.py eases your work by just needing you to invoke using command python3 server.py and specifying only the IP address in Settings. The socket invocation is internally handled by server.py.

Receiver.py arguments Usage

Receiver.py [-h] [--portNo PORTNO] [--displayWindows DISPLAYWINDOWS]
                [--recognitionMode RECOGNITIONMODE]
                [--socketTimeOutEnable SOCKETTIMEOUTENABLE]
                [--stabilize STABILIZE] [--recordVideos RECORDVIDEOS]
                [--subDir SUBDIR]

Main Entry Point

optional arguments:
-h, --help            show this help message and exit
--portNo PORTNO       
                        Usage: python3 Receiver.py --portNo 12345
--displayWindows DISPLAYWINDOWS
                        Usage: python3 Receiver.py --displayWindows True | False
--recognitionMode RECOGNITIONMODE
                        Usage: python3 Receiver.py --recognitionMode SIGN | GESTURE
--socketTimeOutEnable SOCKETTIMEOUTENABLE
                        Usage: python3 Receiver.py --socketTimeOutEnable True | False
--stabilize STABILIZE
                        Usage: python3 Receiver.py --stabilize True | False
--recordVideos RECORDVIDEOS
                        Usage: python3 Receiver.py --recordVideos True | False --subDir GN
--subDir SUBDIR       
                        Usage: python3 Receiver.py --recordVideos True | False --subDir GN

Example usage: python3 Receiver.py --portNo 49164 --displayWindows False --recognitionMode SIGN --socketTimeOutEnable True

Installation on AWS Ubuntu flavour

AWS Ubuntu 16.04 server free-tier image was chosen with 30 GB SSD. This has 1 GB RAM. So we extended the swap by 3 GB with reference from: Linux Add a Swap File โ€“ HowTo - nixCraft. All the dependencies were installed exactly in the way specified above, installing any intermediate dependencies that have been missed out from above. dlib cannot be installed (either from source or by using pip) directly. This was the main reason swap file was added.

Now, we needed to run server.py on Flask server that would assign port No and invoke server socket for recognition. And for remote execution, server.py needs to run in background. Using & proved to be of no use since, once we quit the remote shell, the server is terminated as it will be a child of the remote shell. For this purpose we used supervisorctl. For installation and configuring invocation of this server, we referred: How To Install and Manage Supervisor on Ubuntu and Debian VPS - Digital Ocean. The conf file that has been stored can be found here (silatra_server.conf). This conf file has been stored at /etc/supervisor/conf.d/silatra_server.conf on the AWS Linux. After this you can hit the commands specified on How To Install and Manage Supervisor on Ubuntu and Debian VPS - Digital Ocean. By using sudo supervisorctl, you can see that this server is running.

Dataset

The training images, videos can be found here: Hand Poses, Gestures Dataset - SiLaTra

This zip file has

  • Dataset/Hand_Poses_Dataset which consist of all static signs (Letters, Digits and Gesture Signs (which are intermediate signs of gestures))
  • Dataset/Gesture_Videos_Dataset which are videos of gestures used for training HMM
  • Dataset/TalkingHands_Original_Videos_Not Used For Training_Just For Reference consists of videos downloaded from Talking Hands which were used as a reference for generating training videos which are stored in Dataset/Gesture_Videos_Dataset.

Constraints

  • The sign demonstrator must wear full-sleeved shirt.
  • The lighting conditions must be good for segmentation.
  • The hand pose should not be outlier i.e. angle of hand should not be extremely far from that of ideal pose.
  • The background must have at most small-sized skin-coloured objects. For gesture recognition to work correctly, it must not have skin-colored objects.
  • The gestures must not be too fast or too slow.
  • The head must be completely present in gestures.
  • While demonstrating gestures or hand poses, the face should not be occluded (covered) by hand.

Repo Structure Description

Archived Codes

Contain the older versions of these codes in the outer directory. These have been deprecated. These codes have the remnants of the older versions such as "Feature Extraction Using Fourier Descriptors - C++ implementation", "Gesture Recognition using Automata" and other such temporary testing codes for experimentation. They have been kept so as to go back in time to revisit the efforts in the hope that someday it could help anyone in some way.

Gesture Videos By TalkingHands

These contain gesture videos which have been downloaded from Talking Hands website. These were the ones that were referred for creating training videos. Some of them were considered finally for gestures because they were easy. (These gestures that are supported in the final version are 1-handed and do not involve the hand overlapping the face any time).

SiLaTra_Server

The folder SiLaTra_Server is the entry point into the server module. The Server developed is very modular. All the models, modules, dependency files are stored in this folder. When you download this folder, all these code dependencies will already be present. The entry point into this server module is server.py. The server.py code decides a port no randomly, checks if it is open and invokes Receiver.py to start TCP socket on this port. Receiver.py will manage recognition. The description of these resources used by Receiver.py in SiLaTra_Server folder are:

  • silatra_utils.py: Contains functions for feature extraction from segmented hand, managing sign prediction by finding modal values in a stream of predictions and functions dealing with displaying signs on screen through windows.

  • Modules/FaceEliminator.py: Contains functions for blackening face and neck area when given the detected face coordinates and segmented mask.

  • Modules/PersonStabilizer.py: Used for stabilizing person as object using face as reference.

  • Modules/TimingMod.py: Used for keeping track of time required for individual process activities (for performance measurement).

  • Gesture_Modules/hmmGestureClassify.py: Contains everything required for handling gesture recognition โ€“ from accessing models to predicting probabilities for each one of them and comparing these probabilities for finding the recognized gesture.

  • Gesture_Modules/directionTracker.py: Used for tracking hand centroid for determining motion for gesture recognition.

  • Models: Contains k-NN, HMM Models for recognition

The server socket is created using this entry point. There are 2 ways of defining the port:

  • Port No specified using terminal interaction: python3 Receiver.py

  • Port No specified in commandline itself: python3 Receiver.py --portNo 12345

After this is invoked, the IP Address and the port can be specified on the Android application and user can see the real-time translation on screen. Another argument provided in this program is the creation of gesture videos for training HMM which can be used as: python3 Receiver.py --recordVideos True --subDir GN. Here, the subdirectory is specified where the developer wants to store his training videos.

SilatraPythonModuleBuilder

Open terminal here and install this module using command: python3 setup.py install. This installs the silatra_cpp module used for segmentation. This is used for faster skin segmentation. It has the same morphology operations as the python code but uses RGB+YUV Colour Models for skin segmentation. To be fast, it is implemented completely in C++ with Python Wrapper. If you want to use this silatra_cpp module, you need to modify the code in appropriate places.

Utilities

Detailed description of the contents within this folder is provided in the README.md file inside the Utilities Folder.

Developers

silatra's People

Contributors

dev-td7 avatar devendravyavaharkar avatar kartik2112 avatar vrr-21 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

silatra's Issues

Github LFS Issue

Without using git lfs track "*.sav", if you commit large files and then try to push the commits, you get an error like
Git LFS: (0 of 0 files, 2 skipped) 0 B / 0 B, 158.28 MB skipped Counting objects: 24, done. Delta compression using up to 4 threads. Compressing objects: 100% (24/24), done. Writing objects: 100% (24/24), 37.03 MiB | 1.18 MiB/s, done. Total 24 (delta 9), reused 0 (delta 0) remote: Resolving deltas: 100% (9/9), completed with 5 local objects. remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com. remote: error: Trace: 2bdd15b682b0d95b5313d22c1f07d9fe remote: error: See http://git.io/iEPt8g for more information. remote: error: File Sign recognition/KNN_Grid_ModelDump.sav is 126.20 MB; this exceeds GitHub's file size limit of 100.00 MB To https://github.com/kartik2112/Silatra.git ! [remote rejected] kartik-interfacing -> kartik-interfacing (pre-receive hook declined)

Quick Resolve

For C++ to Python Neural network initialization, move load model code outside Python function to outisde part. This may be loaded once and stored forever. This could speed up the recognition by some extent.

Mobile app integration

Can we use directly this model to run in local machine and use api call to get the output on mobile screen??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.