google / shuwa Goto Github PK

Python 83.55% Jupyter Notebook 16.45%

shuwa's Introduction

Shuwa Gesture Toolkit

Shuwa (手話) is Japanese for "Sign Language"

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos. It is particularly useful for recognizing basic words in sign language. We collected thousands of example videos of people signing Japanese Sign Language (JSL) and Hong Kong Sign Language (HKSL) to train the baseline model for recognizing gestures and facial expressions.

The Shuwa Gesture Toolkit also allows you to train new gestures, so it can be trained to recognize any sign from any sign language in the world.

[Web Demo]

How it works

by combining the results of a Holistic model over multiple frames. We can create a reasonable set of requirements for interpreting sign language, which include body language, facial expression, and hand gestures.

The next step is to predict the sign features vector using a classifier model. Lastly, output the class prediction using K-Nearest Neighbor classification.

Installation

Install python 3.9
Install dependencies
```
pip3 install -r requirements.txt 
```

Run demo

python3 webcam_demo.py

Use record mode to add more sign.
Play mode.

Try Hong Kong Sign Language or Japanese Sign Language

Download hksl_jsl_samples.zip
Extract to root directory
run
```
python3 webcam_demo.py
```

Train classifier

You can add a custom sign by using Record mode in the full demo program.
But if you want to train the classifier from scratch you can check out the process here

shuwa's People

Contributors

Stargazers

Watchers

Forkers

sygoing mikeybeez isaac4everlast isabella232 trendingtechnology marmikreal 2samgu2 beverlyab globalic cnhoper mugabi0978 adameseliga lc-ghub ehuan2 mqasim39 wayneotemah annietranmg

shuwa's Issues

The prediction of the js version of the pose model is different from the python version.

I ran a inference of the posenet model in js version and its predictions are worse than its python counterpart Is there a reason for the performance degradation between the python and js posenet models? @bit-kim

Training script issue

Hi, thank you for sharing this code!

We would like to ask for help about the error that we encounter while running the training script on our data with 10 words only.
In the jupyter notebook, the model.fit_generator() produces the following error:

InvalidArgumentError: Input to reshape is a tensor with 10 values, but the requested shape has 1
[[node Reshape
(defined at /Users/abc/opt/anaconda3/envs/githubshuwa/lib/python3.7/site-packages/tensorflow_addons/losses/triplet.py:257)
]] [Op:__inference_train_function_13545]

We used the following tool versions:

Python 3.7
Tensorflow 2.6

And we also updated the batch from 32 to 1 because the demo webapp expects the model to have an input shape with 1 batch only.

We would like to ask:

What versions of python and Tensorflow should we use?
Are there other possible cause for the error that we encountered?
How can we train using more than 1 batch without affecting the expected shape of the input of the model?

Thank you very much in advance!

How to generate txt files corresponding to different gestures when training your own dataset

Web Demo Issue

Hi! We noticed that the output layer of the model was updated. The new architecture is now incompatible with the web demo application.

signClassify.js expects to get an array of 2 elements from classifyModel.predict(..), but the new model only returns 1 element.
Please refer to Line 103 or 454 of ./web_demo/public/js/ML/signClassify.js

Does it mean the web demo will also be updated?

Thank you!!!

Readme

Thank you for this interesting software. What version of python should I use. 3.8 fails. 3.6 works, but is it best? Also, python3 hand_landmark\webcam_demo_hand.py fails on linux. This needs a forward slash not a backslash. Also crop_utils module can't be found. Does something need to be added to pythonpath? Andy other hints would be appreciated. BTW, I can't type input to record on extract_knn_features.py. Thanks again. I'm on Pop!_OS 20.04.

I get a `ValueError: operands could not be broadcast together with shapes (0,) (336,) ` when launching the Play demo

Whenever I launch the demo, I get the

Exception in Tkinter callback
Traceback (most recent call last):
  File "/usr/lib/python3.9/tkinter/__init__.py", line 1892, in __call__
    return self.func(*args)
  File "/home/Reborn/Documents/Projects/sign_language/lsf/webcam_demo.py", line 57, in record_btn_cb
    res_txt = self.translator_manager.run_knn(feats)
  File "/home/Reborn/Documents/Projects/sign_language/lsf/modules/translator/translator_manager.py", line 100, in run_knn
    dists = np.square(self.knn_database - feats)
ValueError: operands could not be broadcast together with shapes (0,) (336,)

Error.
I'm on Python3.9 and on Debian 10