Welcome to our ASL to Text with Emotional Analysis project! This project aims to bridge the communication gap between individuals using American Sign Language (ASL) and those who communicate through spoken or written language. By leveraging machine learning and natural language processing, we provide a platform for interpreting ASL alphabet gestures into text, coupled with emotional analysis to convey the sentiment behind the signs. Our goal is to foster a more inclusive society by making ASL more accessible and comprehensible to everyone, particularly benefiting individuals with Autism Spectrum Disorder (ASD) who may find it challenging to interpret emotional cues in sign language. This project was created by students at the University Of Washington Seattle: Hrudhai Umas, Rohan Sabhaya and Kaden Kapadia.
Communication disparities, particularly in sign language, can lead to nuanced differences in interpretation. Our project seeks to address these disparities by developing a model adept at discerning emotional cues embedded within ASL and articulating them in natural language. This initiative aims to empower those learning or mastering ASL, thereby promoting universal communication parity and nurturing a more synergistic and inclusive environment.
To see our project in action, check out this demo video showcasing how the system works, including real-time ASL sign recognition and sentiment analysis. Click Here For A Demo Video Of How It Works
- Please note that our project only contains space through the letter S for compute reasons. Our next steps will be to include S to Z and incorporate dynamic words so users do not have to spell out every word.
- Python Version: 3.8 - 3.11 (Tested on Python 3.9.6) (IDE: PyCharm)
- Libraries: OS, OpenCV, Pickle, MediaPipe, Numpy, PyTorch, scikit-learn, transformers
- Hardware: A webcam for capturing sign language gestures
pip install opencv-python mediapipe numpy torch scikit-learn transformers
The project is structured into four main steps, each encapsulated in its own Python script:
- collect_imgs.py: Captures ASL signs using a webcam and stores the images in a data folder. Each letter of the alphabet, including a sign for space, is represented by 300 images.
- create_dataset.py: Processes the captured images to extract hand landmarks using MediaPipe, storing the data in a serialized format for training.
- NeuralNetworkML.py: Trains a neural network model on the processed dataset to recognize ASL signs.
- RealTimePredictionAndNLP.py: Utilizes the trained model for real-time ASL sign recognition and performs sentiment analysis on the interpreted signs.
The data folder contains images of the ASL signs for each letter of the alphabet and a sign for space, with 300 images per sign. It's automatically generated and populated by running collect_imgs.py.
To use the system, follow these steps in order. Note that we made it so you only have to run RealTimePredictionAndNLP.py for the program to work. Steps 1-3 are only required if you want to capture your images, create your dataset, and run the ML model on that dataset.
- (OPTIONAL HAS ALREADY BEEN DONE) Run collect_imgs.py to collect images for each ASL sign (Only if you would like to train your own ASL Model)
- (OPTIONAL HAS ALREADY BEEN DONE) Execute create_dataset.py to process the images and create a dataset. (Need to do if you ran collect_imgs.py)
- (OPTIONAL HAS ALREADY BEEN DONE) Run NeuralNetworkML.py to train the machine learning model.
- Finally, execute RealTimePredictionAndNLP.py for real-time sign recognition and sentiment analysis.
This project relies on several libraries for image processing, machine learning, and natural language processing. Make sure you have the following dependencies installed:
- os: Standard library in Python, no need for separate installation.
- cv2 (OpenCV): For image processing and capturing webcam footage.
- pickle: Standard library in Python for serializing and deserializing Python object structures.
- mediapipe: For hand landmark detection.
- numpy: For numerical computations and handling arrays.
- torch (PyTorch): For implementing and training neural network models.
- sklearn (scikit-learn): For data preprocessing and splitting the dataset.
- transformers: For natural language processing, specifically for sentiment analysis.
Our project stands on the shoulders of giants, leveraging open-source libraries such as OpenCV, MediaPipe, PyTorch, scikit-learn, and transformers. We extend our gratitude to all the contributors of these projects. Please also let us know if there is anything we can improve on!
By embracing the motto "The World is One Family," we are committed to removing communication barriers and fostering inclusivity across all boundaries. Join us in our journey towards creating a more accessible and understanding world.