Deep facial expressions recognition using Opencv and Tensorflow. Recognizing facial expressions from images or camera stream

License: GNU General Public License v3.0

Python 100.00%

cnn cnn-classification machine-learning deep-learning facial-expression-recognition facial-landmarks tflearn tensorflow fer2013 opencv python hog-features images image-classification

facial-expression-recognition-using-cnn's Introduction

Facial expression recognition using CNN in Tensorflow

Using a Convolutional Neural Network (CNN) to recognize facial expressions from images or video/camera stream.

1. Motivation

2. Why is Fer2013 challenging?

3. Classification results

4. How to use?

Install the dependeciens
Download and prepare the data
Train the model
Optimize the hyperparameters
Evaluate a trained model
Recognizing facial expressions from an image file
Recognizing facial expressions in real time from video/camera

5. Contributing

1. Motivation

The goal is to get a quick baseline to compare if the CNN architecture performs better when it uses only the raw pixels of images for training, or if it's better to feed some extra information to the CNN (such as face landmarks or HOG features). The results show that the extra information helps the CNN to perform better.

To train the model, we used Fer2013 datset that contains 30,000 images of facial expressions grouped in seven categories: Angry, Disgust, Fear, Happy, Sad, Surprise and Neutral.

The faces are first detected using opencv, then we extract the face landmarks using dlib. We also extracted the HOG features and we input the raw image data with the face landmarks+hog into a convolutional neural network.

For our experiments, we used 2 CNN models:

2. Why is Fer2013 challenging?

Fer2013 is a challenging dataset. The images are not aligned and some of them are uncorrectly labeled as we can see from the following images. Moreover, some samples do not contain faces.

This makes the classification harder because the model have to generalize well and be robust to incorrect data. The best accuracy results obtained on this dataset, as far as I know, is 75.2% described in this paper: [Facial Expression Recognition using Convolutional Neural Networks: State of the Art, Pramerdorfer & al. 2016]

3. Classification Results (training on 5 expressions)

Experiments	SVM	Model A	Model B	Difference
CNN (on raw pixels)	-----	72.4%	73.5%	+1.1%
CNN + Face landmarks	46.9%	73.5%	74.4%	+0.9%
CNN + Face landmarks + HOG	55.0%	68.7%	73.2%	+4.5%
CNN + Face landmarks + HOG + sliding window	59.4%	71.4%	75.1%	+3.7%

As expected:

The CNN models gives better results than the SVM (You can find the code for the SVM implmentation in the following repository: Facial Expressions Recognition using SVM)
Combining more features such as Face Landmarks and HOG, improves slightly the accuray.
Since the CNN Model B uses deep convolutions, it gives better results on all experiments (up to 4.5%).

It's interesting to note that using HOG features in the CNN Model A decreased the results compared to using only the RAW data. This may be caused by an overfitting or a failure to extract the coorelation between the information.

In the following table, we can see the effects of the batch normalization on improving the results:

Batch norm effects	on Model A	on Model B
CNN (on raw pixels)	+7.4%	+39.3%
CNN + Face landmarks	+26.2%	+50.0%
CNN + Face landmarks + HOG	+1.9%	+50.1%
CNN + Face landmarks + HOG + sliding window	+16.7%	+16.9%

In the previous experiments, I used only 5 expressions for the training: Angry, Happy, Sad, Surprise and Neutral.

The accuracy using the best model trained on the whole dataset (7 emotions) dropped to 61.4%. The state of the art results obtained on this dataset, as far as I know, is 75.2% described in this paper.

Note: the code was tested in python 2.7 and 3.6.

4. HOW TO USE?

4.1. Install dependencies

Tensorflow
Tflearn
Numpy
Argparse
[optional] Hyperopt + pymongo + networkx
[optional] dlib, imutils, opencv 3
[optional] scipy, pandas, skimage

Better to use anaconda environemnt to easily install the dependencies (especially opencv and dlib)

4.2. Download and prepare the data

Download Fer2013 dataset and the Face Landmarks model
- Kaggle Fer2013 challenge
- Dlib Shape Predictor model
Unzip the downloaded files

And put the files fer2013.csv and shape_predictor_68_face_landmarks.dat in the root folder of this package.
Convert the dataset to extract Face Landmarks and HOG Features
```
python convert_fer2013_to_images_and_landmarks.py
```
You can also use these optional arguments according to your needs:
- -j, --jpg (yes|no): save images as .jpg files (default=no)
- -l, --landmarks (yes|no): extract Dlib Face landmarks (default=yes)
- -ho, --hog (yes|no): extract HOG features (default=yes)
- -hw, --hog_windows (yes|no): extract HOG features using a sliding window (default=yes)
- -hi, --hog_images (yes|no): extract HOG images (default=no)
- -o, --onehot (yes|no): one hot encoding (default=yes)
- -e, --expressions (list of numbers): choose the faciale expression you want to use: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral (default=0,1,2,3,4,5,6)
Examples:
```
python convert_fer2013_to_images_and_landmarks.py
python convert_fer2013_to_images_and_landmarks.py --landmarks=yes --hog=no --how_windows=no --jpg=no --expressions=1,4,6
```
The script will create a folder with the data prepared and saved as numpy arrays. Make sure the --onehot argument set to yes (default value)

4.3. Train the model

Choose your parameters in 'parameters.py'
Launch training:

python train.py --train=yes

The variable output_size in parameters.py (line 20), should correspond to the number of facial expressions you want to train on. By default it is set to 7 expressions.

Train and evaluate:

python train.py --train=yes --evaluate=yes

N.B: make sure the parameter "save_model" (in parameters.py) is set to True if you want to train and evaluate

4.4. Optimize training hyperparameters

For this section, you'll need to install first these optional dependencies:

pip install hyperopt, pymongo, networkx

Lunch the hyperparamets search:

python optimize_hyperparams.py --max_evals=20

You should then retrain your model with the best parameters

N.B: the accuracies displayed are for validation_set only (not test_set)

4.5. Evaluate a trained model (calculating test accuracy)

Modify 'parameters.py':

Set "save_model_path" parameter to the path of your pretrained file.

Launch evaluation on test_set:

python train.py --evaluate=yes

4.6. Recognizing facial expressions from an image file

For this section you will need to install dlib and opencv 3 dependencies
Modify 'parameters.py':

Set "save_model_path" parameter to the path of your pretrained file

Predict emotions from a file

python predict.py --image path/to/image.jpg

4.7. Recognizing facial expressions in real time from video

For this section you will need to install dlib, imutils and opencv 3 dependencies
Modify 'parameters.py':

Set "save_model_path" parameter to the path of your pretrained file

Predict emotions from a file

python predict-from-video.py

A window will appear with a box around the face and the predicted expression. Press 'q' key to stop.

N.B: If you changed the number of expressions while training the model (default 7 expressions), please update the emotions array in parameters.py line 51.

5. Contributing

Some ideas for interessted contributors:

Automatically downloading the data
Adding data augmentation?
Adding other features extraction techniques?
Improving the models

Feel free to add or suggest more ideas. Please report any bug in the issues section.

facial-expression-recognition-using-cnn's People

Contributors

Stargazers

Watchers

Forkers

ralyaismail tim-lee-cn ayarotsky akshayjh mayankvik2 alewisztann94 fitrialif phongaster minkukjo tranquanghuy0801 chengyiwen nagesh10 nojuman ehumss anurag-patwegar charfikiy hypermit jankim err0rr zqcr pinglmlcv shantanunandan sunbowei95 uno998 viggyprabhu tianyunfei rutwik-nayak mfayaq rr-y purushottam-kr-singh 666dzy666 saiuz abhijeetd01 gwliu213 abhihirekhan pluto1314 clara-genadry federicosan elttaes pilotbear hansbalab karanbudhraja 113741090a nouman97 abr-98 sheqi baddot zhleternity aixinneucore oxfordhalfblood youssefkhalifa1 vivianliangb nehakherde yiyg510 ellsionjeep jaganrv anubhavroy srinivasmachiraju cimszw jiahaosong laurakag supachaya2535 japeshmethuku persistentbuilder whztt07 eric60305 xshuyu peterzs huishanyi shelarvs tkamkb zaochen01 ruorala mojingmiao nightfury97 dex68 aibenstunner specteresawy yt731 ducmanhkthd wannaminkhant shivangraikar yashwanthsaai mengtender rainerenglisch iqbalululazmi lynnnnnnnnn okokyou zhanghee liuxin18923723799 ayushkakkar24 shadow-prince william9527wn gokulsg jackthgu kaitlynn-pineda szhyuling novadileep rsingh2083 adas-eye

facial-expression-recognition-using-cnn's Issues

hello，some question about the acc of classification

I tested your model, but the recognition rate was only 41%。

Issue in train.py

@amineHorseman sir, When i run python train.py --train=yes in cmd, its throwing an error in line 22 in train.py, which leads to line 31 in data_loader.py and i get an error saying "all the input array dimensions except for the concatenation axis must match exactly." I have not changed anything in the code. Thank you..

convert_fer2013_to_images_and_landmarks

First of all, I verified that the fer2013.csv contains 28709 training data, 3436 publictest, and 3589 privatetest.

However, after running the convert_fer2013_to_images_and_landmarks, I only get 3436 training data, 56 publictest data, and 8 private test data, which I think happened because of this
if labels[i] in SELECTED_LABELS and nb_images_per_label[get_new_label(labels[i])] < IMAGES_PER_LABEL:

I guessed that you want to clip the expression so the data will be balance but because you mix the limit for all training , public, and private data, it resulted in imbalanced data distribution.

Was that intentional? Since I think the training becomes really weird with only 3k data and 56 val and 8 test data.

Thanks

data_dict['X2'] = np.concatenate((data_dict['X2'], np.load(DATASET.train_folder + '/hog_features.npy')), axis=1)

when i did 'python train.py --train=yes', code 'data_dict['X2'] = np.concatenate((data_dict['X2'], np.load(DATASET.train_folder + '/hog_features.npy')), axis=1)' from data_loader.py reprted a error, which is 'numpy.AxisError: axis 1 is out of bounds for array of dimension 1' ,i do not know how to correct it, i tried to amend axis=1 to axis=0, well , code 'return X[start]'
from utils.py showed me a new error ,which is 'index 9006 is out of bounds for axis 0 with size 0'
anyway, i want to ask Did anyone run this code successfully,and how should i correct my code, thanks.

Unexpected error occured

while running predict.py in command prompt im getting this unexpected error message that "Python has stopped working"
please help.

training accuracy

about the accuracy

Hi, I have some problems in optimizer param. I run the optimize_hyperparams.py with max_evals=20, but still cannot get the acc of 75.1%. In fact, my results always under 70%.
Is the value of max_evals setting too small? I use 5 emotions(angry, happy, sad, suprise, neytral). Could you help me?

[Solved] ValueError: Cannot feed value of shape

When using python train.py --train=yes --evaluate=yes it gives the following error:
Traceback (most recent call last): File "train.py", line 118, in <module> train() File "train.py", line 57, in train n_epoch=TRAINING.epochs) File "/anaconda3/lib/python3.6/site-packages/tflearn/models/dnn.py", line 216, in fit callbacks=callbacks) File "/anaconda3/lib/python3.6/site-packages/tflearn/helpers/trainer.py", line 339, in fit show_metric) File "/anaconda3/lib/python3.6/site-packages/tflearn/helpers/trainer.py", line 818, in _train feed_batch) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1111, in _run str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (48,) for Tensor 'output/Y:0', which has shape '(?, 5)'

How to overcome this error?

provide video example

Dear Horseman,
Could you please provide an example video?
When i tried run video, the error "can grab frames " occurs.

Landmarks and hog feature

great work man
i want to know about how you are feeding landmarks and hog features to your network?
either you are putting in the form of label of same like image?

about the confusion matrix

Hi, I am a newcomer to research expression recognition, I want to know how to get the confusion matrix. Thank you in advance for your reply.

[Enhacement] Automatic Dataset download

A good feature to automate the benchmarking is to add a module for automatic dataset download.

question about Classification Results

why do you train on 5 expressions instead of 7 expressions?

TypeError: call(): incompatible function arguments. The following argument types are supported:

Mac-Pro:facial-expression-recognition-using-cnn morewayshealthcareserver$ python3 predict.py --image img/adrian.jpg
loading pretrained model...
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tflearn/initializations.py:119: UniformUnitScaling.init (from tensorflow.python.ops.init_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tflearn/objectives.py:66: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
Traceback (most recent call last):
File "predict.py", line 93, in
emotion, confidence = predict(image, model, shape_predictor)
File "predict.py", line 54, in predict
face_landmarks = np.array([get_landmarks(image, face_rects, shape_predictor)])
File "predict.py", line 39, in get_landmarks
return np.matrix([[p.x, p.y] for p in predictor(image, rects[0]).parts()])
TypeError: call(): incompatible function arguments. The following argument types are supported:
1. (self: dlib.shape_predictor, image: array, box: dlib.rectangle) -> dlib.full_object_detection

Invoked with: <dlib.shape_predictor object at 0x13da5b378>, None, rectangle(0,0,48,48)

3 expression training

I have try to train 3 expression, 3,4,6: Sad,Happy and Neutral.
But why i run predict a certain image, it predicts "Angry".
Can i change image_height and image_width to 256 x 256 to improve the accuracy.
image_height = 48
image_width = 48
What should i do to improve accuracy?

Thanks for your great works.

Issue with train.py

@amineHorseman Quick observation. Noticed that when we run "convert_fer2013_to_images_and_landmarks.py" for expressions less than 7, and when we run train.py, we get an exception which says "axis 1 out of bounds for array of dimension 1".

about zerodivisionerror

hello amine!
I tried to train 5 emotions with using landmarks.
the train environment is like this:

emotions = 5
model = B
optimizer = 'momentum'
learning_rate = 0.016
learning_rate_decay = 0.864
otimizer_param (momentum) = 0.95
keep_prob = 0.956
epochs = 100
use landmarks = True
use hog + landmarks = False
use hog sliding window + landmarks = False
use batchnorm after conv = True
use batchnorm after fc = False_

and I also changed [output_size=5] in parameters.py. but when I tried to train,

Traceback (most recent call last):
File "train.py", line 124, in
train()
File "train.py", line 60, in train
n_epoch=TRAINING.epochs)
File "/anaconda3/envs/tf/lib/python3.6/site-packages/tflearn/models/dnn.py", line 216, in fit
callbacks=callbacks)
File "/anaconda3/envs/tf/lib/python3.6/site-packages/tflearn/helpers/trainer.py", line 339, in fit
show_metric)
File "/anaconda3/envs/tf/lib/python3.6/site-packages/tflearn/helpers/trainer.py", line 847, in _train
e = evaluate_flow(self.session, eval_ops, self.test_dflow)
File "/anaconda3/envs/tf/lib/python3.6/site-packages/tflearn/helpers/trainer.py", line 1003, in evaluate_flow
res = [r / dataflow.n_samples for r in res]
File "/anaconda3/envs/tf/lib/python3.6/site-packages/tflearn/helpers/trainer.py", line 1003, in
res = [r / dataflow.n_samples for r in res]
ZeroDivisionError: float division by zero

Is there anything else that needs to be fixed? :)

unable to open shape predictor.

File "G:/Asad/Projects/facial-expression-recognition-using-cnn-master/facial-expression-recognition-using-cnn-master/convert_fer2013_to_images_and_landmarks.py", line 61, in
predictor = dlib.shape_predictor('shape_predictor_68_face_landmarks.dat')

RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat

A question about convert_fer2013_to_images_and_landmarks.py

hello,prof!
when I run convert_fer2013_to_images_and_landmarks.py, and I want to save images,so I make --jpg=yes,but the images are not enough,for example,in PrivateTest folder,the number of images only 8.
This is my command.
python convert_fer2013_to_images_and_landmarks.py --jpg=yes --landmarks=yes --hog=yes --hog_windows=yes --hog_images=yes --onehot=yes --expressions=0,1,2,3,4,5,6

Run model on test image using predict.py

Hi, I am facing an issue when I am running python predict.py --image pic.jpg to test the prediction on a random image. I downloaded the angry face image randomly from internet and try to check the model predicition. But I am getting error as below (please check screenshot).
Your help would be much appreciated!

Regards,
Ajay Sharma

Issue in executing optimize_hyperparams.py

Hi,
Can anybody guide me as to why am I getting this error while I try to run the optimize_hyperparams.py : line 7, import pprint(
SyntaxError: invalid syntax

Thanks

Could not run successfully while using Anaconda

I'm using the code on Anaconda Spyder and the code in optimize_hyperparams.py is quitting showing some error like
usage: [-h] -m MAX_EVALS
: error: the following arguments are required: -m/--max_evals
An exception has occurred, use %tb to see the full traceback.
SystemExit: 2

Unable to open shape predictor

RuntimeError: Unable to open shape_predictor_68_face_landmarks.dat

convert_fer2013_to_images_and_landmarks.py

Are these two sentences wrongly written?
if ONE_HOT_ENCODING: np.save(OUTPUT_FOLDER_NAME + '/' + category + '/labels.npy', labels_list) else: np.save(OUTPUT_FOLDER_NAME + '/' + category + '/labels.npy', labels_list)
@amineHorseman

Publication paper

is it available for publication paper?

Testing python 3 compatibility

Can anyone test the repository on python 3.6 and list the problems needed to be solved?

add speech recognition

How to add speech recognition with emotion recognition so that it can respond with speech based on our emotion with our phrase.i tried but it not done successfully please help me

error while using predict.py

hi,

I'm constantly getting this error whenever I try to run predict.py. I am not sure what is wrong. Is it because of some package update?

Use standard file APIs to check for files with this prefix.
/home/gaurav/anaconda3/envs/openface/lib/python2.7/site-packages/skimage/feature/_hog.py:150: skimage_deprecation: Default value of block_norm==L1 is deprecated and will be changed to L2-Hys in v0.15. To supress this message specify explicitly the normalization method.
skimage_deprecation)
/home/gaurav/anaconda3/envs/openface/lib/python2.7/site-packages/skimage/feature/_hog.py:248: skimage_deprecation: Argument visualise is deprecated and will be changed to visualize in v0.16
'be changed to visualize in v0.16', skimage_deprecation)
Traceback (most recent call last):
File "predict.py", line 93, in
emotion, confidence = predict(image, model, shape_predictor)
File "predict.py", line 57, in predict
hog_features = sliding_hog_windows(image)
File "predict.py", line 47, in sliding_hog_windows
cells_per_block=(1, 1), visualise=False))
File "/home/gaurav/anaconda3/envs/openface/lib/python2.7/site-packages/skimage/feature/_hog.py", line 211, in hog
g_row, g_col = _hog_channel_gradient(image)
File "/home/gaurav/anaconda3/envs/openface/lib/python2.7/site-packages/skimage/feature/_hog.py", line 41, in _hog_channel_gradient
g_col[:, 0] = 0
IndexError: index 0 is out of bounds for axis 1 with size 0

sliding window

Hello, I am a newcomer to research expression recognition.I want to know what sliding window is doing with the data.Why can IT improve the accuracy? Thank you in advance for your reply.

aminehorseman / facial-expression-recognition-using-cnn Goto Github PK

facial-expression-recognition-using-cnn's Introduction

Facial expression recognition using CNN in Tensorflow

Table of contents

facial-expression-recognition-using-cnn's People

Contributors

Stargazers

Watchers

Forkers

facial-expression-recognition-using-cnn's Issues

Recommend Projects

Recommend Topics

Recommend Org