Git Product home page Git Product logo

realtimefer's Introduction

Real time facial expression recognition for webcam application

Real time Demo (using one RGB camera)

gif

Frameworks

  • Face Dectection is accomplished by mediapipe developed by Google.

  • Facial Expression Recognition is trained using DCNN (Deep Convolutional Neural Network) with FER+ dataset which is held by Microsoft.

nn

(Thanks to @zc of the image)

Language & Dependencies

  • Language: python3.6

  • Dependencies:

    • pytorch
    • opencv-python
    • mediapipe (modified)
    • CUDA10.1 (optional)
    • ...
  • you may install all the dependencies via command python -m pip install -r requirements.txt

Details

  • Usage:

    • Run from prebuilt exe: see release-windows-v0.1 (Recommended)
    • Run from source:
      • Replace drawing_utils.py in mediapipe with src/drawing_utils.py in which I slightly modified.
      • Contact me by Email to get the trained model.
      python camdemo.py --camera 0
  • Performance:

    • Absolutely REALTIME! The model could achieve above the average 60 FPS on a plain PC. If possible, try using a GPU to gain better performance!
    • My poor computer: Intel i7-7700K CPU (4.2GHz) with NVIDIA Quadro P2000 (5G memory)
  • Model Structure: the model is quite simple though. It uses ResNet50 as backbone for feature extraction after which it is stacked with two fully connected layer. The output is a 10-size digits vector corresponding to 10 emotion classes.

  • Accuracy: the model achieves 79.8% accuracy evaluated by FER+ valid subset after 14 epochs of training using softCE loss.

    epoch KLdiv softCE weightedSoftCE
    0 0.005 0.005 0.005
    1 0.55 0.598 0.56
    2 0.58 0.652 0.668
    3 * 0.695 0.697
    4 * 0.726 0.71
    5 * 0.753 0.68
    6 * 0.76 0.665
    ... ... ... ...
    14 * 0.798 0.742
  • Loss function:

    • Rather than original FER, each image in FER+ has been labeled by 10 crowd-sourced taggers but the default implementation of cross-entropy in pytorch uses just one hard label to compute the loss which abandons the information of 10 soft labels. So I implemented the soft cross-entropy to train the model fitting the probability distribution of emotion class which got pretty good results.

    • One more reason to use softCE loss is that for emotion classification, some human emotions cannot be distinguished well such as happiness and surprise.

    • As FER+ is a very imbalanced dataset (see image below) so I've tried use weightedSoftCE( like the idea of focal loss) but no good which I don't quite get it yet. If you happen to know why, tell me! Also when using weightedSoftCE during training, I found the loss rising upside and down a lot, which means it's not that numerical stable.

data

Expressions neutral happiness surprise sadness anger disgust fear contempt unknown NF
index 0 1 2 3 4 5 6 7 8 9

Potential applications

  • Online education for children, which could be used to identify whether children listen carefully; For on-site meeting or school classroom, to judge the quality of the speech.

online

  • On-site Human–Machine Interaction.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.