Git Product home page Git Product logo

feedbackusingfer's Introduction

REAL-TIME-VIDEO-FEEDBACK-USING-FACIAL-EXPRESSION-RECOGNITION.

Problem Statement

The technique of Feedback Detection Using Facial Expression Recognition can outweigh the existing Feedback Techniques. The Pen and Paper based Feedback Systems sometimes cannot be trustworthy and time consuming and also same is the case for online Feedback Systems. For instance in Competitive Market of Content Producing Platforms, at present there is no such known system where you can give feedback at particular point of the session or content. The real-time and Circumstantial Feedback of audience can not only help in improvisation of the content of the lecture but also bring progressiveness in more better techniques.

Proposed Methodology

In the proposed system we are taking Winograd convolution approach to implement the Feedback Detection using Facial Expression Recognition. After the pre-processing and feature extraction, using Winograd Convolution we generate facial expression report, to keep track of the emotion of a certain person we use Position based person tagging to generate a report over a period.


Instructions


Files Structure:

  1. FER_CNN.ipynb - Tutorial to train the CNN
  2. final.py - Uses the pre-trained model to give inferences. Produces recorded_face_data.xlsx which contains frame by frame facial expressions of people.
  3. generateGraphs.py - generates statistics based on recorded_face_data.xlsx file.
  4. model.json - Neural network architecture
  5. weights.h5 - Trained model weights

CNN Architecture

  • TRAINING AND CLASSIFICATION Supervised learning is an important technique for solving classification problems.
  • CNN: The task was to classify basic seven emotions that a person expresses such as: Neutral, Happy, Surprise, Sad, Disgust, Anger, and Fear. To achieve this Winograd convolution model was used.
    • Winograd Convolution: An important point of optimization to discuss here is Winograd convolution .If we use a convolution layer with filters of size 3x3; the number of parameters we need to train our network is 9.Now consider using two layers with dimensions of 3x1 and 1x3. By the property of matrix multiplication, we still get a matrix of 3x3 in theory but an added advantage here would be that the number of parameters here have reduced to 3+3 = 6 instead of 9 in the former case. This comes at a cost of accuracy drop but compared to performance gain, we would be going with the two convolution layers (3x1 and 1x3) instead of one 3x3. We use the below layers for our CNN model: The four initial phases, we try to have features extracted and obtain a verbose feature map.

    • Phase 1
      Convolutional: Filters- 64, Size: 3x1,Stride- 1, Active padding, Input- 48x48x1
      Convolutional: Filters- 64, Size: 1x3, Stride- 1, Active padding
      Batch Normalization
      Activation: ReLU
      MaxPool: Size- 2x2, Stride- None, Active padding
      Dropout: 0.25

    • Phase 2
      Convolutional: Filters- 128, Size: 3x1, Stride- 1, Active padding
      Convolutional: Filters- 128, Size: 1x3, Stride- 1, Active padding
      Batch Normalization
      Activation: ReLU
      MaxPool: Size- 2x2, Stride- None, Active padding
      Dropout: 0.25

    • Phase 3
      Convolutional: Filters- 256, Size: 3x1, Stride- 1, Active padding
      Convolutional: Filters- 256, Size: 1x3, Stride- 1, Active padding
      Batch Normalization
      Activation: ReLU
      MaxPool: Size- 2x2, Stride- None, Active padding
      Dropout: 0.25

    • Phase 4
      Convolutional: Filters- 512, Size: 3x1, Stride- 1, Active padding
      Convolutional: Filters- 512, Size: 1x3, Stride- 1, Active padding
      Batch Normalization
      Activation: ReLU
      MaxPool: Size- 2x2, Stride- None, Active padding
      Dropout: 0.25
      Flatten

      Using this 512 vector space feature map, fully connected layers(FCN or Dense layers) can be used

    • Phase 5
      FCN: Unit- 512 Batch Normalization Activation: ReLU Dropout: 0.25 13

    • Phase 6
      FCN: Units- 256
      Batch Normalization
      Activation: ReLU
      Dropout: 0.25

    • Phase 7
      FCN: Units- 7
      Activation: Softmax

      Below, are the hyper-parameters defined for the CNN:
      Batch size: determines the number of input samples to be fed to the network together.
      Batch size = 32
      Epoch: an epoch is when all of the data is seen by the network once
      Num epochs = 30

  • POSITION BASED PERSON TRACKING: The steady footage was required as we used the position based person tagging to distinguish the people sitting in the room. The faces that were captured were stored in the database on the basis of their position. But to add newly appeared faces (the faces that was not previously captured), we developed an algorithm. This algorithm adds newly appeared faces based on the area of intersection with previously added faces. Now for every face added in the dictionary, the facial expression associated with that face appearing in that box was recorded in the dictionary. Finally, the statistics of that person was plotted on the graph.

More About POSITION BASED PERSON TRACKING

We used our algorithm to detect real time facial expressions of the attendees using the captured data from the steady camera. The steady footage was required as we used the position based person tagging to distinguish the people sitting in the room. The faces that were captured were stored in the database on the basis of their position. But to add newly appeared faces (the faces that was not previously captured), we developed an algorithm. This algorithm adds newly appeared faces based on the area of intersection with previously added faces. Now for every face added in the dictionary, the facial expression associated with that face appearing in that box was recorded in the dictionary. Finally, the statistics of that person was plotted on the graph.

Position based person tracking flowchart Position based person tracking flowchart



Requirements

opencv-python==4.1.2.30
Keras==2.3.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
tensorboard==2.1.0
tensorflow==2.1.0
tensorflow-estimator==2.1.0
tflearn==0.3.2
h5py==2.10.0
matplotlib==3.2.1
numpy==1.18.1
openpyxl==3.0.3
PyYAML==5.3
scipy==1.4.1
ffmpeg-python==0.2.0

Guide to run

Clone Repository

git clone https://github.com/slashcaret/FeedbackUsingFER.git

Create new python 3.6.5 environment and switch to it

//If you have anaconda installed; open anaconda prompt and enter following commands:

1. conda create -n feedbackFER python==3.6.5 --no-default-packages
2. conda activate feedbackFER

//change directory to the cloned git repo and install the requirements

3. pip install requirements.txt

Execution and Feedback Generation

  • On existing video:

    For this application I have used a clip from townhall 360p video of then US President Obama which I found on youtube.
    [Here is the link 11:02-11.51] (https://youtu.be/fEKx5FuMUR4)

    python final.py video_name.mp4 30
    

    Here, video_name.mp4 is the video file on which you want to generate the feedback and 30 is the no of fps. Both are mandatory arguments.

    Note: The camera must be stable and fixed and position of the people must not be changed throughout the execution. This application is best suitable for classrooms, cinema halls, theatres, podiums and standup shows. Little bit of head movement is tolerable as vicinity of face was kept in mind while developing the solution

  • Realtime Feedback Generation using Webcam

    python final.py webcam 30
    

    When execution is complete [ You can stop the execution by pressing Esc key. ],
    "recorded_face_data.xlsx" file will be generated.
    In this file, Facial Expressions of the people present in each frame of the video is recorded.

  • To generate graphs/statistics

    generateGraphs.py file uses recorded_face_data.xlsx file as input to calculate average classroom/hall sentiments as well as graph denoting average sentiments for each person during the seminar/class/lecture is also generated.

    python generateGraphs.py
    

Results

  1. Initially...  Initially...

  2. During Execution of final.py  Initially...

  3. During Execution of final.py Detecting Facial Expression of attendees  Initially...

  4. After execution... Faces of people are saved.  Initially...

  5. Preview of recorded_face_data.xlsx ...  Initially...

  6. generating graphs... by running generateGraphs.py .  Initially...

feedbackusingfer's People

Contributors

thenileshunde avatar

Stargazers

Cioclea Doru Octavian avatar Siddhesh Swami avatar Kevin Jordan avatar Jackli95 avatar

Watchers

James Cloos avatar  avatar

Forkers

jackli95

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.