Git Product home page Git Product logo

3d-facial-reconstruction's Introduction

Thesis

3D facial reconstruction, expression recognition and transfer from monocular RGB images with a deep convolutional auto-encoding neural network

Abstract

The present work implements an automatic system for coding and reconstructing 3D faces from low resolution RGB images by utilizing machine learning algorithms. Given a 3D morphable model, different faces are represented as a vector of variables ("code vector") which describe the shape, expression and color of the face. The multiplication of these parameter vectors with the PCA bases provided by the morphable model results in the 3D coordinates of the reconstructed face. As part of this work, an algorithm for the creation of two-dimensional synthetic faces solely from the information captured by the code vector was developed. The synthetic faces were used to train the neural network that acts as the encoding phase of the auto-encoding system and bootstrapping techniques were used to generalize the network to real-world facial images.

The outcome of this work is not only proof of the potential of 3D facial reconstruction from RGB images, but also the ability to exploit the 3D face by changing its expression, color or lighting. In the context of said exploitation, a neural network was implemented to identify the facial expression from the information encoded in the code vector. The problem tackled by the present work has until now been solved by the use of iterative algorithms based on the linear combination of existing prototype samples, which require a large amount of data from three-dimensional scans. Here, an attempt is made to solve this problem purely with machine learning and synthetic data.

Results

Autoencoding network for 3D reconstruction

Below are the results of the 3D reconstruction auto-encoding network:

  • A comparison between the results of Xception, ResNet50 and InceptionV3 architectures. Image (i.) shows the original face and images (ii.) - (iv.) depict the reconstructions with Xception, ResNet50 and InceptionV3, respectively.

Encoder Architectures Comparison

  • A comparison between the reconstructions of ResNet50 at 4 different stages of the training. Image (i.) shows the original face, image (ii.) shows the initial reconstruction by a ResNet50 encoder and images (iii.) - (iv.) show the reconstructions after each bootstrapping iteration.

Reconstructions after each bootstrapping iteration

Facial expression recognition network

Below are the results of the 2-hidden-layer expression classification network.

  • The network can classify between 7 different expressions, namely anger, disgust, fear, happiness, neutral, sadness and surprise. The images below depict the base expressions that were used as a reference when creating the synthetic dataset. Expression Vector Bases

  • The matrix (CM) shows the accuracy of the network on 700 real faces from The MUG Facial Expression Database. The network was trained on synthetic data. Confusion Matrix

Dependencies

pip install -r requirements.txt

Data Dependencies

The following files have to be downloaded and placed on the ./DATASET directory

For validation on real faces:

How images are pre-processed for reconstructions

Image Preprocess

Bugs & Request

Please report bugs and request features using the Issue Tracker.

3d-facial-reconstruction's People

Contributors

anapt avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.