Git Product home page Git Product logo

awesome-avatar's Introduction

awesome-avatar

This is a repository for organizing papers, codes and other resources related to the topic of Avatar (talking-face and talking-body).

🔆 This project is still on-going, pull requests are welcomed!!

If you have any suggestions (missing papers, new papers, key researchers or typos), please feel free to edit and pull a request.

TO DO LIST

  • Main paper list
  • Researchers list
  • Toolbox for avatar
  • Add paper link
  • Add paper notes
  • Add codes if have
  • Add project page if have
  • Datasets and metrics
  • Related links

Researchers and labs

  1. NVIDIA Research
  2. Aliaksandr Siarohin @ Snap Research
  3. Ziwei Liu @ Nanyang Technological University
  4. Xiaodong Cun @ Tencent AI Lab:
  1. Max Planck Institute for Informatics:

Papers

Example: [Conference'year] Title, First-author Affiliation, ProjectPage, Code

2D talking-face synthesis

3D talking-face synthesis

Talking-body synthesis

3D animation

Co-speech gesture synthesis

Pose2image

Datasets

Talking-face

Audio-Visual Datasets for Enlish Speakers
Dataset name Environment Year Resolution Subject Duration Sentence
VoxCeleb1 Wild 2017 360p~720p 1251 352 hours 100k
VoxCeleb2 Wild 2018 360p~720p 6112 2442 hours 1128k
HDTF Wild 2020 720p~1080p 300+ 15.8 hours
LSP Wild 2021 720p~1080p 4 18 minutes 100k
Audio-Visual Datasets for Chinese Speakers
Dataset name Environment Year Resolution Subject Duration Sentence
CMLR Lab 2019 11 102k
MAVD Lab 2023 1920x1080 64 24 hours 12k
CN-Celeb Wild 2020 3000 1200 hours
CN-Celeb-AV Wild 2023 1136 660 hours
CN-CVS Wild 2023 2500+ 300+ hours

Talking-body

TBD

Metrics

Talking-face

Lip-Sync
Metric name Description Code/Paper
LMD↓ Mouth landmark distance
LMD↓ Mouth landmark distance
MA↑ The Insertion-over-Union (IoU) for the overlap between the predicted mouth area and the ground truth area
Sync↑ The confidence score from SyncNet (Sync) wav2lip
LSE-C↑ Lip Sync Error - Confidence wav2lip
LSE-D↓ Lip Sync Error - Distance wav2lip
Image Quality (identity preserving)
Metric name Description Code/Paper
MAE↓ Mean Absolute Error metric for image mmagic
MSE↓ Mean Squared Error metric for image mmagic
PSNR↑ Peak Signal-to-Noise Ratio mmagic
SSIM↑ Structural similarity for image mmagic
FID↓ Frchet Inception Distance mmagic
IS↑ Inception score mmagic
NIQE↓ Natural Image Quality Evaluator metric mmagic
CSIM↑ The cosine similarity of identity embedding InsightFace
CPBD↑ The cumulative probability blur detection python-cpbd
Diversity
Metric name Description Code/Paper
Diversity of head motions↑ A standard deviation of the head motion feature embeddings extracted from the generated frames using Hopenet (Ruiz et al., 2018) is calculated SadTalker
Beat Align Score↑ The alignment of the audio and generated head motions is calculated in Bailando (Siyao et al., 2022) SadTalker

Talking-body

TBD

Toolbox

  1. A general toolbox for AIGC, including common metrics and models https://github.com/open-mmlab/mmagic
  2. face3d: Python tools for processing 3D face https://github.com/yfeng95/face3d
  3. 3DMM model fitting using Pytorch https://github.com/ascust/3DMM-Fitting-Pytorch
  4. OpenFace: a facial behavior analysis toolkit https://github.com/TadasBaltrusaitis/OpenFace
  5. autocrop: Automatically detects and crops faces from batches of pictures https://github.com/leblancfg/autocrop
  6. OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation https://github.com/CMU-Perceptual-Computing-Lab/openpose
  7. GFPGAN: Practical Algorithm for Real-world Face Restoration https://github.com/TencentARC/GFPGAN
  8. CodeFormer: Robust Blind Face Restoration https://github.com/sczhou/CodeFormer

Related Links

If you are interested in avatar and digital human, we would also like to recommend you to check out other related collections:

awesome-avatar's People

Contributors

jason-cs18 avatar supergoodgame avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.