Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalized Head Movement From Short Video and Speech Signal" (TMM 2022)

Home Page: https://ieeexplore.ieee.org/document/9894719

Python 92.84% MATLAB 0.54% Starlark 0.43% C++ 6.17% Shell 0.02%

audio-driven-talkingface-headpose's Introduction

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

We provide PyTorch implementations for our arxiv paper "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"(http://arxiv.org/abs/2002.10137), and our IEEE TMM paper "Predicting Personalized Head Movement From Short Video and Speech Signal" (https://ieeexplore.ieee.org/document/9894719).

Note that this code is protected under patent. It is for research purposes only at your university (research institution) only. If you are interested in business purposes/for-profit use, please contact Prof.Liu (the corresponding author, email: [email protected]).

We provide a demo video here (please search for "Talking Face" in this page and click the "demo video" button).

Colab

Our Proposed Framework

Prerequisites

Linux or macOS
NVIDIA GPU
Python 3
MATLAB

Getting Started

Installation

You can create a virtual env, and install all the dependencies by

pip install -r requirements.txt

Download pre-trained models

Including pre-trained general models and models needed for face reconstruction, identity feature extraction etc
Download from BaiduYun(extract code:usdm) or GoogleDrive and copy to corresponding subfolders (Audio, Deep3DFaceReconstruction, render-to-video).

Download face model for 3d face reconstruction

Download Basel Face Model from https://faces.dmi.unibas.ch/bfm/main.php?nav=1-0&id=basel_face_model, and copy 01_MorphableModel.mat to Deep3DFaceReconstruction/BFM folder
Download Expression Basis from CoarseData of Guo et al., and copy Exp_Pca.bin to Deep3DFaceReconstruction/BFM folder

Fine-tune on a target peron's short video

1. Prepare a talking face video that satisfies: 1) contains a single person, 2) 25 fps, 3) longer than 12 seconds, 4) without large body translation (e.g. move from the left to the right of the screen). An example is here. Rename the video to [person_id].mp4 (e.g. 1.mp4) and copy to Data subfolder.

Note: You can make a video to 25 fps by

ffmpeg -i xxx.mp4 -r 25 xxx1.mp4

1. Extract frames and lanmarks by

cd Data/
python extract_frame1.py [person_id].mp4

1. Conduct 3D face reconstruction. First should compile code in Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/kernels to .so, following its readme, and modify line 28 in rasterize_triangles.py to your directory. Then run

cd Deep3DFaceReconstruction/
CUDA_VISIBLE_DEVICES=0 python demo_19news.py ../Data/[person_id]

This process takes about 2 minutes on a Titan Xp.

1. Fine-tune the audio network. First modify line 28 in rasterize_triangles.py to your directory. Then run

cd Audio/code/
python train_19news_1.py [person_id] [gpu_id]

The saved models are in Audio/model/atcnet_pose0_con3/[person_id]. This process takes about 5 minutes on a Titan Xp.

1. Fine-tune the gan network. Run

cd render-to-video/
python train_19news_1.py [person_id] [gpu_id]

The saved models are in render-to-video/checkpoints/memory_seq_p2p/[person_id]. This process takes about 40 minutes on a Titan Xp.

Test on a target peron

Place the audio file (.wav or .mp3) for test under Audio/audio/. Run [with generated poses]

cd Audio/code/
python test_personalized.py [audio] [person_id] [gpu_id]

or [with poses from short video]

cd Audio/code/
python test_personalized2.py [audio] [person_id] [gpu_id]

This program will print 'saved to xxx.mov' if the videos are successfully generated. It will output 2 movs, one is a video with face only (_full9.mov), the other is a video with background (_transbigbg.mov).

Colab

A colab demo is here.

Citation

If you use this code for your research, please cite our papers:

@article{yi2020audio,
  title     = {Audio-driven talking face video generation with learning-based personalized head pose},
  author    = {Yi, Ran and Ye, Zipeng and Zhang, Juyong and Bao, Hujun and Liu, Yong-Jin},
  journal   = {arXiv preprint arXiv:2002.10137},
  year      = {2020}
}

@article{YiYSZZWBL22,
  title     = {Predicting Personalized Head Movement From Short Video and Speech Signal},
  author    = {Yi, Ran and Ye, Zipeng and Sun, Zhiyao and Zhang, Juyong and Zhang, Guoxin and Wan, Pengfei and Bao, Hujun and Liu, Yong-Jin},
  journal   = {IEEE Transactions on Multimedia}, 
  volume    = {},
  number    = {},
  pages     = {1-13},
  doi       = {10.1109/TMM.2022.3207606}
}

Acknowledgments

The face reconstruction code is from Deep3DFaceReconstruction, the arcface code is from insightface, the gan code is developed based on pytorch-CycleGAN-and-pix2pix.

audio-driven-talkingface-headpose's People

Contributors

Stargazers

Watchers

Forkers

konatasick ak9250 johndpope aihill eyebies marcelomata lelechen63 jjandnn paolo626 gengcauwong calculusoflambdas abhmalik fulin2019 baldrlector xiahuadong1981 uzboy dongdong4fei ericustc iamleon121 yes7rose phillip1029 wqs111000 mramikoujan-sc cuijianzhu pableeto ggsonic bruinxiong joodykimdev abbyvon songluchuan ankushbikkasani shawn-zsy jokecorleone chenchy warhammer0 maybeee18 guidewsp lezasantaizi drolai praveen-ait zhaoyuehrb anantyash9 wowowos syedrz tuantran1810 mitchellx sshuster luochengleo chilydream peterzs zhangcong22 davemgl evildonkey420 azuki-miho yuzhou164 sodapeter anhqttruong jaedukseo stephengao derrick-xwp progcsv yangtao19920109 wmonica lakithasahan hypox64 wode123 birdflies houlin zhanchao019 manhcuongk55 c00renut jaehyun-ko arongsamuel andersonyangoh chaiyujin hhhhwb yuri6037 541607120115 pranavmistry maddigit awanit512 zzitaileo rishabbjain zerrui zebra-media c1a1o1 mathpopo ethanoool suzhenwang86 boragocode jiali1993 macroustc ayankumarbhunia peterzhousz killsking yamahigashi yihe1003 lixxstudio ishine janfschr

audio-driven-talkingface-headpose's Issues

RuntimeError: Unable to open ../Deep3DFaceReconstruction/shape_predictor_68_face_landmarks.dat

RuntimeError: Unable to open ../Deep3DFaceReconstruction/shape_predictor_68_face_landmarks.dat
where is the shape_predictor_68_face_landmarks.dat? it is not in Deep3DFaceReconstruction.

model.bind(data_shapes=[('data', (1, 3, image_size[0], image_size[1]))]) int Step V (Fine-tune the gan network) failed.

I encountered an error while running the step v:
cd render-to-video/
python train_19news_1.py [person_id] [gpu_id]

The program was killed by the system when running the following line:
model.bind(data_shapes=[('data', (1, 3, image_size[0], image_size[1]))])

No such file or directory

您好，我在colab上运行您readme中给的代码。
1.执行!cd render-to-video/; python train_19news_1.py 1 0
没有问题，有到60个epoch的模型文件，最后的提示为：processing (0095)-th image... ['/content/drive/My Drive/GAN/Audio-driven-TalkingFace-HeadPose/render-to-video/../Deep3DFaceReconstruction/output/render/19_news/1/bm/frame395_renderold_bm.png']
2.执行测试：!cd Audio/code/; python test_personalized.py 5_00006 1 0
问题1：
在test_memory.py中测试了N = dataset.len()的大小为0；
问题2：
提示：
cp: cannot stat '../../render-to-video/results/memory_seq_p2p/1/test_60/imagesrseq_1_5_00006_full9//R_1_reassign2-00002_blend2_fake.png': No such file or directory
cp: cannot stat '../../render-to-video/results/memory_seq_p2p/1/test_60/imagesrseq_1_5_00006_full9//R_1_reassign2-00002_blend2_fake.png': No such file or directory
Traceback (most recent call last):
File "test_personalized.py", line 102, in
os.remove(video_name)
FileNotFoundError: [Errno 2] No such file or directory: '../results/atcnet_pose0_con3/1/5_00006_99/1_5_00006wav_results_full9.mp4'

说明：使用的您最新的visualizer.py文件。
请问问题可能是出在什么地方呢？

Host model into another site

Everyone outside of China cannot download the pretrained models. :/

undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs

Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/60fa6525d50100b7b5f2200eb3a7cc53/sandbox/processwrapper-sandbox/48/execroot/tf_mesh_renderer/bazel-out/k8-fastbuild/bin/mesh_renderer/mesh_renderer_test.runfiles/tf_mesh_renderer/mesh_renderer/mesh_renderer_test.py", line 26, in
import mesh_renderer
File "/root/audio2/Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/mesh_renderer.py", line 24, in
import rasterize_triangles
File "/root/audio2/Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/rasterize_triangles.py", line 29, in
'tf_mesh_renderer/mesh_renderer/kernels/rasterize_triangles_kernel.so'))
File "/root/anaconda3/envs/audio2/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: /root/audio2/Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/kernels/rasterize_triangles_kernel.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrESs

请问有此问题的解决方法吗？
tf=1.14.0，已链接.so文件
尝试过更换bazel版本和gcc版本，未解决
更改-D_GLIBCXX_USE_CXX11_ABI=1，未解决

Can anyone offer a solution？
tf=1.14.0，i have changed bazel version and GCC version , but failed.
set -D_GLIBCXX_USE_CXX11_ABI=1, failed either

I make a better one [https://github.com/WIKI2020/FacePose_pytorch]

FileNotFoundError: [Errno 2] No such file or directory: 'arcface/iden_feat/19_news/xxx/framexx.npy'

Thanks for your great work!
I meet some problem when I try to run step 5
cd render-to-video/
python train_19news_1.py [person_id] [gpu_id]

the errors are
FileNotFoundError: [Errno 2] No such file or directory: 'arcface/iden_feat/19_news/311/frame79.npy'(the number seems be random……)
and
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/memory_seq_p2p/60_net_G.pth'`

Maybe errors in training leading the test error…But i have no idea how to solve it…
Can you give me some help?

the full report:

19_news/311 311_bmold_win3

                                                         < M A T L A B (R) >
                                               Copyright 1984-2018 The MathWorks, Inc.
                                               R2018a (9.4.0.813654) 64-bit (glnxa64)
                                                          February 23, 2018

For online documentation, see http://www.mathworks.com/support
For product information, visit www.mathworks.com.

Elapsed time is 17.994871 seconds.
Illegal instruction (core dumped)
Illegal instruction (core dumped)
----------------- Options ---------------
Nw: 3
alpha: 0.3
attention: 1
batch_size: 1
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: True [default: False]
crop_size: 256
dataroot: 311_bmold_win3 [default: None]
dataset_mode: aligned_feature_multi
direction: AtoB
display_env: memory_seq_311 [default: main]
display_freq: 400
display_id: 0
display_ncols: 4
display_port: 8097
display_server: http://localhost
display_winsize: 256
do_saturate_mask: False
epoch: 0 [default: latest]
epoch_count: 1
gan_mode: vanilla
gpu_ids: 0
iden_feat_dim: 512
iden_feat_dir: arcface/iden_feat/
iden_thres: 0.98
init_gain: 0.02
init_type: normal
input_nc: 3
isTrain: True [default: None]
lambda_L1: 100.0
lambda_mask: 2.0 [default: 0.1]
lambda_mask_smooth: 1e-05
load_iter: 0 [default: 0]
load_size: 286
lr: 0.0001 [default: 0.0002]
lr_decay_iters: 50
lr_policy: linear
max_dataset_size: inf
mem_size: 30000
model: memory_seq [default: cycle_gan]
n_layers_D: 3
name: memory_seq_p2p/311 [default: experiment_name]
ndf: 64
netD: basic
netG: unetac_adain_256
ngf: 64
niter: 60 [default: 100]
niter_decay: 0 [default: 100]
no_dropout: False
no_flip: False
no_html: False
norm: batch
num_threads: 4
output_nc: 3
phase: train
pool_size: 0
preprocess: resize_and_crop
print_freq: 100
resizemethod: lanczos
save_by_iter: False
save_epoch_freq: 5
save_latest_freq: 5000
serial_batches: False
spatial_feat_dim: 512
suffix:
top_k: 256
update_html_freq: 1000
verbose: False
----------------- End -------------------
dataset [AlignedFeatureMultiDataset] was created
The number of training images = 298
initialize network with normal
initialize network with normal
model [MemorySeqModel] was created
loading the model from ./checkpoints/memory_seq_p2p/0_net_G.pth
loading the model from ./checkpoints/memory_seq_p2p/0_net_D.pth
loading the model from ./checkpoints/memory_seq_p2p/0_net_mem.pth
---------- Networks initialized -------------
[Network G] Total number of parameters : 259.056 M
[Network D] Total number of parameters : 2.775 M
[Network mem] Total number of parameters : 11.952 M

create web directory ./checkpoints/memory_seq_p2p/311/web...
Traceback (most recent call last):
File "train.py", line 45, in
for i, data in enumerate(dataset): # inner loop within one epoch
File "/Audio-driven-TalkingFace-HeadPose-master/render-to-video/data/init.py", line 90, in iter
for i, data in enumerate(self.dataloader):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/Audio-driven-TalkingFace-HeadPose-master/render-to-video/data/aligned_feature_multi_dataset.py", line 94, in getitem
B_feat = np.load(os.path.join(self.opt.iden_feat_dir,ss[-3],ss[-2],ss[-1][:-4]+'.npy'))
File "/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py", line 422, in load
fid = open(os_fspath(file), "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'arcface/iden_feat/19_news/311/frame79.npy'

----------------- Options ---------------
Nw: 3
alpha: 0.3
aspect_ratio: 1.0
attention: 1
batch_size: 1
blinkframeid: 41
checkpoints_dir: ./checkpoints
crop_size: 256
dataroot: 311_bmold_win3 [default: None]
dataset_mode: aligned_feature_multi
direction: AtoB
display_winsize: 256
do_saturate_mask: False
epoch: 60 [default: latest]
eval: False
gpu_ids: 0
iden_feat_dim: 512
iden_feat_dir: arcface/iden_feat/
iden_thres: 0.98
imagefolder: images60 [default: images]
init_gain: 0.02
init_type: normal
input_nc: 3
isTrain: False [default: None]
load_iter: 0 [default: 0]
load_size: 256
max_dataset_size: inf
mem_size: 30000
model: memory_seq [default: test]
n: 26
n_layers_D: 3
name: memory_seq_p2p/311 [default: experiment_name]
ndf: 64
netD: basic
netG: unetac_adain_256
ngf: 64
no_dropout: False
no_flip: False
norm: batch
ntest: inf
num_test: 200 [default: 50]
num_threads: 4
output_nc: 3
phase: test
preprocess: resize_and_crop
resizemethod: lanczos
results_dir: ./results/
serial_batches: False
spatial_feat_dim: 512
suffix:
test_batch_list:
test_use_gt: 0
top_k: 256
verbose: False
----------------- End -------------------
dataset [AlignedFeatureMultiDataset] was created
initialize network with normal
model [MemorySeqModel] was created
loading the model from ./checkpoints/memory_seq_p2p/60_net_G.pth
Traceback (most recent call last):
File "test.py", line 47, in
model.setup(opt) # regular setup: load and print networks; create schedulers
File "/Audio-driven-TalkingFace-HeadPose-master/render-to-video/models/base_model.py", line 89, in setup
self.load_networks(load_suffix)
File "/Audio-driven-TalkingFace-HeadPose-master/render-to-video/models/base_model.py", line 202, in load_networks
state_dict = torch.load(load_path, map_location=str(self.device))
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 381, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/memory_seq_p2p/60_net_G.pth'

FileNotFoundError: [Errno 2] No such file or directory: '/Audio_driven_TalkingFace_HeadPose_master/Audio/dataset/coeff_train.pkl'

Retrain the LSTM network with LRW-1000

I would like to use the LRW-1000 dataset to train the model, thereby generating talking head videos with Chinese audios, but I only find the fine-tune code instead of any training phase of the LSTM network.
Could you explain how I can retrain the LSTM network using LRW-1000 and if possible what are the procedures to achieve what I mentioned above?
Thank you very much!

build tf_mesh_renderer appear unable to load packages ?

ERROR: /home/puaiuc/opensources/Audio-driven-TalkingFace-HeadPose-master/Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/kernels/BUILD:7:1: error loading package '@com_google_googletest//': Extension file not found. Unable to load package for '@rules_cc//cc:defs.bzl': The repository could not be resolved and referenced by '//mesh_renderer/kernels:rasterize_triangles_impl_test'
ERROR: Analysis of target '//mesh_renderer/kernels:rasterize_triangles_impl_test' failed; build aborted: error loading package '@com_google_googletest//': Extension file not found. Unable to load package for '@rules_cc//cc:defs.bzl': The repository could not be resolved
INFO: Elapsed time: 0.158s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 2 targets configured)
FAILED: Build did NOT complete successfully (0 packages loaded, 2 targets configured)

I compiled it on my own server.
bazel version: 0.19.2
tensorflow version: 1.13.2
Both bazel and tf are source compiled.
I've been troubled for a long time, I don't know how to solve it.Help me, please.Thanks.

build tf_mesh_renderer error

When I bulit tf_mesh_renderer using runtests.sh, it failed:

Can you give me some help?
I use two codes to build bazel-2.2.0 as follows:
chmod +x bazel-<version>-installer-linux-x86_64.sh
./bazel-<version>-installer-linux-x86_64.sh --user

why You make a video to 25 fps ,how about 50fps？

RasterizeTriangles expects vertices to have shape (-1, 3).

Hello, thank you for opening up such a good job. I found the following problems during study, can you tell me a solution. Thanks you.

Use tf.where in 2.0, which has the same broadcast rule as np.where
2021-03-30 16:08:24.343185: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
Traceback (most recent call last):
File "/home/research/weiwenqi/anaconda3/envs/snowflake_weiwenqi/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/research/weiwenqi/anaconda3/envs/snowflake_weiwenqi/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/research/weiwenqi/anaconda3/envs/snowflake_weiwenqi/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: RasterizeTriangles expects vertices to have shape (-1, 3).
[[{{node while/RasterizeTriangles}}]]

关于嘴唇同步

已经跑通，并看到不错展示的效果。谢谢！
我跑的样例生成的视频中嘴唇的同步似乎不是很好，从论文中看，嘴唇同步的评分也不高，不知是否可以通过增加训练数据或者其他方法进行优化？

我在colab安装好环境之后,运行!python test_personalized.py 5_00006 31 0 实现第一个demo发现它并不能工作，请问作者有什么需要注意的吗？

python atcnet_test1.py --device_ids 1 --model_name ../model/atcnet_pose0_con3/31/atcnet_lstm_99.pth --pose 1 --relativeframe 0 --sample_dir ../results/atcnet_pose0_con3/31/5_00006_99 --in_file ../audio/5_00006.wav
device 1
Traceback (most recent call last):
File "atcnet_test1.py", line 117, in
test()
File "atcnet_test1.py", line 85, in test
state_dict = multi2single(config.model_name, 0)
File "atcnet_test1.py", line 22, in multi2single
checkpoint = torch.load(model_path)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 382, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '../model/atcnet_pose0_con3/31/atcnet_lstm_99.pth'
choose_bg_gexinghua2 19_news/31 5_00006 0 atcnet_pose0_con3/31/5_00006_99
../results/chosenbg/5_00006_19_news/31_atcnet_pose0_con3_31_5_00006_99/reassign
Traceback (most recent call last):
File "test_personalized.py", line 66, in
bgdir = choose_bg_gexinghua2_reassign2('19_news/'+person, audiobasen, start, audiomodel, num=num, tran=pingyi, speed=speed)
File "/content/drive/Shared drives/masterWork/Audio2Vdeio/Audio-driven-TalkingFace-HeadPose-master/Audio/code/choose_bg_gexinghua2_reassign.py", line 116, in choose_bg_gexinghua2_reassign2
os.makedirs(tardir2)
File "/usr/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
File "/usr/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
File "/usr/lib/python3.6/os.py", line 210, in makedirs
makedirs(head, mode, exist_ok)
File "/usr/lib/python3.6/os.py", line 220, in makedirs
mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '../results/chosenbg'

合成脸抖动比较大

感谢作者分享，我发现合成的视频脸部抖动比较大，显得不真实。另外我看到这篇文章Everybody’s Talkin’: Let Me Talk as You Want，和你的架构很相似，是否可以参考改进抖动问题

extend to a single image

would it be possible to extend the currently architecture to handle head movements and lip sync from a single image similar to https://github.com/Rudrabha/LipGAN but keeping natural head poses

关于GAN的训练问题

您好，非常感谢作者的分享。我正在尝试GAN的训练，但我不知道训练所需要的目标如何确定。用cycleGAN进行训练需要一组不真实的图片和另一组真实的图片，我希望这两组图片是一一对应的。所以，我准备了一些由3d人脸和背景合成的图片，这些图片是不真实的，但是我不知道要怎样得到对应的另一组真实图片。
请问我要如何确定另一组训练数据呢？

关于效果抖动问题

您好，我成功运行了您的项目，但是生成结果抖动得比较厉害，请问您是否知道这个情况有可能是哪个步骤或是哪个模块引起的呢？

Colab not working, how to run in colab?

I am unable run in local machine and have problem with blazer, when i try use google colab it`s not working also, blazer only pass first test, also when i run !CUDA_VISIBLE_DEVICES=0 python demo_19news.py ../Data/[person id]
i get error
Traceback (most recent call last):
File "demo_19news.py", line 1, in
import tensorflow as tf
File "/usr/local/lib/python3.6/dist-packages/tensorflow/init.py", line 28, in
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/init.py", line 83, in
from tensorflow.python import keras
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/init.py", line 26, in
from tensorflow.python.keras import activations
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/activations.py", line 24, in
from tensorflow.python.keras.utils.generic_utils import deserialize_keras_object
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/init.py", line 39, in
from tensorflow.python.keras.utils.multi_gpu_utils import multi_gpu_model
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/multi_gpu_utils.py", line 22, in
from tensorflow.python.keras.engine.training import Model
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py", line 40, in
from tensorflow.python.keras.engine import network
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py", line 39, in
from tensorflow.python.keras import saving
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/saving/init.py", line 33, in
from tensorflow.python.keras.saving.saved_model import export_saved_model
ImportError: cannot import name 'export_saved_model'

Cannot find transbig.npy

Please tell me which steps generate this file

How to build .so?

(1)Hello, the procedure(https://github.com/yiranran/Audio-driven-TalkingFace-HeadPose/blob/master/Deep3DFaceReconstruction/tf_mesh_renderer/README.md) for compiling the .so is not very detailed. Please tell me how to compile the corresponding .so in detail. The following figure shows the picture of the undefined character when I run the command directly, because the latest .so file is not compiled.

(2)The link https://github.com/yiranran/Audio-driven-TalkingFace-HeadPose/blob/master/Deep3DFaceReconstruction/tf_mesh_renderer/README.md introduce how to build so, but how to get runtests.sh?

Can't find op.h file when trying to run locally as well as in the Google Colab demo

I am up to the part to construct the 3D face reconstruction (step iii in readme, Build tf_mesh_renderer step in the Google Colab demo). I am trying to run this codeblock from the Google Colab demo and it failed. Here is the output:

rasterize_triangles_grad.cc:18:10: fatal error: tensorflow/core/framework/op.h: No such file or directory
 #include "tensorflow/core/framework/op.h"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
rasterize_triangles_op.cc:19:10: fatal error: tensorflow/core/framework/op.h: No such file or directory
 #include "tensorflow/core/framework/op.h"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

I tried looking in the tensorflow/core/framework part of the path and the op.h file was nowhere to be found. The closest thing to that is op_def_pb2.py.

Here is the original code in that codeblock from the Colab demo:

!cp /usr/local/lib/python3.6/dist-packages/tensorflow/libtensorflow_framework.so.1 /usr/lib/
!cd /usr/lib/ && ln -s libtensorflow_framework.so.1 libtensorflow_framework.so
!cd Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/kernels/;\
  g++ -std=c++11 -shared rasterize_triangles_grad.cc rasterize_triangles_op.cc rasterize_triangles_impl.cc rasterize_triangles_impl.h -o rasterize_triangles_kernel.so -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -I /usr/local/lib/python3.6/dist-packages/tensorflow/include -I /usr/local/lib/python3.6/dist-packages/tensorflow/include/external/nsync/public -L /usr/local/lib/python3.6/dist-packages/tensorflow -ltensorflow_framework -O2

I am using Ubuntu 20.04.2.0 LTS through VirtualBox, Python 3.8, and Miniconda3. I am using pip to install packages and pip installs them in site-packages instead of dist-packages. There is a /usr/lib/local/python3.8/dist-packages path but the dist-packages directory is empty.

I was able to create the symbolic link from the first 2 commands, however, I had to use libtensorflow_framework.so.2. The original code from the Colab demo specified for libtensorflow_framework.so.1 but that file was nowhere to be found. Only libtensorflow_framework.so.2 was there.

!cp /home/miniconda3/lib/python3.8/site-packages/tensorflow/libtensorflow_framework.so.2 /home/miniconda3/lib/
!cd /home/miniconda3/lib/ && ln -s libtensorflow_framework.so.2 libtensorflow_framework.so
!cd Deep3DFaceReconstruction/tf_mesh_renderer/mesh_renderer/kernels/;\
  g++ -std=c++11 -shared rasterize_triangles_grad.cc rasterize_triangles_op.cc rasterize_triangles_impl.cc rasterize_triangles_impl.h -o rasterize_triangles_kernel.so -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -I /home/miniconda3/lib/python3.8/site-packages/tensorflow/include -I /home/miniconda3/lib/python3.8/site-packages/tensorflow/include/external/nsync/public -L /home/miniconda3/lib/python3.8/site-packages/tensorflow/ -ltensorflow_framework -O2

Anyone have an idea of what happened and/or where the op.h file is?

test_personalized.py 和 test_personalized2.py有啥区别？

    请问这两个版本的流程上有什么区别呢？看着仿佛test_personalized2没有调用 choose_bg_gexinghua2_reassign这个东东了，是简化了整体流程的版本吗？效果一样？
    另外整个工程都假定person是一个整数，一般都是字符串的吧？设置成数值类型有什么特殊原因吗？
   280毫秒的预测输入长度有什么原因吗？为什么是7帧长度，而不是1帧？或者为什么不是5帧6帧？
   atcnet_test1.py 里边有一个 multi2single函数，不知道什么意思呢？看着就是加载模型啊，为什么是multi2single呢？跟280毫秒的预测输入长度有关系吗?

谢谢！

how to transfer alpha_blend_newsold.m to python?

No such file when fine tuning GAN

FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/memory_seq_p2p/60_net_G.pth'

Error

FileNotFoundError: [Errno 2] No such file or directory: '/home/yugaljain03/audio_face_animation/audio_driven_headpose/Audio-driven-TalkingFace-HeadPose/render-to-video/../Deep3DFaceReconstruction/output/render/19_news/33/bm/frame18_renderold_bm.png'

@yiranran I am unable to find rederold_bm.png files in bm folder. What script I should run to generate these files?
Please help me to solve this..

PS - II am running demo colab file which is mentioned in README and I already installed octave as per given instructions in demo colab link in Readme..
Thanks

RuntimeError: Error opening '../../Data/31.mp4': File contains data in an unknown format.

请问这个错误是由于什么引起的？请问下我该怎么解决？

运行./runtests.sh报错

你好，我按照步骤运行(tensorflow)$ ./runtests.sh,这步时，报错，

(tensorflow) [liu@no3@node05 tf_mesh_renderer]$ ./runtests.sh
./runtests.sh: line 2: bazel: command not found
这是runtests.sh问题吗

basic information about code

I run `runtest.sh', appeared errors

Extracting Bazel installation...
Starting local Bazel server and connecting to it...
WARNING: Download from https://mirror.bazel.build/github.com/bazelbuild/rules_cc/archive/8bd6cd75d03c01bb82561a96d9c1f9f7157b13d0.zip failed: class java.io.IOException connect timed out
WARNING: Download from https://mirror.bazel.build/github.com/bazelbuild/rules_java/archive/7cf3cefd652008d0a64a419c34c13bdca6c8f178.zip failed: class java.io.IOException connect timed out
INFO: SHA256 (https://github.com/google/googletest/archive/master.zip) = dc68f063f82052444a11186a476f3ff02aeb88038e7390229ce4b773bd4ea158
DEBUG: Rule 'com_google_googletest' indicated that a canonical reproducible form can be obtained by modifying arguments sha256 = "dc68f063f82052444a11186a476f3ff02aeb88038e7390229ce4b773bd4ea158"
DEBUG: Repository com_google_googletest instantiated at:
  no stack (--record_rule_instantiation_callstack not enabled)
Repository rule http_archive defined at:
  /home/puaiuc/.cache/bazel/_bazel_puaiuc/dbf87e3e58b07c8f4c9b973173f78069/external/bazel_tools/tools/build_defs/repo/http.bzl:336:16: in <toplevel>
INFO: Repository remote_coverage_tools instantiated at:
  no stack (--record_rule_instantiation_callstack not enabled)
Repository rule http_archive defined at:
  /home/puaiuc/.cache/bazel/_bazel_puaiuc/dbf87e3e58b07c8f4c9b973173f78069/external/bazel_tools/tools/build_defs/repo/http.bzl:336:16: in <toplevel>
WARNING: Download from https://mirror.bazel.build/bazel_coverage_output_generator/releases/coverage_output_generator-v2.1.zip failed: class java.io.IOException connect timed out
ERROR: An error occurred during the fetch of repository 'remote_coverage_tools':
   java.io.IOException: Error downloading [https://mirror.bazel.build/bazel_coverage_output_generator/releases/coverage_output_generator-v2.1.zip] to /home/puaiuc/.cache/bazel/_bazel_puaiuc/dbf87e3e58b07c8f4c9b973173f78069/external/remote_coverage_tools/coverage_output_generator-v2.1.zip: connect timed out
ERROR: /home/puaiuc/.cache/bazel/_bazel_puaiuc/dbf87e3e58b07c8f4c9b973173f78069/external/bazel_tools/tools/test/BUILD:36:1: @bazel_tools//tools/test:coverage_report_generator depends on @remote_coverage_tools//:coverage_report_generator in repository @remote_coverage_tools which failed to fetch. no such package '@remote_coverage_tools//': java.io.IOException: Error downloading [https://mirror.bazel.build/bazel_coverage_output_generator/releases/coverage_output_generator-v2.1.zip] to /home/puaiuc/.cache/bazel/_bazel_puaiuc/dbf87e3e58b07c8f4c9b973173f78069/external/remote_coverage_tools/coverage_output_generator-v2.1.zip: connect timed out
ERROR: Analysis of target '//mesh_renderer:mesh_renderer_test' failed; build aborted: Analysis failed

Can you help me,thanks.

关于Finetune gan步骤的问题

您好，我在自己的机器上训练Finetune gan的时候遇到了imglist len 0的问题。

导致在后续的features上也报错

猜测可能是因为dataset没有准备好而导致的。
想请问一下我这种问题该怎么解决呢？dataset是如何设置的？有相关的说明吗？

Sorry, I hit the wrong button, please ignore

No 60_net_G.pth file

Running this project on colab
Getting error on running train19_news1.py
here is colab https://colab.research.google.com/drive/1FXoqSLC_y6UpDDcbxGefwMDBh9UTipvZ

And error

/content/Audio-driven-TalkingFace-HeadPose/render-to-video
19_news/11 11_bmold_win3
sh: 1: matlab: not found
loading models/model-r100-ii/model 0
[20:08:28] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v1.2.0. Attempting to upgrade...
[20:08:28] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!

Segmentation fault: 11

Stack trace:
  [bt] (0) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x3c27360) [0x7f214559b360]
loading models/model-r100-ii/model 0
[20:08:30] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v1.2.0. Attempting to upgrade...
[20:08:30] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!

Segmentation fault: 11

Stack trace:
  [bt] (0) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x3c27360) [0x7f5400af1360]
----------------- Options ---------------
                       Nw: 3                             
                    alpha: 0.3                           
                attention: 1                             
               batch_size: 1                             
                    beta1: 0.5                           
          checkpoints_dir: ./checkpoints                 
           continue_train: True                          	[default: False]
                crop_size: 256                           
                 dataroot: 11_bmold_win3                 	[default: None]
             dataset_mode: aligned_feature_multi         
                direction: AtoB                          
              display_env: memory_seq_11                 	[default: main]
             display_freq: 400                           
               display_id: 1                             
            display_ncols: 4                             
             display_port: 8097                          
           display_server: http://localhost              
          display_winsize: 256                           
         do_saturate_mask: False                         
                    epoch: 0                             	[default: latest]
              epoch_count: 1                             
                 gan_mode: vanilla                       
                  gpu_ids: 0                             
            iden_feat_dim: 512                           
            iden_feat_dir: arcface/iden_feat/            
               iden_thres: 0.98                          
                init_gain: 0.02                          
                init_type: normal                        
                 input_nc: 3                             
                  isTrain: True                          	[default: None]
                lambda_L1: 100.0                         
              lambda_mask: 2.0                           	[default: 0.1]
       lambda_mask_smooth: 1e-05                         
                load_iter: 0                             	[default: 0]
                load_size: 286                           
                       lr: 0.0001                        	[default: 0.0002]
           lr_decay_iters: 50                            
                lr_policy: linear                        
         max_dataset_size: inf                           
                 mem_size: 30000                         
                    model: memory_seq                    	[default: cycle_gan]
               n_layers_D: 3                             
                     name: memory_seq_p2p/11             	[default: experiment_name]
                      ndf: 64                            
                     netD: basic                         
                     netG: unetac_adain_256              
                      ngf: 64                            
                    niter: 60                            	[default: 100]
              niter_decay: 0                             	[default: 100]
               no_dropout: False                         
                  no_flip: False                         
                  no_html: False                         
                     norm: batch                         
              num_threads: 4                             
                output_nc: 3                             
                    phase: train                         
                pool_size: 0                             
               preprocess: resize_and_crop               
               print_freq: 100                           
             resizemethod: lanczos                       
             save_by_iter: False                         
          save_epoch_freq: 5                             
         save_latest_freq: 5000                          
           serial_batches: False                         
         spatial_feat_dim: 512                           
                   suffix:                               
                    top_k: 256                           
         update_html_freq: 1000                          
                  verbose: False                         
----------------- End -------------------
dataset [AlignedFeatureMultiDataset] was created
The number of training images = 298
initialize network with normal
initialize network with normal
model [MemorySeqModel] was created
loading the model from ./checkpoints/memory_seq_p2p/0_net_G.pth
loading the model from ./checkpoints/memory_seq_p2p/0_net_D.pth
loading the model from ./checkpoints/memory_seq_p2p/0_net_mem.pth
---------- Networks initialized -------------
[Network G] Total number of parameters : 259.056 M
[Network D] Total number of parameters : 2.775 M
[Network mem] Total number of parameters : 11.952 M
-----------------------------------------------
Setting up a new session...
Exception in user code:
------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connection.py", line 159, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py", line 80, in create_connection
    raise err
  File "/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py", line 70, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.6/http/client.py", line 1254, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1300, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1249, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.6/http/client.py", line 1036, in _send_output
    self.send(msg)
  File "/usr/lib/python3.6/http/client.py", line 974, in send
    self.connect()
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connection.py", line 181, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connection.py", line 168, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f506009f198>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/memory_seq_11 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f506009f198>: Failed to establish a new connection: [Errno 111] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/visdom/__init__.py", line 711, in _send
    data=json.dumps(msg),
  File "/usr/local/lib/python3.6/dist-packages/visdom/__init__.py", line 677, in _handle_post
    r = self.session.post(url, data=data)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 581, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8097): Max retries exceeded with url: /env/memory_seq_11 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f506009f198>: Failed to establish a new connection: [Errno 111] Connection refused',))
[Errno 99] Cannot assign requested address
[Errno 99] Cannot assign requested address
[Errno 99] Cannot assign requested address
Visdom python client failed to establish socket to get messages from the server. This feature is optional and can be disabled by initializing Visdom with `use_incoming_socket=False`, which will prevent waiting for this request to timeout.


Could not connect to Visdom server. 
 Trying to start a server....
Command: /usr/bin/python3 -m visdom.server -p 8097 &>/dev/null &
create web directory ./checkpoints/memory_seq_p2p/11/web...
Traceback (most recent call last):
  File "train.py", line 45, in <module>
    for i, data in enumerate(dataset):  # inner loop within one epoch
  File "/content/Audio-driven-TalkingFace-HeadPose/render-to-video/data/__init__.py", line 90, in __iter__
    for i, data in enumerate(self.dataloader):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/Audio-driven-TalkingFace-HeadPose/render-to-video/data/aligned_feature_multi_dataset.py", line 54, in __getitem__
    A = Image.open(AB_path).convert('RGB')
  File "/usr/local/lib/python3.6/dist-packages/PIL/Image.py", line 2809, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/content/Audio-driven-TalkingFace-HeadPose/render-to-video/../Deep3DFaceReconstruction/output/render/19_news/11/bm/frame111_renderold_bm.png'

----------------- Options ---------------
                       Nw: 3                             
                    alpha: 0.3                           
             aspect_ratio: 1.0                           
                attention: 1                             
               batch_size: 1                             
             blinkframeid: 41                            
          checkpoints_dir: ./checkpoints                 
                crop_size: 256                           
                 dataroot: 11_bmold_win3                 	[default: None]
             dataset_mode: aligned_feature_multi         
                direction: AtoB                          
          display_winsize: 256                           
         do_saturate_mask: False                         
                    epoch: 60                            	[default: latest]
                     eval: False                         
                  gpu_ids: 0                             
            iden_feat_dim: 512                           
            iden_feat_dir: arcface/iden_feat/            
               iden_thres: 0.98                          
              imagefolder: images60                      	[default: images]
                init_gain: 0.02                          
                init_type: normal                        
                 input_nc: 3                             
                  isTrain: False                         	[default: None]
                load_iter: 0                             	[default: 0]
                load_size: 256                           
         max_dataset_size: inf                           
                 mem_size: 30000                         
                    model: memory_seq                    	[default: test]
                        n: 26                            
               n_layers_D: 3                             
                     name: memory_seq_p2p/11             	[default: experiment_name]
                      ndf: 64                            
                     netD: basic                         
                     netG: unetac_adain_256              
                      ngf: 64                            
               no_dropout: False                         
                  no_flip: False                         
                     norm: batch                         
                    ntest: inf                           
                 num_test: 200                           	[default: 50]
              num_threads: 4                             
                output_nc: 3                             
                    phase: test                          
               preprocess: resize_and_crop               
             resizemethod: lanczos                       
              results_dir: ./results/                    
           serial_batches: False                         
         spatial_feat_dim: 512                           
                   suffix:                               
          test_batch_list:                               
              test_use_gt: 0                             
                    top_k: 256                           
                  verbose: False                         
----------------- End -------------------
dataset [AlignedFeatureMultiDataset] was created
initialize network with normal
model [MemorySeqModel] was created
loading the model from ./checkpoints/memory_seq_p2p/60_net_G.pth
Traceback (most recent call last):
  File "test.py", line 47, in <module>
    model.setup(opt)               # regular setup: load and print networks; create schedulers
  File "/content/Audio-driven-TalkingFace-HeadPose/render-to-video/models/base_model.py", line 89, in setup
    self.load_networks(load_suffix)
  File "/content/Audio-driven-TalkingFace-HeadPose/render-to-video/models/base_model.py", line 202, in load_networks
    state_dict = torch.load(load_path, map_location=str(self.device))
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 525, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 212, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 193, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/memory_seq_p2p/60_net_G.pth'

关于audio

我注意到这个数据集只包含了video，请问这个audio是你们自己手动分离的吗

求助 test_personalized2.py FileNotFoundError: [Errno 2] No such file or directory: '../results/atcnet_pose0_con3/31/03Fsi1831_99/31_03Fsi1831wav_results_full9.mp4'

我在运行python test_personalized2.py 03Fsi1831 31 0 后出现

processing (0105)-th image... ['/home/zhangzhanwang/projects/virtual_host/Audio-driven-TalkingFace-HeadPose/Audio/code/../results/atcnet_pose0_con3/31/03Fsi1831_99/R_31_reassign2/00107_blend2.png'] control 1 ../results/atcnet_pose0_con3/31/03Fsi1831_99/31_03Fsi1831wav_results_full9.mp4 Traceback (most recent call last): File "test_personalized2.py", line 170, in <module> os.remove(video_name) FileNotFoundError: [Errno 2] No such file or directory: '../results/atcnet_pose0_con3/31/03Fsi1831_99/31_03Fsi1831wav_results_full9.mp4'
不知道怎么回事，大佬怎么办？

where's Preprocess

what version matlab should I install

what version of matlab should I install?

我在安装环境时碰到了问题，ERROR: Command errored out with exit status 1:

你好，我在运行test_personalize的时候遇到一个问题，FileNotFoundError

你好，我按照readme的步骤创建了虚拟python环境，然后安装了requirment里的模块．但是在运行test_personalize（python test_personalized.py [5_00006] [31] 0）的时候遇到了一个问题．
FileNotFoundError: [Errno 2] No such file or directory: '../../Deep3DFaceReconstruction/output/coeff/19_news/31/frame0.mat'
请问我应该怎么修改呢？

Unable to reproduce on custom video

I followed the Colab tutorial and could get it running. I am now trying to do the same on a custom video and audio. When fine-tuning audio net, we do

!cd Audio/code/; python train_19news_1.py 32 0

When I run this I get

32 lack frame0.mat
32 lack frame1.mat
32 lack frame2.mat
...
...
32 lack frame298.mat
32 lack frame299.mat
not all 300 frames are reconstructed successfully

My video is called 32.mp4 and I used ffmpeg to make sure it is 25 fps. Other than that I haven't modified anything in the notebook. Where is it going wrong?

Also, if anyone was able to reproduce it on a custom video, do share the notebook

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED while running demo_talkingface

I encountered the below error while running the following cell
!cd Audio/code/; python train_19news_1.py 31 0

The error is while running on the sample video given itself. ('Data/31.mp4').

The full Traceback is as follow:

python atcnet.py --pose 1 --relativeframe 0 --dataset news --newsname 19_news/31 --start 0 --model_dir ../model/atcnet_pose0_con3/31/ --continue_train 1 --lr 0.0001 --less_constrain 1 --smooth_loss 1 --smooth_loss2 1 --model_name ../model/atcnet_lstm_general.pth --sample_dir ../sample/atcnet_pose0_con3/31 --device_ids 0 --max_epochs 100
device 0
---------- Networks initialized -------------
[Network] Total number of parameters : 29.431 M
-----------------------------------------------
Traceback (most recent call last):
  File "atcnet.py", line 328, in <module>
    main(config)
  File "atcnet.py", line 305, in main
    t = trainer.Trainer(config)
  File "/content/Audio-driven-TalkingFace-HeadPose/Audio/code/atcnet.py", line 81, in __init__
    self.generator     = self.generator.cuda()
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 265, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 193, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 127, in _apply
    self.flatten_parameters()
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py", line 123, in flatten_parameters
    self.batch_first, bool(self.bidirectional))
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

How can I overcome this.

tf_mesh_renderer error

Hello, I meet some questions when I build the tf_mesh_renderer. Can u give me some answers?

INFO: Analyzed 11 targets (29 packages loaded, 391 targets configured).
INFO: Found 8 targets and 3 test targets...
ERROR: /home/research/.cache/bazel/_bazel_research/5b8c8045b34c2ba9a89ac5750f4648b0/external/com_google_googletest/BUILD.bazel:67:11: Compiling googletest/src/gtest-matchers.cc failed: (Exit 1): gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 33 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -MD -MF ... (remaining 33 argument(s) skipped)

Thank u for any advice.

Some questions about the dataset of LRW when training

Hi, thanks for the great job.
I have a little question about the training process of the audio to epression and pose mapping part.
While testing there is a finetune process, and I found it in the dataset.py which the class " News_1D_lstm_3dmm_pose" is selected.

I guess when trained with LRW dataset, the corresponding dataset should be "LRW_1D_lstm_3dmm_pose", it is correct or not ?

If it is correct, I saw it there is a random index from here :
"r = random.choice([x for x in range(3,8])",
seems like the mfcc features and expression features are random sampled from time.
So when training , for one sample the features may be not equal because of this random selection?
I just wondering whether will it bring some problems while training ?

Thank you!

cv2.error: OpenCV(4.1.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

I have this problem in step 2 which is extracting the landmark:

Traceback (most recent call last):
File "extract_frame1.py", line 64, in
detect_dir(mp4[:-4])
File "extract_frame1.py", line 47, in detect_dir
detect_image(imagename=file, savepath=file[:-4]+'.txt')
File "extract_frame1.py", line 25, in detect_image
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.error: OpenCV(4.1.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'

Cant find files when executing the test_personalized.py

Hi Yiran,

Firstly, thanks for your good work, especially for the excellent rendering result, it's not a easy solution for 3DMM.

Secondly, I faced a error in the last step of test_personalized.py, as shown in

processing (1245)-th image... ['/home/acm/Audio-driven-TalkingFace-HeadPose/Audio/code/../results/atcnet_pose0_con3/13/ttsbb_99/R_13_reassign2/01247_blend2.png'] control 1 cp: cannot stat '../../render-to-video/results/memory_seq_p2p/13/test_60/imagesrseq_13_ttsbb_full9//R_13_reassign2-00002_blend2_fake.png': No such file or directory cp: cannot stat '../../render-to-video/results/memory_seq_p2p/13/test_60/imagesrseq_13_ttsbb_full9//R_13_reassign2-00002_blend2_fake.png': No such file or directory Traceback (most recent call last): File "test_personalized.py", line 100, in <module> os.remove(video_name) FileNotFoundError: [Errno 2] No such file or directory: '../results/atcnet_pose0_con3/13/ttsbb_99/13_ttsbbwav_results_full9.mp4'
I check the folder ../render-to-video/results/memory_seq_p2p/13/test_60/imagesrseq_13_ttsbb_full9/ and there is no such files. I also review the code but fail to find which steps will create these files, and the previous processes seem to be run successfully.

ps. Conda environment

Low quality results: artifacts/distortion around the face

I tried to fine-tune on a short video of Trump speaking, but the resulting output videos seem to be much lower quality than the ones in your demo, with consistent artifacts/distortions around the face. Any idea what could cause this?

Examples
Screenshot from the full9.mov output: https://i.imgur.com/7cNOWAt.png
Screenshot from the transbigbg.mov output: https://i.imgur.com/zVY6y7D.png

where's '102' come from in Preprocess?

In Preprocess.py，there are some '102' in function Preprocess and process_img. For example:

trans_params = np.array([w0, h0, 102.0 / s, t[0], t[1]])
w = (w0 / s * 102).astype(np.int32)
h = (h0 / s * 102).astype(np.int32)

what does the '102' mean ? and where is comes from?

Error in background merging using cv2.seamlessClone (file trans_with_bigbg.py)

I am getting the following error on many videos from VoxCeleb2 test dataset.

File "test_personalized.py", line 105, in
merge_with_bigbg(audiobasen,n)
File "Audio-driven-TalkingFace-HeadPose/Audio/code/trans_with_bigbg.py", line 76, in merge_with_bigbg
output = cv2.seamlessClone(img1,img,mask,center,cv2.NORMAL_CLONE)
cv2.error: OpenCV(4.1.0) /io/opencv/modules/core/src/matrix.cpp:466: error: (-215:Assertion failed) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function 'Mat'

I tried to resize the videos (originally 224x224) by increasing frame dimensions but getting the same error. Could you kindly suggest what maybe the issue and how to resolve it ?

FileNotFoundError:'./checkpoints/memory_seq_p2p/60_net_G.pth'

Hi，Yiran:
Thank you for your good work.
I ran into some problems.
When i ran the step 5 or test on a target person, program will load ./checkpoints/memory_seq_p2p/60_net_G.pth after initialize network with normal.
I checked the folders and did not find such file.
If this file generated in run time? or I just miss this file in one step?
Look forward for your response.
Thank you very much.

yiranran / audio-driven-talkingface-headpose Goto Github PK

audio-driven-talkingface-headpose's Introduction

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose

Our Proposed Framework

Prerequisites

Getting Started

Installation

Download pre-trained models

Download face model for 3d face reconstruction

Fine-tune on a target peron's short video

Test on a target peron

Colab

Citation

Acknowledgments

audio-driven-talkingface-headpose's People

Contributors

Stargazers

Watchers

Forkers

audio-driven-talkingface-headpose's Issues

Recommend Projects

Recommend Topics

Recommend Org