Comments (8)
cd vocaset && python process_voca_data.py
vertices_npy/ wav/
but
config/vocaset/stage2.yaml >> wav_path : wav_mini
wav/ != wav_mini
so "Loaded data: Train-0, Val-0, Test-0"
sh scripts/render.sh
Failed to open PLY file.
--dataset_dir . >> --dataset_dir ./
from codetalker.
- You should train the stage1 of CodeTalker and then set the trained model weights path in configs of stage2 accordingly.
- It is a pretrained moel weights of wav2vec2 released by facebook. If it cannot be automatically downloaded, you can download these files manually and change the path to your local wav2vec2-base-960h folder.
from codetalker.
can we used
vocaset_stage1.pth.tar > RUN/vocaset/CodeTalker_s1/model/model.pth.tar,
vocaset_stage2.pth.tar > RUN/vocaset/CodeTalker_s2/model/model.pth.tar
to run "sh scripts/test.sh CodeTalker_s2 config/vocaset/stage2.yaml vocaset s2" ?
from codetalker.
Sure you can do that.
from codetalker.
[2023-05-06 07:41:54,511 INFO test_pred.py line 23 7740]=>=> creating model ...
Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']
- This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2Model were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2023-05-06 07:42:01,650 INFO test_pred.py line 28 7740]=>=> loading checkpoint 'RUN/vocaset/CodeTalker_s2/model/model.pth.tar'
[2023-05-06 07:42:02,127 INFO test_pred.py line 31 7740]=>=> loaded checkpoint 'RUN/vocaset/CodeTalker_s2/model/model.pth.tar'
Loading data...
Loaded data: Train-0, Val-0, Test-0
Traceback (most recent call last):
File "main/test_pred.py", line 75, in
main()
File "main/test_pred.py", line 37, in main
dataset = get_dataloaders(cfg)
File "/home/nvme1n1p1/CodeTalker/dataset/data_loader.py", line 107, in get_dataloaders
dataset["train"] = data.DataLoader(dataset=train_data, batch_size=args.batch_size, shuffle=True, num_workers=args.workers)
File "/root/anaconda3/envs/codetalker/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 344, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/root/anaconda3/envs/codetalker/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 108, in init
"value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0
I get this result
Do we need any other data?
from codetalker.
As this method is to animate a 3D neutral face given speech signals, you need to download and preprocess data according to the instruction in Dataset Preparation
for VOCASET or BIWI.
from codetalker.
I use vocaset
vocaset/
├── data_verts.npy
├── FLAME_sample.ply
├── init_expression_basis.npy
├── processed_audio_deepspeech.pkl
├── process_voca_data.py
├── raw_audio_fixed.pkl
├── readme.pdf
├── subj_seq_to_idx.pkl
├── templates.pkl
├── vertices_npy
│ ├── condition_FaceTalk_170725_00137_TA_subject_FaceTalk_170809_00138_TA.npy
│ ├── FaceTalk_170725_00137_TA_sentence01.npy
│ ├── FaceTalk_170725_00137_TA_sentence40.npy
│ ├── FaceTalk_170728_03272_TA_sentence01.npy
│ ├── FaceTalk_170728_03272_TA_sentence40.npy
│ ├── FaceTalk_170731_00024_TA_sentence01.npy
│ ├── FaceTalk_170731_00024_TA_sentence40.npy
│ ├── FaceTalk_170809_00138_TA_sentence01.npy
│ ├── FaceTalk_170811_03274_TA_sentence01.npy
│ ├── FaceTalk_170811_03275_TA_sentence01.npy
│ ├── FaceTalk_170811_03275_TA_sentence40.npy
│ ├── FaceTalk_170904_00128_TA_sentence01.npy
│ ├── FaceTalk_170904_00128_TA_sentence40.npy
│ ├── FaceTalk_170904_03276_TA_sentence01.npy
│ ├── FaceTalk_170904_03276_TA_sentence40.npy
│ ├── FaceTalk_170908_03277_TA_sentence01.npy
│ ├── FaceTalk_170908_03277_TA_sentence40.npy
│ ├── FaceTalk_170912_03278_TA_sentence01.npy
│ ├── FaceTalk_170912_03278_TA_sentence40.npy
│ ├── FaceTalk_170913_03279_TA_sentence01.npy
│ ├── FaceTalk_170913_03279_TA_sentence40.npy
│ ├── FaceTalk_170915_00223_TA_sentence01.npy
│ └── FaceTalk_170915_00223_TA_sentence40.npy
├── vocaset_stage1.pth.tar
├── vocaset_stage2.pth.tar
└── wav
├── FaceTalk_170725_00137_TA_sentence01.wav
├── FaceTalk_170725_00137_TA_sentence40.wav
├── FaceTalk_170728_03272_TA_sentence01.wav
├── FaceTalk_170728_03272_TA_sentence40.wav
├── FaceTalk_170731_00024_TA_sentence01.wav
├── FaceTalk_170731_00024_TA_sentence40.wav
├── FaceTalk_170809_00138_TA_sentence01.wav
├── FaceTalk_170809_00138_TA_sentence40.wav
├── FaceTalk_170811_03274_TA_sentence03.wav
├── FaceTalk_170811_03274_TA_sentence40.wav
├── FaceTalk_170811_03275_TA_sentence01.wav
├── FaceTalk_170811_03275_TA_sentence40.wav
├── FaceTalk_170904_00128_TA_sentence01.wav
├── FaceTalk_170904_00128_TA_sentence40.wav
├── FaceTalk_170904_03276_TA_sentence01.wav
├── FaceTalk_170904_03276_TA_sentence40.wav
├── FaceTalk_170908_03277_TA_sentence01.wav
├── FaceTalk_170908_03277_TA_sentence40.wav
├── FaceTalk_170912_03278_TA_sentence01.wav
├── FaceTalk_170913_03279_TA_sentence01.wav
├── FaceTalk_170915_00223_TA_sentence40.wav
└── man.wav
from codetalker.
I found the reason, thank you very much
from codetalker.
Related Issues (20)
- Code is not present for lip-distance calculations HOT 4
- Not converge when larger batch size is used in Stage2 HOT 1
- What is the lip vertex range when calculating LVE? HOT 3
- Code about calculating LVE HOT 2
- A typo in vocaset stage2.yaml? What is "wav_mini"? HOT 1
- psbody module HOT 3
- the parameters of TCN are fixed, so how to unfreeze HOT 2
- The hidden_states during prediction stage is the same when a custom dataset is used HOT 3
- a part of biwi data is missing? HOT 2
- How to get the lve.txt and fdd.txt files HOT 3
- About the lve.txt and fdd.txt annotation
- BIWI dataset preprocess HOT 1
- 关于在vocaset上训练meshtalk问题 HOT 5
- 我们创建了一个中文讨论组,有需要的加我微信douzijun1999
- Dont know this technology yet, but Before using it I need to ask a question HOT 1
- Evaluation Problem: About the Measuring Unit(mm) and testing
- cal_metric.py error "FileNotFoundError: [Errno 2] No such file or directory: 'RUN/BIWI/CodeTalker_s2/result/npy/F2_e37_condition_F2.npy'"
- Is it possible to use facial models trained in Chinese HOT 1
- the input audio file's length limit
- use matlab script to handle the biwi datset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from codetalker.