Git Product home page Git Product logo

Comments (8)

MagicRedZero avatar MagicRedZero commented on June 29, 2024 1

cd vocaset && python process_voca_data.py

vertices_npy/ wav/

but
config/vocaset/stage2.yaml >> wav_path : wav_mini

wav/ != wav_mini
so "Loaded data: Train-0, Val-0, Test-0"


sh scripts/render.sh

Failed to open PLY file.
--dataset_dir . >> --dataset_dir ./

from codetalker.

Doubiiu avatar Doubiiu commented on June 29, 2024
  1. You should train the stage1 of CodeTalker and then set the trained model weights path in configs of stage2 accordingly.
  2. It is a pretrained moel weights of wav2vec2 released by facebook. If it cannot be automatically downloaded, you can download these files manually and change the path to your local wav2vec2-base-960h folder.

from codetalker.

MagicRedZero avatar MagicRedZero commented on June 29, 2024

can we used
vocaset_stage1.pth.tar > RUN/vocaset/CodeTalker_s1/model/model.pth.tar,
vocaset_stage2.pth.tar > RUN/vocaset/CodeTalker_s2/model/model.pth.tar
to run "sh scripts/test.sh CodeTalker_s2 config/vocaset/stage2.yaml vocaset s2" ?

from codetalker.

Doubiiu avatar Doubiiu commented on June 29, 2024

Sure you can do that.

from codetalker.

MagicRedZero avatar MagicRedZero commented on June 29, 2024

[2023-05-06 07:41:54,511 INFO test_pred.py line 23 7740]=>=> creating model ...
Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

  • This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Some weights of Wav2Vec2Model were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed']
    You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
    [2023-05-06 07:42:01,650 INFO test_pred.py line 28 7740]=>=> loading checkpoint 'RUN/vocaset/CodeTalker_s2/model/model.pth.tar'
    [2023-05-06 07:42:02,127 INFO test_pred.py line 31 7740]=>=> loaded checkpoint 'RUN/vocaset/CodeTalker_s2/model/model.pth.tar'
    Loading data...
    Loaded data: Train-0, Val-0, Test-0
    Traceback (most recent call last):
    File "main/test_pred.py", line 75, in
    main()
    File "main/test_pred.py", line 37, in main
    dataset = get_dataloaders(cfg)
    File "/home/nvme1n1p1/CodeTalker/dataset/data_loader.py", line 107, in get_dataloaders
    dataset["train"] = data.DataLoader(dataset=train_data, batch_size=args.batch_size, shuffle=True, num_workers=args.workers)
    File "/root/anaconda3/envs/codetalker/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 344, in init
    sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
    File "/root/anaconda3/envs/codetalker/lib/python3.7/site-packages/torch/utils/data/sampler.py", line 108, in init
    "value, but got num_samples={}".format(self.num_samples))
    ValueError: num_samples should be a positive integer value, but got num_samples=0

I get this result
Do we need any other data?

from codetalker.

Doubiiu avatar Doubiiu commented on June 29, 2024

As this method is to animate a 3D neutral face given speech signals, you need to download and preprocess data according to the instruction in Dataset Preparation for VOCASET or BIWI.

from codetalker.

MagicRedZero avatar MagicRedZero commented on June 29, 2024

I use vocaset
vocaset/
├── data_verts.npy
├── FLAME_sample.ply
├── init_expression_basis.npy
├── processed_audio_deepspeech.pkl
├── process_voca_data.py
├── raw_audio_fixed.pkl
├── readme.pdf
├── subj_seq_to_idx.pkl
├── templates.pkl
├── vertices_npy
│   ├── condition_FaceTalk_170725_00137_TA_subject_FaceTalk_170809_00138_TA.npy
│   ├── FaceTalk_170725_00137_TA_sentence01.npy
│   ├── FaceTalk_170725_00137_TA_sentence40.npy
│   ├── FaceTalk_170728_03272_TA_sentence01.npy
│   ├── FaceTalk_170728_03272_TA_sentence40.npy
│   ├── FaceTalk_170731_00024_TA_sentence01.npy
│   ├── FaceTalk_170731_00024_TA_sentence40.npy
│   ├── FaceTalk_170809_00138_TA_sentence01.npy
│   ├── FaceTalk_170811_03274_TA_sentence01.npy
│   ├── FaceTalk_170811_03275_TA_sentence01.npy
│   ├── FaceTalk_170811_03275_TA_sentence40.npy
│   ├── FaceTalk_170904_00128_TA_sentence01.npy
│   ├── FaceTalk_170904_00128_TA_sentence40.npy
│   ├── FaceTalk_170904_03276_TA_sentence01.npy
│   ├── FaceTalk_170904_03276_TA_sentence40.npy
│   ├── FaceTalk_170908_03277_TA_sentence01.npy
│   ├── FaceTalk_170908_03277_TA_sentence40.npy
│   ├── FaceTalk_170912_03278_TA_sentence01.npy
│   ├── FaceTalk_170912_03278_TA_sentence40.npy
│   ├── FaceTalk_170913_03279_TA_sentence01.npy
│   ├── FaceTalk_170913_03279_TA_sentence40.npy
│   ├── FaceTalk_170915_00223_TA_sentence01.npy
│   └── FaceTalk_170915_00223_TA_sentence40.npy
├── vocaset_stage1.pth.tar
├── vocaset_stage2.pth.tar
└── wav
├── FaceTalk_170725_00137_TA_sentence01.wav
├── FaceTalk_170725_00137_TA_sentence40.wav
├── FaceTalk_170728_03272_TA_sentence01.wav
├── FaceTalk_170728_03272_TA_sentence40.wav
├── FaceTalk_170731_00024_TA_sentence01.wav
├── FaceTalk_170731_00024_TA_sentence40.wav
├── FaceTalk_170809_00138_TA_sentence01.wav
├── FaceTalk_170809_00138_TA_sentence40.wav
├── FaceTalk_170811_03274_TA_sentence03.wav
├── FaceTalk_170811_03274_TA_sentence40.wav
├── FaceTalk_170811_03275_TA_sentence01.wav
├── FaceTalk_170811_03275_TA_sentence40.wav
├── FaceTalk_170904_00128_TA_sentence01.wav
├── FaceTalk_170904_00128_TA_sentence40.wav
├── FaceTalk_170904_03276_TA_sentence01.wav
├── FaceTalk_170904_03276_TA_sentence40.wav
├── FaceTalk_170908_03277_TA_sentence01.wav
├── FaceTalk_170908_03277_TA_sentence40.wav
├── FaceTalk_170912_03278_TA_sentence01.wav
├── FaceTalk_170913_03279_TA_sentence01.wav
├── FaceTalk_170915_00223_TA_sentence40.wav
└── man.wav

from codetalker.

MagicRedZero avatar MagicRedZero commented on June 29, 2024

I found the reason, thank you very much

from codetalker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.