Git Product home page Git Product logo

Comments (5)

YifengMa9 avatar YifengMa9 commented on August 29, 2024

您好. 您的错误应该是python版本的问题. 可以尝试更高版本 比如3.8 或 3.9 等等( 根据https://stackoverflow.com/questions/75529492/importerror-cannot-import-name-ordereddict-from-typing 只要3.7.2以上即可)
您这个错误提示是python type hint的问题, 和cuda版本等无关, 和python版本有关.

from dreamtalk.

chaorenai avatar chaorenai commented on August 29, 2024

我升级python到3.10.6以后,推理的时候,又遇到这样的错误:

(dreamtalk) C:\Users\sunny\Documents\dreamtalk>python inference_for_demo_video.py ^
More? --wav_path data/audio/acknowledgement_english.m4a ^
More? --style_clip_path data/style_clip/3DMM/M030_front_neutral_level1_001.mat ^
More? --pose_path data/pose/RichardShelby_front_neutral_level1_001.mat ^
More? --image_path data/src_img/uncropped/male_face.png ^
More? --cfg_scale 1.0 ^
More? --max_gen_len 30 ^
More? --output_name acknowledgement_english@M030_front_neutral_level1_001@male_face
ffmpeg version N-112686-g3f890fbfd9-20231104 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 13.2.0 (crosstool-NG 1.25.0.232_c175b21)
configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --enable-shared --disable-static --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-chromaprint --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --enable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20231104
libavutil 58. 31.100 / 58. 31.100
libavcodec 60. 32.102 / 60. 32.102
libavformat 60. 17.100 / 60. 17.100
libavdevice 60. 4.100 / 60. 4.100
libavfilter 9. 13.100 / 9. 13.100
libswscale 7. 6.100 / 7. 6.100
libswresample 4. 13.100 / 4. 13.100
libpostproc 57. 4.100 / 57. 4.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'data/audio/acknowledgement_english.m4a':
Metadata:
major_brand : M4A
minor_version : 0
compatible_brands: M4A isommp42
creation_time : 2023-12-20T14:25:20.000000Z
iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Duration: 00:00:16.62, start: 0.044000, bitrate: 246 kb/s
Stream #0:00x1: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, mono, fltp, 244 kb/s (default)
Metadata:
creation_time : 2023-12-20T14:25:20.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
Stream mapping:
Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'tmp/acknowledgement_english@M030_front_neutral_level1_001@male_face\acknowledgement_english@M030_front_neutral_level1_001@male_face_16K.wav':
Metadata:
major_brand : M4A
minor_version : 0
compatible_brands: M4A isommp42
iTunSMPB : 00000000 00000840 00000000 00000000000C23C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ISFT : Lavf60.17.100
Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s (default)
Metadata:
creation_time : 2023-12-20T14:25:20.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
encoder : Lavc60.32.102 pcm_s16le
[out#0/wav @ 0000021b689badc0] video:0kB audio:518kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.014706%
size= 518kB time=00:00:16.57 bitrate= 256.1kbits/s speed= 685x
C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Some weights of the model checkpoint at jonatasgrosman/wav2vec2-large-xlsr-53-english were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

  • This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
    Traceback (most recent call last):
    File "C:\Users\sunny\Documents\dreamtalk\inference_for_demo_video.py", line 178, in
    speech_array, sampling_rate = torchaudio.load(wav_16k_path)
    File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchaudio_backend\utils.py", line 203, in load
    backend = dispatcher(uri, format, backend)
    File "C:\Users\sunny.conda\envs\dreamtalk\lib\site-packages\torchaudio_backend\utils.py", line 115, in dispatcher
    raise RuntimeError(f"Couldn't find appropriate backend to handle uri {uri} and format {format}.")
    RuntimeError: Couldn't find appropriate backend to handle uri tmp/acknowledgement_english@M030_front_neutral_level1_001@male_face\acknowledgement_english@M030_front_neutral_level1_001@male_face_16K.wav and format None.

(dreamtalk) C:\Users\sunny\Documents\dreamtalk>

from dreamtalk.

YifengMa9 avatar YifengMa9 commented on August 29, 2024

您可以尝试
windows系统:
pip install soundfile
linux系统:
pip install sox

参见 #2

from dreamtalk.

chaorenai avatar chaorenai commented on August 29, 2024

感谢,跑通了。不过和sadtalker还不是一个级别啊,加油哦

from dreamtalk.

weiran0129 avatar weiran0129 commented on August 29, 2024

感谢,跑通了。不过和sadtalker还不是一个级别啊,加油哦

您好,我是在读学生,想学习下这个的模型和环境搭建,可否给我一些指导,谢谢

from dreamtalk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.