fudan-generative-vision / hallo Goto Github PK

View Code? Open in Web Editor NEW

7.4K 469.0 956.0 42.96 MB

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Home Page: https://fudan-generative-vision.github.io/hallo/

License: MIT License

Python 100.00%

face-animation image-animation video-animation

hallo's Introduction

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Mingwang Xu^1* Hui Li^1* Qingkun Su^1* Hanlin Shang¹ Liwei Zhang¹ Ce Liu³

Jingdong Wang² Yao Yao⁴ Siyu Zhu¹

¹Fudan University ²Baidu Inc ³ETH Zurich ⁴Nanjing University

📸 Showcase

head.mp4

🎬 Honoring Classic Films

Devil Wears Prada	Green Book	Infernal Affairs

Patch Adams	Tough Love	Shawshank Redemption

Explore more examples.

📰 News

2024/06/28: 🎉🎉🎉 We are proud to announce the release of our model training code. Try your own training data. Here is tutorial.
2024/06/21: 🚀🚀🚀 Cloned a Gradio demo on 🤗Huggingface space.
2024/06/20: 🌟🌟🌟 Received numerous contributions from the community, including a Windows version, ComfyUI, WebUI, and Docker template.
2024/06/15: ✨✨✨ Released some images and audios for inference testing on 🤗Huggingface.
2024/06/15: 🎉🎉🎉 Launched the first version on 🫡GitHub.

🤝 Community Resources

Explore the resources developed by our community to enhance your experience with Hallo:

TTS x Hallo Talking Portrait Generator - Check out this awesome Gradio demo by @Sylvain Filoni! With this tool, you can conveniently prepare portrait image and audio for Hallo.
Demo on Huggingface - Check out this easy-to-use Gradio demo by @multimodalart.
hallo-webui - Explore the WebUI created by @daswer123.
hallo-for-windows - Utilize Hallo on Windows with the guide by @sdbds.
ComfyUI-Hallo - Integrate Hallo with the ComfyUI tool by @AIFSH.
hallo-docker - Docker image for Hallo by @ashleykleynhans.
RunPod Template - Deploy Hallo to RunPod by @ashleykleynhans.

Thanks to all of them.

Join our community and explore these amazing resources to make the most out of Hallo. Enjoy and elevate their creative projects!

🔧️ Framework

⚙️ Installation

System requirement: Ubuntu 20.04/Ubuntu 22.04, Cuda 12.1
Tested GPUs: A100

Create conda environment:

  conda create -n hallo python=3.10
  conda activate hallo

Install packages with pip

  pip install -r requirements.txt
  pip install .

Besides, ffmpeg is also needed:

  apt-get install ffmpeg

🗝️️ Usage

The entry point for inference is scripts/inference.py. Before testing your cases, two preparations need to be completed:

Download all required pretrained models.
Prepare source image and driving audio pairs.
Run inference.

📥 Download Pretrained Models

You can easily get all pretrained models required by inference from our HuggingFace repo.

Clone the pretrained models into ${PROJECT_ROOT}/pretrained_models directory by cmd below:

git lfs install
git clone https://huggingface.co/fudan-generative-ai/hallo pretrained_models

Or you can download them separately from their source repo:

hallo: Our checkpoints consist of denoising UNet, face locator, image & audio proj.
audio_separator: Kim_Vocal_2 MDX-Net vocal removal model. (Thanks to KimberleyJensen)
insightface: 2D and 3D Face Analysis placed into pretrained_models/face_analysis/models/. (Thanks to deepinsight)
face landmarker: Face detection & mesh model from mediapipe placed into pretrained_models/face_analysis/models.
motion module: motion module from AnimateDiff. (Thanks to guoyww).
sd-vae-ft-mse: Weights are intended to be used with the diffusers library. (Thanks to stablilityai)
StableDiffusion V1.5: Initialized and fine-tuned from Stable-Diffusion-v1-2. (Thanks to runwayml)
wav2vec: wav audio to vector model from Facebook.

Finally, these pretrained models should be organized as follows:

./pretrained_models/
|-- audio_separator/
|   |-- download_checks.json
|   |-- mdx_model_data.json
|   |-- vr_model_data.json
|   `-- Kim_Vocal_2.onnx
|-- face_analysis/
|   `-- models/
|       |-- face_landmarker_v2_with_blendshapes.task  # face landmarker model from mediapipe
|       |-- 1k3d68.onnx
|       |-- 2d106det.onnx
|       |-- genderage.onnx
|       |-- glintr100.onnx
|       `-- scrfd_10g_bnkps.onnx
|-- motion_module/
|   `-- mm_sd_v15_v2.ckpt
|-- sd-vae-ft-mse/
|   |-- config.json
|   `-- diffusion_pytorch_model.safetensors
|-- stable-diffusion-v1-5/
|   `-- unet/
|       |-- config.json
|       `-- diffusion_pytorch_model.safetensors
`-- wav2vec/
    `-- wav2vec2-base-960h/
        |-- config.json
        |-- feature_extractor_config.json
        |-- model.safetensors
        |-- preprocessor_config.json
        |-- special_tokens_map.json
        |-- tokenizer_config.json
        `-- vocab.json

🛠️ Prepare Inference Data

Hallo has a few simple requirements for input data:

For the source image:

It should be cropped into squares.
The face should be the main focus, making up 50%-70% of the image.
The face should be facing forward, with a rotation angle of less than 30° (no side profiles).

For the driving audio:

It must be in WAV format.
It must be in English since our training datasets are only in this language.
Ensure the vocals are clear; background music is acceptable.

We have provided some samples for your reference.

🎮 Run Inference

Simply to run the scripts/inference.py and pass source_image and driving_audio as input:

python scripts/inference.py --source_image examples/reference_images/1.jpg --driving_audio examples/driving_audios/1.wav

Animation results will be saved as ${PROJECT_ROOT}/.cache/output.mp4 by default. You can pass --output to specify the output file name. You can find more examples for inference at examples folder.

For more options:

usage: inference.py [-h] [-c CONFIG] [--source_image SOURCE_IMAGE] [--driving_audio DRIVING_AUDIO] [--output OUTPUT] [--pose_weight POSE_WEIGHT]
                    [--face_weight FACE_WEIGHT] [--lip_weight LIP_WEIGHT] [--face_expand_ratio FACE_EXPAND_RATIO]

options:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
  --source_image SOURCE_IMAGE
                        source image
  --driving_audio DRIVING_AUDIO
                        driving audio
  --output OUTPUT       output video file name
  --pose_weight POSE_WEIGHT
                        weight of pose
  --face_weight FACE_WEIGHT
                        weight of face
  --lip_weight LIP_WEIGHT
                        weight of lip
  --face_expand_ratio FACE_EXPAND_RATIO
                        face region

Training

Prepare Data for Training

The training data, which utilizes some talking-face videos similar to the source images used for inference, also needs to meet the following requirements:

It should be cropped into squares.
The face should be the main focus, making up 50%-70% of the image.
The face should be facing forward, with a rotation angle of less than 30° (no side profiles).

Organize your raw videos into the following directory structure:

dataset_name/
|-- videos/
|   |-- 0001.mp4
|   |-- 0002.mp4
|   |-- 0003.mp4
|   `-- 0004.mp4

You can use any dataset_name, but ensure the videos directory is named as shown above.

Next, process the videos with the following commands:

python -m scripts.data_preprocess --input_dir dataset_name/videos --step 1
python -m scripts.data_preprocess --input_dir dataset_name/videos --step 2

Note: Execute steps 1 and 2 sequentially as they perform different tasks. Step 1 converts videos into frames, extracts audio from each video, and generates the necessary masks. Step 2 generates face embeddings using InsightFace and audio embeddings using Wav2Vec, and requires a GPU. For parallel processing, use the -p and -r arguments. The -p argument specifies the total number of instances to launch, dividing the data into p parts. The -r argument specifies which part the current process should handle. You need to manually launch multiple instances with different values for -r.

Generate the metadata JSON files with the following commands:

python scripts/extract_meta_info_stage1.py -r path/to/dataset -n dataset_name
python scripts/extract_meta_info_stage2.py -r path/to/dataset -n dataset_name

Replace path/to/dataset with the path to the parent directory of videos, such as dataset_name in the example above. This will generate dataset_name_stage1.json and dataset_name_stage2.json in the ./data directory.

Training

Update the data meta path settings in the configuration YAML files, configs/train/stage1.yaml and configs/train/stage2.yaml:

#stage1.yaml
data:
  meta_paths:
    - ./data/dataset_name_stage1.json

#stage2.yaml
data:
  meta_paths:
    - ./data/dataset_name_stage2.json

Start training with the following command:

accelerate launch -m \
  --config_file accelerate_config.yaml \
  --machine_rank 0 \
  --main_process_ip 0.0.0.0 \
  --main_process_port 20055 \
  --num_machines 1 \
  --num_processes 8 \
  scripts.train_stage1 --config ./configs/train/stage1.yaml

Accelerate Usage Explanation

The accelerate launch command is used to start the training process with distributed settings.

accelerate launch [arguments] {training_script} --{training_script-argument-1} --{training_script-argument-2} ...

Arguments for Accelerate:

-m, --module: Interpret the launch script as a Python module.
--config_file: Configuration file for Hugging Face Accelerate.
--machine_rank: Rank of the current machine in a multi-node setup.
--main_process_ip: IP address of the master node.
--main_process_port: Port of the master node.
--num_machines: Total number of nodes participating in the training.
--num_processes: Total number of processes for training, matching the total number of GPUs across all machines.

Arguments for Training:

{training_script}: The training script, such as scripts.train_stage1 or scripts.train_stage2.
--{training_script-argument-1}: Arguments specific to the training script. Our training scripts accept one argument, --config, to specify the training configuration file.

For multi-node training, you need to manually run the command with different machine_rank on each node separately.

For more settings, refer to the Accelerate documentation.

📅️ Roadmap

Status	Milestone	ETA
✅	Inference source code meet everyone on GitHub	2024-06-15
✅	Pretrained models on Huggingface	2024-06-15
✅	Releasing data preparation and training scripts	2024-06-28
🚀	Improving the model's performance on Mandarin Chinese	TBD

Other Enhancements

Enhancement: Test and ensure compatibility with Windows operating system. #39
Bug: Output video may lose several frames. #41
Bug: Sound volume affecting inference results (audio normalization).
~~Enhancement: Inference code logic optimization~~. This solution doesn't show significant performance improvements. Trying other approaches.

📝 Citation

If you find our work useful for your research, please consider citing the paper:

@misc{xu2024hallo,
  title={Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation},
  author={Mingwang Xu and Hui Li and Qingkun Su and Hanlin Shang and Liwei Zhang and Ce Liu and Jingdong Wang and Yao Yao and Siyu zhu},
  year={2024},
  eprint={2406.08801},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

🌟 Opportunities Available

Multiple research positions are open at the Generative Vision Lab, Fudan University! Include:

Research assistant
Postdoctoral researcher
PhD candidate
Master students

Interested individuals are encouraged to contact us at [email protected] for further information.

⚠️ Social Risks and Mitigations

The development of portrait image animation technologies driven by audio inputs poses social risks, such as the ethical implications of creating realistic portraits that could be misused for deepfakes. To mitigate these risks, it is crucial to establish ethical guidelines and responsible use practices. Privacy and consent concerns also arise from using individuals' images and voices. Addressing these involves transparent data usage policies, informed consent, and safeguarding privacy rights. By addressing these risks and implementing mitigations, the research aims to ensure the responsible and ethical development of this technology.

🤗 Acknowledgements

We would like to thank the contributors to the magic-animate, AnimateDiff, ultimatevocalremovergui, AniPortrait and Moore-AnimateAnyone repositories, for their open research and exploration.

If we missed any open-source projects or related articles, we would like to complement the acknowledgement of this specific work immediately.

👏 Community Contributors

Thank you to all the contributors who have helped to make this project better!

hallo's People

Contributors

Stargazers

Watchers

Forkers

crystallee-ai ntt720 awekling hs991023 e-kiss-me farmingtong subazinga mistyr0se vamoko fskeo ichibanya obsidian6s solo-tiger-man n0wwa molierflower paramedick xupercoin omroon s8xy windb3ll nicbair ishine lycokie tufo830 reyx3 billionerd zaku-zaku huangshenneng biphoria maigone monsterdove halimoai sdbds starlucks coder-drinker cerviny thomascherickal moguijoe closegoingaway masemxiao hay-man neurlnetworker wensiyuansix unoph hecha2 tutuna resendlab chewtoys spicyguml peanutcocktail jbluv zdaar iam20cm ohmyx d3p10y yurikabe minisoco piapplepi libresse nanpusher excelisa daswer123 ymzhang96 w90o0u anasshadad tomchapin zcfrank1st xiao2duan twacoco paoyes utopic-dev luozhe023 ai2047 chinshou qugou1350636 jinyi-sama aimogmog kamjin1996 commachan nap1ch kamifr raymusk zshpro bartslab leonz87 tqcheung daviddelaurier liunix61 err-nil msakaida jmanhype jtt1998 xiaohedith miyabix skillcampalan perfword lewieyasu yetaye paperwave leemengtw

hallo's Issues

Just wanted to say thanks to the project developers and post an example video.

Harley Quinn sings "House of the Rising Sun." https://www.threads.net/@deephomage/post/C8YXeVpgR51?deeplink_ref=ig_web

The best open-source facial animation model I've ever used

Great job! And solute the open source spirit of your team!

I tried many hours today on this and it really impressed me. Here are some results Hallo generated.

3.mp4

2.mp4

1.mp4

Could not build wheels for insightface (on Windows)

When I tried to command in windows:

pip install insightface==0.7.3

I meet the problem below:

Processing d:\bingoyes\hallo\insightface-0.7.3.tar.gz
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (1.26.4)
Requirement already satisfied: onnx in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (1.16.1)
Requirement already satisfied: tqdm in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (4.66.4)
Requirement already satisfied: requests in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (2.32.3)
Requirement already satisfied: matplotlib in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (3.9.0)
Requirement already satisfied: Pillow in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (10.3.0)
Requirement already satisfied: scipy in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (1.13.1)
Requirement already satisfied: scikit-learn in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (1.5.0)
Requirement already satisfied: scikit-image in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (0.23.2)
Requirement already satisfied: easydict in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (1.13)
Requirement already satisfied: cython in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from insightface==0.7.3) (3.0.10)
Collecting albumentations (from insightface==0.7.3)
  Using cached albumentations-1.4.8-py3-none-any.whl.metadata (37 kB)     
Collecting prettytable (from insightface==0.7.3)
  Using cached prettytable-3.10.0-py3-none-any.whl.metadata (30 kB)       
Requirement already satisfied: PyYAML in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from albumentations->insightface==0.7.3) (6.0.1)    
Requirement already satisfied: typing-extensions>=4.9.0 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from albumentations->insightface==0.7.3) (4.12.2)
Collecting pydantic>=2.7.0 (from albumentations->insightface==0.7.3)
  Using cached pydantic-2.7.4-py3-none-any.whl.metadata (109 kB)
Collecting albucore>=0.0.4 (from albumentations->insightface==0.7.3)
  Using cached albucore-0.0.10-py3-none-any.whl.metadata (3.1 kB)
Requirement already satisfied: opencv-python-headless>=4.9.0.80 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from albumentations->insightface==0.7.3) (4.9.0.80)
Requirement already satisfied: networkx>=2.8 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-image->insightface==0.7.3) (3.3) 
Requirement already satisfied: imageio>=2.33 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-image->insightface==0.7.3) (2.34.1)
Requirement already satisfied: tifffile>=2022.8.12 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-image->insightface==0.7.3) (2024.5.22)
Requirement already satisfied: packaging>=21 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-image->insightface==0.7.3) (24.1)
Requirement already satisfied: lazy-loader>=0.4 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-image->insightface==0.7.3) (0.4)
Requirement already satisfied: joblib>=1.2.0 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-learn->insightface==0.7.3) (1.4.2)
Requirement already satisfied: threadpoolctl>=3.1.0 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from scikit-learn->insightface==0.7.3) (3.5.0)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from matplotlib->insightface==0.7.3) (1.2.1)
Requirement already satisfied: cycler>=0.10 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from matplotlib->insightface==0.7.3) (0.12.1) 
Requirement already satisfied: fonttools>=4.22.0 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from matplotlib->insightface==0.7.3) (4.53.0)
Requirement already satisfied: kiwisolver>=1.3.1 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from matplotlib->insightface==0.7.3) (1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from matplotlib->insightface==0.7.3) (3.1.2)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from matplotlib->insightface==0.7.3) (2.9.0.post0)
Requirement already satisfied: protobuf>=3.20.2 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from onnx->insightface==0.7.3) (4.25.3)   
Collecting wcwidth (from prettytable->insightface==0.7.3)
  Using cached wcwidth-0.2.13-py2.py3-none-any.whl.metadata (14 kB)       
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from requests->insightface==0.7.3) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from requests->insightface==0.7.3) (3.7)      
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from requests->insightface==0.7.3) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from requests->insightface==0.7.3) (2024.6.2)
Requirement already satisfied: colorama in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from tqdm->insightface==0.7.3) (0.4.6)
Requirement already satisfied: tomli>=2.0.1 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from albucore>=0.0.4->albumentations->insightface==0.7.3) (2.0.1)
Collecting annotated-types>=0.4.0 (from pydantic>=2.7.0->albumentations->insightface==0.7.3)
  Using cached annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)    
Collecting pydantic-core==2.18.4 (from pydantic>=2.7.0->albumentations->insightface==0.7.3)
  Using cached pydantic_core-2.18.4-cp310-none-win_amd64.whl.metadata (6.7 kB)
Requirement already satisfied: six>=1.5 in c:\users\bingoyes\.conda\envs\hallo\lib\site-packages (from python-dateutil>=2.7->matplotlib->insightface==0.7.3) (1.16.0)
Using cached albumentations-1.4.8-py3-none-any.whl (156 kB)
Using cached prettytable-3.10.0-py3-none-any.whl (28 kB)
Using cached albucore-0.0.10-py3-none-any.whl (8.4 kB)
Using cached pydantic-2.7.4-py3-none-any.whl (409 kB)
Using cached pydantic_core-2.18.4-cp310-none-win_amd64.whl (1.9 MB)       
Using cached wcwidth-0.2.13-py2.py3-none-any.whl (34 kB)
Using cached annotated_types-0.7.0-py3-none-any.whl (13 kB)
Building wheels for collected packages: insightface
  Building wheel for insightface (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for insightface (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [215 lines of output]
      WARNING: pandoc not enabled
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-cpython-310
      creating build\lib.win-amd64-cpython-310\insightface
      copying insightface\__init__.py -> build\lib.win-amd64-cpython-310\insightface
      creating build\lib.win-amd64-cpython-310\insightface\app
      copying insightface\app\common.py -> build\lib.win-amd64-cpython-310\insightface\app
      copying insightface\app\face_analysis.py -> build\lib.win-amd64-cpython-310\insightface\app
      copying insightface\app\mask_renderer.py -> build\lib.win-amd64-cpython-310\insightface\app
      copying insightface\app\__init__.py -> build\lib.win-amd64-cpython-310\insightface\app
      creating build\lib.win-amd64-cpython-310\insightface\commands       
      copying insightface\commands\insightface_cli.py -> build\lib.win-amd64-cpython-310\insightface\commands
      copying insightface\commands\model_download.py -> build\lib.win-amd64-cpython-310\insightface\commands
      copying insightface\commands\rec_add_mask_param.py -> build\lib.win-amd64-cpython-310\insightface\commands
      copying insightface\commands\__init__.py -> build\lib.win-amd64-cpython-310\insightface\commands
      creating build\lib.win-amd64-cpython-310\insightface\data
      copying insightface\data\image.py -> build\lib.win-amd64-cpython-310\insightface\data
      copying insightface\data\pickle_object.py -> build\lib.win-amd64-cpython-310\insightface\data
      copying insightface\data\rec_builder.py -> build\lib.win-amd64-cpython-310\insightface\data
      copying insightface\data\__init__.py -> build\lib.win-amd64-cpython-310\insightface\data
      creating build\lib.win-amd64-cpython-310\insightface\model_zoo      
      copying insightface\model_zoo\arcface_onnx.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\attribute.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\inswapper.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\landmark.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\model_store.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\model_zoo.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\retinaface.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\scrfd.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      copying insightface\model_zoo\__init__.py -> build\lib.win-amd64-cpython-310\insightface\model_zoo
      creating build\lib.win-amd64-cpython-310\insightface\thirdparty     
      copying insightface\thirdparty\__init__.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty
      creating build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\constant.py -> build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\download.py -> build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\face_align.py -> build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\filesystem.py -> build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\storage.py -> build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\transform.py -> build\lib.win-amd64-cpython-310\insightface\utils
      copying insightface\utils\__init__.py -> build\lib.win-amd64-cpython-310\insightface\utils
      creating build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d
      copying insightface\thirdparty\face3d\__init__.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d
      creating build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      copying insightface\thirdparty\face3d\mesh\io.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      copying insightface\thirdparty\face3d\mesh\light.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      copying insightface\thirdparty\face3d\mesh\render.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      copying insightface\thirdparty\face3d\mesh\transform.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      copying insightface\thirdparty\face3d\mesh\vis.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      copying insightface\thirdparty\face3d\mesh\__init__.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh
      creating build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy
      copying insightface\thirdparty\face3d\mesh_numpy\io.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy
      copying insightface\thirdparty\face3d\mesh_numpy\light.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy
      copying insightface\thirdparty\face3d\mesh_numpy\render.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy        
      copying insightface\thirdparty\face3d\mesh_numpy\transform.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy     
      copying insightface\thirdparty\face3d\mesh_numpy\vis.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy
      copying insightface\thirdparty\face3d\mesh_numpy\__init__.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh_numpy      
      creating build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\morphable_model
      copying insightface\thirdparty\face3d\morphable_model\fit.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\morphable_model 
      copying insightface\thirdparty\face3d\morphable_model\load.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\morphable_model
      copying insightface\thirdparty\face3d\morphable_model\morphabel_model.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\morphable_model
      copying insightface\thirdparty\face3d\morphable_model\__init__.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\morphable_model
      running egg_info
      writing insightface.egg-info\PKG-INFO
      writing dependency_links to insightface.egg-info\dependency_links.txt
      writing entry points to insightface.egg-info\entry_points.txt       
      writing requirements to insightface.egg-info\requires.txt
      writing top-level names to insightface.egg-info\top_level.txt       
      reading manifest file 'insightface.egg-info\SOURCES.txt'
      writing manifest file 'insightface.egg-info\SOURCES.txt'
      C:\Users\bingoyes\AppData\Local\Temp\pip-build-env-ibn8tcr3\overlay\Lib\site-packages\setuptools\command\build_py.py:207: _Warning: Package 'insightface.thirdparty.face3d.mesh.cython' is absent from the `packages` configuration.
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'insightface.thirdparty.face3d.mesh.cython' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration. 

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'insightface.thirdparty.face3d.mesh.cython' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'insightface.thirdparty.face3d.mesh.cython' to be distributed and are
              already explicitly excluding 'insightface.thirdparty.face3d.mesh.cython' via
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html


              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages. 
              ********************************************************************************

      !!
        check.warn(importable)
      C:\Users\bingoyes\AppData\Local\Temp\pip-build-env-ibn8tcr3\overlay\Lib\site-packages\setuptools\command\build_py.py:207: _Warning: Package 'insightface.data.images' is absent from the `packages` configuration.      
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'insightface.data.images' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration. 

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'insightface.data.images' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'insightface.data.images' to be distributed and are
              already explicitly excluding 'insightface.data.images' via  
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html


              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages. 
              ********************************************************************************

      !!
        check.warn(importable)
      C:\Users\bingoyes\AppData\Local\Temp\pip-build-env-ibn8tcr3\overlay\Lib\site-packages\setuptools\command\build_py.py:207: _Warning: Package 'insightface.data.objects' is absent from the `packages` configuration.     
      !!

              ********************************************************************************
              ############################
              # Package would be ignored #
              ############################
              Python recognizes 'insightface.data.objects' as an importable package[^1],
              but it is absent from setuptools' `packages` configuration. 

              This leads to an ambiguous overall configuration. If you want to distribute this
              package, please make sure that 'insightface.data.objects' is explicitly added
              to the `packages` configuration field.

              Alternatively, you can also rely on setuptools' discovery methods
              (for example by using `find_namespace_packages(...)`/`find_namespace:`
              instead of `find_packages(...)`/`find:`).

              You can read more about "package discovery" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/package_discovery.html

              If you don't want 'insightface.data.objects' to be distributed and are
              already explicitly excluding 'insightface.data.objects' via 
              `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.

              You can read more about "package data files" on setuptools documentation page:

              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html


              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages. 
              ********************************************************************************

pp -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh\cython
      creating build\lib.win-amd64-cpython-310\insightface\data\images
      copying insightface\data\images\Tom_Hanks_54745.png -> build\lib.win-amd64-cpython-310\insightface\data\images
      copying insightface\data\images\mask_black.jpg -> build\lib.win-amd64-cpython-310\insightface\data\images
      copying insightface\data\images\mask_blue.jpg -> build\lib.win-amd64-cpython-310\insightface\data\images
      copying insightface\data\images\mask_green.jpg -> build\lib.win-amd64-cpython-310\insightface\data\images
      copying insightface\data\images\mask_white.jpg -> build\lib.win-amd64-cpython-310\insightface\data\images
      copying insightface\data\images\t1.jpg -> build\lib.win-amd64-cpython-310\insightface\data\images

      creating build\lib.win-amd64-cpython-310\insightface\data\objects
      copying insightface\data\objects\meanshape_68.pkl -> build\lib.win-amd64-cpython-310\insightface\data\objects
      copying insightface\thirdparty\face3d\mesh\cython\mesh_core_cython.c -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh\cython
      copying insightface\thirdparty\face3d\mesh\cython\mesh_core_cython.cpp -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh\cython
      copying insightface\thirdparty\face3d\mesh\cython\mesh_core_cython.pyx -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh\cython
      copying insightface\thirdparty\face3d\mesh\cython\setup.py -> build\lib.win-amd64-cpython-310\insightface\thirdparty\face3d\mesh\cython
      running build_ext
      building 'insightface.thirdparty.face3d.mesh.cython.mesh_core_cython' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for insightface
Failed to build insightface
ERROR: Could not build wheels for insightface, which is required to install pyproject.toml-based projects

When FPS>25, OSError: Error in file

When FPS>25, OSError: Error in file appears. I tried 60FPS, 50FPS, and 30FPS, but this error occurred:

50FPS

60FPS

There is also a problem with 30FPS, if 25FPS is not a problem

RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 4

pipeline_output = pipeline(
ref_image=pixel_values_ref_img,
audio_tensor=audio_tensor,
face_emb=source_image_face_emb,
face_mask=source_image_face_region,
pixel_values_full_mask=source_image_full_mask,
pixel_values_face_mask=source_image_face_mask,
pixel_values_lip_mask=source_image_lip_mask,
width=1024,
height=1024,
video_length=clip_length,
num_inference_steps=config.inference_steps,
guidance_scale=config.cfg_scale,
generator=generator,
motion_scale=motion_scale,
)

change:
width=1024
height=1024

Traceback (most recent call last):
File "F:\workplace\hallo-webui\scripts\inference.py", line 424, in
inference_process(
File "F:\workplace\hallo-webui\scripts\inference.py", line 364, in inference_process
pipeline_output = pipeline(
File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\workplace\hallo-webui\hallo\animate\face_animate.py", line 401, in call
noise_pred = self.denoising_unet(
File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "F:\workplace\hallo-webui\hallo\models\unet_3d.py", line 605, in forward
sample = sample + mask_cond_fea
RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 4

Inference效果一般，没有样例的好，对图片的大小分辨率，有要求吧？

git clone https://huggingface.co/fudan-generative-ai/hallo fails with "smudge filter lfs failed"

Currently,

!git lfs install
!git clone https://huggingface.co/fudan-generative-ai/hallo pretrained_models

fails with:

Updated git hooks.
Git LFS initialized.
Cloning into 'pretrained_models'...
remote: Enumerating objects: 61, done.
remote: Counting objects: 100% (57/57), done.
remote: Compressing objects: 100% (51/51), done.
remote: Total 61 (delta 8), reused 0 (delta 0), pack-reused 4 (from 1)
Unpacking objects: 100% (61/61), 17.38 KiB | 26.00 KiB/s, done.
Downloading hallo/net.pth (4.9 GB)
Error downloading object: hallo/net.pth (e886a96): Smudge error: Error downloading hallo/net.pth (e886a9610b71a0f05a4cc65b4eb5bf3cebabfc75b06f8818c40ac225e69a0015): expected OID e886a9610b71a0f05a4cc65b4eb5bf3cebabfc75b06f8818c40ac225e69a0015, got 974a61a7c6ef1748966794cfba0f2535831680593c781791e671f49dad3c7300 after 4850540707 bytes written

Errors logged to /content/drive/MyDrive/hallo/source/hallo/pretrained_models/.git/lfs/logs/20240620T194058.384401071.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' failed
fatal: hallo/net.pth: smudge filter lfs failed
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

Is there an option to reduce the resolution?

The paper showed a dramatically faster inference rate for smaller resolution image processing. I tried reducing the input image size (half width and half height), but this did not seem to significantly improve the inference time. Do I need to change the code or add an option to change the image size during inference?

Invalid: Protobuf parsing failed?

Hi when I run it get this error, any idea what could be wrong?

GPU: A10G 24GB VRAM

  return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
  File "/home/ubuntu/hallo-webui/scripts/inference.py", line 424, in <module>
    inference_process(
  File "/home/ubuntu/hallo-webui/scripts/inference.py", line 181, in inference_process
    with ImageProcessor(img_size, face_analysis_model_path) as image_processor:
  File "/home/ubuntu/hallo-webui/hallo/datasets/image_processor.py", line 97, in __init__
    self.face_analysis = FaceAnalysis(
  File "/home/ubuntu/hallo-webui/venv/lib/python3.10/site-packages/insightface/app/face_analysis.py", line 31, in __init__
    model = model_zoo.get_model(onnx_file, **kwargs)
  File "/home/ubuntu/hallo-webui/venv/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 96, in get_model
    model = router.get_model(providers=providers, provider_options=provider_options)
  File "/home/ubuntu/hallo-webui/venv/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 40, in get_model
    session = PickableInferenceSession(self.onnx_file, **kwargs)
  File "/home/ubuntu/hallo-webui/venv/lib/python3.10/site-packages/insightface/model_zoo/model_zoo.py", line 25, in __init__
    super().__init__(model_path, **kwargs)
  File "/home/ubuntu/hallo-webui/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/ubuntu/hallo-webui/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 472, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidProtobuf: [ONNXRuntimeError] : 7 : INVALID_PROTOBUF : Load model from ./pretrained_models/face_analysis/models/1k3d68.onnx failed:Protobuf parsing failed.```

Everything is working fine, but why does the inferred video file have no sound, only lip movements?

100% 2/2 [00:04<00:00, 2.48s/it]
100% 2/2 [00:00<00:00, 9.86it/s]
2024-06-15 19:03:42,073 - INFO - mdx_separator - Saving Vocals stem to aud_1_(Vocals)_Kim_Vocal_2.wav...
2024-06-15 19:03:42,377 - INFO - common_separator - Clearing input audio file paths, sources and stems...
2024-06-15 19:03:42,377 - INFO - separator - Separation duration: 00:00:06
The config attributes {'center_input_sample': False, 'out_channels': 4} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
['conv_norm_out.bias, conv_norm_out.weight, conv_out.bias, conv_out.weight']
The config attributes {'center_input_sample': False} were passed to UNet3DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Load motion module params from pretrained_models/motion_module/mm_sd_v15_v2.ckpt
loaded weight from ./pretrained_models/hallo/net.pth
100% 40/40 [00:25<00:00, 1.55it/s]
100% 16/16 [00:00<00:00, 30.32it/s]
100% 40/40 [00:25<00:00, 1.57it/s]
100% 16/16 [00:00<00:00, 32.29it/s]
100% 40/40 [00:25<00:00, 1.57it/s]
100% 16/16 [00:00<00:00, 32.26it/s]
100% 40/40 [00:25<00:00, 1.56it/s]
100% 16/16 [00:00<00:00, 32.27it/s]
100% 40/40 [00:25<00:00, 1.56it/s]
100% 16/16 [00:00<00:00, 32.26it/s]
100% 40/40 [00:25<00:00, 1.56it/s]
100% 16/16 [00:00<00:00, 32.25it/s]
100% 40/40 [00:25<00:00, 1.56it/s]
100% 16/16 [00:00<00:00, 32.23it/s]
Moviepy - Building video /content/output1.mp4.
MoviePy - Writing audio in output1TEMP_MPY_wvf_snd.mp3
MoviePy - Done.
Moviepy - Writing video /content/output1.mp4

Moviepy - Done !
Moviepy - video ready /content/output1.mp4

HelloGitHub Badge

🎉 Congratulations! Your project has been featured and recommended by the HelloGitHub community. We invite you to join the HelloGitHub Badge Program. Joining will grant you the following privileges:

Community Recognition: The badge indicates that your project has successfully passed the HelloGitHub community's stringent selection and recommendation process.
Increased Exposure: Displaying the badge will draw more traffic to your project, attracting additional users and contributors.
Enhanced Interaction: Users can quickly understand your project through the badge and engage with it (like, comment, bookmark).
Feedback Collection: Gather genuine feedback from a broad user base, aiding in the continuous improvement of your project.
Special Identification: Once verified, your comments will feature a distinctive mark and be prioritized for pinning.

📌 Click here to wear the badge and join the HelloGitHub Badge Program, allowing your open-source project to shine even brighter.

HelloGitHub is a community focused on discovering, sharing, and promoting open-source projects. Since its inception in 2016, it has grown from a monthly newsletter into a dynamic community with over 10,000 users. Our footprint extends across multiple content platforms, earning the trust and support of 500,000 fans worldwide.

GPU not working

PS D:\1Git\hallo> python scripts/inference.py --source_image .\img.jpg --driving_audio .\audio.wav
A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
File "C:\Users\akash\AppData\Local\Programs\Python\Python310\lib\site-packages\xformers_init_.py", line 55, in _is_triton_available
from xformers.triton.softmax import softmax as triton_softmax # noqa
File "C:\Users\akash\AppData\Local\Programs\Python\Python310\lib\site-packages\xformers\triton\softmax.py", line 11, in
import triton
ModuleNotFoundError: No module named 'triton'
WARNING:py.warnings:C:\Users\akash\AppData\Local\Programs\Python\Python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
warnings.warn(

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
WARNING:py.warnings:C:\Users\akash\AppData\Local\Programs\Python\Python310\lib\site-packages\insightface\utils\transform.py:68: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1.
P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1718569438.682961 2464 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1718569438.725895 23228 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1718569438.745459 19520 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
WARNING:py.warnings:C:\Users\akash\AppData\Local\Programs\Python\Python310\lib\site-packages\google\protobuf\symbol_database.py:55: UserWarning: SymbolDatabase.GetPrototype() is deprecated. Please use message_factory.GetMessageClass() instead. SymbolDatabase.GetPrototype() will be removed soon.
warnings.warn('SymbolDatabase.GetPrototype() is deprecated. Please '

Processed and saved: ./.cache\img_sep_background.png
Processed and saved: ./.cache\img_sep_face.png
Some weights of Wav2VecModel were not initialized from the model checkpoint at ./pretrained_models/wav2vec/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
INFO:audio_separator.separator.separator:Separator version 0.17.2 instantiating with output_dir: ./.cache\audio_preprocess, output_format: WAV
INFO:audio_separator.separator.separator:Operating System: Windows 10.0.22631
INFO:audio_separator.separator.separator:System: Windows Node: SmashingStar Release: 10 Machine: AMD64 Proc: Intel64 Family 6 Model 154 Stepping 3, GenuineIntel
INFO:audio_separator.separator.separator:Python Version: 3.10.11
INFO:audio_separator.separator.separator:PyTorch Version: 2.3.0+cu121
INFO:audio_separator.separator.separator:FFmpeg installed: ffmpeg version 2024-06-03-git-77ad449911-full_build-www.gyan.dev Copyright (c) 2000-2024 the FFmpeg developers
INFO:audio_separator.separator.separator:ONNX Runtime GPU package installed with version: 1.18.0
INFO:audio_separator.separator.separator:ONNX Runtime CPU package installed with version: 1.18.0
INFO:audio_separator.separator.separator:CUDA is available in Torch, setting Torch device to CUDA
WARNING:audio_separator.separator.separator:CUDAExecutionProvider not available in ONNXruntime, so acceleration will NOT be enabled
INFO:audio_separator.separator.separator:Loading model Kim_Vocal_2.onnx...
17.2kiB [00:00, 866kiB/s]
4.38kiB [00:00, 583kiB/s]
12.0kiB [00:00, 1.50MiB/s]
INFO:audio_separator.separator.separator:Load model duration: 00:00:13
INFO:audio_separator.separator.separator:Starting separation process for audio_file_path: .\audio.wav

Please tell me how do i fix this?
I put some 1 min+ video its been running since 3 hours.. i feel its not using GPU well.

Windows inference

Hello everyone!
Can I run it on windows machine?

请问一直生成，很久了，是正常的吗？

使用主页的推理代码，我已经跑了一小时还没结束···，慢得相当离谱啊

3080 32g内存

OOM error when loading models

Hi,

when i try to test inference with 1 minute audio having oom error on ram, it seems some model files going to load on cpu.
my env is wsl2 ubuntu 20gb ram - rtx4090. detailed log is below

Some weights of Wav2VecModel were not initialized from the model checkpoint at ./pretrained_models/wav2vec/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
INFO:audio_separator.separator.separator:Separator version 0.17.2 instantiating with output_dir: ./.cache/audio_preprocess, output_format: WAV
INFO:audio_separator.separator.separator:Operating System: Linux #1 SMP Fri Mar 29 23:14:13 UTC 2024
INFO:audio_separator.separator.separator:System: Linux Node: developer Release: 5.15.153.1-microsoft-standard-WSL2 Machine: x86_64 Proc: x86_64
INFO:audio_separator.separator.separator:Python Version: 3.10.14
INFO:audio_separator.separator.separator:PyTorch Version: 2.2.2+cu121
INFO:audio_separator.separator.separator:FFmpeg installed: ffmpeg version N-115648-g7a3369398f Copyright (c) 2000-2024 the FFmpeg developers
INFO:audio_separator.separator.separator:ONNX Runtime CPU package installed with version: 1.18.0
INFO:audio_separator.separator.separator:CUDA is available in Torch, setting Torch device to CUDA
WARNING:audio_separator.separator.separator:CUDAExecutionProvider not available in ONNXruntime, so acceleration will NOT be enabled
INFO:audio_separator.separator.separator:Loading model Kim_Vocal_2.onnx...
INFO:audio_separator.separator.separator:Load model duration: 00:00:00
INFO:audio_separator.separator.separator:Starting separation process for audio_file_path: assets/derniere.wav
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:16<00:00, 1.06it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13/13 [00:00<00:00, 19.25it/s]
INFO:audio_separator.separator.separator:Saving Vocals stem to derniere_(Vocals)_Kim_Vocal_2.wav...
INFO:audio_separator.separator.separator:Clearing input audio file paths, sources and stems...
INFO:audio_separator.separator.separator:Separation duration: 00:00:17
The config attributes {'center_input_sample': False, 'out_channels': 4} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
['conv_norm_out.bias, conv_norm_out.weight, conv_out.bias, conv_out.weight']
INFO:hallo.models.unet_3d:loaded temporal unet's pretrained weights from pretrained_models/stable-diffusion-v1-5/unet ...
The config attributes {'center_input_sample': False} were passed to UNet3DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Load motion module params from pretrained_models/motion_module/mm_sd_v15_v2.ckpt
INFO:hallo.models.unet_3d:Loaded 453.20928M-parameter motion module
loaded weight from ./pretrained_models/hallo/net.pth
Traceback (most recent call last):
File "/home/developer/projects/hallo/scripts/inference.py", line 375, in
inference_process(command_line_args)
File "/home/developer/projects/hallo/scripts/inference.py", line 323, in inference_process
pipeline_output = pipeline(
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/hallo/animate/face_animate.py", line 335, in call
ref_image_latents = self.vae.encode(ref_image_tensor).latent_dist.mean
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/diffusers/models/autoencoders/autoencoder_kl.py", line 260, in encode
h = self.encoder(x)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/diffusers/models/autoencoders/vae.py", line 143, in forward
sample = self.conv_in(sample)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 460, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/developer/miniconda3/envs/hallo/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 456, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 192.00 MiB. GPU 0 has a total capacity of 23.99 GiB of which 17.16 GiB is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 5.09 GiB is allocated by PyTorch, and 132.88 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

🍊 Jupyter Notebook

Thanks for the project ❤️ I made a jupyter notebook 🥳 I hope you like it.

https://github.com/camenduru/hallo-jupyter

Command line Inference not working with config

I am specifying the config file and it is trying to load an image which is not specified anywhere

(venv) C:\sd\hallo>python scripts/inference.py --config "C:\sd\hallo\configs\inference\default.yaml"
A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
  File "C:\sd\hallo\venv\lib\site-packages\xformers\__init__.py", line 55, in _is_triton_available
    from xformers.triton.softmax import softmax as triton_softmax  # noqa
  File "C:\sd\hallo\venv\lib\site-packages\xformers\triton\softmax.py", line 11, in <module>
    import triton
ModuleNotFoundError: No module named 'triton'
INFO:albumentations.check_version:A new version of Albumentations is available: 1.4.9 (you have 1.4.8). Upgrade using: pip install --upgrade albumentations
WARNING:py.warnings:C:\sd\hallo\venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
  warnings.warn(

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis\models\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
Traceback (most recent call last):
  File "C:\sd\hallo\scripts\inference.py", line 374, in <module>
    inference_process(command_line_args)
  File "C:\sd\hallo\scripts\inference.py", line 162, in inference_process
    source_image_lip_mask = image_processor.preprocess(
  File "C:\sd\hallo\scripts\hallo\datasets\image_processor.py", line 115, in preprocess
    source_image = Image.open(source_image_path)
  File "C:\sd\hallo\venv\lib\site-packages\PIL\Image.py", line 3277, in open
    fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\sd\\hallo\\test_data\\source_images\\6.jpg'

default.yaml

source_image: .\examples\reference_images\FACE.png
driving_audio: .\examples\driving_audios\1.wav

weight_dtype: fp16

data:
  n_motion_frames: 2
  n_sample_frames: 16
  source_image:
    width: 512
    height: 512
  driving_audio:
    sample_rate: 16000
  export_video:
    fps: 25

inference_steps: 40
cfg_scale: 3.5

audio_ckpt_dir: ./pretrained_models/hallo

base_model_path: ./pretrained_models/stable-diffusion-v1-5

motion_module_path: ./pretrained_models/motion_module/mm_sd_v15_v2.ckpt

face_analysis:
  model_path: ./pretrained_models/face_analysis

wav2vec:
  model_path: ./pretrained_models/wav2vec/wav2vec2-base-960h
  features: all

audio_separator:
  model_path: ./pretrained_models/audio_separator/Kim_Vocal_2.onnx

vae:
  model_path: ./pretrained_models/sd-vae-ft-mse

save_path: ./.cache

face_expand_ratio: 1.1
pose_weight: 1.1
face_weight: 1.1
lip_weight: 1.1

unet_additional_kwargs:
  use_inflated_groupnorm: true
  unet_use_cross_frame_attention: false
  unet_use_temporal_attention: false
  use_motion_module: true
  use_audio_module: true
  motion_module_resolutions:
    - 1
    - 2
    - 4
    - 8
  motion_module_mid_block: true
  motion_module_decoder_only: false
  motion_module_type: Vanilla
  motion_module_kwargs:
    num_attention_heads: 8
    num_transformer_block: 1
    attention_block_types:
      - Temporal_Self
      - Temporal_Self
    temporal_position_encoding: true
    temporal_position_encoding_max_len: 32
    temporal_attention_dim_div: 1
  audio_attention_dim: 768
  stack_enable_blocks_name:
    - "up"
    - "down"
    - "mid"
  stack_enable_blocks_depth: [0,1,2,3]
  

enable_zero_snr: true

noise_scheduler_kwargs:
  beta_start: 0.00085
  beta_end: 0.012
  beta_schedule: "linear"
  clip_sample: false
  steps_offset: 1
  ### Zero-SNR params
  prediction_type: "v_prediction"
  rescale_betas_zero_snr: True
  timestep_spacing: "trailing"

sampler: DDIM

生成的视频比音频短，有时候漏掉了最后的一点，像是被切断了

这是输入的音频信息
Duration: 00:00:09.56, bitrate: 705 kb/s

这是生成的视频信息
Duration: 00:00:09.00, start: 0.000000, bitrate: 794 kb/s

视频文件最后的twice感觉是漏掉了
https://github.com/fudan-generative-vision/hallo/assets/775190/b781f8ca-32fd-4c89-9c9f-0c3e69b85bd4

The generated video flickers a bit

为啥不支持中文推理

为啥不支持中文推理，只用了英文训练数据集

Inference speed

Based on the table 7 in the paper

Inference w. HADVS 9.77gb 1.63secs
Inference w.o. HADVS 9.76gb 1.63secs
Inference (256 × 256) 6.62gb 0.46secs
Inference (1024 × 1024) 20.66gb 10.29secs

As far as I am aware users are not currently achieving these speeds, am I right in assuming those experimented times are per frame?

It will compress the image size and the picture quality will become unclear

Segmentation fault

Hello guys, nice work! But I have a problem while running the demo.

loaded weight from ./pretrained_models/hallo/net.pth
Segmentation fault (core dumped)

How do I solve this issue?

Thanks in advance!

Have you try to use SDXL for training？If chang from SD1.5 to SDXL will further to improve performance？

Generating consistent features

I have noticed that teeth in the generated videos are always a hit or a miss, is it because of the nature of base ad? (that it finds it difficult to maintain consistency in smaller elements). Also, do you guys know how this can be avoided/minimized ?

dialogue.mp4

HTTPSConnectionPool(host='raw.githubusercontent.com', port=443)

运行这个命令 > python scripts/inference.py --source_image ./examples/reference_images/1.jpg --driving_audio ./examples/driving_audios/1.wav

出现 requests.exceptions.ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /TRvlvr/application_data/main/filelists/download_checks.json (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x00000233F5201A50>: Failed to resolve 'raw.githubusercontent.com' ([Errno 11004] getaddrinfo failed)"))

这个错误需要科学上网吗？

IndexError: cannot do a non-empty take from an empty axes.

Hi there,

Thanks for sharing this work, it looks perfect!

After following the setup instructions, when I trying to run

”python scripts/inference.py --source_image 1.png --driving_audio 1.wav"

I got an issue like:

I would appreciate it if you would help.

Regards,

When cloning a model using git clone, I encountered an error =>The download of `hallo/net.pth` failed.

LFS is confirmed to be installed and working properly, and a proxy was also used for the download.

run:

git lfs clone https://huggingface.co/fudan-generative-ai/hallo pretrained_models

error:

WARNING: git lfs clone is deprecated and will not be updated
with new flags from git clone

git clone has been updated in upstream Git to have comparable
speeds to git lfs clone.
Cloning into 'pretrained_models'...
remote: Enumerating objects: 52, done.
remote: Counting objects: 100% (48/48), done.
remote: Compressing objects: 100% (42/42), done.
remote: Total 52 (delta 6), reused 0 (delta 0), pack-reused 4 (from 1)
Unpacking objects: 100% (52/52), 11.47 KiB | 25.00 KiB/s, done.
expected OID e886a9610b71a0f05a4cc65b4eb5bf3cebabfc75b06f8818c40ac225e69a0015, got 974a61a7c6ef1748966794cfba0f2535831680593c781791e671f49dad3c7300 after 4850540707 bytes written
Failed to fetch some objects from 'https://huggingface.co/fudan-generative-ai/hallo.git/info/lfs'

when I run the hallo demo,

python scripts/inference.py --source_image examples/reference_images/1.jpg --driving_audio examples/driving_audios/1.wav

I got that:

INFO:audio_separator.separator.separator:Starting separation process for audio_file_path: examples/driving_audios/1.wav
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:07<00:00, 2.43s/it]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 4.48it/s]
INFO:audio_separator.separator.separator:Saving Vocals stem to 1_(Vocals)_Kim_Vocal_2.wav...
INFO:audio_separator.separator.separator:Clearing input audio file paths, sources and stems...
INFO:audio_separator.separator.separator:Separation duration: 00:00:09
The config attributes {'center_input_sample': False, 'out_channels': 4} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
['conv_norm_out.bias, conv_norm_out.weight, conv_out.bias, conv_out.weight']
INFO:hallo.models.unet_3d:loaded temporal unet's pretrained weights from pretrained_models/stable-diffusion-v1-5/unet ...
The config attributes {'center_input_sample': False} were passed to UNet3DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Load motion module params from pretrained_models/motion_module/mm_sd_v15_v2.ckpt
INFO:hallo.models.unet_3d:Loaded 453.20928M-parameter motion module
Traceback (most recent call last):
File "/app/scripts/inference.py", line 372, in
inference_process(command_line_args)
File "/app/scripts/inference.py", line 244, in inference_process
torch.load(
File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1040, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1258, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

推理出来的视频背景有一些移动的奇怪噪斑

推理视频截图

原图

很不错,继续加油！

使

1.mp4

用4080，7秒视频大概8分钟生成

Mac Support

I tried my best to get this to work on a Mac, by:

installing eva-decord instead of decord
skipping xformers when on a mac
dynamically importing xformers only when xformers is available
changing the torch device to mps

as well as trying a multiple different combinations of torch dtype from float16 to float32 to bfloat16

While I DID get it to run, I had the following issues:

By default it would run out of memory (use like 70GB of memory and never makes a progress)
If I use the image size of 256 (instead of the default 512) it would at least run, but the end result is a black blank video.

Any idea how to run this on a Mac? Or any plans to implement Mac support?

Minimum VRAM?

What's the min vram for this? Thanks!

Impressive results - congratulations!

23.evangelical.hootenanny.alligator.fortune.teller.mp4

Web-UI Request

Thank you very much for the great work!

It would be super nice if you could add a simple web UI with the following functionality to enhance the user experience, especially as a tool used for remote education in developing countries:

Audio input field with the option of using the microphone for a maximum recording of 5 minutes. The audio is recorded as a wav file and automatically used as the input audio file.
Option for integrating a locally hosted LLM with the ability to generate text which is then converted to a wav audio file via a locally hosted TTS model, for a maximum of 5 minutes worth of text, and automatically used as an input audio wav file.
For the image input field, add an option for an animated "default-idle-state" that displays after submitting the image file giving the user the feeling of immersion with a "live" character.

Thank you!

Can release your training code?Thanks

What is `attn_weight` for? I don't see `attn_weight` being used anywhere else

What is module.attn_weight for? I don't see attn_weight being used anywhere else.

Use Case Example - Great job on this devs!

I wanted to share a quick demonstration of a home use case for this that I made - though it is a bit silly: https://youtu.be/Q6PgKZfQeK8

I am running that on a 3090ti and Ubuntu 2022.04. As a side question, would there be a way to get this to utilize my second 3090ti as well? Great job on this project and thanks for sharing it openly!

guys,comfyui version is here

https://github.com/AIFSH/ComfyUI-Hallo

Which feature need Cuda 12.1 ？

I don't want upgrade cuda , it needs upgrade kernel.

I want to know which feature need cuda 12.1 , is there any plan to replace it.

想知道云主机调度策略的代码在哪里，想参考看看

The inference code silently halts sometimes

when running the inference code, especially when i'm trying to create a longer video (40sec or more) it often just halts along the way (sometimes halfway through, sometimes when it's almost done) silently.

there are no error messages when this happens, it simply halts the program altogether without any error message. anyone have this problem too? From my experience it happens around 30% of the time. i am looking at the task manager and I don't see anything out of the ordinary.

The best digital person in open source! Error in integration package!

Decompress and report an error.

中文音频效果不佳，英文效果确实不错，这是否与训练数据有关？

猜测训练所用数据多为英文，应该没有对中文做任何优化，所以目前中文表现并不是很好

中文测试↓

default.mp4

英文测试↓

default.mp4

            `find_namespace_packages(...)/find_namespace` or `find_packages(...)/find`,
              you can try to use `exclude_package_data`, or `include-package-data=False` in
              combination with a more fine grained `package-data` configuration.
      
              You can read more about "package data files" on setuptools documentation page:
      
              - https://setuptools.pypa.io/en/latest/userguide/datafiles.html
      
      
              [^1]: For Python, any directory (with suitable naming) can be imported,
                    even if it does not contain any `.py` files.
                    On the other hand, currently there is no concept of package data
                    directory, all directories are treated like packages.
              ********************************************************************************
      
      !!
        check.warn(importable)
      creating build/lib.linux-x86_64-cpython-310/insightface/data/images
      copying insightface/data/images/Tom_Hanks_54745.png -> build/lib.linux-x86_64-cpython-310/insightface/data/images
      copying insightface/data/images/mask_black.jpg -> build/lib.linux-x86_64-cpython-310/insightface/data/images
      copying insightface/data/images/mask_blue.jpg -> build/lib.linux-x86_64-cpython-310/insightface/data/images
      copying insightface/data/images/mask_green.jpg -> build/lib.linux-x86_64-cpython-310/insightface/data/images
      copying insightface/data/images/mask_white.jpg -> build/lib.linux-x86_64-cpython-310/insightface/data/images
      copying insightface/data/images/t1.jpg -> build/lib.linux-x86_64-cpython-310/insightface/data/images
      creating build/lib.linux-x86_64-cpython-310/insightface/data/objects
      copying insightface/data/objects/meanshape_68.pkl -> build/lib.linux-x86_64-cpython-310/insightface/data/objects
      creating build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      copying insightface/thirdparty/face3d/mesh/cython/mesh_core.cpp -> build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      copying insightface/thirdparty/face3d/mesh/cython/mesh_core.h -> build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      copying insightface/thirdparty/face3d/mesh/cython/mesh_core_cython.c -> build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      copying insightface/thirdparty/face3d/mesh/cython/mesh_core_cython.cpp -> build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      copying insightface/thirdparty/face3d/mesh/cython/mesh_core_cython.pyx -> build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      copying insightface/thirdparty/face3d/mesh/cython/setup.py -> build/lib.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      running build_ext
      building 'insightface.thirdparty.face3d.mesh.cython.mesh_core_cython' extension
      creating build/temp.linux-x86_64-cpython-310
      creating build/temp.linux-x86_64-cpython-310/insightface
      creating build/temp.linux-x86_64-cpython-310/insightface/thirdparty
      creating build/temp.linux-x86_64-cpython-310/insightface/thirdparty/face3d
      creating build/temp.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh
      creating build/temp.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython
      gcc -pthread -B /home/ubuntu/anaconda3/envs/hallo/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/ubuntu/anaconda3/envs/hallo/include -fPIC -O2 -isystem /home/ubuntu/anaconda3/envs/hallo/include -fPIC -Iinsightface/thirdparty/face3d/mesh/cython -I/tmp/pip-build-env-jcpgxzhb/overlay/lib/python3.10/site-packages/numpy/core/include -I/home/ubuntu/anaconda3/envs/hallo/include/python3.10 -c insightface/thirdparty/face3d/mesh/cython/mesh_core.cpp -o build/temp.linux-x86_64-cpython-310/insightface/thirdparty/face3d/mesh/cython/mesh_core.o
      gcc: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
      compilation terminated.
      error: command '/usr/bin/gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for insightface
Failed to build insightface
ERROR: Could not build wheels for insightface, which is required to install pyproject.toml-based projects

Hallo sir, please help me , When I click Start , This happened

Unable to create process using 'F:\PinokioBrowser\pinokio\bin\miniconda\python.exe F:\PinokioBrowser\pinokio\bin\miniconda\Scripts\conda-script.py shell.cmd.exe activate base '
Traceback (most recent call last):
File "F:\PinokioBrowser\pinokio\api\hallo.git\app\scripts\app.py", line 1, in
from inference import inference_process
File "F:\PinokioBrowser\pinokio\api\hallo.git\app\scripts\inference.py", line 34, in
import torch
ModuleNotFoundError: No module named 'torch'

what should i do ?

fudan-generative-vision / hallo Goto Github PK

hallo's Introduction

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

📸 Showcase

🎬 Honoring Classic Films

📰 News

🤝 Community Resources

🔧️ Framework

⚙️ Installation

🗝️️ Usage

📥 Download Pretrained Models

🛠️ Prepare Inference Data

🎮 Run Inference

Training

Prepare Data for Training

Training

Accelerate Usage Explanation

📅️ Roadmap

📝 Citation

🌟 Opportunities Available

⚠️ Social Risks and Mitigations

🤗 Acknowledgements

👏 Community Contributors

hallo's People

Contributors

Stargazers

Watchers

Forkers

hallo's Issues

Hallo sir, please help me , When I click Start , This happened

Recommend Projects

Recommend Topics

Recommend Org