Git Product home page Git Product logo

Comments (9)

AricGamma avatar AricGamma commented on July 26, 2024

Can you show your cases? let me know what's the problem

from hallo.

Song367 avatar Song367 commented on July 26, 2024

Original image: 3466 ✖ 1942 . Output video: 512 ✖512

from hallo.

Song367 avatar Song367 commented on July 26, 2024

Can you show your cases? let me know what's the problem

The parameters are default, and the final output video resolution is 512 × 512.

from hallo.

AricGamma avatar AricGamma commented on July 26, 2024

You can modify data.source_image.width and data.source_image.height in the inference config to generate higher resolution videos. However, please be mindful of your VRAM usage.

from hallo.

AricGamma avatar AricGamma commented on July 26, 2024

Original image: 3466 ✖ 1942 . Output video: 512 ✖512

BTW, please use square images.

from hallo.

Song367 avatar Song367 commented on July 26, 2024

You can modify data.source_image.width and data.source_image.height in the inference config to generate higher resolution videos. However, please be mindful of your VRAM usage.

pipeline_output = pipeline(
ref_image=pixel_values_ref_img,
audio_tensor=audio_tensor,
face_emb=source_image_face_emb,
face_mask=source_image_face_region,
pixel_values_full_mask=source_image_full_mask,
pixel_values_face_mask=source_image_face_mask,
pixel_values_lip_mask=source_image_lip_mask,
width=1024,
height=1024,
video_length=clip_length,
num_inference_steps=config.inference_steps,
guidance_scale=config.cfg_scale,
generator=generator,
motion_scale=motion_scale,
)

change:
width=1024
height=1024

Traceback (most recent call last):
File "F:\workplace\hallo-webui\scripts\inference.py", line 424, in
inference_process(
File "F:\workplace\hallo-webui\scripts\inference.py", line 364, in inference_process
pipeline_output = pipeline(
File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\workplace\hallo-webui\hallo\animate\face_animate.py", line 401, in call
noise_pred = self.denoising_unet(
File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "F:\workplace\hallo-webui\hallo\models\unet_3d.py", line 605, in forward
sample = sample + mask_cond_fea
RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 4

from hallo.

AricGamma avatar AricGamma commented on July 26, 2024

You can modify data.source_image.width and data.source_image.height in the inference config to generate higher resolution videos. However, please be mindful of your VRAM usage.

pipeline_output = pipeline( ref_image=pixel_values_ref_img, audio_tensor=audio_tensor, face_emb=source_image_face_emb, face_mask=source_image_face_region, pixel_values_full_mask=source_image_full_mask, pixel_values_face_mask=source_image_face_mask, pixel_values_lip_mask=source_image_lip_mask, width=1024, height=1024, video_length=clip_length, num_inference_steps=config.inference_steps, guidance_scale=config.cfg_scale, generator=generator, motion_scale=motion_scale, )

change: width=1024 height=1024

Traceback (most recent call last): File "F:\workplace\hallo-webui\scripts\inference.py", line 424, in inference_process( File "F:\workplace\hallo-webui\scripts\inference.py", line 364, in inference_process pipeline_output = pipeline( File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "F:\workplace\hallo-webui\hallo\animate\face_animate.py", line 401, in call noise_pred = self.denoising_unet( File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "F:\workplace\hallo-webui\hallo\models\unet_3d.py", line 605, in forward sample = sample + mask_cond_fea RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 4

Do not modify the code. Just modify the data.source_image.width and data.source_image.height in configs/inference/default.yaml.

from hallo.

Song367 avatar Song367 commented on July 26, 2024

您可以在推理配置中修改 data.source_image.width 和 data.source_image.height 以生成更高分辨率的视频。但是,请注意 VRAM 的使用情况。

pipeline_output = pipeline(ref_image=pixel_values_ref_img, audio_tensor=audio_tensor, face_emb=source_image_face_emb, face_mask=source_image_face_region, pixel_values_full_mask=source_image_full_mask, pixel_values_face_mask=source_image_face_mask, pixel_values_lip_mask=source_image_lip_mask, width=1024, height=1024, video_length=clip_length, num_inference_steps=config.inference_steps, guide_scale=config.cfg_scale, generator=generator, motion_scale=motion_scale, )
更改:宽度=1024 高度=1024
回溯(最近一次调用最后一次):文件“F:\workplace\hallo-webui\scripts\inference.py”,第 424 行,在 inference_process 中(文件“F:\workplace\hallo-webui\scripts\inference.py”,第 364 行,在 inference_process 中pipeline_output = pipeline(文件“F:\workplace\hallo-webui\venv\lib\site-packages\torch\utils_contextlib.py”,第 115 行,在 decorate_context return func(args,kwargs)文件“F:\workplace\hallo-webui\hallo\animate\face_animate.py”,第 401 行,在call* noise_pred = self.denoising_unet(文件“F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py”,第 1511 行,在 _wrapped_call_impl return self._call_impl(*args,**kwargs) 文件“F:\workplace\hallo-webui\venv\lib\site-packages\torch\nn\modules\module.py”,第 1520 行,在 _call_impl 中返回 forward_call(*args,**kwargs) 文件“F:\workplace\hallo-webui\hallo\models\unet_3d.py”,第 605 行,在正向样本 = 样本 + mask_cond_fea RuntimeError:张量 a (128) 的大小必须与非单例维度 4 处的张量 b (64) 的大小匹配

不要修改代码。只需修改中的data.source_image.width和。data.source_image.height``configs/inference/default.yaml

I'll try

from hallo.

abdur75648 avatar abdur75648 commented on July 26, 2024

@Song367 Did it work?
Were you able to generate 1024 x 1024 frames?
Also, how was the quality? Is it just 512 x 512 resized or what?

from hallo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.