Excellent work! I was ispired by it and want to try it with some other work. I underst

Original parameters; LoRA would barely work if the rank is as low as 4, but can

By default the SD VAE output needs to be rescaled by about 0.18 (<code class="notransl

Is the training based on LoRA or just tune the original model parameters! about zero123plus HOT 3 CLOSED

WillowKaze commented on May 24, 2024

Is the training based on LoRA or just tune the original model parameters!

from zero123plus.

Comments (3)

WillowKaze commented on May 24, 2024

And another thing that confuses me is that "scaled by a factor of 5" in the paper, is that implemented in the code? I could not find that.

from zero123plus.

eliphatfs commented on May 24, 2024

Original parameters; LoRA would barely work if the rank is as low as 4, but can work to some extent if the rank is more than 64.
ControlNet generally assume local spatial correlations between input condition and output image, so you will need multi-view control images; for off-the-shelf ones you may want to try https://github.com/haofanwang/ControlNet-for-Diffusers; I am not familiar with T2I-Adapter so I cannot tell now. The training process is after the training of UNet, the same as regular ControlNets.
It is not possible; in general you will need other tricks such as Gaussian blob initialization (I am referring to the Instant3D paper by Adobe) to provide more global clues if you do not change the schedule. I think changing the schedule is the principly-correct way to go though. Epsilon models can be used to enhance local details, which is one of the future works I mentioned in Zero123++ report and I am currently working on.

from zero123plus.

eliphatfs commented on May 24, 2024

By default the SD VAE output needs to be rescaled by about 0.18 (vae.config.scaling_factor) before sending into diffusion; we skipped that step for the condition branch (so it is roughly scaled by a factor of 5), and we have an extra function called scale_latents that normalizes the residual by shifting and rescaling the latents according to statistics we compute with Objaverse renders.

from zero123plus.

Recommend Projects

Is the training based on LoRA or just tune the original model parameters! about zero123plus HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent