Git Product home page Git Product logo

luciddreamer's People

Contributors

abnervictor avatar eltociear avatar haodong2000 avatar yixunliang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

luciddreamer's Issues

How to generate 3D model?

Thanks for sharing this excellent work!
I woudl like to ask that how can I get the generated 3D model afterwards?

Looking forward your reply!

Error when installing requirements

I was doing the instructions and it installs everything but gives this error at the end:

ERROR: Could not find a version that satisfies the requirement triton (from versions: none)
ERROR: No matching distribution found for triton

Any ideas? Using Anaconda on windows 11. 4090 here.

Code Release & Related Questions

Hi, Really great work.

I was wondering if your codebase was based on threestudio? Further, do you have an intended timescale for your release? And do you know which license you intend to use?

some question about tain.py

image
I have installed the GussianDreamer environment using :
pip install submodules/diff-gaussian-rasterization/
pip install submodules/simple-knn/
but I run : python train.py --opt './configs/bagel.yaml' , and find the above question in the image.
for 1, I modify 'rendered_image, radii, depth = rasterizer(xxx)' into 'rendered_image, radii, depth, _ = rasterizer(xxx)' it can run,
but I encountered 2. Can you give me some advice? Thank you!

About XFormers

When I run this code,the warning is as follows

WARNING[XFORMERS]:` xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.12.1+cu113)
Python 3.9.16 (you have 3.9.18)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details

but in the environment.yml,

  • cudatoolkit=11.6
  • python=3.9
  • pytorch=1.12.1

The pytorch version is not the same as the one required by xformers,will this have an impact on the result?

Some questions about the algoritm.

Hi, excellent works with impressive results.

When I read the paper, I have some questions.

In the Alogrithm 1, line 6 and 7 have a notation j, but the dicussion about it is missing. I guess $j=i \delta_S$. Is that right?

Besides, line 3 claims that $t \sim U(1,1000)$. However, it seems that t should be $n * \delta_S + \delta_T$.
For example, if $\delta_S=200$ and $\delta_T=50$, the $t$ can only be 50, 250, 450, 650, 850, instead of $U(1,1000)$. Is there any problem about it?

Thanks! :)

Supplementary Material Issue

Thanks for the great work!

In the main text, it is mentioned that details are provided in the supplementary material. However, it can not be found in the Arxiv version.

Could you please update it to add the supplementary material? Thx :)

Method Issue

According to the multi-step DDIM sampling, it is mentioned in Section 3.2 that Eqn. (13) is derived from Eqn. (11).

However, it is quite confused since Eqn. (11) seems incorrect.

The DDIM sampling seems to be:

$\frac{\tilde{x}_s}{\sqrt{\overline{\alpha}_s}}=\frac{x_t}{\sqrt{\overline{\alpha}_t}}+(\gamma(s)-\gamma(t)) \epsilon(x_t; y, \phi)$.

Since $\overline{\alpha}_0=1$, it can derive Eqn. (13).

Also, the notation of the sampling latents $\tilde{x}_s \dots$ is missed.

how to find Pretrained Diffusion Models config.json file

Firstly, you may need to change model_key: in the configs<config_file>.yaml to link the local Pretrained Diffusion Models ( Stable Diffusion 2.1-base in default)

I wonder how to find config.json file. After downloading the file to "/home/xxx/LucidDreamer-main/stabilityai/stable-diffusion-2-1-base" ,I came across this error.
image

This is the files in the StableDiffusion model.
image

Comparison results of ISM with the multi-step DDIM baseline

Hi, thanks for the awesome work and the code, it brings lots of valuable insights about the SDS to me!

From the paper I think the multi-step DDIM baseline can also solve the low feature consistency and low-quality problem of the vanila SDS loss; the proposed ISM is a spped-up version of this baseline method. What about the comparison results of the proposed ISM with this multi-step DDIM baseline in terms of running time and quality of the text-to-3D results?

Question about algorithm

Thank you for your great work! In algorithm 1, the noise added is fixed (determined by unet and x_0). However, in the "train_step_perpneg" function, a random noise is added which is different from the algorithm 1.

How can I generate zero-shot avatar??

I can't find training codes for Zero-shot avatar..
This code is only for head. Do you have plans about updating about zero-shot avatar generation?

How to finetune?

I want to load the trained LucidDreamer model and then proceed with further finetune, what should I do?

About the reproduction of figs in the paper

Hi,

Thanks for your great work.

I tried to reproduce the figs such as Fig. 1 in the paper following the training scripts in ./configs with some modifications. But the results are less than satisfactory.

Could you give more training configs of Fig. 1 in the paper?

Thanks.

Issues with Dependency Installation and Training for LucidDreamer Gradio Demo

Hello,

I'm currently engaging with the LucidDreamer project and have been following the installation instructions in the Gradio Demo guide. I would like to report some issues I encountered during this process, along with the solutions that worked for me.

  1. Initial Setup:
    As per the guide's instructions, I started by creating a new Conda environment with the following command:

    conda create -n LD_Demo python=3.9.16 cudatoolkit=11.8 -y

    This step was completed successfully, setting up an environment with Python 3.9.16 and CUDA Toolkit 11.8.

  2. Dependency Installation Issues:
    However, I encountered problems when trying to install specific dependencies:

    pip install git+https://github.com/YixunLiang/diff-gaussian-rasterization.git
    pip install git+https://github.com/YixunLiang/simple-knn.git

    The error message I received was:

    raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
    RuntimeError: The detected CUDA version (12.3) mismatches the version that was used to compile PyTorch (11.7). Please make sure to use the same CUDA versions.

    Despite my system having CUDA version 12.3, I anticipated that creating the Conda environment with cudatoolkit=11.8 would resolve any version conflicts. To address this issue, I had to uninstall and then reinstall PyTorch and its associated libraries within the Conda environment:

    pip uninstall torch torchvision torchaudio
    pip install torch torchvision torchaudio

    After these adjustments, I was able to successfully install the dependencies and run the Gradio demo. It's also worth noting that the command mentioned in the documentation seems to be outdated. The correct command now appears to be python app.py --cuda $LD_CUDA, not gradio.demo.py.

  3. Training Issues:
    During training, I encountered several errors related to xFormers:

    `flshattF` is not supported because: xFormers wasn't build with CUDA support dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) Operator wasn't built - see `python -m xformers.info` for more info
    

    I resolved this issue by following the solution in this thread:

    pip install -U xformers --no-deps -qq

I hope this information helps in improving the setup process for future users. Any updates to the documentation or advice on these issues would be greatly appreciated.

Thank you for your time and effort in maintaining this project.

Best regards,

leo4life

Inquiry regarding method part

Thanks for sharing the code and presenting such a great paper!

I have a question about how the Equation 7 is derived from Equation 5 and how the gradient computation and the additional gamma term are included in the equation. Could you please provide some insights or explanations on this?

Thank you!

About the visualization

Thanks for sharing this excellent work!
I woudl like to ask that how can I visualize the gaussian points and the in-time rendered results? Are there any relevant tutorials or README?

Looking forward your reply!

Results Generated by Model on Civitai are blurred and distorted

Hello, I've been attempting to replicate the results from your paper, specifically the text "A portrait of Hatsune Miku, robot" using the Civitai model. Unfortunately, the outcomes I'm getting are quite poor and don't resemble the results shown in the paper.

I am unsure if there is a specific configuration that I might be missing, could you provide a config file that could reproduce the results as they are in the publication?

Thank you very much for your assistance.

WX20231220-171358@2x

Question about the DDIM process in the paper

Hi, thanks for the impressive work! I check the Eq.(11) in the latest arxiv paper but find it is not consistent with the original DDIM process.
I also notice #15 but find there seems to be a mistake.

In the paper "Denoising diffusion implicit models", Eq.(12) shows:

ddim

Set $\sigma_t=0$, we obtain:
ddim2

However, from the paper and the #15, the DDIM process seems to be:
ddim3

I think they are not equivalent. In original DDIM process, we have:
$x_{t-1} = \frac{\sqrt{\alpha_{t-1}}}{\sqrt{\alpha_{t}}}\cdot x_t + ...$

In the paper version of DDIM, we have:
ddim4

But:
ddim5

Is there a mistake in the paper or I miss something?


Update: Sorry, I checked again and find the $\alpha$ does not share the same meaning in the LucidDreamer paper and the DDIM paper. $\alpha_t$ in DDIM paper is equal to $\overline{\alpha_t}$ in the LucidDreamer paper. So the DDIM process is equivalent.

RuntimeError: numel: integer multiplication overflow

I ran into the following exception when I made some modifications on learning rates.

 9 Training progress:  21%|▍ | 1030/5000 [15:14<1:43:53,  1.57s/it, Loss=0.9026756]Error executing job with overrides: ['+wandb_key=xxx']
10 Traceback (most recent call last):
11   File "/LucidDreamer/train.py", line 622, in main
12     training(lp, op, pp, gcp, gp, hg_params, cfg.test_iterations, cfg.save_iterations, cfg.checkpoint_iterations,
13   File "/LucidDreamer/train.py", line 349, in training
14     render_pkg = render(viewpoint_cam, gaussians, pipe, background,
15   File "/LucidDreamer/gaussian_renderer/__init__.py", line 146, in render
16     rendered_image, radii, depth_alpha = rasterizer(
17   File "/LucidDreamer/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
18     return forward_call(*args, **kwargs)
19   File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 186, in forward
20     return rasterize_gaussians(
21   File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 28, in rasterize_gaussians
22     return _RasterizeGaussians.apply(
23   File "/LucidDreamer/venv/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
24     return super().apply(*args, **kwargs)  # type: ignore[misc]
25   File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 78, in forward
26     num_rendered, color, depth, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
27 RuntimeError: numel: integer multiplication overflow

Specifically, the optimization params that I use are as follows:

as_latent_ratio: 0.2
densification_interval: 100
densify_from_iter: 100
densify_grad_threshold: 0.00075
densify_until_iter: 3000
feature_lr: 0.01
feature_lr_final: 0.0005
fovy_scale_up_factor:
- 0.75
- 1.1
geo_iter: 0
iterations: 5000
lambda_scale: 0.0
lambda_tv: 0.0
opacity_lr: 0.01
opacity_reset_interval: 300
percent_dense: 0.003
phi_scale_up_factor: 1.5
position_lr_delay_mult: 0.01
position_lr_final: 1.6e-06
position_lr_init: 0.00016
position_lr_max_steps: 30000
pro_frames_num: 600
pro_render_45: false
progressive_view_init_ratio: 0.2
progressive_view_iter: 500
rotation_lr: 0.01
rotation_lr_final: 0.0005
save_process: true
scale_up_cameras_iter: 500
scale_up_factor: 0.95
scaling_lr: 0.01
scaling_lr_final: 0.0005
use_control_net_iter: 10000000
use_progressive: false
warmup_iter: 1500

I've also checked this issue from the original gaussian-splatting repo with little help: graphdeco-inria/gaussian-splatting#24

I wonder if similar issues were encountered before, and what are the possible methods to mitigate this issue?

How to extract mesh/get normal?

May I ask what is the data format for the final point_cloud_rgb.txt and point_cloud.ply and if it is possible to provide code for extracting mesh/obtaining normals?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.