Git Product home page Git Product logo

torch-bakedsdf's People

Contributors

alvaro-budria avatar bennyguo avatar hugoycj avatar mvwouden avatar sanskar107 avatar stefan-baumann avatar wangyida avatar yyeboah avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

torch-bakedsdf's Issues

question about train bakedsdf-blender

Hi, Thanks for your great job! When I use the bakedsdf-blender.yaml config parameters execute the following code,
python launch.py --config configs/bakedsdf-colmap.yaml --gpu 0 --train dataset.root_dir=$1 \ --resume_weights_only --resume latest
The output result is incorrect. Can you tell me what might be causing it?
1

About Loss val

Why is the loss value during training nan, but the psnr value is normal during evaluation?
image

OutOfMemoryError : how to reduce GPU memory footprint ?

when I try to run :
python launch.py --config configs/bakedsdf-colmap.yaml --gpu 0 --train
I get this type of error in my GPU GeForce RTX 2070 (with 8 Gb of VRAM)

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.10 GiB (GPU 0; 7.79 GiB total capacity; 3.30 GiB already allocated; 341.44 MiB free; 5.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Epoch 0: : 16it [02:12,  8.29s/it, loss=1.99, train/inv_s=20.10, train/num_rays=1226.0]
  1. How can I change config to reduce GPU memory footprint ?
  2. What is the minim amount of VRAM required ?

The code is stuck.

Thanks for your great work. I tried to run your code on my data but the code is stuck here:

python launch.py --config configs/bakedsdf-colmap.yaml --gpu 0 --train   dataset.root_dir=./data/360_v2/garden/
Global seed set to 42
/home/zh/.conda/envs/bakedsdf/lib/python3.8/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Using 16bit None Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
[rank: 0] Global seed set to 42
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------

You are using a CUDA device ('NVIDIA GeForce RTX 4090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name  | Type          | Params
----------------------------------------
0 | model | BakedSDFModel | 41.9 M
----------------------------------------
41.9 M    Trainable params
0         Non-trainable params
41.9 M    Total params
83.865    Total estimated model params size (MB)
Epoch 0: : 0it [00:00, ?it/s]

When I cancel this task, it exits from this line:

Epoch 0: : 0it [00:00, ?it/s]^C^C/home/zh/.conda/envs/bakedsdf/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py:48: UserWarning: Detected KeyboardInterrupt, attempting graceful shutdown...
  rank_zero_warn("Detected KeyboardInterrupt, attempting graceful shutdown...")

This is my environment:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main    defaults
_openmp_mutex             5.1                       1_gnu    defaults
absl-py                   2.0.0                    pypi_0    pypi
antlr4-python3-runtime    4.9.3                    pypi_0    pypi
ca-certificates           2023.12.12           h06a4308_0    defaults
cachetools                5.3.2                    pypi_0    pypi
certifi                   2023.11.17               pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
contourpy                 1.1.1                    pypi_0    pypi
cycler                    0.12.1                   pypi_0    pypi
filelock                  3.13.1                   pypi_0    pypi
fonttools                 4.47.0                   pypi_0    pypi
fsspec                    2023.12.2                pypi_0    pypi
google-auth               2.26.1                   pypi_0    pypi
google-auth-oauthlib      1.0.0                    pypi_0    pypi
grpcio                    1.60.0                   pypi_0    pypi
idna                      3.6                      pypi_0    pypi
imageio                   2.33.1                   pypi_0    pypi
imageio-ffmpeg            0.4.9                    pypi_0    pypi
importlib-metadata        7.0.1                    pypi_0    pypi
importlib-resources       6.1.1                    pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
kiwisolver                1.4.5                    pypi_0    pypi
ld_impl_linux-64          2.38                 h1181459_1    defaults
libffi                    3.4.4                h6a678d5_0    defaults
libgcc-ng                 11.2.0               h1234567_1    defaults
libgomp                   11.2.0               h1234567_1    defaults
libstdcxx-ng              11.2.0               h1234567_1    defaults
lightning-utilities       0.10.0                   pypi_0    pypi
markdown                  3.5.1                    pypi_0    pypi
markdown-it-py            3.0.0                    pypi_0    pypi
markupsafe                2.1.3                    pypi_0    pypi
matplotlib                3.7.4                    pypi_0    pypi
mdurl                     0.1.2                    pypi_0    pypi
mpmath                    1.3.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0    defaults
nerfacc                   0.3.3                    pypi_0    pypi
networkx                  3.1                      pypi_0    pypi
ninja                     1.11.1.1                 pypi_0    pypi
numpy                     1.24.4                   pypi_0    pypi
nvidia-cublas-cu12        12.1.3.1                 pypi_0    pypi
nvidia-cuda-cupti-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-nvrtc-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-runtime-cu12  12.1.105                 pypi_0    pypi
nvidia-cudnn-cu12         8.9.2.26                 pypi_0    pypi
nvidia-cufft-cu12         11.0.2.54                pypi_0    pypi
nvidia-curand-cu12        10.3.2.106               pypi_0    pypi
nvidia-cusolver-cu12      11.4.5.107               pypi_0    pypi
nvidia-cusparse-cu12      12.1.0.106               pypi_0    pypi
nvidia-nccl-cu12          2.18.1                   pypi_0    pypi
nvidia-nvjitlink-cu12     12.3.101                 pypi_0    pypi
nvidia-nvtx-cu12          12.1.105                 pypi_0    pypi
oauthlib                  3.2.2                    pypi_0    pypi
omegaconf                 2.2.3                    pypi_0    pypi
opencv-python             4.9.0.80                 pypi_0    pypi
openssl                   3.0.12               h7f8727e_0    defaults
packaging                 23.2                     pypi_0    pypi
pillow                    10.2.0                   pypi_0    pypi
pip                       23.3.1           py38h06a4308_0    defaults
protobuf                  4.25.1                   pypi_0    pypi
pyasn1                    0.5.1                    pypi_0    pypi
pyasn1-modules            0.3.0                    pypi_0    pypi
pybind11                  2.11.1                   pypi_0    pypi
pygments                  2.17.2                   pypi_0    pypi
pymcubes                  0.1.4                    pypi_0    pypi
pyparsing                 3.1.1                    pypi_0    pypi
pyransac3d                0.6.0                    pypi_0    pypi
python                    3.8.18               h955ad1f_0    defaults
python-dateutil           2.8.2                    pypi_0    pypi
pytorch-lightning         1.9.5                    pypi_0    pypi
pyyaml                    6.0.1                    pypi_0    pypi
readline                  8.2                  h5eee18b_0    defaults
requests                  2.31.0                   pypi_0    pypi
requests-oauthlib         1.3.1                    pypi_0    pypi
rich                      13.7.0                   pypi_0    pypi
rsa                       4.9                      pypi_0    pypi
scipy                     1.10.1                   pypi_0    pypi
setuptools                68.2.2           py38h06a4308_0    defaults
six                       1.16.0                   pypi_0    pypi
sqlite                    3.41.2               h5eee18b_0    defaults
sympy                     1.12                     pypi_0    pypi
tensorboard               2.14.0                   pypi_0    pypi
tensorboard-data-server   0.7.2                    pypi_0    pypi
tinycudann                1.7                      pypi_0    pypi
tk                        8.6.12               h1ccaba5_0    defaults
torch                     2.1.2                    pypi_0    pypi
torch-efficient-distloss  0.1.3                    pypi_0    pypi
torchmetrics              1.2.1                    pypi_0    pypi
torchvision               0.16.2                   pypi_0    pypi
tqdm                      4.66.1                   pypi_0    pypi
trimesh                   4.0.8                    pypi_0    pypi
triton                    2.1.0                    pypi_0    pypi
typing-extensions         4.9.0                    pypi_0    pypi
urllib3                   2.1.0                    pypi_0    pypi
werkzeug                  3.0.1                    pypi_0    pypi
wheel                     0.41.2           py38h06a4308_0    defaults
xz                        5.4.5                h5eee18b_0    defaults
zipp                      3.17.0                   pypi_0    pypi
zlib                      1.2.13               h5eee18b_0    defaults

Could you give me a hand : )

When run "python BakedSDF2FBX.py glbfile", error!

step:

  1. python launch.py --config configs/bakedsdf-blender.yaml --gpu 0 --train --exp_dir /home/disk2/out/torch-bakedsdf/ dataset.root_dir=/home/disk2/data/nerf_synthetic/lego/ name=nues-blender-lego
  2. python export.py --exp_dir /home/disk2/out/torch-bakedsdf/nues-blender-lego/@20231123-153121/ --output-dir /home/disk2/out/torch-bakedsdf/nues-blender-lego/@20231123-153121/
  3. python BakedSDF2FBX.py /home/disk2/out/torch-bakedsdf/nues-blender-lego/@20231123-153121/nues-blender-lego.glb

output:
image

其他方法重建的mesh,如何实现只学习表征

您好:
请问其他方法重建的mesh,如colmap方法重建的mesh模型,如何学习其外观信息,实现mesh纹理的生成。
即输入为mesh和colmap姿态,利用torch-bakedsdf,最终如何生成带纹理的mesh模型。

It appears that the Install command does not work properly

pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
pip install -r requirements.txt

appears to give me result of

cuda-nn/#subdirectory=bindings/torch
Collecting git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch            q-build-jx3n58h2
  Cloning https://github.com/NVlabs/tiny-cuda-nn/ to c:\users\sydian\appdata\local\temp\pip-re 'C:\Users\Sydian\AppData\Local\Temp\pip-req-build-jx3n58h2'q-build-jx3n58h2                                                                              a39ce44
  Running command git clone --filter=blob:none --quiet https://github.com/NVlabs/tiny-cuda-nn/ 'C:\Users\Sydian\AppData\Local\Temp\pip-req-build-jx3n58h2'
  Resolved https://github.com/NVlabs/tiny-cuda-nn/ to commit e02068459c4c36ba8b6fc40e312a301dca39ce44
  Running command git submodule update --init --recursive -q
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      C:\Users\Sydian\AppData\Local\Temp\pip-req-build-jx3n58h2\bindings/torch\setup.py:5: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
        from pkg_resources import parse_version
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\Users\Sydian\AppData\Local\Temp\pip-req-build-jx3n58h2\bindings/torch\setup.py", line 9, in <module>
      ModuleNotFoundError: No module named 'torch'
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.        
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Newly created environment through conda. Python=3.10

Error while building nerfacc

I got an error when I train the model using python launch.py --config configs/neus-colmap.yaml --gpu 0 --train:

nvcc fatal   : Value 'c++17' is not defined for option 'std'
ninja: build stopped: subcommand failed.

command fails

Hi, and thank you for making this code available.

I am trying to run on the mip360 garden dataset, and when I run:

python launch.py --config configs/neus-colmap.yaml --gpu 0 --train dataset.root_dir=D://NERF//BakedSDF//torch-bakedsdf-main//torch-bakedsdf-main//load//unbounded360//garden//

I see this error:

D:\NERF\BakedSDF\torch-bakedsdf-main\torch-bakedsdf-main>python launch.py --config configs/neus-colmap.yaml --gpu 0 --train
Global seed set to 42
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
fatal: not a git repository (or any of the parent directories): .git
D:\NERF\BakedSDF\torch-bakedsdf-main\torch-bakedsdf-main\utils\callbacks.py:76: UserWarning: Code snapshot is not saved. Please make sure you have git installed and are in a git repository.
  rank_zero_warn("Code snapshot is not saved. Please make sure you have git installed and are in a git repository.")

  | Name  | Type      | Params
------------------------------------
0 | model | NeuSModel | 28.0 M
------------------------------------
28.0 M    Trainable params
0         Non-trainable params
28.0 M    Total params
55.913    Total estimated model params size (MB)
Traceback (most recent call last):
  File "D:\NERF\BakedSDF\torch-bakedsdf-main\torch-bakedsdf-main\launch.py", line 130, in <module>
    main()
  File "D:\NERF\BakedSDF\torch-bakedsdf-main\torch-bakedsdf-main\launch.py", line 119, in main
    trainer.fit(system, datamodule=dm)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 770, in fit
    self._call_and_handle_interrupt(
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 723, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 811, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1236, in _run
    results = self._run_stage()
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1323, in _run_stage
    return self._run_train()
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1353, in _run_train
    self.fit_loop.run()
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\fit_loop.py", line 266, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 208, in advance
    batch_output = self.batch_loop.run(batch, batch_idx)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\batch\training_batch_loop.py", line 88, in advance
    outputs = self.optimizer_loop.run(split_batch, optimizers, batch_idx)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\base.py", line 204, in run
    self.advance(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 203, in advance
    result = self._run_optimization(
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 256, in _run_optimization
    self._optimizer_step(optimizer, opt_idx, batch_idx, closure)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 369, in _optimizer_step
    self.trainer._call_lightning_module_hook(
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1595, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\core\lightning.py", line 1646, in optimizer_step
    optimizer.step(closure=optimizer_closure)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\core\optimizer.py", line 168, in step
    step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\strategies\strategy.py", line 193, in optimizer_step
    return self.precision_plugin.optimizer_step(model, optimizer, opt_idx, closure, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\plugins\precision\native_amp.py", line 85, in optimizer_step
    closure_result = closure()
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 148, in __call__
    self._result = self.closure(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 134, in closure
    step_output = self._step_fn()
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\optimization\optimizer_loop.py", line 427, in _training_step
    training_step_output = self.trainer._call_strategy_hook("training_step", *step_kwargs.values())
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1765, in _call_strategy_hook
    output = fn(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\strategies\dp.py", line 125, in training_step
    return self.model(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\parallel\data_parallel.py", line 169, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\overrides\data_parallel.py", line 64, in forward
    output = super().forward(*inputs, **kwargs)
  File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\overrides\base.py", line 82, in forward
    output = self.module.training_step(*inputs, **kwargs)
  File "D:\NERF\BakedSDF\torch-bakedsdf-main\torch-bakedsdf-main\systems\neus.py", line 95, in training_step
    train_num_rays = int(self.train_num_rays * (self.train_num_samples / out['num_samples_full'].sum().item()))
ZeroDivisionError: division by zero
Epoch 0: : 0it [02:25, ?it/s]
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

What might be causing this?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.