Git Product home page Git Product logo

nex-code's Introduction

NeX: Real-time View Synthesis with Neural Basis Expansion

Open NeX in Colab

NeX

We present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce NeXt-level view-dependent effects---in real time. Unlike traditional MPI that uses a set of simple RGBα planes, our technique models view-dependent effects by instead parameterizing each pixel as a linear combination of basis functions learned from a neural network. Moreover, we propose a hybrid implicit-explicit modeling strategy that improves upon fine detail and produces state-of-the-art results. Our method is evaluated on benchmark forward-facing datasets as well as our newly-introduced dataset designed to test the limit of view-dependent modeling with significantly more challenging effects such as the rainbow reflections on a CD. Our method achieves the best overall scores across all major metrics on these datasets with more than 1000× faster rendering time than the state of the art.

Table of contents



Getting started

conda env create -f environment.yml
./download_demo_data.sh
conda activate nex
python train.py -scene data/crest_demo -model_dir crest -http
tensorboard --logdir runs/

Installation

We provide environment.yml to help you setup a conda environment.

conda env create -f environment.yml

Dataset

Shiny dataset

Download: Shiny dataset.

We provide 2 directories named shiny and shiny_extended.

  • shiny contains benchmark scenes used to report the scores in our paper.
  • shiny_extended contains additional challenging scenes used on our website project page and video

NeRF's real forward-facing dataset

Download: Undistorted front facing dataset

For real forward-facing dataset, NeRF is trained with the raw images, which may contain lens distortion. But we use the undistorted images provided by COLMAP.

However, you can try running other scenes from Local lightfield fusion (Eg. airplant) without any changes in the dataset files. In this case, the images are not automatically undistorted.

Deepview's spaces dataset

Download: Modified spaces dataset

We slightly modified the file structure of Spaces dataset in order to determine the plane placement and split train/test sets.

Using your own images.

Running NeX on your own images. You need to install COLMAP on your machine.

Then, put your images into a directory following this structure

<scene_name>
|-- images
     | -- image_name1.jpg
     | -- image_name2.jpg
     ...

The training code will automatically prepare a scene for you. You may have to tune planes.txt to get better reconstruction (see dataset explaination)

Training

Run with the paper's config

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http

This implementation uses scikit-image to resize images during training by default. The results and scores in the paper are generated using OpenCV's resize function. If you want the same behavior, please add -cv2resize argument.

Note that this code is tested on an Nvidia V100 32GB and 4x RTX 2080Ti GPU.

For a GPU/GPUs with less memory (e.g., a single RTX 2080Ti), you can run using the following command:

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http -layers 12 -sublayers 6 -hidden 256

Note that when your GPU runs ouut of memeory, you can try reducing the number of layers, sublayers, and sampled rays.

Rendering

To generate a WebGL viewer and a video result.

python train.py -scene ${scene} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -predict -http

Video rendering

To generate a video that matches the real forward-facing rendering path, add -nice_llff argument, or -nice_shiny for shiny dataset

Citation

@inproceedings{Wizadwongsa2021NeX,
    author = {Wizadwongsa, Suttisak and Phongthawee, Pakkapon and Yenphraphai, Jiraphon and Suwajanakorn, Supasorn},
    title = {NeX: Real-time View Synthesis with Neural Basis Expansion},
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year = {2021},
}

Visit us 🦉

Vision & Learning Laboratory VISTEC - Vidyasirimedhi Institute of Science and Technology

nex-code's People

Contributors

pureexe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nex-code's Issues

Noticeable difference in brightness between ground truth images in shiny dataset, and those in evaluation folders

Hi! While poking around in the data for the cd shiny sequence, I noticed that there is a difference in brightness between the ground truth heldout images provided in the evaluation folder, and those provided in the dataset folder (it's a lot more obvious if you overlay the images and cycle between them). The same brightness discrepancy appears for most of the other sequences as well. I'm wondering if you can tell me why this is the case. Additionally, after running NeRF on your data and writing out the ground truth images, I get images that look identical to those in the dataset folder, but not the evaluation folder:

(CD Evaluation Folder)
000_gt_shiny

(CD Dataset Folder)
frame_0001

(NeRF Ground Truth)
000_gt

Thanks in advance!

Low contrast outputs

Training seems to work fine with the preview image in tensorboard getting sharper over time.

However, when I run with the -predict flag the evaluation, video frames, and realtime demo all show as a grey image with low contrast aberrations. During the saving processes I get a lot of warnings about " UserWarning {path}_ours.png is a low contrast image".

I'm not sure what to do from here.

low contrast

image

pose convertion, resize and camera principal point

  1. I was curious about the nerf_pose_to_ours function, and I read the article below. But there are still some things I don't understand. #13

As I understand the Pose value is the Camera to world matrix, and each column represents the x-axis, y-axis, z-axis, and location of the camera in the world coordinate system.

If I want to change the coordinate axis of the camera from the opengl coordinate system (right, up, backward) to the opencv coordinate system (right, down, forward), the pose values [r1, r2, r3, t] are set to [r1, -r2, -r3, t], isn't it? Why does Translation change? Isn't the world coordinate system fixed?

The poses_bounds.npy file stores the camera coordinate axes as (down, right, backward). When you change this from NeRF to OpenGL coordinate system (right, up, backward), don't you do it like the following? Why is the method of changing the coordinate axes in the nerf_to_our_pose function different from the method of changing the coordinate axes below?

nex-code/utils/load_llff.py

Lines 245 to 254 in eeff38c

def load_llff_data(basedir, factor=8, recenter=True, bd_factor=.75, spherify=False, path_zflat=False, split_train_val = 0, render_style=''):
# poses, bds, imgs = _load_data(basedir, factor=factor) # factor=8 downsamples original imgs by 8x
poses, bds, intrinsic = _load_data(basedir, factor=factor, load_imgs=False) # factor=8 downsamples original imgs by 8x
print('Loaded', basedir, bds.min(), bds.max())
# Correct rotation matrix ordering and move variable dim to axis 0
#poses [R | T] [3, 4, images]
poses = np.concatenate([poses[:, 1:2, :], -poses[:, 0:1, :], poses[:, 2:, :]], 1)

Doesn't the world coordinate system matter whether it is opencv convention or opengl convention? Isn't the world coordinate system determined independently? Maybe it's because the nerf_to_our_pose function is located after recenter..? I'm confused.

  1. It makes sense to multiply the focal length by the scale factor when resizing the image. By the way, why add 0.5 to the principal point, multiply, and subtract 0.5 again? Can't we just multiply by sw? I'm curious about the hidden meaning here.

nex-code/utils/sfm_utils.py

Lines 188 to 191 in eeff38c

cam['fx'] = ocam['fx'] * sw
cam['fy'] = ocam['fy'] * sh
cam['px'] = (ocam['px']+0.5) * sw - 0.5
cam['py'] = (ocam['py']+0.5) * sh - 0.5

adjust the number of rays

Hi, how to adjust the number of rays in each step of the rendering phase? I saw the argument ray just used in training.
image

Enquiry related to the normalization

Dear author,

After reading your paper, the experience is enriching as well as being overwhelmed.

I have some confusion regarding your code, and would appreciate if you can explain.

In the poses_avg, it calculates the new center of the world, and get a normalized vector, Vec2.

THen in viewmatrix function, it normalizes the z (Vec2) for a second time, and then vec0 and vec1 is cross producted and normalized.

Why are the above cross product processes necessary? could we simply normalize them without cross product?

Thanks

def viewmatrix(z, up, pos):
vec2 = normalize(z)
vec1_avg = up
vec0 = normalize(np.cross(vec1_avg, vec2))
vec1 = normalize(np.cross(vec2, vec0))
m = np.stack([vec0, vec1, vec2, pos], 1)
return m

def poses_avg(poses):
#poses [images, 3, 4] not [images, 3, 5]
#hwf = poses[0, :3, -1:],

center = poses[:, :3, 3].mean(0)
vec2 = normalize(poses[:, :3, 2].sum(0))
up = poses[:, :3, 1].sum(0)
c2w = np.concatenate([viewmatrix(vec2, up, center)], 1)

Pre-trained models?

Will there be plans to release the pre-trained models on all three dataset?

low quality outputs

Hello and thanks for sharing this nice work! I 've been trying to make better outputs with nex but the only thing I got was low-quality output.

I trained 2k-images with this code.

python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http -layers 12 -sublayers 6 -hidden 256

but I got output-images like these. (the images which are automatically created in video_output directory after training )

Even I created a dataset following the images collection method shown in LLFF and successfully installed Colmap and other requirements as well, I have no idea what's wrong with it.

If there is any other way to get better outputs, please help me out. :)

and my environments are like those below.
RTX-2080Ti
Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
Ubuntu 18.04.
CUDA Toolkit 10.2

Thanks

[Training] AssertionError: Torch not compiled with CUDA enabled

Hi,

I am trying running nex on my machine (windows, RTX2080).

After I created the conda env via environment.tml, I first received this error when run the training code:

  1. OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
    OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/

Then I set the variable KMP_DUPLICATE_LIB_OK to true and rerun. Now I received the second error:
2) AssertionError: Torch not compiled with CUDA enabled.

I looked up online and all suggested to install torch and torchvision via conda but in envirnoment.yml authors noted to install via pip conda give error about cuda. So, we install torch via pip instead.

Any idea?

MPI generation at multiple resolutions

Hi,

Great work and thanks for making the code available.

I have trained a model at a resolution of 1008px but I can’t find the way to generate multiple MPI at a lower resolution starting from that hi-res model (for example 800px and 640px for mobile viewing).

How did you approach it when you have created your examples for Mobile, Low, High and VR viewing?

Have you generated a model for each specific resolution or is there another way to generate the MPI without training several models?

Thanks in advance!

CUDA problem when following the `get started`

when follow the steps

conda env create -f environment.yml
./download_demo_data.sh
conda activate nex
python train.py -scene data/crest_demo -model_dir crest -http

Here comes the problem, I don't know what happens.

"train.py" in <module>
  751:  train()
  train.py
"train.py" in train
  633:  output = model(dataset.sfm, feature, output_shape, sel)
  train.py
"module.py" in _call_impl
  889:  result = self.forward(*input, **kwargs)
"train.py" in forward
  334:  warp, ref_coords = computeHomoWarp(sfm,
  train.py
"train.py" in computeHomoWarp
  158:  prod = coords @ pt.transpose(Hs, 1, 2).cuda()
  train.py
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

360 degree is possible?

Hi,

Is it possible to train & view for 360 degree around captured images ( for scene and objects captured from 360 degree views ) ?

Enquiry regarding the calculation of evaluation results

Dear author,

Thanks for providing us such an excellent work.

I found that the on scene seasoning, the PSNR result of evaluations on 8 views are 0.3 db higher than the result shown in the paper.

I want to confirm with you the calculation method of your evaluation results.

Do you use 8 views per scene to calculate the average of PSNR?

Thanks

License for Shiny

Hi, I could not find a License file for the shiny dataset and was wondering if the MIT license applies to the dataset as well or if there were restrictions on use of the dataset that would forbid use for non-commercial research?

COLMAP

If I'm installing the prebuilt binary, how do I use that and indicate that is COLMAP. Every time I try to run it, I just get "No COLMAP found in this machine"

shiny dataset

Can someone provide the shiny dataset of Google Netdisk or Baidu Netdisk version? Onedrive sucks.
Thanks!!!

Colab custom image demo error

Hello, Thank you for your masterpiece.

The following error occurs when I train your model from colab to custom image.

My video's resolution is 1280 x 720

How to solve that?

Thanks.


Loaded data/demo 1.435987851575115 25.984873491396872
recentered (3, 4)
Overriding offset 200-> 80
dmin = 7.938341, dmax = 20.183123, invz = 0, offset = 80
TRAINING IMAGES: 37
VALIDATE IMAGES: 6
Basis Network: DataParallel(
(module): ReluMLP(
(activation): LeakyReLU(negative_slope=0.01)
(seq1): Sequential(
(0): Linear(in_features=12, out_features=64, bias=True)
(1): LeakyReLU(negative_slope=0.01)
(2): Linear(in_features=64, out_features=64, bias=True)
(3): LeakyReLU(negative_slope=0.01)
(4): Linear(in_features=64, out_features=8, bias=True)
)
)
)
Mpi Size: torch.Size([12, 3, 385, 560])
All combined layers: 72
[ 7.93834066 8.11080238 8.2832641 8.45572582 8.62818754 8.80064925
8.97311097 9.14557269 9.31803441 9.49049613 9.66295784 9.83541956
10.00788128 10.180343 10.35280471 10.52526643 10.69772815 10.87018987
11.04265159 11.2151133 11.38757502 11.56003674 11.73249846 11.90496018
12.07742189 12.24988361 12.42234533 12.59480705 12.76726877 12.93973048
13.1121922 13.28465392 13.45711564 13.62957735 13.80203907 13.97450079
14.14696251 14.31942423 14.49188594 14.66434766 14.83680938 15.0092711
15.18173282 15.35419453 15.52665625 15.69911797 15.87157969 16.04404141
16.21650312 16.38896484 16.56142656 16.73388828 16.90634999 17.07881171
17.25127343 17.42373515 17.59619687 17.76865858 17.9411203 18.11358202
18.28604374 18.45850546 18.63096717 18.80342889 18.97589061 19.14835233
19.32081405 19.49327576 19.66573748 19.8381992 20.01066092 20.18312263]
Using inverse depth: False, Min depth: 7.938340663909912, Max depth: 20.183122634887695
Layer of MLP: 6
Hidden Channel of MLP: 128
Main Network DataParallel(
(module): VanillaMLP(
(activation): LeakyReLU(negative_slope=0.01)
(seq1): Sequential(
(0): Linear(in_features=50, out_features=128, bias=True)
(1): LeakyReLU(negative_slope=0.01)
(2): Linear(in_features=128, out_features=128, bias=True)
(3): LeakyReLU(negative_slope=0.01)
(4): Linear(in_features=128, out_features=128, bias=True)
(5): LeakyReLU(negative_slope=0.01)
(6): Linear(in_features=128, out_features=128, bias=True)
(7): LeakyReLU(negative_slope=0.01)
(8): Linear(in_features=128, out_features=128, bias=True)
(9): LeakyReLU(negative_slope=0.01)
(10): Linear(in_features=128, out_features=25, bias=True)
)
)
)
"train.py" in
785: train()
train.py
"train.py" in train
576: start_epoch = loadFromCheckpoint(ckpt, model, optimizer)
train.py
"train.py" in loadFromCheckpoint
743: model.load_state_dict(checkpoint['model_state_dict'])
train.py
"module.py" in load_state_dict
1224: self.class.name, "\n\t".join(error_msgs)))
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py
RuntimeError: Error(s) in loading state_dict for Network:
size mismatch for mpi_c: copying a param with shape torch.Size([12, 3, 460, 560]) from checkpoint, the shape in current model is torch.Size([12, 3, 385, 560]).
CPU times: user 38.2 ms, sys: 8.82 ms, total: 47.1 ms
Wall time: 5.15 s

How to determine the parameters in the plane.txt?

Hi,

I am trying to reproduce the colmap results in your shiny datasets. I tried the exact same command for the scene "food". It generates 'hwf_cxcy.npy' and 'poses_bounds.npy', which are similar to your results.

However, how to determine the 4 parameters in the 'plane.txt'? For example, in your 'plane.txt', the four numbers for this scene are:

2.6 100 1 300

Where do these numbers come from? My colmap result shows:

Post-colmap
Images # 49
Points (50725, 3) Visibility (50725, 49)
Depth stats -0.06422890217140988 273.68360285004354 25.033925606886992

How to determine the first two numbers based on this information in your 'plane.txt'? And what about the last two numbers?

Moreover, I observe for some other scenes the 'plane.txt' contains only 3 numbers. For example, for the scene 'lab', the numbers in the file are

46 206 1

How to deal with the 4th missing value? What does that mean?

Thank you very much!

3d Reconstruction

hello, can you work to do some 3d object reconstruction, like nerf construct neural radiance field and extract color mesh.

What is correct loss drop please?

My laptop performance is poor, so I use ‘-layers 12’ and ‘ -sublayers 2’
But the output image is all white,And my loss did not drop significantly
so What is correct loss drop result, please

rendered_val\images does not exist

"train.py" in
770: train()
train.py
"train.py" in train
693: generateAlpha(model, dataset, dataloader_val, writer, runpath, epoch)
train.py
"train.py" in generateAlpha
516: out = evaluation(model,
train.py
"mpi_utils.py" in evaluation
280: info_dict = measurement(model, dataset.sfm, dataloader, ray, webpath + outputFile + '/rendered_val')
C:\Files JuanVi\2. Proyectos\ML\nex-code-main\utils\mpi_utils.py
"mpi_utils.py" in measurement
412: io.imsave(os.path.join(write_path, '{}_ours.png'.format(filename)), (predict_image * 255).astype(np.uint8))
C:\Files JuanVi\2. Proyectos\ML\nex-code-main\utils\mpi_utils.py
"_io.py" in imsave
136: return call_plugin('imsave', fname, arr, plugin=plugin, **plugin_args)
C:\Users\capit\anaconda3\lib\site-packages\skimage\io_io.py
"manage_plugins.py" in call_plugin
207: return func(*args, **kwargs)
C:\Users\capit\anaconda3\lib\site-packages\skimage\io\manage_plugins.py
"functions.py" in imwrite
303: writer = get_writer(uri, format, "i", **kwargs)
C:\Users\capit\anaconda3\lib\site-packages\imageio\core\functions.py
"functions.py" in get_writer
217: request = Request(uri, "w" + mode, **kwargs)
C:\Users\capit\anaconda3\lib\site-packages\imageio\core\functions.py
"request.py" in init
124: self._parse_uri(uri)
C:\Users\capit\anaconda3\lib\site-packages\imageio\core\request.py
"request.py" in _parse_uri
265: raise FileNotFoundError("The directory %r does not exist" % dn)
C:\Users\capit\anaconda3\lib\site-packages\imageio\core\request.py
FileNotFoundError: The directory 'C:\Files JuanVi\2. Proyectos\ML\nex-code-main\runs\evaluation\crest\000200\rendered_val\images' does not exist

Had this error pop up after training for an hour or so. Loaded the file with
python train.py -scene data\crest_demo -model_dir crest -http -layers 6 -sublayers 2 -hidden 16 -num_workers 4
on windows 10.

Any idea on how to fix this is welcomed. I can confirm that the file/folder images does not exist inside rendered_val folder.

[ValueError: invalid dmin dmax] during training.

I took 24 photos of a scene within close orientation and distance. However when I tried to train the model, it crashed and throw an exception
"train.py" in <module> 785: train() train.py "train.py" in train 559: ), dataset.sfm) train.py "train.py" in __init__ 312: raise ValueError("invalid dmin dmax") train.py ValueError: invalid dmin dmax

In the Setup environment procedure, the results are:
Points (973, 3) Visibility (973, 24) Depth stats -4.047071137830014 47.063423996075706 14.221906796195023

I succeeded in training with 12 photos before. However this time it goes wrong.
Due to some privacy reason, I cannot attach the photos here, sorry for that.

VR example on the video

I saw your video and i was very impressed thank you for making this open source.
I wonder if is possible to get more information about the VR Applications.
can i use a 360 image? will the image resolution affect the results?
what format is the 3d generated if there is one.

Thanks

cuda error: an illegal memory access was encountered

Hi @pureexe, thanks for your great work, but when I trained on m own datasets for a while, I got a Cuda error, when I change to train on only one GPU, it can train for a longer time, but can also trigger this error

the error messages like below:
train.py in forward cof=pt.repeat_interleave(cof,args.sublayers,0)
runtimeerror: an illegal memory access was encountered

But when I use the demo-room datasets to train, it seems the training phase is normal. I use the colmap to preprocess the datasets and get the hwf_cxxy.txt and poses_bounds.npy using the scripts you provided.

btw, when train on my own datasets, how to set the plane.txt? hope you can give some advice, thanks~

Shiny dataset dataloader

How can I check the given data loader and data format for Shiny dataset ?

In addition, what camera coordinate Shiny dataset use?

For example, x,y,z is (right, up, backward) in NeRF.

Can I use the LLFF loader for Shiny dataset as well?

Black boundaries in some cases of Shiny dataset

Hi, thanks for your great work!

I found there are some black lines in the boundaries of some images, in the Shiny dataset (for example, CD):

image

After I resize the image to the target width (1008), the black line still exists:

image

Could you help figure out the reason behind this issue? I would like to know if I should remove the boundary pixels during training (and also in testing).

Thank you very much!

Code of different basis

Can you provide code of FS (Fourier’s series), JH
(Jacobi spherical harmonics), HSH (hemispherical harmon-
ics), SH (spherical harmonics), and TS (Taylor’s series)?

Thanks!

Colab broken?

Hey together, i just tried to use the Colab notebook with custom images and got the error below, when running the scene cell. Tried it in Chrome on Win and Mac, same error. Should it work atm?

Upload widget is only available when the cell has been executed in the current browser session. Please rerun this cell to enable.
---------------------------------------------------------------------------
MessageError                              Traceback (most recent call last)
<ipython-input-2-8dd6a665c864> in <module>()
     40   display(HTML('<small>*minimum 12 images</small>'))
     41   dir = os.getcwd()
---> 42   uploaded = files.upload()
     43   preupload_datasets = [os.path.join(dir, f) for f in uploaded.keys()]
     44   del uploaded

2 frames
/usr/local/lib/python3.7/dist-packages/google/colab/_message.py in read_reply_from_input(message_id, timeout_sec)
    104         reply.get('colab_msg_id') == message_id):
    105       if 'error' in reply:
--> 106         raise MessageError(reply['error'])
    107       return reply.get('data', None)
    108 

MessageError: TypeError: Cannot read property '_uploadFiles' of undefined

real-time rendering issue

For real-time rendering, is it true that H_i(v) is first offline-computed on the reference view v (pre-defined), and then warped to new views (unknown)? which mean there is no network inference in the real-time rendering stage.

is it possible to share the code of the online viewer (webgl)?

hwf_cxcy document typo

I found the code hwf_cxcy.npy save [6, 1] matrix using separated focal length x, y

But the dataset explaination

hwf_cxcy.npy stores the camera intrinsic (height, width, focal length, principal point x, principal point y) in a 1x5 numpy array.

hwf_cxcy = np.array([h, w, fx, fy, cx, cy]).reshape([6,1])

CUDA out of memory error

my gpu: nvidia 1080Ti * 4

Loading Model @ Epoch 4000
"train.py" in
751: train()
train.py
"train.py" in train
584: generateAlpha(model, dataset, dataloader_val, None, runpath, dataloader_train = dataloader_train)
train.py
"train.py" in generateAlpha
494: info = getMPI(model, dataset.sfm, dataloader = dataloader_train)
train.py
"train.py" in getMPI
455: out = model.seq1(bigcoords)
train.py
"module.py" in _call_impl
889: result = self.forward(*input, **kwargs)
/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/modules/module.py
"data_parallel.py" in forward
167: outputs = self.parallel_apply(replicas, inputs, kwargs)
/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py
"data_parallel.py" in parallel_apply
177: return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py
"parallel_apply.py" in parallel_apply
86: output.reraise()
/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py
"_utils.py" in reraise
429: raise self.exc_type(msg)
/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/_utils.py
RuntimeError: Caught RuntimeError in replica 0 on device 1.
Original Traceback (most recent call last):
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/nex/utils/mlp.py", line 29, in forward
return self.seq1(x)
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/modules/activation.py", line 714, in forward
return F.leaky_relu(input, self.negative_slope, self.inplace)
File "/root/anaconda3/envs/nex/lib/python3.8/site-packages/torch/nn/functional.py", line 1378, in leaky_relu
result = torch._C._nn.leaky_relu(input, negative_slope)
RuntimeError: CUDA out of memory. Tried to allocate 3.51 GiB (GPU 1; 10.92 GiB total capacity; 4.26 GiB already allocated; 2.61 GiB free; 7.55 GiB reserved in total by PyTorch)

Run time issues

hi,
After the model training is over, why does it take 3 seconds to render a picture?
Graphics card: V100 32gb。
image

Render depth

First of all, thanks for sharing such a great work.

I noticed that although there is an option to 'render_depth' in render_video, it seems that depth rendering is not implemented?

generated MPIs cannot be seen on mobile

Hey,

I'm able to open the demos you generated on my mobile phone.
however, I'm not able to see my own generated MPIs on my mobile. They simply do not show up, I only see the slider of layers.
I'm hosted my MPI on the web, and used the same viewer you use for mobile. textures are all loaded, but then it does not show anything.
Do you I need to do any further modification on the MPI in order to view it on mobile web ?
Thanks for your support and great work

Firas

Shiny Dataset Download

After Download the shiny datasets through one-drive link, I can't decompress the zip file.

To extract the file, I repaired the compressed file through following command.

zip -FF my_zip --out my_zip_ver2.zip

But, the file still have some problems after repairing.

When I decompress the repaired file, there is the log file which notice the error logs.

image

image

  • many files does not exist.

1062 files does not exist.

Is there any problem with the uploaded file?

Or how to download and extract the complete dataset?

I tried download on OSX and Ubuntu, but both failed.

Do I have to use window to download the file?

When is warping computed?

Are the sampled x,y,d already warped before feeding them into the network? Cause in the algorithm in the paper warping is computed after the color info is regressed?

Question about viewing direction, basis function, gpu memory

All 3d points along one ray have the same viewing direction. So when rendering, isn't it enough to input only one viewing direction rather than input all duplicate viewing directions into the basis function?

Below is the result I checked by myself. out2 is the basis function value.
image

As you can see, all 32 values have the same value of 0.1176. Since the input is the same, the output is of course the same.
My question is, do I really need to waste network memory? Instead of having 32 inputs, isn't it enough to have just 1 input?

Enquiry about the size of mpi_b

Dear author,

From the output of your code, the spatial size of MPI_b is 400 by 400, which is different with the spatial size of other output.

So does your program need interpolation process when rendering on web server?

Why the MPI_b needs to be downsampled during saving, but using same size when training?

Thanks

The question about training and predict

Hello, thank you for your great work!

I wonder whether the model trained in one scene can be applied to other scenes.
Because when I try to applied a model on other datasets, it seems that the output frames are still the trained scene itself.
Is this because I made a mistake when I ran the code, or was it designed this way?

Thanks.

COLMAP

image
When using the data generated by colmap to generate training, are these two npy files automatically generated by the training code ?python train.py -scene ${PATH_TO_SCENE} -model_dir ${MODEL_TO_SAVE_CHECKPOINT} -http

Is there any requirements for the training data or args?

I take 12 casual photos of the same scene with a little difference in position and angles and trained with the Colab demo. The training procedure(the size is 400, training for 40 epochs) seems to be fine. However, the results are extremely blurred, and cannot see anything. I am not sure if there are any requirements for the input images.

Mismatch between imgs 19 and poses 5

My Setup

  • I'm using colab.
  • I'm using my own set of images.
  • I didnt tweek anything after colmap.

My Problem

When training, it cant load data since util/load_llff.py:107 return None after mismatch warning.
This is the screenshot:
image

Another Problem

I tried to remove the mismatch images and eventually the colmap give following error: FileNotFoundError: [Errno 2] No such file or directory: 'data/demo/dense/sparse/cameras.bin'
Here is the screenshot of log:
image
I wonder what do these mean and how to fix these problem.
If more infomation is need, just let me know.
Any help would be greatly appreciated.

Incorrect Colmap Option for Training

Thanks for the repo!

In utils/colmap_runner.py the --output_path option is added when --export_path should be specified.

  cmd("colmap mapper \
    --database_path " + dataset + "/database.db \
    --image_path " + dataset + "/images \
    --Mapper.ba_refine_principal_point 1 \
    --Mapper.num_threads 16 \
    --Mapper.extract_colors 0 \
    --output_path " + dataset + "/sparse")

should be

  cmd("colmap mapper \
    --database_path " + dataset + "/database.db \
    --image_path " + dataset + "/images \
    --Mapper.ba_refine_principal_point 1 \
    --Mapper.num_threads 16 \
    --Mapper.extract_colors 0 \
    --export_path " + dataset + "/sparse")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.