google / mipnerf Goto Github PK

 directions = ((camera_dirs[None, ..., None, :] * 

 self.camtoworlds[:, None, None, :3, :3]).sum(axis=-1)) 

 origins = np.broadcast_to(self.camtoworlds[:, None, None, :3, -1], 

 directions.shape) 

 viewdirs = directions / np.linalg.norm(directions, axis=-1, keepdims=True) 

 # Distance from each unit-norm direction vector to its x-axis neighbor. 

 dx = np.sqrt( 

 np.sum((directions[:, :-1, :, :] - directions[:, 1:, :, :])**2, -1))

This is different,right?

mipnerf/internal/datasets.py

Lines 194 to 200 in 84c969e

 dx = np.sqrt( 

 np.sum((directions[:, :-1, :, :] - directions[:, 1:, :, :])**2, -1)) 

 dx = np.concatenate([dx, dx[:, -2:-1, :]], 1) 

 # Cut the distance in half, and then round it out so that it's 

 # halfway between inscribed by / circumscribed about the pixel. 

 radii = dx[..., None] * 2 / np.sqrt(12)

If you use directions (not-norm) ,dx (or radii) will be the same for different pixel rays in the image .
If you use viewdirs （unit-norm）, dx (or radii) will be smaller for pixel rays away from image center, and bigger for pixel rays around image center.

How to understand "The variance of the conical frustum with respect to its radius r is equal to the variance of the frustum with respect to x or (by symmetry) y. "

How to understand "The variance of the conical frustum with respect to its radius r is equal to the variance of the frustum with respect to x or (by symmetry) y. "in Supplemental Material?
I think
Var(r)=E(r^2)-E(r)^2 where -R<r<R so E(r)=0
E(r^2)=E(x^2+y^2)=E(x^2)+E(y^2)=E(x^2)-E(x)^2+E(y^2)-E(y)^2=Var(x)+Var(y) where E(x)=0 E(y)=0
SO Var(r)=Var(x)+Var(y) rather than Var(r)=Var(x)
Where is my mistake?

Requirements.txt needs a dependency resolver

pip install -r requirements.txt doesn't work and old
Dependencies needs to be update

Why are we using cylindrical volume representation instead of conical frustum for forward facing scenes? That's because of changing camera's coordinate system to NDC?

Why positional encoding for viewing direction?

Positional encoding (PE) or IPE is required to allow NeRF/MipNeRF learn high frequencies. I understand that volume density and color as a function of 3d point can have discontinuities, especially at object boundaries. However, for a given 3d position, the color as a function of viewing direction is a smooth function, right? I can't imagine a scenario where it may have high frequencies. So, why do we need PE for viewing direction?

@jonbarron, I've pondered over this question for quite some time now, but didn't understand. Any intuition will be helpful.

Question about RGB Activation Padding

Hi, this is super interesting work that seems to solve a key problem with NeRF.
One thing I noticed when reading through your paper was that you used a modified RGB activation function.
I tried using this padded sigmoid with normal NeRF, and I noticed that it tends to cause background pixels to have non-zero density because they can saturate all the way to black or white, and was wondering if you encountered the same thing? I was looking at the acc image returned from volumetric integration of the weights. I'm not sure if it's significant, but I was wondering whether you compared normal sigmoid to the widened one?

I was also wondering if you experimented with shrinking the range of sigmoid? I tried lowering the range of sigmoid, and that seems to produce much cleaner accs, at the cost of less RGB range, but to a negligible extent.

Thanks!

Fails at high resolution on LLFF dataset

Hi, I appreciate your excellent work. I found that when training mip-nerf on LLFF scene (fern specifically), it reconstructs well on original (252x189) and lower resolution.

However, when I try to render higher resolution images (e.g. 512x378 here), they contain a lot of noise that vanilla nerf doesn't contain. What might be the possible reason?

Question about mipnerf360

Hello! Sorry if this is the wrong place to post this question.
In mipnerf, during inference, LLFF scenes have their near and far distances set as 0 and 1 with the use of NDC.
However in mipnerf360, during training, without the usage of NDC, I assume near and far distances are used directly from the COLMAP calculations.
If so, how do you set these values during inference? Has it got something to do with the contract(.) operator?
Or perhaps I am approaching this wrongly?

Question about CPU OOM.

I use the following command to train on multiscale datasets, but get "killed" output.

bash ./scripts/train_multiblender.sh

I have generated multiscale datesets, and changed the correct path in ./scripts/train_multiblender.sh. It works well on ./scripts/train_blender.sh with original datasets.

My computer has 16G MEM and 4G SWAP and I'd like to know the minimum requirements.

Thanks.

Need some suggestions for reimplementation in pytorch

Hi, thanks for this remarkable work.

I'm trying to reimplement mipnerf in pytorch on LLFF. Currently, I've finished the pipeline identical to this repo. But the training converged to a lower psnr on the same validation images, e.g. 22.5 in my implementation vs 24.9 in this repo for 20w iterations . Is this phenomenon possibly caused by the natural of the optimizers in pytorch? Or maybe I missed some points in mipnerf implementation? Would you please provide some suggestions?

Thank you!

Tiny typo in supplemental material

Hi! Thanks for sharing the great work!

I was just going through the paper and might find a tiny typo. So I am posting an issue in case you did not already notice. In the 1st section of supplemental material, I guess it should be (sinθ)^2 here instead of sinθ.

Radii computation with and without NDC

The radii computation code is different for non-ndc and ndc spaces. In particular, without ndc, radii computation uses only dx derived from directions, but when ndc is enabled, (dx+dy)/2 is used, which is derived from ndc_origins. Can you please shed some light on why it is done so?

a question about the sampling way of conial frustum? how about using a square Frustum, will it be better than a conial frustum

calculate the variance in 3D space?

Hi, this is a very interesting work that solves a key problem in NeRF. I don’t quite understand your code for calculating variance. Can you explain why it is calculated like this?

d_outer = d[..., :, None] * d[..., None, :]
eye = jnp.eye(d.shape[-1])
null_outer = eye - d[..., :, None] * (d / d_mag_sq)[..., None, :]
t_cov = t_var[..., None, None] * d_outer[..., None, :, :]
xy_cov = r_var[..., None, None] * null_outer[..., None, :, :]
cov = t_cov + xy_cov

How do I visualize the training result？

keras version match tensorflow

Hi, when I am trying to reproduce the result, I installed on cuda10.1 py3.6 tensorflow2.3.1, keras2.7.0

Then I met the issue when I import tensorflow, and from keras import optimizers,

"another metric with the same name already exists".

I would like to know what's the version you use during your experiments?

Thanks in advance!

Use of sensor size in focal length calculation

Hi,

Thanks for making the mip-NeRF codebase public. For the focal length calculation, why isn't the actual sensor size used (example 36mm for the Blender dataset) here? I believe the formula for calculating focal length from FOV uses the true sensor size.

mipnerf/internal/datasets.py

Line 349 in 84c969e

self.focal = .5 * self.w / np.tan(.5 * camera_angle_x)

Extracting isosurfaces of Mip-NeRF

Hello!

Thanks for sharing this awesome work! :)

I'm curious, have you thought about the correct way of extracting the isosurfaces from the trained implicit function?

For vanilla NeRF model it is possible to extract the level surface at two scales using separately trained coarse and fine networks. Here, as far as I understand, it is possible to extract the level surface at an arbitrary scale, and for that I could just query the network with a positional encoding obtained for a desired point x and some manually selected variance, which determines the scale.

Does this approach makes sense to you, or are there some reasons why it could fail?

How to get equation (5)

My undestanding of the first condition is: the projection of o to x on the direction d should be between $t_0$ and $t_1$.

But I don't understand how to get the second condition. Can anyone help me?

A confusion about the order of sin and cos in the IPE part

mipnerf/internal/mip.py

Line 183 in 84c969e

jnp.concatenate([y, y + 0.5 * jnp.pi], axis=-1),

Hello author! I recently tried to reproduce the mip-nerf by myself, and found a doubt about the IPE part.

In NeRF it is coded in the order of [(sinx,cosx),...], while Mip-NeRF seems to put the sinx-related ones together, followed by the cosx-related ones,[(sinx , ...),(cosx , ...)], as the following equation shows:

So I'm curious, have you tried coding in the same order as in NeRF, or does the current coding layout work better? Thank you so much! I just noticed this while writing the code, so I wanted to ask for some advice

a bug about training process

Hi! I downloaded the code directly from your codebase and trained it without making any changes.

Why the training time is longer than NeRF

Hi, It seems that mip-NeRF takes much longer time to train compared to NeRF. Is there any explanation or comparison? Thanks!

Could you provide the Pre-Trained model?

N/A; Sorry, this "issue" was a HUGE mistake.

Why LPIPS evaluation is removed?

I would like to know if there is any problem in evaluating LPIPS.

origins and directions issues

hi,thanks for your exciting work.I have one question.
In the code,the initialization values of directions are (x,y,1),so i think the values of origins should be (x,y,0).So,the values of directions are Xdirections=R*Xc+t,the values of origins are Xdirections-R[0:3,2].

Are the signs inverted on rgb padding?

After the sigmoid activation I noticed that you are doing this

rgb = rgb * (1 + 2 * self.rgb_padding) - self.rgb_padding

where rgb_padding = 0.001 by default.

Is this intentional or did you mean to move the range to (rgb_padding, 1 - rgb_padding)?

Reconstructing using colmap poses

If we estimate the poses of the lego dataset using colmap, will it work? I tried doing the same, and tried training it using train_llff.sh but it showed inconsistent results.

IPE explained

Hello.
First of all, thank you for sharing this great work.

I would like to kindly ask you to elaborate more on how did you derive the formulas for IPE.

Why do you concatenate it this way?

How did you get the formula for the y_var?

I cannot quiet get it from the article.
Thank you :)

scripts/train_blender.sh: line 31: 11426 Segmentation fault (core dumped)

Hi,

I ran the command

bash scripts/train_blender.sh

and the terminal indicated the following error:

scripts/train_blender.sh: line 31: 11426 Segmentation fault (core dumped) python -m train --data_dir=$DATA_DIR --train_dir=$TRAIN_DIR --gin_file=configs/blender.gin --logtostderr

Could you tell me how to address it? Thanks

average error matrics in paper

hi,thanks for your exciting work.I have two question about avg matrics.

first，I'm confused about the meaning of avg matric，will there be more advantages？

second，I find that when you compute MSE from PSNR，the implementation of the code in is different from that in the paper. so I'm a bit confunsed. Can you help me ?

A question about the code

Hello, thank you very much for your high-quality work. At present, I have some questions that I would like to get your help.

When I tried to reproduce your code, I found that the environment in requirements was difficult to work with. There were always some dependency issues.

Do you have any new dependencies installed?Looking forward to your reply.thanks!!!

How is formula (8) derived？

I can't derive Sigma in equ (8), could you please explain it a bit more?

Batch_size Can't reduce GPU Memory

Hi, I am using RTX3080 for training and will crash every 5000 iterations when executing this code
vis_suite = vis.visualize_suite(pred_distance, pred_acc)
And here is the error message

Traceback (most recent call last):
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/feihu/mipnerf-main/train.py", line 321, in <module>
    app.run(main)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/data/feihu/mipnerf-main/train.py", line 295, in main
    vis_suite = vis.visualize_suite(pred_distance, pred_acc)
  File "/data/feihu/mipnerf-main/internal/vis.py", line 140, in visualize_suite
    'depth_normals': visualize_normals(depth, acc)
  File "/data/feihu/mipnerf-main/internal/vis.py", line 125, in visualize_normals
    normals = depth_to_normals(scaled_depth)
  File "/data/feihu/mipnerf-main/internal/vis.py", line 38, in depth_to_normals
    dy = convolve2d(depth, f_blur[None, :] * f_edge[:, None])
  File "/data/feihu/mipnerf-main/internal/vis.py", line 30, in convolve2d
    return jsp.signal.convolve2d(
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/scipy/signal.py", line 85, in convolve2d
    return _convolve_nd(in1, in2, mode, precision=precision)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/scipy/signal.py", line 65, in _convolve_nd
    result = lax.conv_general_dilated(in1[None, None], in2[None, None], strides,
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/lax/convolution.py", line 147, in conv_general_dilated
    return conv_general_dilated_p.bind(
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/core.py", line 323, in bind
    return self.bind_with_trace(find_top_trace(args), args, params)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/core.py", line 326, in bind_with_trace
    out = trace.process_primitive(self, map(trace.full_raise, args), params)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/core.py", line 675, in process_primitive
    return primitive.impl(*tracers, **params)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 98, in apply_primitive
    compiled_fun = xla_primitive_callable(prim, *unsafe_map(arg_spec, args),
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/util.py", line 219, in wrapper
    return cached(config._trace_context(), *args, **kwargs)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/util.py", line 212, in cached
    return f(*args, **kwargs)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 148, in xla_primitive_callable
    compiled = _xla_callable_uncached(lu.wrap_init(prim_fun), device, None,
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 230, in _xla_callable_uncached
    return lower_xla_callable(fun, device, backend, name, donated_invars, False,
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 704, in compile
    self._executable = XlaCompiledComputation.from_xla_computation(
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 806, in from_xla_computation
    compiled = compile_or_get_cached(backend, xla_computation, options)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 768, in compile_or_get_cached
    return backend_compile(backend, computation, compile_options)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/profiler.py", line 206, in wrapper
    return func(*args, **kwargs)
  File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 713, in backend_compile
    return backend.compile(built_c, compile_options=options)
jaxlib.xla_extension.XlaRuntimeError: UNKNOWN: Failed to determine best cudnn convolution algorithm for:
%cudnn-conv = (f32[1,1,800,800]{3,2,1,0}, u8[0]{0}) custom-call(f32[1,1,800,800]{3,2,1,0} %Arg_0.1, f32[1,1,3,3]{3,2,1,0} %Arg_1.2), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convForward", metadata={op_name="jit(conv_general_dilated)/jit(main)/conv_general_dilated[window_strides=(1, 1) padding=((1, 1), (1, 1)) lhs_dilation=(1, 1) rhs_dilation=(1, 1) dimension_numbers=ConvDimensionNumbers(lhs_spec=(0, 1, 2, 3), rhs_spec=(0, 1, 2, 3), out_spec=(0, 1, 2, 3)) feature_group_count=1 batch_group_count=1 lhs_shape=(1, 1, 800, 800) rhs_shape=(1, 1, 3, 3) precision=(<Precision.HIGHEST: 2>, <Precision.HIGHEST: 2>) preferred_element_type=None]" source_file="/data/feihu/mipnerf-main/internal/vis.py" source_line=30}, backend_config="{\"conv_result_scale\":1,\"activation_mode\":\"0\",\"side_input_scale\":0}"

Original error: UNIMPLEMENTED: DNN library is not found.

To ignore this failure and try to use a fallback algorithm (which may have suboptimal performance), use XLA_FLAGS=--xla_gpu_strict_conv_algorithm_picker=false.  Please also file a bug for the root cause of failing autotuning.

I have found that jax will show this message when OOM, so i changed my batch_size from 1024 to 512, but it still takes 10GB when training, how can I reduce the usage of GPU Memory?

Question about the radius setting.

Hi, a wonderful work!

I am wondering why the radius of the cone is set to r=2/sqrt(12)*pixel_size.

I know that this setting is to ensure that the variance of the cone matches that of the pixel in world coordinate space. I'm just curious about the derivation. Could you please give me some hint on how to get to this result?

Thanks,
Yu

why only multiscale blender dataset? What about multiscale LLFF dataset?

I do not understand why you only have provided the code for multiscale blender dataset. What about multiscale LLFF dataset? Is it because multiscale representation does not work for LLFF dataset? Looking forward to your reply.

Is it safe to use unnormalized ray directions while sampling points?

Hi,

While sampling points along rays, the code uses rays.directions for the direction vectors instead of rays.viewdirs.

mipnerf/internal/models.py

Lines 70 to 81 in 84c969e

 t_vals, samples = mip.sample_along_rays( 

 key, 

 rays.origins, 

 rays.directions, 

 rays.radii, 

 self.num_samples, 

 rays.near, 

 rays.far, 

 randomized, 

 self.lindisp, 

 self.ray_shape, 

 )

Original NeRF uses normalized direction vectors for the sampling points. Can you clarify if we need to replace rays.directions with rays.viewdirs?

Is dx (or base_radius of cone) all the same for rays of different pixel in a picture?

Distance from each unit-norm direction vector to its x-axis neighbor.

dx = [
    np.sqrt(np.sum((v[:-1, :, :] - v[1:, :, :])**2, -1)) for v in directions
]

dx = [np.concatenate([v, v[-2:-1, :]], 0) for v in dx]

# Cut the distance in half, and then round it out so that it's
# halfway between inscribed by / circumscribed about the pixel.

radii = [v[..., None] * 2 / np.sqrt(12) for v in dx]`

Use of internal packages

Hi,
I think you are using internal packages in scripts/summarize.ipynb.

For example:
from google3.pyglib import gfile
with gfile.Open(filename) as f:

about shiny_datasets

sorry to bother you but, after I downloaded shiny.zip, it failed when I unzipped it. It seems that the zip file is internally damaged?

	directions = ((camera_dirs[None, ..., None, :] *
	self.camtoworlds[:, None, None, :3, :3]).sum(axis=-1))
	origins = np.broadcast_to(self.camtoworlds[:, None, None, :3, -1],
	directions.shape)
	viewdirs = directions / np.linalg.norm(directions, axis=-1, keepdims=True)

	# Distance from each unit-norm direction vector to its x-axis neighbor.
	dx = np.sqrt(
	np.sum((directions[:, :-1, :, :] - directions[:, 1:, :, :])**2, -1))

	dx = np.sqrt(
	np.sum((directions[:, :-1, :, :] - directions[:, 1:, :, :])**2, -1))
	dx = np.concatenate([dx, dx[:, -2:-1, :]], 1)
	# Cut the distance in half, and then round it out so that it's
	# halfway between inscribed by / circumscribed about the pixel.

	radii = dx[..., None] * 2 / np.sqrt(12)

	t_vals, samples = mip.sample_along_rays(
	key,
	rays.origins,
	rays.directions,
	rays.radii,
	self.num_samples,
	rays.near,
	rays.far,
	randomized,
	self.lindisp,
	self.ray_shape,
	)

google / mipnerf Goto Github PK

mipnerf's Issues

Recommend Projects

Recommend Topics

Recommend Org