google / mipnerf Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
python scripts/convert_blender_data.py --blenderdir /nerf_synthetic --outdir /multiscale
should be:
python scripts/convert_blender_data.py --blenderdir ./nerf_synthetic --outdir ./multiscale
Thanks for the code. Any suggestions on running my own data?
AS you said: in
Line 193 in 84c969e
BUT you use directions (not-norm) rather than viewdirs (unit-norm):
Lines 187 to 195 in 84c969e
Lines 194 to 200 in 84c969e
How to understand "The variance of the conical frustum with respect to its radius r is equal to the variance of the frustum with respect to x or (by symmetry) y. "in Supplemental Material?
I think
Var(r)=E(r^2)-E(r)^2 where -R<r<R so E(r)=0
E(r^2)=E(x^2+y^2)=E(x^2)+E(y^2)=E(x^2)-E(x)^2+E(y^2)-E(y)^2=Var(x)+Var(y) where E(x)=0 E(y)=0
SO Var(r)=Var(x)+Var(y) rather than Var(r)=Var(x)
Where is my mistake?
pip install -r requirements.txt
doesn't work and old
Dependencies needs to be update
Positional encoding (PE) or IPE is required to allow NeRF/MipNeRF learn high frequencies. I understand that volume density and color as a function of 3d point can have discontinuities, especially at object boundaries. However, for a given 3d position, the color as a function of viewing direction is a smooth function, right? I can't imagine a scenario where it may have high frequencies. So, why do we need PE for viewing direction?
@jonbarron, I've pondered over this question for quite some time now, but didn't understand. Any intuition will be helpful.
Hi, this is super interesting work that seems to solve a key problem with NeRF.
One thing I noticed when reading through your paper was that you used a modified RGB activation function.
I tried using this padded sigmoid with normal NeRF, and I noticed that it tends to cause background pixels to have non-zero density because they can saturate all the way to black or white, and was wondering if you encountered the same thing? I was looking at the acc
image returned from volumetric integration of the weights. I'm not sure if it's significant, but I was wondering whether you compared normal sigmoid to the widened one?
I was also wondering if you experimented with shrinking the range of sigmoid? I tried lowering the range of sigmoid, and that seems to produce much cleaner acc
s, at the cost of less RGB range, but to a negligible extent.
Thanks!
Hi, I appreciate your excellent work. I found that when training mip-nerf on LLFF scene (fern specifically), it reconstructs well on original (252x189) and lower resolution.
However, when I try to render higher resolution images (e.g. 512x378 here), they contain a lot of noise that vanilla nerf doesn't contain. What might be the possible reason?
Hello! Sorry if this is the wrong place to post this question.
In mipnerf, during inference, LLFF scenes have their near and far distances set as 0 and 1 with the use of NDC.
However in mipnerf360, during training, without the usage of NDC, I assume near and far distances are used directly from the COLMAP calculations.
If so, how do you set these values during inference? Has it got something to do with the contract(.) operator?
Or perhaps I am approaching this wrongly?
I use the following command to train on multiscale datasets, but get "killed" output.
bash ./scripts/train_multiblender.sh
I have generated multiscale datesets, and changed the correct path in ./scripts/train_multiblender.sh
. It works well on ./scripts/train_blender.sh
with original datasets.
My computer has 16G MEM and 4G SWAP and I'd like to know the minimum requirements.
Thanks.
Hi, thanks for this remarkable work.
I'm trying to reimplement mipnerf in pytorch on LLFF. Currently, I've finished the pipeline identical to this repo. But the training converged to a lower psnr on the same validation images, e.g. 22.5 in my implementation vs 24.9 in this repo for 20w iterations . Is this phenomenon possibly caused by the natural of the optimizers in pytorch? Or maybe I missed some points in mipnerf implementation? Would you please provide some suggestions?
Thank you!
The radii
computation code is different for non-ndc and ndc spaces. In particular, without ndc, radii
computation uses only dx
derived from directions
, but when ndc is enabled, (dx+dy)/2
is used, which is derived from ndc_origins
. Can you please shed some light on why it is done so?
Hi, this is a very interesting work that solves a key problem in NeRF. I don’t quite understand your code for calculating variance. Can you explain why it is calculated like this?
d_outer = d[..., :, None] * d[..., None, :]
eye = jnp.eye(d.shape[-1])
null_outer = eye - d[..., :, None] * (d / d_mag_sq)[..., None, :]
t_cov = t_var[..., None, None] * d_outer[..., None, :, :]
xy_cov = r_var[..., None, None] * null_outer[..., None, :, :]
cov = t_cov + xy_cov
Hi, when I am trying to reproduce the result, I installed on cuda10.1 py3.6 tensorflow2.3.1, keras2.7.0
Then I met the issue when I import tensorflow, and from keras import optimizers,
"another metric with the same name already exists".
I would like to know what's the version you use during your experiments?
Thanks in advance!
Hi,
Thanks for making the mip-NeRF codebase public. For the focal length calculation, why isn't the actual sensor size used (example 36mm for the Blender dataset) here? I believe the formula for calculating focal length from FOV uses the true sensor size.
Line 349 in 84c969e
Hello!
Thanks for sharing this awesome work! :)
I'm curious, have you thought about the correct way of extracting the isosurfaces from the trained implicit function?
For vanilla NeRF model it is possible to extract the level surface at two scales using separately trained coarse and fine networks. Here, as far as I understand, it is possible to extract the level surface at an arbitrary scale, and for that I could just query the network with a positional encoding obtained for a desired point x and some manually selected variance, which determines the scale.
Does this approach makes sense to you, or are there some reasons why it could fail?
Line 183 in 84c969e
Hello author! I recently tried to reproduce the mip-nerf by myself, and found a doubt about the IPE part.
In NeRF it is coded in the order of [(sinx,cosx),...], while Mip-NeRF seems to put the sinx-related ones together, followed by the cosx-related ones,[(sinx , ...),(cosx , ...)], as the following equation shows:
So I'm curious, have you tried coding in the same order as in NeRF, or does the current coding layout work better? Thank you so much! I just noticed this while writing the code, so I wanted to ask for some advice
Hi, It seems that mip-NeRF takes much longer time to train compared to NeRF. Is there any explanation or comparison? Thanks!
N/A; Sorry, this "issue" was a HUGE mistake.
I would like to know if there is any problem in evaluating LPIPS.
hi,thanks for your exciting work.I have one question.
In the code,the initialization values of directions are (x,y,1),so i think the values of origins should be (x,y,0).So,the values of directions are Xdirections=R*Xc+t,the values of origins are Xdirections-R[0:3,2].
After the sigmoid activation I noticed that you are doing this
rgb = rgb * (1 + 2 * self.rgb_padding) - self.rgb_padding
where rgb_padding = 0.001
by default.
Is this intentional or did you mean to move the range to (rgb_padding, 1 - rgb_padding)
?
If we estimate the poses of the lego dataset using colmap, will it work? I tried doing the same, and tried training it using train_llff.sh but it showed inconsistent results.
Hi,
I ran the command
bash scripts/train_blender.sh
and the terminal indicated the following error:
scripts/train_blender.sh: line 31: 11426 Segmentation fault (core dumped) python -m train --data_dir=$DATA_DIR --train_dir=$TRAIN_DIR --gin_file=configs/blender.gin --logtostderr
Could you tell me how to address it? Thanks
hi,thanks for your exciting work.I have two question about avg matrics.
first,I'm confused about the meaning of avg matric,will there be more advantages?
second,I find that when you compute MSE from PSNR,the implementation of the code in is different from that in the paper. so I'm a bit confunsed. Can you help me ?
Hello, thank you very much for your high-quality work. At present, I have some questions that I would like to get your help.
When I tried to reproduce your code, I found that the environment in requirements was difficult to work with. There were always some dependency issues.
Do you have any new dependencies installed?Looking forward to your reply.thanks!!!
I can't derive Sigma in equ (8), could you please explain it a bit more?
Hi, I am using RTX3080 for training and will crash every 5000 iterations when executing this code
vis_suite = vis.visualize_suite(pred_distance, pred_acc)
And here is the error message
Traceback (most recent call last):
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data/feihu/mipnerf-main/train.py", line 321, in <module>
app.run(main)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/data/feihu/mipnerf-main/train.py", line 295, in main
vis_suite = vis.visualize_suite(pred_distance, pred_acc)
File "/data/feihu/mipnerf-main/internal/vis.py", line 140, in visualize_suite
'depth_normals': visualize_normals(depth, acc)
File "/data/feihu/mipnerf-main/internal/vis.py", line 125, in visualize_normals
normals = depth_to_normals(scaled_depth)
File "/data/feihu/mipnerf-main/internal/vis.py", line 38, in depth_to_normals
dy = convolve2d(depth, f_blur[None, :] * f_edge[:, None])
File "/data/feihu/mipnerf-main/internal/vis.py", line 30, in convolve2d
return jsp.signal.convolve2d(
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/scipy/signal.py", line 85, in convolve2d
return _convolve_nd(in1, in2, mode, precision=precision)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/scipy/signal.py", line 65, in _convolve_nd
result = lax.conv_general_dilated(in1[None, None], in2[None, None], strides,
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/lax/convolution.py", line 147, in conv_general_dilated
return conv_general_dilated_p.bind(
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/core.py", line 323, in bind
return self.bind_with_trace(find_top_trace(args), args, params)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/core.py", line 326, in bind_with_trace
out = trace.process_primitive(self, map(trace.full_raise, args), params)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/core.py", line 675, in process_primitive
return primitive.impl(*tracers, **params)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 98, in apply_primitive
compiled_fun = xla_primitive_callable(prim, *unsafe_map(arg_spec, args),
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/util.py", line 219, in wrapper
return cached(config._trace_context(), *args, **kwargs)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/util.py", line 212, in cached
return f(*args, **kwargs)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 148, in xla_primitive_callable
compiled = _xla_callable_uncached(lu.wrap_init(prim_fun), device, None,
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 230, in _xla_callable_uncached
return lower_xla_callable(fun, device, backend, name, donated_invars, False,
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 704, in compile
self._executable = XlaCompiledComputation.from_xla_computation(
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 806, in from_xla_computation
compiled = compile_or_get_cached(backend, xla_computation, options)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 768, in compile_or_get_cached
return backend_compile(backend, computation, compile_options)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/profiler.py", line 206, in wrapper
return func(*args, **kwargs)
File "/home/feihu/.conda/envs/metanerf/lib/python3.9/site-packages/jax/_src/dispatch.py", line 713, in backend_compile
return backend.compile(built_c, compile_options=options)
jaxlib.xla_extension.XlaRuntimeError: UNKNOWN: Failed to determine best cudnn convolution algorithm for:
%cudnn-conv = (f32[1,1,800,800]{3,2,1,0}, u8[0]{0}) custom-call(f32[1,1,800,800]{3,2,1,0} %Arg_0.1, f32[1,1,3,3]{3,2,1,0} %Arg_1.2), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convForward", metadata={op_name="jit(conv_general_dilated)/jit(main)/conv_general_dilated[window_strides=(1, 1) padding=((1, 1), (1, 1)) lhs_dilation=(1, 1) rhs_dilation=(1, 1) dimension_numbers=ConvDimensionNumbers(lhs_spec=(0, 1, 2, 3), rhs_spec=(0, 1, 2, 3), out_spec=(0, 1, 2, 3)) feature_group_count=1 batch_group_count=1 lhs_shape=(1, 1, 800, 800) rhs_shape=(1, 1, 3, 3) precision=(<Precision.HIGHEST: 2>, <Precision.HIGHEST: 2>) preferred_element_type=None]" source_file="/data/feihu/mipnerf-main/internal/vis.py" source_line=30}, backend_config="{\"conv_result_scale\":1,\"activation_mode\":\"0\",\"side_input_scale\":0}"
Original error: UNIMPLEMENTED: DNN library is not found.
To ignore this failure and try to use a fallback algorithm (which may have suboptimal performance), use XLA_FLAGS=--xla_gpu_strict_conv_algorithm_picker=false. Please also file a bug for the root cause of failing autotuning.
I have found that jax will show this message when OOM, so i changed my batch_size from 1024 to 512, but it still takes 10GB when training, how can I reduce the usage of GPU Memory?
Hi, a wonderful work!
I am wondering why the radius of the cone is set to r=2/sqrt(12)*pixel_size.
I know that this setting is to ensure that the variance of the cone matches that of the pixel in world coordinate space. I'm just curious about the derivation. Could you please give me some hint on how to get to this result?
Thanks,
Yu
I do not understand why you only have provided the code for multiscale blender dataset. What about multiscale LLFF dataset? Is it because multiscale representation does not work for LLFF dataset? Looking forward to your reply.
Hi,
While sampling points along rays, the code uses rays.directions
for the direction vectors instead of rays.viewdirs
.
Lines 70 to 81 in 84c969e
Original NeRF uses normalized direction vectors for the sampling points. Can you clarify if we need to replace rays.directions
with rays.viewdirs
?
`
Distance from each unit-norm direction vector to its x-axis neighbor.
dx = [
np.sqrt(np.sum((v[:-1, :, :] - v[1:, :, :])**2, -1)) for v in directions
]
dx = [np.concatenate([v, v[-2:-1, :]], 0) for v in dx]
# Cut the distance in half, and then round it out so that it's
# halfway between inscribed by / circumscribed about the pixel.
radii = [v[..., None] * 2 / np.sqrt(12) for v in dx]`
Hi,
I think you are using internal packages in scripts/summarize.ipynb.
For example:
from google3.pyglib import gfile
with gfile.Open(filename) as f:
sorry to bother you but, after I downloaded shiny.zip, it failed when I unzipped it. It seems that the zip file is internally damaged?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.