harry-zhi / semantic_nerf Goto Github PK
View Code? Open in Web Editor NEWThe implementation of "In-Place Scene Labelling and Understanding with Implicit Scene Representation" [ICCV 2021].
License: Other
The implementation of "In-Place Scene Labelling and Understanding with Implicit Scene Representation" [ICCV 2021].
License: Other
Thank you for your great contribution and for sharing this wonderful code.
I am trying to run the semantic nerf on my custom dataset where I have a bunch of images on which I wish to train this network. How do I generate the traj_w_c file as well as other required files?
Thank you in advance for your help.
Thanks for the graceful codebase. When switching the depth type from z-dim to euclidean, the error occurs due to this line:
semantic_nerf/SSR/models/rays.py
Line 57 in bb98f10
The proper way might be
dirs = dirs * (1. / norm)
dirs = dirs[:, :, :, None]
Hi,
Thank you for the wonderful work. You said camera poses of Replica are randomly generated in #5 . However, the images rendered seem to be in a smooth trajectory instead of produced from random camera poses. May I ask what is the algorithm used to generate the trajectory?
Thanks!
Hello, thanks for the great work,
I am processing Replica dataset with nerfstudio and I am trying to correlate your traj_w_c.txt with nerfstudio's transform.json but it doesn't make sense. I tried even to maintain camera coordinate system to OpenCv but I didnt find similarities between the two files. Is there any way to produce your traj_w_c.txt with nerfstudio?
Thank you!
Hi, I am able to implement semantic nerf on the outdoor unbounded scene by following your kind suggestions vide #38 (comment)
I have good results for semantic rendered images during training but now I am facing problem with mesh reconstruction (snapshot attached). I will be glad to have suggestions/guidance regarding it.
Hi! Thanks for your great job again! It is really a cool idea to integrate the semantic in the nerf model. And I meet some trouble in the visualization, I can train perfectly now on my computer, but when I try to visualize the mesh there is always error of CUDA out of memory, i try to reduce the chunk and the netchunk to 8 but it still occurs, did you have any idea? FYI I notice that in the visualization you use another test.yaml, is it possible that you share this test.yaml?
Hi,
So appreciate for your sharing and it's really amazing.
I have two questions about the dataset.
HI @Harry-Zhi,
Firstly, this is amazing work. I had a query, what do you mean by pixel-wise denoising? Is it similar to image deblurring?
For Replica and/or scannet.
It would be very helpful.
您好!在data_generation文件夹中使用habitat-sim生成数据集时,相机内参是多少呢?或者说如何获取相机内存呢?感谢您的答复。
Hi Harry, thanks a lot for sharing your work!
I am new to NERF and I was trying to replicate your results using the pre-rendered Replica data, but I'm getting the following error:
"semantic = cv2.imread(self.semantic_list[idx], cv2.IMREAD_UNCHANGED)
IndexError: list index out of range"
The reason I could figure out was that the files in semantic_class folder do not have 900 samples starting with "semantic_class_" which was used for searching in the code (after about 100 files, the filename changes to vis_sem_).
Could you help me figure out why that is the case?
I'm using the following command to train on just room 0 sequence 1 set as PATHtoRENDERED_REPLICA_DATA:
!python3 train_SSR_main.py --config_file /content/drive/MyDrive/semantic_nerf/SSR/configs/SSR_room0_config.yaml --sparse_views --sparse_ratio 0.6
Hello!
Thanks for your great work!! I am very interested in your semantic-nerf.
And I encountered a problem when I apply other model on your pre-rendered replica dataset. The same parameters performed well on room0, but performed very poorly on scene data like office4. Have you encountered a similar problem? Are the hyperparameters of each scene basically the same?
in config just see "gpu".
can we use more than one to speed up training?
Hi, Thanks for the amazing work!
I downloaded the pre-rendered Replica dataset from provided link but was unable to find the instance label folder for each scene.
The directory semantic_instance
is missing from each scene. In the dataset this is optionally loaded here.
it would be great if the link can be updated with the instance id folders as well.
Alternatively, if the data rendering scripts from Habitat can be open-sourced that would be really great.
Thanks!
Hi,
Thanks for sharing your work.
Code works fine for the indoor scenes. I trying to implement this code on images of an outdoor seen e.g. building. But I am unable to generate the render scenes. I will be glad if you can guide me about parameters I need to change to get the required output for outdoor scenes.
Can you share a copy of info_semantic.json so we can directly use prerendered replica data instead of having to download the whole dataset?
I noticed in your paper that the model was trained for 200,000 iterations on a single RTX2080Ti GPU. Could you please provide an estimate of the time it took to complete this training?
Hello, thanks for your excellent job! Generating a semantic model in a nerf-way is so cool. With the shared codes, I succeeded in training with Replica data and stopped the training when the testing results are perceptually good. However, when generating colored meshes, I met some trouble because no "mesh.ply" file can be found. I wonder where I can get the "mesh.ply" file.
Hi,
thanks for your code and efforts! I have a problem about the details of label mapping.
In the paper, you mentioned "manually map these labels to the popular NYUv2-13 definition", could you please share the details of it?
I've used the habitat-sim as you suggested and want to have exactly the same label mapping.
For example in the script of Habitat-sim:
self.labels = {
'rug': 6,
'wall': 12,
'floor': 5,
'ceiling': 3,
'chair': 4,
'table': 10,
'window': 13,
.......
}
As for "Semantic View Synthesis with Sparse Labels" part, it seems to have more classes, could you please share about the label mapping of it as well?
Thanks a lot for your reply!
I follow nerfstudio to convert the provided poses to nerfstudio format with the following:
c2w = read_poses_c2w[idx].reshape((4,4))
# Convert from COLMAP's camera coordinate system to ours
c2w[0:3, 1:3] *= -1
c2w = c2w[np.array([1, 0, 2, 3]), :]
c2w[2, :] *= -1
However, it does not work for nerfstudio. Can you give some hints?
Hey,
once again congrats for the amazing work.
I am trying to train the semantic-nerf in a custom dataset. I have pre-processed the dataset to meet the format of replica dataset, classes are also familiar, so I have managed to train the model without tuning the data loader (replica_datasets.py
). Yet the model cannot learn the 3D representation due to camera poses incompatibility.
In specific, as far as I can understand from trainer.py
and set_params_replica
function, the rgb images are supposed to be captured via a pin-hole camera. In my case, the camera model is different so I want to modify the function so as to process the camera poses of the traj_w_c.txt
file correctly. Is it enough changing fx, fy, etc. parameters? What if there is a more complex camera model with additional parameters such as p?
Any tips or ideas on how to implement a different camera model than pin-hole are more than welcome!
Hello author, I am a Nerf beginner and I would like to know if there are pre-trained weights that can directly run the model to produce semantically segmented images? And how should I run this project?
Hi,
Thanks for sharing your code, I am looking to explore this system with other datasets and would like to know thew format of the "traj_w_c.txt" in the pre-rendered replica dataset(s).
Cheers,
Kurt
appreciating your excellent work !!!
I want to follow your work, but when I used your most recent code in the replica room_0, I discovered that the void class could not display in the image,but the generated sem.mp4 can display void class normally. Why is this? I recall using your code a few months ago, and the output image showed a void class(color is black). Please give me some advice on how to switch back to the old code.
Thanks a lot!!!
Hi, thanks for the great work!
It seems that semantic-nerf doesn't seem to use NDC space, I want to know why?
Hello!
Can you provide more files(configs of parameters setting, pretrained models...) on Scannet dataset in the paper?
Thanks
Hello!
Thank you a lot for your great work!
I am planning to use your method for robotic application and I have a question. Do you have any guide how to prepare my own data for training and rendering?
As I understood the input includes 2D color images, depth images, camera poses and sematic class? Am I correct? These data should be stored as a dataset?
Thank you in advance!
The error occurs on both Replica and ScanNet datasets.
Traceback (most recent call last):
File "train_SSR_main.py", line 224, in
train()
File "train_SSR_main.py", line 212, in train
ssr_trainer.step(global_step)
File "/opt/data3/semantic_nerf-main/SSR/training/trainer.py", line 1027, in step
number_classes=self.num_valid_semantic_class, ignore_label=self.ignore_label)
File "/opt/data3/semantic_nerf-main/SSR/training/training_utils.py", line 67, in calculate_segmentation_metrics
conf_mat = confusion_matrix(true_labels, predicted_labels, labels=list(range(number_classes)))
File "/home/zy/miniconda3/envs/nerf/lib/python3.7/site-packages/sklearn/utils/validation.py", line 72, in inner_f
return f(**kwargs)
File "/home/zy/miniconda3/envs/nerf/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 276, in confusion_matrix
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "/home/zy/miniconda3/envs/nerf/lib/python3.7/site-packages/sklearn/metrics/_classification.py", line 81, in _check_targets
check_consistent_length(y_true, y_pred)
File "/home/zy/miniconda3/envs/nerf/lib/python3.7/site-packages/sklearn/utils/validation.py", line 256, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [19814400, 11767603]
Hi, i want to test semantic_nerf model, but is extract colour mesh file for testing. If it is, why neccessary mesh file, is nerf model need to generate mesh model. In conclusion, how can i do testing
Hi,
Thank you for your inspiring work and the code released!
I'm trying running your demo on "room0" of Replica by
python3 train_SSR_main.py --config_file /SSR/configs/SSR_room0_config.yaml
Whenever it comes to validation time, it will not be able to move on and interrupted by "CUDA out of memory". However, I use the same GPU (RTX2080-Ti with 11GB memory) as mentioned in your paper. Also, no matter how much I decrease the numbers "chunk" and "netchunk" in the config file (even decreasing them to 1024*1), this "CUDA out of memory" would happen during validation time.
I'm wondering whether you could give me some advice.
home/xxxx/.conda/envs/semantic_nerf/lib/python3.7/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/xxxx/.conda/envs/semantic_nerf/lib/python3.7/site-packages/torch/cuda/__init__.py:125: UserWarning:
NVIDIA GeForce RTX 3070 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70 sm_75.
If you want to use the NVIDIA GeForce RTX 3070 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
prepare rays
prepare rays
prepare rays
Begin
0%| | 0/200001 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train_SSR_main.py", line 224, in <module>
train()
File "train_SSR_main.py", line 212, in train
ssr_trainer.step(global_step)
File "/home/xxxx/Desktop/work/semantic_nerf/SSR/training/trainer.py", line 882, in step
sampled_data = self.sample_data(global_step, self.rays, self.H, self.W, no_batching=True, mode="train")
File "/home/xxxx/Desktop/work/semantic_nerf/SSR/training/trainer.py", line 650, in sample_data
sampled_rays = rays[index_batch, index_hw, :]
RuntimeError: CUDA error: no kernel image is available for execution on the device
Thank you for your dataset! I want to know where can I find the camera intrinsic K ?
I want to run on my win10 computer, and the GPU is RTX2060super8G. Maybe GPU is not good enough :) , and I just want to try to run through it.
Traceback (most recent call last):
File "train_SSR_main.py", line 225, in
train()
File "train_SSR_main.py", line 213, in train
ssr_trainer.step(global_step)
File "D:\Users\admin\Documents\GitHub\semantic_nerf\SSR\training\trainer.py", line 981, in step
rgbs, disps, deps, vis_deps, sems, vis_sems, sem_uncers, vis_sem_uncers = self.render_path(self.rays_vis, save_dir=trainsavedir)
File "D:\Users\admin\Documents\GitHub\semantic_nerf\SSR\training\trainer.py", line 1156, in render_path
output_dict = self.render_rays(rays[i])
File "D:\Users\admin\Documents\GitHub\semantic_nerf\SSR\training\trainer.py", line 703, in render_rays
all_ret = batchify_rays(fn, flat_rays, self.chunk)
File "D:\Users\admin\Documents\GitHub\semantic_nerf\SSR\training\training_utils.py", line 10, in batchify_rays
ret = render_fn(rays_flat[i:i + chunk])
File "D:\Users\admin\Documents\GitHub\semantic_nerf\SSR\training\trainer.py", line 745, in volumetric_rendering
raw_coarse = run_network(pts_coarse_sampled, viewdirs, self.ssr_net_coarse,
File "D:\Users\admin\Documents\GitHub\semantic_nerf\SSR\models\model_utils.py", line 31, in run_network
embedded = torch.cat([embedded, embedded_dirs], -1)
RuntimeError: CUDA out of memory. Tried to allocate 1.65 GiB (GPU 0; 8.00 GiB total capacity; 5.00 GiB already allocated; 910.21 MiB free; 5.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
After GOOGLE, I know I need to reduce the batch_size or chunk size, but I don't know where PROJECT CODE I need to edit and what reduced value of batch_size and chunk size is proper.
And batch_size or chunk size, which one should I choose to reduce first?
thanks !!!!!!
Edit: I'm a beginner in NeRF. Apologize if it's a dummy question:)
Hi,
thanks for your code and efforts! I have two questions which about pre-render the Dataset.
Hi, I wonder what is the camera format in traj_w_c.txt
? Does the filename "*_w_c" suggests it is in world to camera format? Thank you very much.
Thank you for your awesome work!!!
I plan to set the semantics of some instances to void when training the model, because I don't want some instances to be segmented, what can I do?
thanks!!!
I want to render a replica dataset with the exact same format as used in iMap.
So far I was able to generate a smooth trajectory, navigate the agent along it, and store the respective depth maps and images which look nice. However, I'm failing when trying to run a method on it, while it works perfectly fine on the replica data from iMap.
I'm a little bit confused about how exactly to test/verify that my poses are legit. The snippet I used to get them assuming that there's a legit trajectory:
for ix, point in enumerate(tqdm(path_points)):
if ix >= len(path_points) - 1:
break
tangent = path_points[ix + 1] - point
agent_state.position = point
tangent_orientation_matrix = mn.Matrix4.look_at(point, point + tangent, np.array([0, 1.0, 0]))
tangent_orientation_q = mn.Quaternion.from_matrix(tangent_orientation_matrix.rotation())
agent_state.rotation = utils.quat_from_magnum(tangent_orientation_q)
agent.set_state(agent_state)
pose = np.eye(4)
pose[:3, :3] = qt.as_rotation_matrix(agent_state.rotation)
pose[:3, 3] = agent_state.position
poses.append(pose)
Thank you very much for your work and sharing
However, when I download the pre-rendered Replica data, it shows failure at the end of the download.
Is there any other way to download it?
Thanks for your publish code! When I run your code about label propagation task, I found that the semantic loss is nan because gt_label almost is 0 after single-click. You mention that "we do not apply any loss on the void regions hence the network is able to predict arbitrary classes without penalty (though in fact it tend to predict some reasonable classes based on the similarity in appearance or geometry). And the void region also does not contribute to the evaluation metrics" in #3 .
So it is easy-to-understand why the semantic loss is nan. However, in this case, how can the network train to the effect that you demonstrate? What I should do?
Hi, I am your sincere follower! Your semantic-NeRF is a very suprising work!
I wonder if you could release the Multi-view Semantic Fusion code?
The error is that lacking sample_step:
Experiment GPU is 0.
INFO - 2022-08-30 21:44:12,892 - trainer - Using gpu's: 0
----- ScanNet Dataset with NYUv2-40 Conventions-----
processing ScanNet scene: scene0010_00
Traceback (most recent call last):
File "train_SSR_main.py", line 224, in
train()
File "train_SSR_main.py", line 168, in train
sample_step=config["experiment"]["sample_step"],
KeyError: 'sample_step'
How can I define the bouding box? like llff : self.scene_bbox = torch.tensor([[-1.5, -1.67, -1.0], [1.5, 1.67, 1.0]]).
But I really don't know what exactly it is or how big is suitable...
I want to test on another nerf...please help me...
best wishes
The code is wonderful. But I have some trouble generating per scene semantic img from this code.
o3d_mesh_canonical_clean.vertex_colors = o3d.utility.Vector3dVector(v_colors/255.0)
Could you please give me some advice?
Thanks for your work!
In the paper,4.4 Semantic Fusion,with further experiences,includes Semantic Label Denoising and Super-Resolution. How can I get the processed data?Could you upload the code that processes the data?
Looking forward to your reply!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.