jumpat / seganygaussians Goto Github PK

The official implementation of SAGA (Segment Any 3D GAussians)

License: Apache License 2.0

Python 14.10% Jupyter Notebook 41.61% CMake 0.87% Cuda 9.20% C 0.48% C++ 32.33% Makefile 0.83% TypeScript 0.01% Shell 0.57%

3d-segmentation computer-vision gaussian-splatting segment-anything

seganygaussians's People

Contributors

Stargazers

Watchers

seganygaussians's Issues

some problem in: python train_contrastive_feature.py

Looking for config file in output/2718f284-3/cfg_args
Config file found: output/2718f284-3/cfg_args
Optimizing output/2718f284-3
Loading trained model at iteration 30000, None [08/03 10:36:33]
Found transforms_train.json file, assuming Blender data set! [08/03 10:36:33]
Reading Training Transforms [08/03 10:36:33]
Reading Test Transforms [08/03 10:37:00]
Loading Training Cameras [08/03 10:37:28]
Loading Test Cameras [08/03 10:37:32]
Training progress: 0%| | 0/30000 [00:00<?, ?it/s]Traceback (most recent call last):
File "train_contrastive_feature.py", line 212, in
training(lp.extract(args), op.extract(args), pp.extract(args), args.iteration)
File "train_contrastive_feature.py", line 121, in training
sam_features = viewpoint_cam.original_features.cuda()
AttributeError: 'NoneType' object has no attribute 'cuda'
Training progress: 0%|

Great work. I encountered this problem when training my own data. What I am a little confused about is what the specific formats of trans_train.json and trans_test.json are. My experimental settings are that the two files are completely consistent. Is this the reason for the bug? Looking forward to your answer~

Questions about training configuration

Hello, can I ask what graphics card you use for training?

How to obtain panoptic segmentation

Hi,

Thanks for the great work. I wonder if there is a way to obtain panoptic segmentation with the learned features?

Version of pytorch3d

Hi, thanks for this great work! I've been having problems installing the dependence. Could you please let me know which version of pytorch3d you used?

Thank you!

a problem about prompt_segmenting

When I use the original images in the dataset without processing, I will report an error like this

root@I1993568c2b00601e90:/hy-tmp/SegAnyGAussians# python prompt_segmenting.py
Looking for config file in ./output/e01280aa-a/cfg_args
Config file found: ./output/e01280aa-a/cfg_args
Loading trained model at iteration -1, 30000
Reading camera 34/34
Loading Training Cameras
Loading Test Cameras
There are 34 views in the dataset.
0.05493569374084473
0.0071582794189453125
10
running k-means on cuda..
resuming
[running kmeans]: 9it [00:00, 525.51it/s, center_shift=0.000000, iteration=9, tol=0.000100]
0.0347135066986084
tensor(3410, device='cuda:0')
0.11548805236816406
tensor(563.6510, device='cuda:0') std_nearest_k_distance
/hy-tmp/SegAnyGAussians/prompt_segmenting.py:153: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
target_mask = torch.tensor(sam_mask, device=grad_catch_2dmask.device)
tensor(4.4595, device='cuda:0') std_nearest_k_distance
tensor(0.8292, device='cuda:0') std_nearest_k_distance
tensor(0.3955, device='cuda:0') std_nearest_k_distance
tensor(0.3851, device='cuda:0') std_nearest_k_distance
tensor(0.3400, device='cuda:0') std_nearest_k_distance
tensor([[0.0000, 0.4895]], device='cuda:0') test threshold
Traceback (most recent call last):
File "/hy-tmp/SegAnyGAussians/prompt_segmenting.py", line 563, in
filtered_points, filtered_mask, thresh = postprocess_grad_based_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask=mask_.clone(), feature_gaussians=feature_gaussians, view=view, sam_mask=ref_mask.clone(), pipeline_args=pipeline.extract(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/hy-tmp/SegAnyGAussians/prompt_segmenting.py", line 196, in postprocess_grad_based_statistical_filtering
mask = nearest_k_distance.mean(dim = -1) <= test_threshold
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: The size of tensor a (3349) must match the size of tensor b (2) at non-singleton dimension 1

about"load_point_colors_from_pcd"

How can I modify the load_point_colors_from_pcd function so that the output segmented point cloud is colored?

output the segmentated model

Thank you for open-sourcing your codes，I am a newbie and I am reproducing the code given. I found that the final output results are rendered pictures. Where can I find the 3D model obtained after segmentation? Looking forward to your reply.

[prompt_segmenting.py] RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

I am facing

I selected 3 points and set them to
input_point = np.array(selected_points)

I'm getting the SAM masks:

I tried adding .cuda() (to mask_pooling_prototype.unsqueeze(0)) and removing .cuda() (in cluster_centers[cluster_mask, :].cuda()) but that didn't help.

oom problem

ref_img_camera_id = 0
mask_img_camera_id = 0

view = cameras[ref_img_camera_id]
img = view.original_image * 255
img = cv2.resize(img.permute([1,2,0]).detach().cpu().numpy().astype(np.uint8),dsize=(1024,1024),fx=1,fy=1,interpolation=cv2.INTER_LINEAR)
predictor.set_image(img)
sam_feature = predictor.features
# sam_feature = view.original_features

start_time = time.time()
bg_color = [0 for i in range(FEATURE_DIM)]
background = torch.tensor(bg_color, dtype=torch.float32, device="cuda")
rendered_feature = render_contrastive_feature(view, feature_gaussians, pipeline.extract(args), background)['render']
time1 = time.time() - start_time

H, W = sam_feature.shape[-2:]

print(time1)
plt.imshow(img)

in this block when using prompt_segmenting
i got this oom error
how can i avoid oom problem??
my laptop is asus and rtx 2080super with 8gb vram

window 11, wsl2 ubuntu 22.04, anaconda envirioment

Details

OutOfMemoryError Traceback (most recent call last) Cell In[15], line 7 5 img = view.original_image * 255 6 img = cv2.resize(img.permute([1,2,0]).detach().cpu().numpy().astype(np.uint8),dsize=(1024,1024),fx=1,fy=1,interpolation=cv2.INTER_LINEAR) ----> 7 predictor.set_image(img) 8 sam_feature = predictor.features 9 # sam_feature = view.original_features

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/predictor.py:60, in SamPredictor.set_image(self, image, image_format)
57 input_image_torch = torch.as_tensor(input_image, device=self.device)
58 input_image_torch = input_image_torch.permute(2, 0, 1).contiguous()[None, :, :, :]
---> 60 self.set_torch_image(input_image_torch, image.shape[:2])

File ~/anaconda3/envs/seggau/lib/python3.9/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/predictor.py:89, in SamPredictor.set_torch_image(self, transformed_image, original_image_size)
87 self.input_size = tuple(transformed_image.shape[-2:])
88 input_image = self.model.preprocess(transformed_image)
---> 89 self.features = self.model.image_encoder(input_image)
90 self.is_image_set = True

File ~/anaconda3/envs/seggau/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:112, in ImageEncoderViT.forward(self, x)
109 x = x + self.pos_embed
111 for blk in self.blocks:
--> 112 x = blk(x)
114 x = self.neck(x.permute(0, 3, 1, 2))
116 return x

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:174, in Block.forward(self, x)
171 H, W = x.shape[1], x.shape[2]
172 x, pad_hw = window_partition(x, self.window_size)
--> 174 x = self.attn(x)
175 # Reverse window partition
176 if self.window_size > 0:

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:234, in Attention.forward(self, x)
231 attn = (q * self.scale) @ k.transpose(-2, -1)
233 if self.use_rel_pos:
--> 234 attn = add_decomposed_rel_pos(attn, q, self.rel_pos_h, self.rel_pos_w, (H, W), (H, W))
236 attn = attn.softmax(dim=-1)
237 x = (attn @ v).view(B, self.num_heads, H, W, -1).permute(0, 2, 3, 1, 4).reshape(B, H, W, -1)

File ~/SuGaR/SegAnyGAussians/third_party/segment-anything/segment_anything/modeling/image_encoder.py:358, in add_decomposed_rel_pos(attn, q, rel_pos_h, rel_pos_w, q_size, k_size)
354 rel_h = torch.einsum("bhwc,hkc->bhwk", r_q, Rh)
355 rel_w = torch.einsum("bhwc,wkc->bhwk", r_q, Rw)
357 attn = (
--> 358 attn.view(B, q_h, q_w, k_h, k_w) + rel_h[:, :, :, :, None] + rel_w[:, :, :, None, :]
359 ).view(B, q_h * q_w, k_h * k_w)
361 return attn

OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 8.00 GiB total capacity; 5.93 GiB already allocated; 0 bytes free; 7.19 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How to render background?

Hi! I greatly appreciate your innovative work.

Under your instruction, I can render the target scene by python render.py -m output/XXX --precomputed_mask segmentation_res/final_mask.pt --target scene --segment. But if I want to render the background instead of the target, that is somewhere not selected by target Gaussians, how can I achieve it?

I tried to replace 0 with 1 and 1 with 0 in mask in gaussian_renderer/__iniy__.py, but it didn't work. (mask = torch.logical_not(mask.bool()).float())

Thanks for your help.

dumb question about sam encoder

Hello gang,

I read the paper for your method but I have a dumb question.

Isn't it more computationally more efficient you directly embed the mask on the image then generate the model via 3dgs instead of applying the sam map during generation?

I'm probably misunderstanding something here, does your also reduce the number of ellipsoids on the point cloud compared to editing the images directly? I'm not clear on the benefits here.

关于交互方式

老哥，首先你这个工作做的是真的吊，很有意义，试了一下效果可以的。

我发现jupyter 里面的交互是直接代码指定一个点

如果我想多个点，或者框子，或者涂鸦，来做交互。
大佬您有什么建议吗？

Some questions about `queries`

Thank you for your excellent work! But I have some questions about queries at inference stage. In the paper, With a rendered feature map Frv for a specific view v, we generate queries for positive points and negative points by directly retrieving their corresponding features on Frv. May I ask how can I understand these queries? And how can we generate them?

Moreover, is there any relationship between these queries at inference stage and the mask queries T_M_i at training stage?

Thanks a lot!

怎么安装这个项目

我下载解压了这个项目的zip包，然后下载了third_party目录里面的内容放进去。
然后执行了README里面的安装依赖的命令，可是找不到pytorch3d的包。我把这个包的依赖注释掉。
再执行安装依赖的命令，接着就出现各种各样的错误，我的意思是，我至少遇见两种不同的错误。
其中一种错误太长了，我只能截取最后的内容，内容是
Traceback (most recent call last):
File "", line 36, in
File "", line 34, in
File "/media/sx639-2/E/jzx's files/SegAnyGAussians-main/submodules/simple-knn/setup.py", line 33, in
'build_ext': BuildExtension
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/init.py", line 103, in setup
return distutils.core.setup(**attrs)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/dist.py", line 963, in run_command
super().run_command(command)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/command/install.py", line 78, in run
return orig.install.run(self)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/command/install.py", line 697, in run
self.run_command('build')
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/dist.py", line 963, in run_command
super().run_command(command)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/dist.py", line 963, in run_command
super().run_command(command)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 88, in run
_build_ext.run(self)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
build_ext.build_extensions(self)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
self._build_extensions_serial()
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
self.build_extension(ext)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 249, in build_extension
_build_ext.build_extension(self, ext)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/setuptools/_distutils/command/build_ext.py", line 555, in build_extension
depends=ext.depends,
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 595, in unix_wrap_ninja_compile
with_cuda=with_cuda)
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1492, in _write_ninja_file_and_compile_objects
error_prefix='Error compiling objects for extension')
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> simple_knn

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
failed

CondaEnvException: Pip failed

—— —— —— —— —— —— —— —— —— —— —— —— —— ——
第二种我遇见的错误是

Pip subprocess error:
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [17 lines of output]
Traceback (most recent call last):
File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/tokenize.py", line 385, in find_cookie
line_string = line.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 0: invalid continuation byte

  During handling of the above exception, another exception occurred:
  
  Traceback (most recent call last):
    File "<string>", line 36, in <module>
    File "<pip-setuptools-caller>", line 28, in <module>
    File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/tokenize.py", line 449, in open
      encoding, lines = detect_encoding(buffer.readline)
    File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/tokenize.py", line 426, in detect_encoding
      encoding = find_cookie(first)
    File "/home/sx639-2/.conda/envs/SAGA/lib/python3.7/tokenize.py", line 390, in find_cookie
      raise SyntaxError(msg)
  SyntaxError: invalid or missing encoding declaration for "/media/sx639-2/E/jzx's files/SegAnyGAussians-main/third_party/segment-anything/setup.py"
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
failed

CondaEnvException: Pip failed

—— —— —— —— —— —— —— —— —— —— —— —— —— ——
我是在Linux的pycharm上执行的，显卡是RTX3090
我不理解为什么会有这种错误，请告诉我正确的安装方式，谢谢

Questions about 3D Gaussian training

Thank you very much for your open source code, I have learned a lot，But when I was training the leaves data set in the NERF-LLFF data set, I encountered that colmap could not estimate the correct pose and the camera matrix could not be created. I would like to ask if you have encountered similar problems.

==============================================================================
Finding good initial image pair
==============================================================================

  => No good initial image pair found.

Elapsed time: 0.754 [minutes]
ERROR: failed to create sparse model

Can we directly compute out the 3D segmentation without iterative fitting?

do 2d segmentaion(2-category/01, or object-by-object), one-time
all pixels in 2D space, can be mapped matched GS points in 3D space. (refer: GS renderer inside)

faster?

Misuse of `iteration` in `train_contrastive_feature.py`

I've noticed a potential issue in train_contrastive_feature.py.

In line 187:

scene.save_feature(iteration, target = 'contrastive_feature')
torch.save(sam_proj.state_dict(), os.path.join(scene.model_path, "point_cloud/iteration_{}/".format(iteration) + "sam_proj.pt"))

The iteration variable here seems to be incorrectly used. It appears to be opt.iterations as per the loop on line 110:

for iteration in range(first_iter, opt.iterations + 1):

However, since the model is being saved to the directory "point_cloud/iteration_{}/", it seems more likely that it should be args.iteration rather than opt.iterations.

The problem is, if args.iteration is not specified, its default value would be -1, which would not work for the directory path. Therefore, simply renaming iteration wouldn't solve the issue.

As a workaround, I replaced iteration with scene.loaded_iter:

scene.save_feature(scene.loaded_iter, target = 'contrastive_feature')
torch.save(sam_proj.state_dict(), os.path.join(scene.model_path, "point_cloud/iteration_{}/".format(scene.loaded_iter) + "sam_proj.pt"))

This seems to have solved the issue for me. Could you please confirm if this is the correct approach?

Render different perspectives

Hello, thank you very much for your work. I would like to ask, after obtaining the Gaussian model, how to render a perspective different from the pose of the photo in the input data set？ Look forward to your reply!

About evaluation and test.

Dear author, thank you very much for open source this work, it is very amazing.

I am a beginner in 3D rendering and SAM. I would like to learn from you how to test on the two datasets of novs and spin-nerf. Will you release the relevant test code?

For rendering the segmentation results

python render.py -m <path to the pre-trained 3DGS model> --precomputed_mask --target scene --segment

I have saved multiple precomputed_masks, how can I segment these multiple targets at once?

FileNotFoundError: [Errno 2] No such file or directory: '/media/liumomo/EXTERNAL_USB/NerF/SegAnyGAussians-main/output/crack/seg_cfg_args'

FileNotFoundError: [Errno 2] No such file or directory: '/media/liumomo/EXTERNAL_USB/NerF/SegAnyGAussians-main/output/crack/seg_cfg_args',我的文件夹中只有cfg_args，请问seg_cfg_args是在哪个步骤中生成的吗

Segmented 3DGS

如何查看分割分割结果

我在一步一步执行完成了所有的训练步骤，在python render.py之后不知道怎么查看3D分割之后的结果，我直接使用gaussian-splatting的SIBR查看器查看output会报错。
请问如何查看实验结果

argument error

when i try to use python train_scene.py -s got this error

prompt_segmenting.ipynb中的代码问题

hi，非常棒的工作，最近在学习代码。想问下一个报错怎么处理。在运行prompt_segmenting.ipynb的postprocess_grad_based_statistical_filtering函数过程中遇到的。是函数调用错了吗？我看render调的是原始的gs渲染，所以没有mask这个参数。感谢！

`TypeError Traceback (most recent call last)
Cell In[27], line 7
4 # write_ply('./segmentation_res/vanilla_seg.ply', selected_xyz)
6 selected_xyz, thresh, mask_ = postprocess_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask = mask.clone(), max_time=1)
----> 7 filtered_points, filtered_mask, thresh = postprocess_grad_based_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask=mask_.clone(), feature_gaussians=feature_gaussians, view=view, sam_mask=ref_mask.clone(), pipeline_args=pipeline.extract(args))
8 # filtered_points, thresh = postprocess_statistical_filtering(pcd=selected_xyz.clone(), max_time=3)
9
10 # print(thresh)
11 # write_ply('./segmentation_res/filtered_seg.ply', filtered_points)
12 time5 = time.time() - start_time

Cell In[21], line 136
133 grad_catch_mask[precomputed_mask, :] = 1
134 grad_catch_mask.requires_grad = True
--> 136 grad_catch_2dmask = render(
137 view,
138 feature_gaussians,
139 pipeline_args,
140 background,
141 filtered_mask=~precomputed_mask,
142 override_color=torch.zeros(feature_gaussians.get_opacity.shape[0], 3, device = 'cuda'),
143 override_mask=grad_catch_mask,
144 )['mask']
147 target_mask = torch.tensor(sam_mask, device=grad_catch_2dmask.device)
148 target_mask = torch.nn.functional.interpolate(target_mask.unsqueeze(0).unsqueeze(0).float(), size=grad_catch_2dmask.shape[-2:] , mode='bilinear').squeeze(0).repeat([3,1,1])

File /data/users/yyy/SegAnyGAussians/gaussian_renderer/init.py:100, in render(viewpoint_camera, pc, pipe, bg_color, scaling_modifier, override_color, override_mask, filtered_mask)
94 colors_precomp = override_color
96 # print("Render time checker: prepare vars", time.time() - start_time)
97
98
99 # Rasterize visible Gaussians to image, obtain their radii (on screen).
--> 100 rendered_image, rendered_mask, radii = rasterizer(
101 means3D = means3D,
102 means2D = means2D,
103 shs = shs,
104 colors_precomp = colors_precomp,
105 opacities = opacity,
106 mask = mask,
107 scales = scales,
108 rotations = rotations,
109 cov3D_precomp = cov3D_precomp)
111 # print("Render time checker: main render", time.time() - start_time)
112
113 # Those Gaussians that were frustum culled or had a radius of 0 were not visible.
114 # They will be excluded from value updates used in the splitting criteria.
115 return {"render": rendered_image,
116 "mask": rendered_mask,
117 "viewspace_points": screenspace_points,
118 "visibility_filter" : radii > 0,
119 "radii": radii}

File ~/anaconda3/envs/yyy/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)

File ~/anaconda3/envs/yyy/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None

TypeError: GaussianRasterizer.forward() got an unexpected keyword argument 'mask'
`

how to use custom data??

If I want to write about my own custom images, how do I process them? Can I do what I did with gssplat?

关于透明度的理解

您好，您的SAGA项目让我受益匪浅！

我在阅读代码的过程中产生了一点疑惑：在使用 precomputed_mask 提取出整个场景中的目标场景对应的高斯球后，可以使用 render 渲染得到目标场景。但是这个过程中是怎么保持透明度的归一性的呢？因为一些背景中的高斯球没有被加入光栅化过程，会不会导致光栅化过程始终无法结束呢？

真诚盼望您的回答！

关于分割的问题

作者你好，我发现在inference时，仅仅通过特征匹配（后处理之前）得到的gaussians非常稀疏，渲染之后是这样的：

我想请教一下为什么特征匹配得到的guassians这么稀疏，而需要靠3D prior-based growing算法来优化呢，是因为特征没学好吗？

Estimated release date?

Omg! This is a great tool for the community. Is there an estimated release date for the code?

A question about correspendence Loss

Thank you for open-sourcing your codes，When I was reading the code I found a problem，The code location is in "train_contrastive_feature.py"

        norm_rendered_feature = torch.nn.functional.normalize(torch.nn.functional.interpolate(rendered_features.unsqueeze(0), (H,W), mode = 'bilinear').squeeze(), dim=0, p=2)
        correspondence = torch.relu(torch.einsum('CHW,CJK->HWJK', norm_rendered_feature, norm_rendered_feature))

Two identical "norm renderfeature" matrices are multiplied to get the result called "correspondence". I don't understand this step. Can you help me explain it?

How should we use this result to get a mesh from the sugar model?

https://github.com/Anttwo/SuGaR

When i get mesh from sugar model, i run with this task to get only the object i want and get pt file. how should i apply it to sugar model?

Render.py script AttributeError: 'str' object has no attribute 'squeeze'

I want to start with a thanks! This project seems to have massive potential and though it was a rough start with me having some issues with setting up the environment - I have been able to reach the end of the prompt_segmenting.ipynb and I'm currently testing the repo on the 360_v2 bicycle dataset.

I am facing this error while running the render.py script -

(saga) root@a100-instance-ajna:~/SegAnyGAussians# python render.py -m output/77fa46cb-3 --precomputed_mask segmentation_res --target scene --segment
Looking for config file in output/77fa46cb-3/cfg_args
Config file found: output/77fa46cb-3/cfg_args
Rendering output/77fa46cb-3
Using precomputed mask segmentation_res
Loading trained model at iteration 30000, None [01/03 03:58:06]
Reading camera 194/194 [01/03 03:58:07]
Loading Training Cameras [01/03 03:58:07]
Loading Test Cameras [01/03 03:59:47]
Traceback (most recent call last):
  File "render.py", line 128, in <module>
    render_sets(model.extract(args), args.iteration, pipeline.extract(args), args.skip_train, args.skip_test, args.segment, args.target, args.idx, args.precomputed_mask)
  File "render.py", line 87, in render_sets
    gaussians.segment(precomputed_mask)
  File "/root/SegAnyGAussians/scene/gaussian_model.py", line 338, in segment
    mask = mask.squeeze()
AttributeError: 'str' object has no attribute 'squeeze'

Please suggest where I can make a change to make sure I can view the segmentation results! Would any gaussian splat viewer like the antimatter15.com/splat work?

Thanks :)

球协函数转RGB

抱歉再次打扰，上次询问您嘞SH2RGB的问题。近期我试了一下效果，但是转换结果和您论文中的有差距，请问是不是哪里处理不对？

对于这个场景，我读取了ply里的f_dc_0，f_dc_1，f_dc_2，将其传入了SH2RGB函数，结果作为rgb存入ply，最终用Viewer可视结果如下：

cloud_compare结果为：

Question about mask in diff_rasterizer

Hello! Thanks for your great work. I'm really interested in the code below:

SegAnyGAussians/submodules/diff-gaussian-rasterization/cuda_rasterizer/forward.cu

Line 361 in 5142b1a

M += mask[collected_id[j]] * alpha * T;

I have several questions about this CUDA line:

Is this the mask value for a given pixel using the following equation:
$m = \sum_{i}\alpha_{i}\prod_{j}{(1-\alpha_{j})}$
Just like Eq.(1) in the main paper which omits $c_{i}$?
SegAnyGAussians/gaussian_renderer/__init__.py

Line 116 in 5142b1a

"mask": rendered_mask,

Is this the returned rendered mask corresponding to the CUDA line?

Hope for your reply!
Sincerely.

RuntimeError: Function _RasterizeGaussiansBackward returned an invalid gradient at index 2 - got [0, 0, 3] but expected shape compatible with [0, 16, 3]

您好，我在使用nerf_Synthetic数据集训练Gaussian时遇到了这个问题
RuntimeError: Function _RasterizeGaussiansBackward returned an invalid gradient at index 2 - got [0, 0, 3] but expected shape compatible with [0, 16, 3]。
我在Gaussian splatting的issues里找了一下没有具体解决措施，请问您有遇到过并解决吗？

Pretrained models

Hi, authors! Thank you for open-sourcing your codes!

I'm wondering if you could provide your pretrained models? Thank you.

缺少seg_crg_args

当我想获取2d rendered masks时，提示我缺少seg_cfg_args文件，我查看文件夹里确实只有cfg_args，没有seg_cfg_args，请问这个seg_cfg_args是在哪里生成的呢

Some questions about Gaussian Splatting for segmentation

Sorry to bother you again. I do not know much about gaussian splatting, but my boss/teacher asks me: Why does 3D everything segmentation nowadays involve 2D SAM and not go straight to 3D segmentation of everything? I hope you can give me some tips or guidance for further research. Looking forward to your comments! Thanks!

python train_contrastive_feature.py -m <path to the pre-trained 3DGS model> 找不到对应的文件

校友你好厉害！您好，我在运行python train_contrastive_feature.py -m <path to the pre-trained 3DGS model>这一行的时候，报错说：

(gaussian_splatting) root@[email protected]:/home/SegAnyGAussians# python train_contrastive_feature.py -m /home/SegAnyGAussians/output/154f4d6d-5/
Looking for config file in /home/SegAnyGAussians/output/154f4d6d-5/cfg_args
Config file found: /home/SegAnyGAussians/output/154f4d6d-5/cfg_args
Optimizing /home/SegAnyGAussians/output/154f4d6d-5/
Loading trained model at iteration 30000, None [02/02 19:24:29]
Reading camera 1/20Traceback (most recent call last):
File "train_contrastive_feature.py", line 214, in
training(lp.extract(args), op.extract(args), pp.extract(args), args.iteration)
File "train_contrastive_feature.py", line 79, in training
scene = Scene(dataset, gaussians, feature_gaussians, load_iteration=iteration, shuffle=False, target='contrastive_feature', mode='train', sample_rate=sample_rate)
File "/home/SegAnyGAussians/scene/init.py", line 98, in init
scene_info = sceneLoadTypeCallbacks["Colmap"](args.source_path, args.images, args.eval, need_features = args.need_features, need_masks = args.need_masks, sample_rate = sample_rate)
File "/home/SegAnyGAussians/scene/dataset_readers.py", line 162, in readColmapSceneInfo
cam_infos_unsorted = readColmapCameras(cam_extrinsics=cam_extrinsics, cam_intrinsics=cam_intrinsics, images_folder=os.path.join(path, reading_dir), features_folder='/home/SegAnyGAussians/data/nerf_llff_data/fern/features/' if need_features else None, masks_folder='/home/SegAnyGAussians/data/nerf_llff_data/fern/sam_masks/' if need_masks else None, sample_rate=sample_rate)
File "/home/SegAnyGAussians/scene/dataset_readers.py", line 111, in readColmapCameras
masks = torch.load(os.path.join(masks_folder, image_name.split('.')[0] + ".pt")) if masks_folder is not None else None
File "/root/miniconda3/envs/gaussian_splatting/lib/python3.7/site-packages/torch/serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "/root/miniconda3/envs/gaussian_splatting/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/root/miniconda3/envs/gaussian_splatting/lib/python3.7/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/SegAnyGAussians/data/nerf_llff_data/fern/sam_masks/IMG_4026.pt'

大概意思找不到对应的features文件下的IMG_4026.pt文件，但是我不太清楚他为啥直接会定位到sam_masks的文件夹里面了，并且，我试图给他把sam_masks和features文件改名，也不行，他就直接又定位到了features里面；

于是我找到了scene/dataset_readers.py文件，打算在 line 162写绝对路径 cam_infos_unsorted = readColmapCameras(cam_extrinsics=cam_extrinsics, cam_intrinsics=cam_intrinsics, images_folder=os.path.join(path, reading_dir), features_folder='/home/SegAnyGAussians/data/nerf_llff_data/fern/features/' if need_features else None, masks_folder='/home/SegAnyGAussians/data/nerf_llff_data/fern/sam_masks/' if need_masks else None, sample_rate=sample_rate)，但是无效；

最后，我把features里面的文件全部移动到了sam_masks里面也不行，他就报错说：Looking for config file in /home/SegAnyGAussians/output/154f4d6d-5/cfg_args
Config file found: /home/SegAnyGAussians/output/154f4d6d-5/cfg_args
Optimizing /home/SegAnyGAussians/output/154f4d6d-5/
Loading trained model at iteration 30000, None [02/02 19:35:01]
Reading camera 20/20 [02/02 19:35:02]
Loading Training Cameras [02/02 19:35:02]
Loading Test Cameras [02/02 19:35:11]
Training progress: 0%| | 0/30000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train_contrastive_feature.py", line 214, in
training(lp.extract(args), op.extract(args), pp.extract(args), args.iteration)
File "train_contrastive_feature.py", line 128, in training
sam_masks = torch.nn.functional.interpolate(sam_masks.unsqueeze(0), size=sam_features.shape[-2:] , mode='nearest').squeeze()
File "/root/miniconda3/envs/gaussian_splatting/lib/python3.7/site-packages/torch/nn/functional.py", line 3855, in interpolate
"Input and output must have the same number of spatial dimensions, but got "
ValueError: Input and output must have the same number of spatial dimensions, but got input with with spatial dimensions of [256, 64, 64] and output size of torch.Size([64, 64]). Please provide input tensor in (N, C, d1, d2, ...,dK) format and output size in (o1, o2, ...,oK) format.

cuda out of memory when use > 500 images

when I use 550 images (size = 640 * 480) and run train_contrastive_feature.py,
no resize, downsample=1,
there is cuda out of memory error,
does this mean there is no need to use many images? can you give some advice?
RTX 3090

How can I accelerate extract_segment_everything_masks.py ?

我尝试着在一个约有300张图片的数据集上运行extract_segment_everything_masks.py，需要大约22分钟，但是这些照片在用于3dgs重建时并不需要这么长的时间，导致prepare data的时间比重建的时间还长了。
所以我想问一下是否有加速extract_segment_everything_masks.py的方法？降低一些精度也可以。（图像的downsample似乎无法加速这一过程）

python train_contrastive_feature.py

How to solve?

Usage instruction?

Hey, thanks for your excellent work!
Could you please provide usage instructions?

How to crop out an object in a pretrained 3d gaussian model

Thanks for you brilliant work. I was wondering if this method could clip the object out of a 3d gaussian point clouds? For example. If I want to get a clean object and remove all its backgrounds from a pretrained gaussian model, is it able to extract the gaussian splats which belongs to the object, and remove all splats that belong to its background?

questions about train_contrastive_feature.py and convert.py

hello mr. Jumpat
first, i want to say thank you for your wonderful contribution to 3dgs field as 3dgs is a new thing and what you did is amazing.
i'm an undergraduate student working on my final project and i'm very new to this field so i think i'm slow to understand how your code works.
i have few noob questions:

i don't understand what train_contrastive_feature.py do, would you please explain it to me in an easy terms?
if i want to use my own images as the data, would it be sufficient to just do convert.py , let the data structure format be it, and just proceed to the next steps? to be more clear, after converting, do i have to change the data structure format to be the same as the example?

for now i can only try to understand the codes without implementing it right away bcs the machine that i'm working on is on repair. i might ask more questions in the future during implementation, i sincerely hope you're okay with that. i'm sorry for the long questions, thank youu and i'm looking forward to your reply!

prompt segment

extract_features.py: error: unrecognized arguments: --downsample 4

I followed the readme template to use extract_features.py, as below:
python extract_features.py --image_root xxx --sam_checkpoint_path xxx --downsample 4
but an error occured:
extract_features.py: error: unrecognized arguments: --downsample 4

error in prompt_segmenting.ipynb: grad is all positive

新年快乐！我在运行以下代码的时候遇到了问题：

start_time = time.time()
selected_xyz = xyz[mask.cpu()].data
selected_score = point_scores[mask.cpu()]
# write_ply('./segmentation_res/vanilla_seg.ply', selected_xyz)

selected_xyz, thresh, mask_ = postprocess_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask = mask.clone(), max_time=1)
filtered_points, filtered_mask, thresh = postprocess_grad_based_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask=mask_.clone(), feature_gaussians=feature_gaussians, view=view, sam_mask=ref_mask.clone(), pipeline_args=pipeline.extract(args))
# filtered_points, thresh = postprocess_statistical_filtering(pcd=selected_xyz.clone(), max_time=3)

# print(thresh)
# write_ply('./segmentation_res/filtered_seg.ply', filtered_points)
time5 = time.time() - start_time
print(time5)

运行到postprocess_grad_based_statistical_filtering时，报错信息如下：

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[/tmp/ipykernel_18342/824582037.py](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22686f7374425f67227d.vscode-resource.vscode-cdn.net/tmp/ipykernel_18342/824582037.py) in <module>
      5 
      6 selected_xyz, thresh, mask_ = postprocess_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask = mask.clone(), max_time=1)
----> 7 filtered_points, filtered_mask, thresh = postprocess_grad_based_statistical_filtering(pcd=selected_xyz.clone(), precomputed_mask=mask_.clone(), feature_gaussians=feature_gaussians, view=view, sam_mask=ref_mask.clone(), pipeline_args=pipeline.extract(args))
      8 # filtered_points, thresh = postprocess_statistical_filtering(pcd=selected_xyz.clone(), max_time=3)
      9 

[/tmp/ipykernel_18342/3925413618.py](https://vscode-remote+ssh-002dremote-002b7b22686f73744e616d65223a22686f7374425f67227d.vscode-resource.vscode-cdn.net/tmp/ipykernel_18342/3925413618.py) in postprocess_grad_based_statistical_filtering(pcd, precomputed_mask, feature_gaussians, view, sam_mask, pipeline_args)
    186     ).dists
    187     mean_nearest_k_distance, std_nearest_k_distance = test_nearest_k_distance[:,:,1:].mean(), test_nearest_k_distance[:,:,1:].std()
--> 188     test_threshold = torch.max(test_nearest_k_distance)
    189     print(test_threshold, "test threshold")
    190 

RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

溯源查找了一下，是postprocess_grad_based_statistical_filtering函数中的这一段代码出现了问题：

    grad_score = grad_catch_mask.grad[precomputed_mask != 0].clone().squeeze()
    grad_score = -grad_score
    
    pos_grad_score = grad_score.clone()
    print("pos_grad_score", pos_grad_score)
    pos_grad_score[pos_grad_score <= 0] = 0
    pos_grad_score[pos_grad_score <= pos_grad_score.mean() + pos_grad_score.std()] = 0
    pos_grad_score[pos_grad_score != 0] = 1

    confirmed_mask = pos_grad_score.bool()

    if type(pcd) == np.ndarray:
        pcd = torch.from_numpy(pcd).cuda()
    else:
        pcd = pcd.cuda()

    confirmed_point = pcd[confirmed_mask == 1]

我跟踪了grad_score的值，发现grad_catch_mask.grad[precomputed_mask != 0]全部都是正数，因此grad_score全是负数，又因为将负数全部设置为0，所以pos_grad_score全是0，从而导致之后的confirmed_mask全部都是false，confirmed_point是空集，空集取max报了这个错误。

根据上下文，我的理解是：grad_score是由3dgs渲染得到一张图片，并计算图像的梯度信息得到的。但梯度全是正数的情况，我有点不知道该如何处理。

作者你好

我在运行prompt_segmenting时第三个框框的时候

出现了问题

Codes in prompt_segmenting.py

Hi. I am a little bit confused about your code in prompt_segmenting.py:

# Filter out the points confirmed to be negative

final_mask = point_mask.float().detach().clone().unsqueeze(-1)
final_mask.requires_grad = True

background = torch.zeros(final_mask.shape[0], 3, device = 'cuda')
rendered_mask_pkg = gaussian_renderer.render_mask(cameras[ref_img_camera_id], feature_gaussians, pipeline.extract(args), background, precomputed_mask=final_mask)

tmp_target_mask = torch.tensor(origin_ref_mask, device=rendered_mask_pkg['mask'].device)
tmp_target_mask = torch.nn.functional.interpolate(tmp_target_mask.unsqueeze(0).unsqueeze(0).float(), size=rendered_mask_pkg['mask'].shape[-2:] , mode='bilinear').squeeze(0)
tmp_target_mask[tmp_target_mask > 0.5] = 1
tmp_target_mask[tmp_target_mask != 1] = 0

loss = 30*torch.pow(tmp_target_mask - rendered_mask_pkg['mask'], 2).sum()
loss.backward()

grad_score = final_mask.grad.clone()
final_mask = final_mask - grad_score
final_mask[final_mask < 0] = 0
final_mask[final_mask != 0] = 1
final_mask *= point_mask.unsqueeze(-1)

Could you please explain the code here? Why you compute the loss and conduct backward() to get the final_mask?
Thank you.

jumpat / seganygaussians Goto Github PK

seganygaussians's People

Contributors

Stargazers

Watchers

Forkers

seganygaussians's Issues

Recommend Projects

Recommend Topics

Recommend Org