Hi there, at first thx for your great work, I install the env and here is my sitution:

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Supporting No Depth Input - RGB SLAM about splatam HOT 10 OPEN

spla-tam commented on July 29, 2024

Supporting No Depth Input - RGB SLAM

from splatam.

Comments (10)

Nik-V9 commented on July 29, 2024 1

Hi, Thanks for trying out the code!

SplaTAM requires depth input for running SLAM & reconstruction. Our dataloaders, by default, expect both an rgb and depth folder. We haven't tested the offline setting in the NeRFCapture App. We have only tested our scripts, which interface with NeRFCapture in Online mode.

I just checked capturing an offline dataset with NeRFCapture. It looks like both the RGB and depth pngs are saved under the images folder. You would need to move the .depth.png images to a depth folder and rename them to .png in addition to the renaming of images folder. This should be a pretty simple script to write.

from splatam.

Nik-V9 commented on July 29, 2024 1

I think the offline mode in the Nerfcapture app is broken as pointed out by the app's developer here jc211/NeRFCapture#10 (comment). I tried renaming the files yesterday but that does not seem to cut it: the tensor dimensions were off and the depth PNG files themselves looked wrong: they do not seem to store the full depth range. This is probably a bit different to how the online mode works.

Yes, this is correct. Looks like the offline mode is broken. So far, we have only used the online mode.

I'm actually trying to get this to work using our own iOS data collection app (not related to Nerfcapture or SplaTAM), see here for details, but I'm not sure if we got the depth conversion correct yet.

This looks like a cool app.

So to summarize, if an app/script would export RGBD data where the depth PNGs have a depth scale is 6553.5, and the camera intrinsics are correctly set in transforms.json, it probably should work?

Yes! We need depth and intrinsics in addition to RGB for SLAM. The depth scale doesn't specifically have to be 6553.5 (as long as the pixel intensity to meter scaling is known). That's what our iPhone dataloader is currently hardcoded to:

SplaTAM/datasets/gradslam_datasets/nerfcapture.py

Line 49 in 9e74e99

config_dict["camera_params"]["png_depth_scale"] = 6553.5 # Depth is in mm

from splatam.

oseiskar commented on July 29, 2024

I think the offline mode in the Nerfcapture app is broken as pointed out by the app's developer here jc211/NeRFCapture#10 (comment). I tried renaming the files yesterday but that does not seem to cut it: the tensor dimensions were off and the depth PNG files themselves looked wrong: they do not seem to store the full depth range. This is probably a bit different to how the online mode works.

I'm actually trying to get this to work using our own iOS data collection app (not related to Nerfcapture or SplaTAM), see here for details, but I'm not sure if we got the depth conversion correct yet.

If I understood this comment correctly:

the online mode reads (uint32 or float32?) data from the Nerfcapture app
then scales that by some number (1/10?) and saves to PNG
the scale of the original depth image (before saving to PNG) is assumed to be 65535 units = 1m
so the PNG depth scale here is 6553.5 units = 1m
other datasets are configured to use other depth scaling, more typically 1000 units = 1m, i.e., depth in millimeters

So to summarize, if an app/script would export RGBD data where the depth PNGs have a depth scale is 6553.5, and the camera intrinsics are correctly set in transforms.json, it probably should work?

from splatam.

LemonSoda-RPG commented on July 29, 2024

Hi, Thanks for trying out the code!

SplaTAM requires depth input for running SLAM & reconstruction. Our dataloaders, by default, expect both an rgb and depth folder. We haven't tested the offline setting in the NeRFCapture App. We have only tested our scripts, which interface with NeRFCapture in Online mode.

I just checked capturing an offline dataset with NeRFCapture. It looks like both the RGB and depth pngs are saved under the images folder. You would need to move the .depth.png images to a depth folder and rename them to .png in addition to the renaming of images folder. This should be a pretty simple script to write.

Hello, I am running a program using WSL and I don't know how to keep WSL and my phone on the same network segment. Therefore, I used nerfcapture for offline data collection. However, after the collection was completed, I found that the folder on my phone only contains color images and transformer. json, and does not include depth maps. My phone is an Apple 14, is it unable to capture depth maps？

from splatam.

StarsTesla commented on July 29, 2024

@Nik-V9 I certainly check the images dir, there is only rgb image. Does the data need collect by iPhone with lidar? Maybe here could be improved to use something like MiDAS or MVSnet to get the depth?

from splatam.

H-tr commented on July 29, 2024

Hi, Thanks for trying out the code!
SplaTAM requires depth input for running SLAM & reconstruction. Our dataloaders, by default, expect both an rgb and depth folder. We haven't tested the offline setting in the NeRFCapture App. We have only tested our scripts, which interface with NeRFCapture in Online mode.
I just checked capturing an offline dataset with NeRFCapture. It looks like both the RGB and depth pngs are saved under the images folder. You would need to move the .depth.png images to a depth folder and rename them to .png in addition to the renaming of images folder. This should be a pretty simple script to write.

Hello, I am running a program using WSL and I don't know how to keep WSL and my phone on the same network segment. Therefore, I used nerfcapture for offline data collection. However, after the collection was completed, I found that the folder on my phone only contains color images and transformer. json, and does not include depth maps. My phone is an Apple 14, is it unable to capture depth maps？

Hi, I also faced the same issue and found only Pro support the LiDAR

from splatam.

Nik-V9 commented on July 29, 2024

@Nik-V9 I certainly check the images dir, there is only rgb image. Does the data need collect by iPhone with lidar? Maybe here could be improved to use something like MiDAS or MVSnet to get the depth?

Yes, you need a LiDAR-equipped iPhone for the demo.

Using a depth estimation network would make the method up to scale (not metric) since monocular depth wouldn't have scale. Your camera tracking performance would be influenced by the accuracy and multi-view consistency of the depth estimation network. An RGB-only SLAM method using 3D Gaussians is currently future research and might be one of the things we might consider.

from splatam.

pablovela5620 commented on July 29, 2024

I think the offline mode in the Nerfcapture app is broken as pointed out by the app's developer here jc211/NeRFCapture#10 (comment). I tried renaming the files yesterday but that does not seem to cut it: the tensor dimensions were off and the depth PNG files themselves looked wrong: they do not seem to store the full depth range. This is probably a bit different to how the online mode works.

Yes, this is correct. Looks like the offline mode is broken. So far, we have only used the online mode.

I'm actually trying to get this to work using our own iOS data collection app (not related to Nerfcapture or SplaTAM), see here for details, but I'm not sure if we got the depth conversion correct yet.

This looks like a cool app.

So to summarize, if an app/script would export RGBD data where the depth PNGs have a depth scale is 6553.5, and the camera intrinsics are correctly set in transforms.json, it probably should work?

Yes! We need depth and intrinsics in addition to RGB for SLAM. The depth scale doesn't specifically have to be 6553.5 (as long as the pixel intensity to meter scaling is known). That's what our iPhone dataloader is currently hardcoded to:

SplaTAM/datasets/gradslam_datasets/nerfcapture.py

Line 49 in 9e74e99

config_dict["camera_params"]["png_depth_scale"] = 6553.5 # Depth is in mm

where does the 6553.5 number come from? I'm also trying to get this working, I see you use a depth_scale of 10 and this magic number of 6553.5 but I don't fully understand. What is encoded into the depth image, to see the actual metric value, I would need to divide by the 6553.5 and multiply by 10?

from splatam.

Nik-V9 commented on July 29, 2024

Hi @pablovela5620, the 6553.5 is the scaling factor for the depth png image. When you load the depth image, you need to divide the pixel values by this number to get metric depth. By default, the iPhone depth image has a pixel intensity of 65535, corresponding to 1 meter. When we save the depth image, we divide this by 10 and save the depth image.

from splatam.

jeezrick commented on July 29, 2024

I think the offline mode in the Nerfcapture app is broken as pointed out by the app's developer here jc211/NeRFCapture#10 (comment). I tried renaming the files yesterday but that does not seem to cut it: the tensor dimensions were off and the depth PNG files themselves looked wrong: they do not seem to store the full depth range. This is probably a bit different to how the online mode works.

Yes, this is correct. Looks like the offline mode is broken. So far, we have only used the online mode.

I'm actually trying to get this to work using our own iOS data collection app (not related to Nerfcapture or SplaTAM), see here for details, but I'm not sure if we got the depth conversion correct yet.

This looks like a cool app.

So to summarize, if an app/script would export RGBD data where the depth PNGs have a depth scale is 6553.5, and the camera intrinsics are correctly set in transforms.json, it probably should work?

Yes! We need depth and intrinsics in addition to RGB for SLAM. The depth scale doesn't specifically have to be 6553.5 (as long as the pixel intensity to meter scaling is known). That's what our iPhone dataloader is currently hardcoded to:

SplaTAM/datasets/gradslam_datasets/nerfcapture.py

Line 49 in 9e74e99

config_dict["camera_params"]["png_depth_scale"] = 6553.5 # Depth is in mm

where does the 6553.5 number come from? I'm also trying to get this working, I see you use a depth_scale of 10 and this magic number of 6553.5 but I don't fully understand. What is encoded into the depth image, to see the actual metric value, I would need to divide by the 6553.5 and multiply by 10?

Because they try to save the depth array like this:

def save_depth_as_png(depth, filename, png_depth_scale):
    depth = depth * png_depth_scale
    depth = depth.astype(np.uint16)
    depth = Image.fromarray(depth)
    depth.save(filename)

when doing it like this, you need to consider that the range of uint16 is from 0 to 65535(2 ^16 -1). So, I guess what they did must be first clamp actual depth value to [0, 10.0], than multiply it by 6553.5, then convert it to uint16 without the risk of overflowing (but lose some accuracy from float to int).
So, after loading the image, just divide it by 6553.5, and you get depth value after clamp.
like this part of code at SplaTAM/datasets/gradslam_datasets/basedataset.py:

    def _preprocess_depth(self, depth: np.ndarray):
        r"""Preprocesses the depth image by resizing, adding channel dimension, and scaling values to meters. Optionally
        converts depth from channels last :math:`(H, W, 1)` to channels first :math:`(1, H, W)` representation.

        Args:
            depth (np.ndarray): Raw depth image

        Returns:
            np.ndarray: Preprocessed depth

        Shape:
            - depth: :math:`(H_\text{old}, W_\text{old})`
            - Output: :math:`(H, W, 1)` if `self.channels_first == False`, else :math:`(1, H, W)`.
        """
        depth = cv2.resize(
            depth.astype(float),
            (self.desired_width, self.desired_height),
            interpolation=cv2.INTER_NEAREST,
        )
        if len(depth.shape) == 2:
            depth = np.expand_dims(depth, -1)
        if self.channels_first:
            depth = datautils.channels_first(depth)
        return depth / self.png_depth_scale

from splatam.

Supporting No Depth Input - RGB SLAM about splatam HOT 10 OPEN

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent