❓ Questions and Help Loop closures are not explicitly discussed in

Thanks for your interest <a class="user-mention notranslate" data-hovercard-type="user

Thanks for your answer <a class="user-mention notranslate" data-hovercard-type="user"

How are loop closures supported in GradSLAM? about gradslam HOT 3 CLOSED

gradslam commented on May 29, 2024

How are loop closures supported in GradSLAM?

from gradslam.

Comments (3)

krrish94 commented on May 29, 2024

Thanks for your interest @paucarre!

In gradslam, at each time step we align the "entire map" with the current frame. This implicitly means that we're performing global optimization at each step (and wouldn't need explicit loop closure detection). However, this also means that currently gradslam can operate in relatively-small environments (as global alignment at each timestep would become computationally expensive for large scenes).

from gradslam.

krrish94 commented on May 29, 2024

The differentiable map fusion module indeed uses camera poses as input (which may in-turn be computed from odometry). This can be used for learning odometry estimation (but not loop closure detection, since we don't have an explicit loop closure step in gradslam)

from gradslam.

paucarre commented on May 29, 2024

Thanks for your answer @krrish94 !

That clarifies a lot.

I was wondering if you see feasible the following setup to solve the memory/resources problem:

Using a neural network, downsample the high memory 3D color map to a low-memory 2D map where
it's fesible to do mapping with limited memory resources (which happens often in robotics)

To be specific, say the original high-memory map called Original_Map is a pointcloud (as per GradSLAM documentation).
Assume Original_Map pointcloud has the following tensor-size structure:

Original_Map = [
  points = [ (K_1, 3), (K_2, 3), ... , (K_N, 3) ],
  normals = [ (K_1, 3), (K_2, 3), ... , (K_N, 3) ],
  colors = [ (K_1, 3), (K_2, 3), ... , (K_N, 3) ]
]

where K_i are very high (lots of points for each sample).

We want to reduce K_i in Original_Map by flattening and down-sampling it to a 2D Compact_Map ("birds eye map").

A neural network would transform the original Original_Map to a Compact_Map, where Compact_Map has the following structure:

Compact_Map = [
  points = [ (H_1, 2), (H_2, 2), ... , (H_N, 2) ],
  normals = [ (H_1, 2), (H_2, 2), ... , (H_N, 2) ],
  embeddings = [ (H_1, L), (H_2, L), ... , (H_N, L) ]
]

Where:

H_i is very significantly smaller than K_i (H_i << K_i ). This implies samples are reduced and embeddings will contain whatever information is required for SLAM.
The embedding space L would be bigger than the 3 colors but that can be limited to something relatively small (say 16 or 32)

Then we perform both Differentiable Map as well as Map Fussion on the Compact_Map with low memory as the
tensor itself could be orders of magnitude smaller (because we reduce tensor size)

Compact_Map to Original_Map Correspondence

Then the only thing that's left is to establish a correspondence between Compact_Map and Original_Map.
We call such network Local_Correspondence_Network, and can be something along the lines of an autoencoder,
which wouldn't require a tagged training set.

As the correspondences between Compact_Map to Original_Map are local, because they are similar from a 2D point prespective, one could just use a sparse 3D convolutional neural network. The loss could just be a reconstruction loss based on pointclouds, which I guess it's something feasible. The idea is that the reconstruction error makes sure colors and point height information in Original_Map are kept in Compact_Map

Furthermore, there could be a map-consistency loss. This loss can be generated by constructing the map with Original_Map and GradSLAM and on a different path the map using Compact_Map and GradSLAM. As during training we can potentially use a high-end computer, it might be ok to generate the whole map. The loss would be something like the difference in regards to 2D correspondences (whether the 3D map of Original_Map can be seen as a "birth eye view" like Compact_Map).

I see the following advantages in Local_Correspondence_Network :

It can be reused as it just compacts locally parts of the map irrespective of the global structure ( due to its convolutional nature)
As it's convolutional it'll be itself reasonably small ("low memory")
It can downsample contiguous parts of the map but also samples from sensors directly, which enables real-time in-robot SLAM.

Production setup

Then the following steps would take place:

Train a Local_Correspondence_Network using samples form sensor data (like intel D435)
Downsample all samples from the sensor using Local_Correspondence_Network and build Compact_Map using GradSLAM layers.
Load Compact_Map as well as Local_Correspondence_Network into the robot and do the following:
- For each sample from the robot's sensor, downsample it using Local_Correspondence_Network
- Use Compact_Map and GradSLAM in the robot for location within the map.

Overall, I was wondering if you could help me to understand:

Whether the idea makes any sense
Whether the framework, as it is now, is capable of such setup

from gradslam.

How are loop closures supported in GradSLAM? about gradslam HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent