Comments (3)
Thanks for your interest @paucarre!
In gradslam, at each time step we align the "entire map" with the current frame. This implicitly means that we're performing global optimization at each step (and wouldn't need explicit loop closure detection). However, this also means that currently gradslam can operate in relatively-small environments (as global alignment at each timestep would become computationally expensive for large scenes).
from gradslam.
The differentiable map fusion module indeed uses camera poses as input (which may in-turn be computed from odometry). This can be used for learning odometry estimation (but not loop closure detection, since we don't have an explicit loop closure step in gradslam)
from gradslam.
Thanks for your answer @krrish94 !
That clarifies a lot.
I was wondering if you see feasible the following setup to solve the memory/resources problem:
- Using a neural network, downsample the high memory 3D color map to a low-memory 2D map where
it's fesible to do mapping with limited memory resources (which happens often in robotics)
To be specific, say the original high-memory map called Original_Map
is a pointcloud (as per GradSLAM documentation).
Assume Original_Map
pointcloud has the following tensor-size structure:
Original_Map = [
points = [ (K_1, 3), (K_2, 3), ... , (K_N, 3) ],
normals = [ (K_1, 3), (K_2, 3), ... , (K_N, 3) ],
colors = [ (K_1, 3), (K_2, 3), ... , (K_N, 3) ]
]
where K_i
are very high (lots of points for each sample).
We want to reduce K_i
in Original_Map
by flattening and down-sampling it to a 2D Compact_Map
("birds eye map").
A neural network would transform the original Original_Map
to a Compact_Map
, where Compact_Map
has the following structure:
Compact_Map = [
points = [ (H_1, 2), (H_2, 2), ... , (H_N, 2) ],
normals = [ (H_1, 2), (H_2, 2), ... , (H_N, 2) ],
embeddings = [ (H_1, L), (H_2, L), ... , (H_N, L) ]
]
Where:
H_i
is very significantly smaller thanK_i
(H_i
<<K_i
). This implies samples are reduced andembeddings
will contain whatever information is required for SLAM.- The embedding space
L
would be bigger than the 3 colors but that can be limited to something relatively small (say 16 or 32)
Then we perform both Differentiable Map
as well as Map Fussion
on the Compact_Map
with low memory as the
tensor itself could be orders of magnitude smaller (because we reduce tensor size)
Compact_Map
toOriginal_Map
Correspondence
Then the only thing that's left is to establish a correspondence between Compact_Map
and Original_Map
.
We call such network Local_Correspondence_Network
, and can be something along the lines of an autoencoder,
which wouldn't require a tagged training set.
As the correspondences between Compact_Map
to Original_Map
are local, because they are similar from a 2D point prespective, one could just use a sparse 3D convolutional neural network. The loss could just be a reconstruction loss
based on pointclouds, which I guess it's something feasible. The idea is that the reconstruction error makes sure colors and point height information in Original_Map
are kept in Compact_Map
Furthermore, there could be a map-consistency
loss. This loss can be generated by constructing the map with Original_Map
and GradSLAM and on a different path the map using Compact_Map
and GradSLAM. As during training we can potentially use a high-end computer, it might be ok to generate the whole map. The loss would be something like the difference in regards to 2D correspondences (whether the 3D map of Original_Map
can be seen as a "birth eye view" like Compact_Map
).
I see the following advantages in Local_Correspondence_Network
:
- It can be reused as it just compacts locally parts of the map irrespective of the global structure ( due to its convolutional nature)
- As it's convolutional it'll be itself reasonably small ("low memory")
- It can downsample contiguous parts of the map but also samples from sensors directly, which enables real-time in-robot SLAM.
- Production setup
Then the following steps would take place:
-
Train a
Local_Correspondence_Network
using samples form sensor data (like intel D435) -
Downsample all samples from the sensor using
Local_Correspondence_Network
and buildCompact_Map
usingGradSLAM
layers. -
Load
Compact_Map
as well asLocal_Correspondence_Network
into the robot and do the following:- For each sample from the robot's sensor, downsample it using
Local_Correspondence_Network
- Use
Compact_Map
and GradSLAM in the robot for location within the map.
- For each sample from the robot's sensor, downsample it using
Overall, I was wondering if you could help me to understand:
- Whether the idea makes any sense
- Whether the framework, as it is now, is capable of such setup
from gradslam.
Related Issues (20)
- Realsense live HOT 1
- Chamferdist installation error HOT 2
- Questions about gradLM HOT 4
- How to run gradslam with the TUM dataset? HOT 2
- Question about step by step method! HOT 2
- Example of Training end-to-end for a simple task that includes mapping
- TUM/ICL Poses HOT 2
- Question about using ScanNet HOT 1
- To save the pointcloud once generated. HOT 1
- Is the Poses Data Actually Used in the SLAM Process?
- How is RGB used here? HOT 1
- Usage of multiple RGBD sensors as input HOT 3
- How do gradslam learn anything using the differentiable modules? HOT 1
- Reproducing `Analysis of Gradinets` Results HOT 1
- how the jacobian is derived? HOT 1
- No matching distribution found for open3d==0.10.0.0 HOT 4
- not compiled with GPU support error in chamfer.py when using gradicp HOT 2
- How can I run these codes?
- how to use it in customer dataset
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gradslam.