Git Product home page Git Product logo

Comments (6)

JoseLamarca avatar JoseLamarca commented on September 3, 2024

Thank you for your issue, I forgot to include the yalm file. I have just updated the readme, there is just one file for the all the sequences "stereo0.yalm". It is included in the zip now.
Best

from defslam.

Gatsby23 avatar Gatsby23 commented on September 3, 2024

Thank you for your issue, I forgot to include the yalm file. I have just updated the readme, there is just one file for the all the sequences "stereo0.yalm". It is included in the zip now.
Best

OK, thank you very much. Thanks to the contribution to the SLAM in changing scenarios. However, have you test the algorithm in classical scene? Such as Kitti Dataset?

from defslam.

JoseLamarca avatar JoseLamarca commented on September 3, 2024

Nope. Unfortunately, the core algorithm for monocular non-rigid mapping assumes that the scene reconstructed is a single smooth surface valid for some intracorporeal sequences but is not valid for classical scenes. How to map in discontinuous deformable scenes is a line to explore in the next works! :)

from defslam.

Gatsby23 avatar Gatsby23 commented on September 3, 2024

Nope. Unfortunately, the core algorithm for monocular non-rigid mapping assumes that the scene reconstructed is a single smooth surface valid for some intracorporeal sequences but is not valid for classical scenes. How to map in discontinuous deformable scenes is a line to explore in the next works! :)

I'm sorry, maybe I'm the fresh man to the deform area. I don't understand why the algorithm unsuitable for the rigid area? maybe the classical scene don't have sufficient frames? Could you provide me some materials about that ?

from defslam.

JoseLamarca avatar JoseLamarca commented on September 3, 2024

The algorithm is suitable for rigid areas, proof of that is the abdominal sequence that is kind of rigid. The problem for these sequences is the discontinuous areas. For the monocular case, we are assuming that the surface is smooth that is not usually valid for the classical datasets. Apart from complexity issues that algorithms with RGB-D and stereo cameras could have in those scenes [1] and [2].

In our decision to choose an algorithm for the mapping, there are two main state-of-the-art non-rigid reconstruction approaches: going for orthogonal cameras and assuming that you can see the real size of the object in your image (there are a vast of good works in the literature, very recommended [3]), or going for perspective cameras and impose isometry or inextensibility. We decide to go for perspective cameras due to the nature of our scenarios with strong perspective effects. The problem using perspective cameras appears when you try to reconstruct independently each point in a non-rigid environment, you always have gauge freedom in the scale for each point, i.e. you can track a small fly very close to the camera or a large elephant further away having exactly the same measurements in the camera. Without any restrictions, It would be equivalent to do a SLAM independently for every single point, so you have to constrain them in some way. What you can estimate independently of this effect is the normals of the points assuming isometry [3,4] (in the paper [2] sec. 5.B). That is why you have to impose a regularization to recover the depth. Following the work of Shaifali in [3] and Chhatkuli in [4], we estimate the surface by imposing minimum bending that is a kind of a minimal surface. In our case, working inside the body it is a valid assumption for many cases of laparoscopies like the ones presented in the Hamlyn dataset. In contrast, assuming that the points are connected by a smooth surface does not fit with the scenes in the classical datasets like EuroC or TUM at least that you follow a single object.

To generalize monocular deformable SLAM in those scenes, we should find another regularization.

This papers should be enough for a smooth introduction to deformable reconstruction:
[1]R. A. Newcombe, D. Fox, and S. M. Seitz. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR, 2015.
[2] J. Song, J. Wang, L. Zhao, S. Huang, and G. Dissanayake. Misslam: Real-time large-scale dense deformable slam system in minimally invasive surgery based on heterogeneous computing.
[1] Dai, Yuchao, Hongdong Li, and Mingyi He. "A simple prior-free method for non-rigid structure-from-motion factorization."
[2] "Defslam: Tracking and mapping of deforming scenes from monocular sequences"
[3] S. Parashar, D. Pizarro, and A. Bartoli. Isometric non-rigid shape-from motion with Riemannian geometry solved in linear time.
[4] A. Chhatkuli, D. Pizarro, and A. Bartoli. Non-rigid shape-from-motion for isometric surfaces using infinitesimal planarity.

Hope this long paragraph would be useful!

from defslam.

Gatsby23 avatar Gatsby23 commented on September 3, 2024

The algorithm is suitable for rigid areas, proof of that is the abdominal sequence that is kind of rigid. The problem for these sequences is the discontinuous areas. For the monocular case, we are assuming that the surface is smooth that is not usually valid for the classical datasets. Apart from complexity issues that algorithms with RGB-D and stereo cameras could have in those scenes [1] and [2].

In our decision to choose an algorithm for the mapping, there are two main state-of-the-art non-rigid reconstruction approaches: going for orthogonal cameras and assuming that you can see the real size of the object in your image (there are a vast of good works in the literature, very recommended [3]), or going for perspective cameras and impose isometry or inextensibility. We decide to go for perspective cameras due to the nature of our scenarios with strong perspective effects. The problem using perspective cameras appears when you try to reconstruct independently each point in a non-rigid environment, you always have gauge freedom in the scale for each point, i.e. you can track a small fly very close to the camera or a large elephant further away having exactly the same measurements in the camera. Without any restrictions, It would be equivalent to do a SLAM independently for every single point, so you have to constrain them in some way. What you can estimate independently of this effect is the normals of the points assuming isometry [3,4] (in the paper [2] sec. 5.B). That is why you have to impose a regularization to recover the depth. Following the work of Shaifali in [3] and Chhatkuli in [4], we estimate the surface by imposing minimum bending that is a kind of a minimal surface. In our case, working inside the body it is a valid assumption for many cases of laparoscopies like the ones presented in the Hamlyn dataset. In contrast, assuming that the points are connected by a smooth surface does not fit with the scenes in the classical datasets like EuroC or TUM at least that you follow a single object.

To generalize monocular deformable SLAM in those scenes, we should find another regularization.

This papers should be enough for a smooth introduction to deformable reconstruction:
[1]R. A. Newcombe, D. Fox, and S. M. Seitz. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR, 2015.
[2] J. Song, J. Wang, L. Zhao, S. Huang, and G. Dissanayake. Misslam: Real-time large-scale dense deformable slam system in minimally invasive surgery based on heterogeneous computing.
[1] Dai, Yuchao, Hongdong Li, and Mingyi He. "A simple prior-free method for non-rigid structure-from-motion factorization."
[2] "Defslam: Tracking and mapping of deforming scenes from monocular sequences"
[3] S. Parashar, D. Pizarro, and A. Bartoli. Isometric non-rigid shape-from motion with Riemannian geometry solved in linear time.
[4] A. Chhatkuli, D. Pizarro, and A. Bartoli. Non-rigid shape-from-motion for isometric surfaces using infinitesimal planarity.

Hope this long paragraph would be useful!

Thank you for your replay! Thank you very much!

from defslam.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.