Git Product home page Git Product logo

Comments (6)

tangchuangqi avatar tangchuangqi commented on August 25, 2024

I also have the same concern. I think the cam_t_m2c and cam_R_m2c are not available during the evaluation.
The cam_t_m2c and cam_R_m2c are used to generate the new_target in the Loss_refiner. And I found the new_target is very close to the model_points after the transformation.
So I have an idea, it's maybe OK to use the model_points to replace the new_target. I'm going to verify the idea in the following days.

from densefusion.

ekdnltrla avatar ekdnltrla commented on August 25, 2024

@tangchuangqi
Thank you for answer.
I have a question about 'eval_linemod'.
The results of estimator and refiner are 'pred_r', 'pred_t', 'pred_c', 'idx'.
So I thought they are rotation(quaternion), translation, confidence, index.
But the result of example of linemod dataset after refinement is different with label of the data.
I'm really confused. Does it need some additional calculation for getting rotation matrix and translation matrix that we know?

from densefusion.

j96w avatar j96w commented on August 25, 2024

Hey guys, I think you misunderstand how we use cam_R_m2c and cam_t_m2c. Cam_R_m2c and cam_t_m2c are the ground truth pose we used to build the target (model points rotated by cam_R_m2c and translated by cam_t_m2c). During our evaluation, the target is only used to calculate the distance between our prediction and the target. For real-time testing, you don't need to have this target and distance because you can't have them. You should just use the pred_r, pred_t, pred_c outputted by the network and choose result with the max confidence (pred_r[argmax(pred_c)], pred_t[argmax(pred_c)]) as your pose estimation prediction. For the next refinement iteration, you only need to inversely apply your previous pose estimation result onto the input pointcloud (generated from the depth) and put it into the network to get the pose of the second iteration. After that, you should add the current result to your previous estimation (please follow the 'eval_ycb' to do this adding). And repeat this process until you finish the refinement. The whole evaluation process does not need cam_R_m2c, cam_t_m2c or the target.

@ekdnltrla for your question about 'eval_linemod':
The reason why the final result is different from the label of the data is because, instead of adding the residual pose on to the previous pose estimation, we inverse the predicted residual pose and use it to inversely change the target to calculate the distance. If you want to have the same number as the label of data, please follow 'eval_ycb', where you can see how to accumulate the residual poses to get a final pose output.

@tangchuangqi for your thinking about using the model_points to replace the new_target:
Sorry, you can't do that. The new_target is the target rotated by the inverse of your predicted rotation and translated by the inverse of your predicted translation. It is a new target for the next pose estimation iteration. The reason you think its very close to the model points is mainly because the initial pose estimation is very accurate, so that this inverse transformation turn the target back to somewhere close to the original model points, but it still is not the model points.

from densefusion.

j96w avatar j96w commented on August 25, 2024

I have cleaned the 'eval_linemod' to make it easier to understand and add a comment to show you where is the final pose output (same with the ground truth label of the dataset). Again, the 'target' (transformed by cam_R_m2c and cam_t_m2c) is not required during the pose estimation process and is only used to calculate the distance between prediction and ground truth.

from densefusion.

ekdnltrla avatar ekdnltrla commented on August 25, 2024

@j96w
Thank you for your kindness!
I could understand the code with your answer and get the result I desired.

And there's one thing I found out on training with LINEMOD dataset.
In the code "dataset.py", there is some processes that calculate "cloud" with "target_t" unlike YCB dataset.
Maybe because of this, result was so weird. Now, I removed the process and got desired test result.
If there is something I misunderstood, please tell me your opinion.

Thank you.

from densefusion.

j96w avatar j96w commented on August 25, 2024

Those processes are trying to convert the distance metric to meter (YCB doesn't need that since it's original metric is meter). Just keep in mind, if you changed the distance metric, you probably also need to adjust the hyperparameter w during training to reach the best performance.

from densefusion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.