Git Product home page Git Product logo

Comments (7)

vitoralbiero avatar vitoralbiero commented on June 5, 2024

Hello,
Thank you for your interest in our work.

In img2pose, we assume a fixed camera intrinsics as described in the paper, where we obtain the head 6DoF to transform 3D points corresponding to the head, so I do not have a function ready for your specific use case.

t is added to the points after they are transformed by R. Take a look at the transform_points function which might help you.

from img2pose.

RobinRenggli avatar RobinRenggli commented on June 5, 2024

As I am still struggling with this issue, I am asking again in the hopes someone can help me out.

I want to use the 6DoF pose to get a camera pose. The following sketch shows what I want to achieve:

sketch

I know I can obtain the camera rotation by inverting the rotation matrix of the pose. But I lack the understanding of this model to transform the translation of the pose into a translation of the camera.

Here's my current attempt in code with img2_pose being the pose provided by your model:
rotation = np.zeros((4,4))
r = Rotation.from_rotvec(img2_pose[0:3]).as_matrix().T
face_pose = (img2_pose[3:])
camera_pose = r.dot(face_pose)
rotation[0:3, 0:3] = r
rotation[0:3, 3] = camera_pose
rotation[3, :] = [0,0,0,1]

By rendering an average face over my input pictures I can test whether my approach is working or not. It shows that the rotation is correct, but the translation is off.

I know this is not directly related to this project, but I feel I can't solve this issue because I do not understand the coordinate systems etc. of this paper well enough. If this is not the appropriate place to ask this question, feel free to close the issue again.

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 5, 2024

The units in the translation vector (tvec) are arbitrary units of the 3D face model used as a reference to annotate the images, they do not represent pixels or other human-understandable units.
Their reference point is the center of the image, and they are consistent across images, where the camera intrinsics is what changes w.r.t. image dimensions.

The tvec is the same as the output of SolvePnP from OpenCV, and you can read more about it here https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d

To get the camera position, you can follow the posts below, as they explain how it can be done.
https://stackoverflow.com/questions/18637494/camera-position-in-world-coordinate-from-cvsolvepnp
https://answers.opencv.org/question/64315/solvepnp-object-to-camera-pose/

Hope this helps.

from img2pose.

RobinRenggli avatar RobinRenggli commented on June 5, 2024

Is the rotvec also a result of SolvePnP?

I think I know how to solve my issue in theory, but I'm struggling with the transformations from one coordinate frame to another. This is of course my own issue to solve and I don't expect you to help me with this, I just want to be certain I understand the output of your model correctly, such that I can remove any uncertainties.

Am I correct in assuming that the output of your model is the pose and position of the face in the object frame? It is in an OpenCV format, not OpenGL.

What I need is a camera-to-world transform (in OpenGL format) according to the pose, which I think you are using when you render the 3D face over the image. If that were the case, shouldn't I be able to simply take the matrix you use for the rendering and convert it to the OpenGL format?

By following the steps outline in the links you gave me and converting it to an OpenGL format, I arrive at a correct Rotation matrix but the translation is always off. This might be due a bug in my code or due to me misunderstanding the output of your model. I just want to rule out the latter.

P.s.: Thank you for helping me out, your answers have already improved my understanding of the problem by a lot!

from img2pose.

RobinRenggli avatar RobinRenggli commented on June 5, 2024

I'm adding some of my renderings that illustrate the problem. This is the result when I follow the steps outlined in the links you provided:

example1
example2

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 5, 2024

Yes, the rotvec is also a result of SolvePnP.
The entire output of img2pose is in the same format as SolvePnP, which is in OpenCV format.

From the example you send, it looks like tx is off, but ty and tz seems correct (or at least not as off as tx).
Is the prediction before conversion correct? I mean, if you render the original estimated pose, how does it look like? If you haven't checked this yet, you can do it by using this notebook.

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 5, 2024

Closing this issue for inactivity.

from img2pose.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.