I want to use img2pose to extract the camera pose (for the purpose of using it as the

How to extract camera extrinsics? about img2pose HOT 7 CLOSED

vitoralbiero commented on June 5, 2024

How to extract camera extrinsics?

from img2pose.

Comments (7)

vitoralbiero commented on June 5, 2024

Hello,
Thank you for your interest in our work.

In img2pose, we assume a fixed camera intrinsics as described in the paper, where we obtain the head 6DoF to transform 3D points corresponding to the head, so I do not have a function ready for your specific use case.

t is added to the points after they are transformed by R. Take a look at the transform_points function which might help you.

from img2pose.

RobinRenggli commented on June 5, 2024

As I am still struggling with this issue, I am asking again in the hopes someone can help me out.

I want to use the 6DoF pose to get a camera pose. The following sketch shows what I want to achieve:

I know I can obtain the camera rotation by inverting the rotation matrix of the pose. But I lack the understanding of this model to transform the translation of the pose into a translation of the camera.

Here's my current attempt in code with img2_pose being the pose provided by your model:
rotation = np.zeros((4,4))
r = Rotation.from_rotvec(img2_pose[0:3]).as_matrix().T
face_pose = (img2_pose[3:])
camera_pose = r.dot(face_pose)
rotation[0:3, 0:3] = r
rotation[0:3, 3] = camera_pose
rotation[3, :] = [0,0,0,1]

By rendering an average face over my input pictures I can test whether my approach is working or not. It shows that the rotation is correct, but the translation is off.

I know this is not directly related to this project, but I feel I can't solve this issue because I do not understand the coordinate systems etc. of this paper well enough. If this is not the appropriate place to ask this question, feel free to close the issue again.

from img2pose.

vitoralbiero commented on June 5, 2024

The units in the translation vector (tvec) are arbitrary units of the 3D face model used as a reference to annotate the images, they do not represent pixels or other human-understandable units.
Their reference point is the center of the image, and they are consistent across images, where the camera intrinsics is what changes w.r.t. image dimensions.

The tvec is the same as the output of SolvePnP from OpenCV, and you can read more about it here https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga549c2075fac14829ff4a58bc931c033d

To get the camera position, you can follow the posts below, as they explain how it can be done.
https://stackoverflow.com/questions/18637494/camera-position-in-world-coordinate-from-cvsolvepnp
https://answers.opencv.org/question/64315/solvepnp-object-to-camera-pose/

Hope this helps.

from img2pose.

RobinRenggli commented on June 5, 2024

Is the rotvec also a result of SolvePnP?

I think I know how to solve my issue in theory, but I'm struggling with the transformations from one coordinate frame to another. This is of course my own issue to solve and I don't expect you to help me with this, I just want to be certain I understand the output of your model correctly, such that I can remove any uncertainties.

Am I correct in assuming that the output of your model is the pose and position of the face in the object frame? It is in an OpenCV format, not OpenGL.

What I need is a camera-to-world transform (in OpenGL format) according to the pose, which I think you are using when you render the 3D face over the image. If that were the case, shouldn't I be able to simply take the matrix you use for the rendering and convert it to the OpenGL format?

By following the steps outline in the links you gave me and converting it to an OpenGL format, I arrive at a correct Rotation matrix but the translation is always off. This might be due a bug in my code or due to me misunderstanding the output of your model. I just want to rule out the latter.

P.s.: Thank you for helping me out, your answers have already improved my understanding of the problem by a lot!

from img2pose.

RobinRenggli commented on June 5, 2024

I'm adding some of my renderings that illustrate the problem. This is the result when I follow the steps outlined in the links you provided:

from img2pose.

vitoralbiero commented on June 5, 2024

Yes, the rotvec is also a result of SolvePnP.
The entire output of img2pose is in the same format as SolvePnP, which is in OpenCV format.

From the example you send, it looks like t_x is off, but t_y and t_z seems correct (or at least not as off as t_x).
Is the prediction before conversion correct? I mean, if you render the original estimated pose, how does it look like? If you haven't checked this yet, you can do it by using this notebook.

from img2pose.

vitoralbiero commented on June 5, 2024

Closing this issue for inactivity.

from img2pose.

How to extract camera extrinsics? about img2pose HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent