Hi! Thanks for your wonderful dataset! I have a question about the camera intrinsi

my test code is here: <div class="highlight highlight-source-python notranslate po

About the camera intrinsics matrix about co3d HOT 7 CLOSED

facebookresearch commented on June 21, 2024

About the camera intrinsics matrix

from co3d.

Comments (7)

liuyuan-pal commented on June 21, 2024 9

Thanks for sharing this dataset.

I finally figure out how to convert the annotation to the opencv-style extrinsics and intrinsics, which may be helpful for others:

def co3d_annotation_to_opencv_pose(entry):
    p = entry.viewpoint.principal_point
    f = entry.viewpoint.focal_length
    h, w = entry.image.size
    K = np.eye(3)
    s = (min(h, w) - 1) / 2
    K[0, 0] = f[0] * (w - 1) / 2
    K[1, 1] = f[1] * (h - 1) / 2
    K[0, 2] = -p[0] * s + (w - 1) / 2
    K[1, 2] = -p[1] * s + (h - 1) / 2

    R = np.asarray(entry.viewpoint.R).T   # note the transpose here
    T = np.asarray(entry.viewpoint.T)
    pose = np.concatenate([R,T[:,None]],1)
    pose = np.diag([-1,-1,1]).astype(np.float32) @ pose # flip the direction of x,y axis

    # "pose" is the extrinsic and "K" is the intrinsic
    # pose = [R|t]
    # x_img = K (R @ x_wrd + t)
    # x_img is in pixel

However, I still have a question about how to convert the points from the estimated depth into the coordinate system of "pointcloud.ply".

from co3d.

shapovalov commented on June 21, 2024 1

Please note that PyTorch3D NDC convention has −1 and 1 coordinates at the corners of the image, not the centres of the corner pixels, so you should not subtract 1 from h, w, and min(h, w).

Please try to use the provided data loaders. If they do not fulfil some needs, please let us know.

The reference for parsing the viewpoint (applying the crop if needed) is https://github.com/facebookresearch/co3d/blob/main/dataset/co3d_dataset.py#L490 .
For conversion to OpenCV format, PyTorch3D has a function
https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/utils/camera_conversions.py#L65
with the actual implementation in
https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/renderer/camera_conversions.py#L61 .

from co3d.

davnov134 commented on June 21, 2024

HI, we store the intrinsics in the PyTorch3D convention. More info here:
https://pytorch3d.org/docs/cameras

from co3d.

OasisYang commented on June 21, 2024

Thanks for your reply! I know I need to convert the given principal point to screen space. But I mean after converting, the principal point also located at the center of image, which is not very common. Did you warp the image or implement other preprocessing steps to make sure it?

from co3d.

davnov134 commented on June 21, 2024

The location of the principal point is decided by the COLMAP image rectification algorithm.
I just checked the raw COLMAP data, and it seems that the COLMAP undistorter also resamples the image so that the principal point is exactly coinciding with the center of the image.
Thanks for spotting this.

from co3d.

MaybeOjbk commented on June 21, 2024

Thanks for sharing this dataset.

I finally figure out how to convert the annotation to the opencv-style extrinsics and intrinsics, which may be helpful for others:

def co3d_annotation_to_opencv_pose(entry):
    p = entry.viewpoint.principal_point
    f = entry.viewpoint.focal_length
    h, w = entry.image.size
    K = np.eye(3)
    s = (min(h, w) - 1) / 2
    K[0, 0] = f[0] * (w - 1) / 2
    K[1, 1] = f[1] * (h - 1) / 2
    K[0, 2] = -p[0] * s + (w - 1) / 2
    K[1, 2] = -p[1] * s + (h - 1) / 2

    R = np.asarray(entry.viewpoint.R).T   # note the transpose here
    T = np.asarray(entry.viewpoint.T)
    pose = np.concatenate([R,T[:,None]],1)
    pose = np.diag([-1,-1,1]).astype(np.float32) @ pose # flip the direction of x,y axis

    # "pose" is the extrinsic and "K" is the intrinsic
    # pose = [R|t]
    # x_img = K (R @ x_wrd + t)
    # x_img is in pixel

However, I still have a question about how to convert the points from the estimated depth into the coordinate system of "pointcloud.ply".

Thanks a lot, and also we should use parameters of train_dataset[idx].camera instead of parameters of entry.viewpoint, when we need to crop images, because after we crop and resize images, principal_point and focal_length may changed

from co3d.

MaybeOjbk commented on June 21, 2024

my test code is here:

  train_dataset = datasets['train']
  def co3d_annotation_to_opencv_pose(idx):
      camera = train_dataset[idx].camera
      p = camera.principal_point[0]
      f = camera.focal_length[0]
      R = camera.R[0]
      T = camera.T[0]
      _, h, w = train_dataset[idx].image_rgb.size()
      K = np.eye(3)
      s = (min(h, w) - 1) / 2
      K[0, 0] = f[0] * (w - 1) / 2
      K[1, 1] = f[1] * (h - 1) / 2
      K[0, 2] = -p[0] * s + (w - 1) / 2
      K[1, 2] = -p[1] * s + (h - 1) / 2
  
      R = np.asarray(R).T   # note the transpose here
      T = np.asarray(T)
      pose = np.concatenate([R,T[:,None]],1)
      pose = np.diag([-1,-1,1]).astype(np.float32) @ pose # flip the direction of x,y axis
      return K, pose

from co3d.

About the camera intrinsics matrix about co3d HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent