Comments (7)
Thanks for sharing this dataset.
I finally figure out how to convert the annotation to the opencv-style extrinsics and intrinsics, which may be helpful for others:
def co3d_annotation_to_opencv_pose(entry):
p = entry.viewpoint.principal_point
f = entry.viewpoint.focal_length
h, w = entry.image.size
K = np.eye(3)
s = (min(h, w) - 1) / 2
K[0, 0] = f[0] * (w - 1) / 2
K[1, 1] = f[1] * (h - 1) / 2
K[0, 2] = -p[0] * s + (w - 1) / 2
K[1, 2] = -p[1] * s + (h - 1) / 2
R = np.asarray(entry.viewpoint.R).T # note the transpose here
T = np.asarray(entry.viewpoint.T)
pose = np.concatenate([R,T[:,None]],1)
pose = np.diag([-1,-1,1]).astype(np.float32) @ pose # flip the direction of x,y axis
# "pose" is the extrinsic and "K" is the intrinsic
# pose = [R|t]
# x_img = K (R @ x_wrd + t)
# x_img is in pixel
However, I still have a question about how to convert the points from the estimated depth into the coordinate system of "pointcloud.ply".
from co3d.
Please note that PyTorch3D NDC convention has −1 and 1 coordinates at the corners of the image, not the centres of the corner pixels, so you should not subtract 1 from h
, w
, and min(h, w)
.
Please try to use the provided data loaders. If they do not fulfil some needs, please let us know.
The reference for parsing the viewpoint (applying the crop if needed) is https://github.com/facebookresearch/co3d/blob/main/dataset/co3d_dataset.py#L490 .
For conversion to OpenCV format, PyTorch3D has a function
https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/utils/camera_conversions.py#L65
with the actual implementation in
https://github.com/facebookresearch/pytorch3d/blob/main/pytorch3d/renderer/camera_conversions.py#L61 .
from co3d.
HI, we store the intrinsics in the PyTorch3D convention. More info here:
https://pytorch3d.org/docs/cameras
from co3d.
Thanks for your reply! I know I need to convert the given principal point to screen space. But I mean after converting, the principal point also located at the center of image, which is not very common. Did you warp the image or implement other preprocessing steps to make sure it?
from co3d.
The location of the principal point is decided by the COLMAP image rectification algorithm.
I just checked the raw COLMAP data, and it seems that the COLMAP undistorter also resamples the image so that the principal point is exactly coinciding with the center of the image.
Thanks for spotting this.
from co3d.
Thanks for sharing this dataset.
I finally figure out how to convert the annotation to the opencv-style extrinsics and intrinsics, which may be helpful for others:
def co3d_annotation_to_opencv_pose(entry): p = entry.viewpoint.principal_point f = entry.viewpoint.focal_length h, w = entry.image.size K = np.eye(3) s = (min(h, w) - 1) / 2 K[0, 0] = f[0] * (w - 1) / 2 K[1, 1] = f[1] * (h - 1) / 2 K[0, 2] = -p[0] * s + (w - 1) / 2 K[1, 2] = -p[1] * s + (h - 1) / 2 R = np.asarray(entry.viewpoint.R).T # note the transpose here T = np.asarray(entry.viewpoint.T) pose = np.concatenate([R,T[:,None]],1) pose = np.diag([-1,-1,1]).astype(np.float32) @ pose # flip the direction of x,y axis # "pose" is the extrinsic and "K" is the intrinsic # pose = [R|t] # x_img = K (R @ x_wrd + t) # x_img is in pixelHowever, I still have a question about how to convert the points from the estimated depth into the coordinate system of "pointcloud.ply".
Thanks a lot, and also we should use parameters of train_dataset[idx].camera instead of parameters of entry.viewpoint, when we need to crop images, because after we crop and resize images, principal_point and focal_length may changed
from co3d.
my test code is here:
train_dataset = datasets['train']
def co3d_annotation_to_opencv_pose(idx):
camera = train_dataset[idx].camera
p = camera.principal_point[0]
f = camera.focal_length[0]
R = camera.R[0]
T = camera.T[0]
_, h, w = train_dataset[idx].image_rgb.size()
K = np.eye(3)
s = (min(h, w) - 1) / 2
K[0, 0] = f[0] * (w - 1) / 2
K[1, 1] = f[1] * (h - 1) / 2
K[0, 2] = -p[0] * s + (w - 1) / 2
K[1, 2] = -p[1] * s + (h - 1) / 2
R = np.asarray(R).T # note the transpose here
T = np.asarray(T)
pose = np.concatenate([R,T[:,None]],1)
pose = np.diag([-1,-1,1]).astype(np.float32) @ pose # flip the direction of x,y axis
return K, pose
from co3d.
Related Issues (20)
- Which categories do you use in Table 3. of your paper? HOT 2
- Release raw video data?
- Bugs in foreground masks HOT 3
- absolute scale? HOT 1
- Units of the pointclouds HOT 1
- fewview_train subset JSON contain frames that belong in both of train and test sets HOT 2
- Unit of depth maps w.r.t. camera extrinsics HOT 3
- Any documents of v2? HOT 3
- Camera focal length for the test set different from dev set HOT 2
- Is it possible tp provide User ID for each sequence?
- Is it possible to download subset with depth maps, and extrinsincs HOT 3
- get_eval_batches_map returns wrong shape of results HOT 1
- Statistics for v2 HOT 2
- Wrong depth mask in the dataset HOT 8
- Pure black frames in single sequence dataset HOT 1
- ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (489786,) + inhomogeneous part.
- (1) Where are camera parameters stored in the dataset? (2) About depth-pointcloud consistency. HOT 2
- How to filter samples/sequences with GT depth map and camera matrices? HOT 1
- Centering camera extrinsics to find absolute positions HOT 1
- Downloading issue HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from co3d.