facebookresearch / co3d Goto Github PK
View Code? Open in Web Editor NEWTooling for the Common Objects In 3D dataset.
License: Other
Tooling for the Common Objects In 3D dataset.
License: Other
Hey - the link list file that we should get at the bottom of this page:
https://ai.facebook.com/datasets/co3d-downloads/
is broken - it shows only one link to one dataset (apples)
Hello,
First of all, thanks a lot for releasing this dataset!
Are the code to compute the metrics and the trained SVM parameters available to reproduce the point cloud quality score ?
Thank you in advance
Hi, thanks for the dataset!!
I can download it on my local machine by clicking on the links at https://ai.facebook.com/datasets/co3d-downloads/. But this is a bit tedious for remote machines.
Is there a recommended way for automating the download for the entire dataset onto a remote machine?
Thanks in advance!
HI, thank you for your great work.
I was wondering if it is possible to release pretrained weights of 3D reconstruction models discussed in the paper? I think it will be very useful to the vision community in general.
Hi,
Thanks for your awesome work.
I am interested in the source data for plotting Figure 2 (left) I really like Figure 2 in the paper and want to plot a figure in a similar pattern.
Would you mind sharing the plot script?
It would be great if the checksums (e.g. SHA256) of the 51 zip files are provided. It would be even better if the checksums are checked in download_dataset.py
. Or, do we already have the checksums available somewhere?
Thanks for the help!
thanks for open-source this awesome dataset.
But when I getinto category folder, I don't see any ground truth 3d point cloud files.
Hi, thanks for the great work!
I'm wondering whether it is possible to release the raw video data or the subsampled but undistorted video frames. It seems that when running COLMAP, intrinsics are not assumed to be shared across frames in each video sequence, which finally leads to inconsistently estimated intrinsics and differently sized undistorted images. Moreover, I think some of the reconstructions with inaccurately estimated camera poses or 3D structures could be avoided if a more optimized reconstruction pipeline is applied.
I am trying to use the CO3Dv2 dataset, however, I ran into some weird issues with the set_lists/set_lists_fewview_train.json
fewview train JSON subset lists.
As defined in `co3d.implicitron.dataset.json_index_dataset_map_provider_v2.py' line 104, each JSON file should contain the following structure:
Each `set_lists_<subset_name_l>.json` file contains the following dictionary:
{
"train": [
(sequence_name: str, frame_number: int, image_path: str),
...
],
"val": [
(sequence_name: str, frame_number: int, image_path: str),
...
],
"test": [
(sequence_name: str, frame_number: int, image_path: str),
...
],
}
In the case of the tv, hydrant, donut
(and I believe all) categories, in set_lists_fewview_train.json
, all of the frames (image_path
) under "train"
are also under "test"
.
However, set_lists_fewview_dev.json
and set_lists_fewview_test.json
contain clearly separated "train"
and "test"
frames.
I am not sure if this behavior is a design choice or a bug. My goal to is train a model only on the training set, and not the dev or test sets. What would be the correct JSON subset list and subset to use?
Hi @davnov134, thanks for releasing the v2 of the dataset! It's very useful.
I am trying to perform depth-based warping between two images of the same sequence. What are the "units" of the depth maps with respect to the camera extrinsics (translations)? I am, so far, getting incorrect warpings which indicates that maybe the depths and camera poses have a different scale? Can you please advise?
I am using the _load_16big_png_depth
function to load depths. And have already taken into account the scale_adjustment
attribute.
Many thanks!
Hi,
Thanks for your great work and convenient script for downloading the dataset.
Would you have a plan to provide the md5sum for each zip file?
Hello, thanks for the amazing work!
I failed to produce good result when using the CO3D dataset (single sequence) to train the original NeRF model. I'm wondering whether you've tried it and how's the result?
Thank you very much!
I am trying to run eval_demo.py with visualizations. In evaluation/evaluate_new_view_synthesis.py on line 203:
imports from tools.vis_utils import make_depth_image
it imports from tools.vis_utils. However, I cannot find the corresponding file in the tools folder. Is this something that is not released yet?
Thanks for your amazing dataset!
I encountered some weird results (as shown below) when planning to back project the depth map to generate the point cloud. The intrinsic matrix is obtained by @liuyuan in issue#4, and the depth map is directly from the car/106_12650_23736/depths/frame000001.jpg.geometric.png
. It seems the intrinsic matrix is not related to the depth map.
Can you give me some quick advice or references?
thxs
Hi, thanks for the great work!
I have a question regarding loading data of multiple categories at the same time.
I saw here you mark it as future work, do you have plans to finish and then release it?
Thanks!
Thanks @davnov134 for this great dataset.
I somehow cannot gain access to the full 50 categories.
As shown by the screenshot below, I can only find 31/50 categories from the dataset webpage or the batch download file. For instance, I didn't seem to find the "car" category.
Not sure if I have missed something. Thanks in advance :)
Thanks for the great work and dataset!
Our team working on per-scene optimization (the same setting as your paper's "Sec.5.2. Single-scene reconstruction") and really want to try on your dataset.
Do you have the plan to release the data for comparing single-scene reconstruction methods as a subset?
Have a nice day :)
Are the units of the pointclouds in inches or mm? And how does that relate to the camera translation vector (extrinsics) units ?
Thanks for releasing the dataset!
I try to run eval_demo.py
after installation but encounter an error. The details are as follows:
Traceback (most recent call last):
File "F:/Github-Projects/co3d-master/eval_demo.py", line 209, in <module>
main()
File "F:/Github-Projects/co3d-master/eval_demo.py", line 56, in main
category, task=task, single_sequence_id=single_sequence_id
File "F:/Github-Projects/co3d-master/eval_demo.py", line 110, in evaluate_dbir_for_category
test_restrict_sequence_id=single_sequence_id,
File "F:\Github-Projects\co3d-master\dataset\dataset_zoo.py", line 182, in dataset_zoo
datasets[dataset] = Co3dDataset(**params)
File "<string>", line 29, in __init__
File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 287, in __post_init__
self._load_frames()
File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 531, in _load_frames
zipfile, List[types.FrameAnnotation]
File "F:\Github-Projects\co3d-master\dataset\types.py", line 132, in load_dataclass
return _dataclass_from_dict(asdict, cls)
File "F:\Github-Projects\co3d-master\dataset\types.py", line 152, in _dataclass_from_dict
types = typing.get_args(typeannot)
AttributeError: module 'typing' has no attribute 'get_args'
I'm using python 3.6.8, pytorch 1.7.1 and pytorch3d 0.5.0. Can you provide some advices on it? Thanks in advance.
Could anyone clarify how is the world coordinate oriented? Is there any way to figure out the gravity direction? Thanks!
Dear authors,
Thank you for your great dataset and code. I am looking to transform an image and corresponding depth map into a point cloud. I see that you have helpfully provided a function get_rgbd_point_cloud
in the point cloud utils file for this purpose. However, I am having difficulty with this function.
Specifically, I am able to create a point cloud, but it does not have the expected shape. I've created a minimal example below:
import torch
from pytorch3d.implicitron.dataset.dataloader_zoo import FrameData
from pytorch3d.implicitron.dataset.dataset_zoo import dataset_zoo
from pytorch3d.implicitron.tools.point_cloud_utils import get_rgbd_point_cloud
from pytorch3d.structures import Pointclouds
from pytorch3d.vis.plotly_vis import plot_scene
# Dataset arguments (copied from the single-sequence implicitron config)
dataset_args = {
'dataset_name': 'co3d_singlesequence',
'dataset_root': '/path/to/co3d',
'category': 'hydrant',
'limit_to': -1,
'limit_sequences_to': -1,
'n_frames_per_sequence': 1,
'test_on_train': False,
'load_point_clouds': True,
'mask_images': False,
'mask_depths': False,
'restrict_sequence_name': (),
'test_restrict_sequence_id': 0,
'assert_single_seq': True,
'only_test_set': False,
'aux_dataset_kwargs': {
'box_crop': True,
'box_crop_context': 0.3,
'image_width': 800,
'image_height': 800,
'remove_empty_masks': True
},
'path_manager': None
}
# Load dataset
datasets = dataset_zoo(**dataset_args)
# Get first item from the dataset
item: FrameData = datasets['train'][0]
# Check shapes
print(item.image_rgb.shape) # -> torch.Size([3, 800, 800])
print(item.depth_map.shape) # -> torch.Size([1, 800, 800])
print(item.depth_mask.shape) # -> torch.Size([1, 800, 800])
# Get point cloud
rendered_pointcloud = get_rgbd_point_cloud(
camera=item.camera,
image_rgb=torch.unsqueeze(item.image_rgb, dim=0),
depth_map=item.depth_map,
mask=item.depth_mask,
mask_thr=0.50,
)
# Check shapes
print(rendered_pointcloud.points_packed().shape) # -> torch.Size([97710, 3])
# Plot a single point cloud using plotly
plot_scene({'Pointcloud': {'scene': rendered_pointcloud}})
# Get point cloud with constant depth
rendered_pointcloud = get_rgbd_point_cloud(
camera=item.camera,
image_rgb=torch.unsqueeze(item.image_rgb, dim=0),
depth_map=item.depth_map,
mask=item.depth_mask,
mask_thr=0.50,
)
# Check shapes
print(rendered_pointcloud.points_packed().shape) # -> torch.Size([97710, 3])
# Plot a single point cloud using plotly
plot_scene({'Pointcloud': {'scene': rendered_pointcloud}})
I obtain the following point cloud:
(This is with a different sequence_id from the minimal example above, but the result is similar)
I was expecting something along the lines of the point cloud in item.sequence_point_cloud
(although of course only a partial point cloud, because I only have a single view here). When I visualize that point cloud, it renders as expected.
I realize that I am most likely just misunderstanding something and using this function incorrectly, but I'm not sure how I should be using it.
Thank you and all the best,
Luke
Hi, for each sequence, would it possible to provide the (anonymized) user ID of the user that provided/uploaded the video.
I know this might a big ask, but I am trying to set up an "instance retrieval" use-case with the dataset. I am carrying the assumption that videos uploaded by the same user would have a similar background, which would be useful in my project.
Please let me know if providing such annotations would be possible. :)
Thanks,
Yash
Hi,
Thanks for releasing the dataset. I encountered some problems when trying to download the dataset.
I encountered a timeout error when downloading the couch data from the CO3D downloads page. I tried to download it again, but found that it became 647MB in size and broke. The timeout error are also raised when I tried to download data of some other categories. I turned to use the download_dataset.py
script but still encountered the timeout error.
Can you help me with this? Thanks very much.
Hi! Thanks for your wonderful dataset!
I have a question about the camera intrinsics matrix. I found for all data the principal_point is [0, 0], which is really rare for real-world cameras. Could you please explain it briefly? Thanks in advance.
Hello, Thank you for the nice contribution on 3d dataset.
When I tried to download all dataset categories with download_dataset.py and --link_list_file option,
It takes much time and always fails to download all categories at once.
Also, when It is broken and restarted, it starts download from the initial.
I got following error message,
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))
I asked about this problem at ICCV 2021 unsupervised 3d in the wild workshop and David said
it is related to the combination of slow connection and timeout settings at the FB server.
I am waiting to fix this setting change.. could you help with it? when it is fixed?
Please help me. Thank you so much for your great work.
Sincerely, Stella Yang.
Thanks for the great work on CO3Dv2 and also answering so many questions on Github!
I recently noticed that the focal length of the 1st camera (the novel view we want to render) within the fewview_test
subset is drastically different from the rest of the context cameras i.e. (1.8 vs 3.7).
Meanwhile, within the fewview_dev
subset, the focal length of the 1st camera and the rest of the context cameras are roughly in the same range i.e. (2.6 vs 2.8).
What is the underlying mechanism behind this phenomenon?
Looking at the sample submission code with dbir, I see that we render a cropped image and then paste the crop onto the original image with paste_render_to_original_image
. Therefore, when I run python example_co3d_challenge_submission.py
on fewview_dev
, it first renders the cropped 800x800 image. This behavior appears to be inline with the focal lengths for fewview_dev
subset, but not the fewvew_test
subset.
Thanks!
Hi. Thanks for great work.
We have recently tried rendering scenes on CO3D v2 and found some issues regarding camera parameters.
Although the our code has successfully rendered scenes on CO3D v1, our model fails to reliably render on CO3D v2.
We have not changed the code.
I didn't find any documentation of v2 so that I want to ask for minor details that CO3D-v2 differs from CO3D-v1.
Was there any change on camera coordinate?
Otherwise, was there any difference on MVS step?
Thanks for releasing this useful dataset. I was trying to download the data following the CDN links found in the text file, but for the URLs I get "URL signature expired" error from any browser and any machine I try it from. How do I solve this?
Hello, many thanks for this impressive work!
Do you have any plans to share the code on NerFormer ?
hey,
I am working with python 3.7, pytorch 1.9.1, pytorch3d 0.6.0 (which is the latest) but demo still fails with the exception
Traceback (most recent call last):
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/orweiser/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/main.py", line 45, in
cli.main()
File "/home/orweiser/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "/home/orweiser/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("main"))
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/orweiser/code/co3d/eval_demo.py", line 209, in
main()
File "/home/orweiser/code/co3d/eval_demo.py", line 56, in main
category, task=task, single_sequence_id=single_sequence_id
File "/home/orweiser/code/co3d/eval_demo.py", line 110, in evaluate_dbir_for_category
test_restrict_sequence_id=single_sequence_id,
File "/home/orweiser/code/co3d/dataset/dataset_zoo.py", line 182, in dataset_zoo
datasets[dataset] = Co3dDataset(**params)
File "", line 30, in init
File "/home/orweiser/code/co3d/dataset/co3d_dataset.py", line 289, in post_init
assert_pytorch3d_has_new_ndc_convention()
File "/home/orweiser/code/co3d/tools/camera_utils.py", line 212, in assert_pytorch3d_has_new_ndc_convention
"This codebase uses the new Pytorch3D NDC convention."
Hi,
Unless I am mistaken, there are errors in the incorrect placement of image files within the dataset. I've only observed this on the Car category, but can't rule it out elsewhere.
For instance, consider the car sequence '336_34852_64130'. This is associated with frame indexes, contiguously, from 9284 to 9385 (102 frames). However, an inspection of the image sequence reveals that about 20 of these frames come from a different sequence (see image for a subset - incorrect cars begin at 9306).
This issue does not seem to only apply to the RGB images. Depth maps and foreground probabilities (pictured here for frame 9307) are "correctly" linked to the RGB images, and thus incorrect for the sequence.
Would it be possible to look into this?
Thank you
Hi, the dataset seems to be only available for the staffs of Facebook AI Research. Could the dataset be released for other people?
Hi the paper reports that COLMAP's SfM produces inaccurate camera poses on about 18% of the raw video collection.
Is it possible to have a few samples / a dataset of such failure cases? Very interested in looking at them.
Thanks!
Thanks for the awesome dataset.
Since I am interested in object size estimation, I wonder if the reconstruction has any absolute scale among different sequences? Or is the point cloud aligned per-category in rotation/translation/size?
https://ai.facebook.com/datasets/co3d-downloads/
Could you please update/upload details as to which class label correspond to each zip file ? It'll be very helpful as downloading the whole 1.4TB of data is cumbersome.
Hi, thanks for releasing this dataset!
I'm trying to visualize the 3D annotations but I'm having some issues projecting the point cloud on the frame.
Essentially I'm picking a single frame from the dataset:
dataset = dataset_zoo("co3d_multisequence", "data_folder", "cup", load_point_clouds=True, test_on_train=False)
train_ds = dataset["train"]
frame = train_ds[42]
from that frame I extract the image and the related point cloud
image = plt.imread(frame.image_path)
point_cloud = frame.sequence_point_cloud
pcl_points = point_cloud.points_list()[0]
I then project the point cloud to screen space
pcl_proj = frame.camera.transform_points_screen(pcl_points, image_size=image.shape[:2])
However drawing a set of the projected points on top of the image shows a misalignment
pcl_proj = pcl_proj[torch.randperm(pcl_proj.size()[0])] # shuffling the point list to sample at random
_, ax = plt.subplots()
ax.imshow(image)
ax.scatter(pcl_proj[:1000,0], pcl_proj[:1000,1])
Helloo
Thanks for creating this awesome dataset :)
Im trying to make 3D bounding boxes from the point clouds using pytorch3D Pointclouds.get_bounding_boxes method, however my results as of now looks completely off - code for transforming object point clouds to 3D bounding boxes:
def bb_vertex_from_sizes(sizes, cloud_idx=0):
sizes = sizes[cloud_idx]
x_min, x_max = sizes[0]
y_min, y_max = sizes[1]
z_min, z_max = sizes[2]
point_0 = [x_min, y_min, z_min]
point_1 = [x_min, y_min, z_max]
point_2 = [x_min, y_max, z_min]
point_3 = [x_min, y_max, z_max]
point_4 = [x_max, y_min, z_min]
point_5 = [x_max, y_min, z_max]
point_6 = [x_max, y_max, z_min]
point_7 = [x_max, y_max, z_max]
return torch.Tensor(np.stack([point_0, point_1, point_2, point_3, point_4, point_5, point_6, point_7]))
dataset = dataset_zoo("co3d_multisequence", "data", "cup", load_point_clouds=True, test_on_train=False)
train_ds = dataset["train"]
n = random.randint(0, 10000)
frame = train_ds[n]
image = frame.image_rgb.permute(1,2,0).numpy()
point_cloud = frame.sequence_point_cloud[0]
bbox = bb_vertex_from_sizes(point_cloud.get_bounding_boxes())
bbox_proj = frame.camera.transform_points_screen(bbox, image_size=image.shape[:2])
bbox_proj = bbox_proj.int().numpy()[:,:2]
fig, ax = plt.subplots(figsize=(15,10))
ax.imshow(image)
ax.scatter(bbox_proj[:, 0], bbox_proj[:, 1])
Just wanted to check in, if this at all would be possible before diving deeper into it.
The calculated 3D vertices does not seem to be correct, even when the point clouds seem to be without too many outliers.
I see that the point clouds sometimes have outliers, so my approach would be to somehow filter out these by perhaps only accounting for the points inside of the 2d bounding box of the object, however not sure if this is the best approach, though this wouldnt fix the bounding boxes for my current implementation.
Hope you can help :)
Hi, thanks for your great work on the big real-world dataset.
However, if you look at sequence car/216_22790_47232
and other sequences, you will see that many of the foreground masks are wrongly annotated for background cars. Here, only 0038 and 0039 images are correct.
So the corresponding point cloud car/216_22790_47232/pointcloud.ply
is incorrect, either.
This happens in most of the sequences in car
. Maybe the annotating algorithm masks several cars and you only retain one?
Will other categories be better?
Thanks for the wonderful work!
We are trying to replicate the work of IDR on your data set. However, we have a problem when using the preprocess_cameras.py provided by IDR to normalize the objects and let them fit in the standard sphere. If we try to run IDR model without normalizing, error occurs and we can't find any points during ray tracing. It seems the object often fall far from the origin. We wonder if you have got the same problem and how did you solve it?
Thanks a lot!
Hi, what's the coordinates direction of CO3D? Is it same as that used in Pytorch3D (X left, Y up, Z in)?
Thanks for releasing the dataset! I plan to train a nerf model using the implementation in Pytorch3D on CO3D but have problem on choosing the near and far bounds of camera. Can you provide some advices on how to calculate the two values or give a experimental reference?
Hi,
Most of the zip files on the download page (https://ai.facebook.com/datasets/co3d-downloads/) are not the full archive and thus cannot be opened.
I have tried downloading using the download_dataset.py
script and manually from the website.
Some of the links give HTTP 400 errors and do not download at all ( eg. broccoli, toytruck, microwave).
Most of the links download an incomplete file that does not match the checksums from #12. These files cannot be opened. For example, couch is a 648 MiB file that cannot be opened.
The categories that I was able to successfully download and decompress are donut, frisbee, plant, and tv.
Is it possible to double check the links on the website or host them on an alternative file sharing site?
Thanks in advance!
Hello, Thanks for your novel dataset.
I am interested in the NerFormer architecture introduced in your CO3D paper.
But I couldn't find the code for NerFormer.
Do you have any plans to share the model code on github?
Hi! Thanks for your wonderful dataset!!
I watched the video on YouTube for an overview of the dataset and the experiments, so would the technical report with those baseline results be available on arXiv? (Not sure if I missed it...)
And do you plan to release the code for the baseline implementations mentioned in your video, especially the one for your CVPR paper?
Thanks!
Hi,
I tried to execute unit test as discribed here in the README. But got the following attribute error:
Seems that TestDatasetTypes.test_parsing()
in test_types.py
failed at the namedtuple type. If I comment out the following lines in test_types.py, the unittest will pass.
parsed = types._dataclass_from_dict(NT(dct), NT)
self.assertEqual(parsed.annot, self.entry)
Hi, I am trying to fuse together each depth map to get one point cloud, but failed. I use camera parameters given by dataloader. Here is my code:
for i, frame in enumerate(train_dataset):
depth = frame.depth_mask * frame.depth_map
depth = depth[0].numpy()
img = frame.image_rgb * frame.depth_map
img = img.permute(1, 2, 0).numpy()
rot, trans = frame.camera.R, frame.camera.T
rot, trans = rot[0].numpy(), trans[0].numpy()
extrinsic = np.eye(4)
extrinsic[:3, :3] = rot
extrinsic[:3, 3] = trans
focal_length_px = frame.camera.focal_length[0].numpy() * np.array((h, w)) / 2
principal_point = frame.camera.principal_point[0].numpy()
principal_point_px = -1 * (principal_point - 1) * np.array((h, w)) / 2
intrinsic = np.eye(4)
intrinsic[[0, 1], [0, 1]] = focal_length_px
intrinsic[:2, 2] = principal_point_px
for u in range(w):
for v in range(h):
d = depth[u, v]
if d == 0:
continue
coor_img = np.array((u, v, 1, 1 / d))[..., None]
coor_world = d * np.linalg.inv(intrinsic @ extrinsic) @ coor_img
xyz = coor_world[:3, 0]
rgb = img[u, v]
point = np.hstack([xyz, rgb])
points.append(point)
Thanks a lot.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.