Git Product home page Git Product logo

co3d's Introduction


CO3Dv2: Common Objects In 3D (version 2)

This repository contains a set of tools for working with the 2nd version of the Common Objects in 3D (CO3Dv2) dataset.

The original dataset has been introduced in our ICCV'21 paper: Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction. For accessing the original data, please switch to the v1 branch of this repository.

New features in CO3Dv2

  • [Common Objects in 3D Challenge]((https://eval.ai/web/challenges/challenge-page/1819/overview) which allows transparent evaluation on a hidden test server - more details in the challenge README
  • 2x larger number of sequences, and 4x larger number of frames
  • Improved image quality - less blocky artifacts due to better video decoding
  • Improved segmentation masks - stable tracking of the main foreground object without jumping to background objects
  • Enabled downloading of a smaller single-sequence subset of ~100 sequences consisting only of the sequences used to evalute the many-view single-sequence task
  • Dataset files are hosted in 20 GB chunks facilitating more stable downloads
  • A novel, more user-friendly, dataset format
  • All images within a sequence are cropped to the same height x width

Download the dataset

The links to all dataset files are present in this repository in dataset/links.txt.

Automatic batch-download

We also provide a python script that allows downloading all dataset files at once. In order to do so, execute the download script:

python ./co3d/download_dataset.py --download_folder DOWNLOAD_FOLDER

where DOWNLOAD_FOLDER is a local target folder for downloading the dataset files. Make sure to create this folder before commencing the download.

Size: All zip files of the dataset occupy 5.5 TB of disk-space.

Single-sequence dataset subset

We also provide a subset of the dataset consisting only of the sequences selected for the many-view single-sequence task where both training and evaluation are commonly conducted on a single image sequence. In order to download this subset add the --single_sequence_subset option to download_dataset.py:

python ./co3d/download_dataset.py --download_folder DOWNLOAD_FOLDER --single_sequence_subset

Size: The single-sequence subset is much smaller than the full dataset and takes 8.9 GB of disk-space.

Common Objects in 3D Challenge

Together with releasing v2 of the dataset, we also organize the Common Objects in 3D Challenge hosted on EvalAI. Please visit the [challenge website](https://eval.ai/web/challenges/challenge-page/1819/overview) and [challenge README](./co3d/challenge/README.md) for the more information.

Installation

This is a Python 3 / PyTorch codebase.

  1. Install PyTorch.
  2. Install PyTorch3D.
    • Please note that Pytorch3D has to be built from source to enable the Implicitron module
  3. Install the remaining dependencies in requirements.txt:
pip install visdom tqdm requests h5py 
  1. Install the CO3D package itself: pip install -e .

Dependencies

Note that the core data model in co3d/dataset/data_types.py is independent of PyTorch/PyTorch3D and can be imported and used with other machine-learning frameworks.

Getting started

  1. Install dependencies - See Installation above.
  2. Download the dataset [here] to a given root folder CO3DV2_DATASET_ROOT.
  3. Set the environment variable CO3DV2_DATASET_ROOT to the dataset root:
    export CO3DV2_DATASET_ROOT="your_dataset_root_folder"
  4. Run example_co3d_challenge_submission.py:
    cd examples
    python example_co3d_challenge_submission.py
    
    Note that example_co3d_challenge_submission.py runs an evaluation of a simple depth-based image rendering (DBIR) model on all challenges and sets of the CO3D Challenge. Feel free to extend the script in order to provide your own submission to the CO3D Challenge.

Running tests

Unit tests can be executed with:

python -m unittest

Reproducing results

Implicitron is our open-source framework used to train all implicit shape learning methods from the CO3D paper. Please visit the following link for more details: https://github.com/facebookresearch/pytorch3d/tree/main/projects/implicitron_trainer

Dataset format

The dataset is organized in the filesystem as follows:

CO3DV2_DATASET_ROOT
    ├── <category_0>
    │   ├── <sequence_name_0>
    │   │   ├── depth_masks
    │   │   ├── depths
    │   │   ├── images
    │   │   ├── masks
    │   │   └── pointcloud.ply
    │   ├── <sequence_name_1>
    │   │   ├── depth_masks
    │   │   ├── depths
    │   │   ├── images
    │   │   ├── masks
    │   │   └── pointcloud.ply
    │   ├── ...
    │   ├── <sequence_name_N>
    │   ├── set_lists
    │       ├── set_lists_<subset_name_0>.json
    │       ├── set_lists_<subset_name_1>.json
    │       ├── ...
    │       ├── set_lists_<subset_name_M>.json
    │   ├── eval_batches
    │   │   ├── eval_batches_<subset_name_0>.json
    │   │   ├── eval_batches_<subset_name_1>.json
    │   │   ├── ...
    │   │   ├── eval_batches_<subset_name_M>.json
    │   ├── frame_annotations.jgz
    │   ├── sequence_annotations.jgz
    ├── <category_1>
    ├── ...
    ├── <category_K>

The dataset contains sequences named <sequence_name_i> from K categories with names <category_j>. Each category comprises sequence folders <category_k>/<sequence_name_i> containing the list of sequence images, depth maps, foreground masks, and valid-depth masks images, depths, masks, and depth_masks respectively. Furthermore, <category_k>/<sequence_name_i>/set_lists/ stores M json files set_lists_<subset_name_l>.json, each describing a certain sequence subset.

Users specify the loaded dataset subset by setting self.subset_name to one of the available subset names <subset_name_l>.

frame_annotations.jgz and sequence_annotations.jgz are gzipped json files containing the list of all frames and sequences of the given category stored as lists of FrameAnnotation and SequenceAnnotation objects respectivelly.

Set lists

Each set_lists_<subset_name_l>.json file contains the following dictionary:

{
    "train": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    "val": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    "test": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
}

defining the list of frames (identified with their sequence_name and frame_number) in the "train", "val", and "test" subsets of the dataset.

Note that frame_number can be obtained only from frame_annotations.jgz and does not necesarrily correspond to the numeric suffix of the corresponding image file name (e.g. a file <category_0>/<sequence_name_0>/images/frame00005.jpg can have its frame number set to 20, not 5).

Available subset names in CO3Dv2

In CO3DV2, by default, each category contains a subset of the following set lists:

"set_lists_fewview_test.json"  # Few-view task on the "test" sequence set.
"set_lists_fewview_dev.json"  # Few-view task on the "dev" sequence set.
"set_lists_manyview_test.json"  # Many-view task on the "test" sequence of a category.
"set_lists_manyview_dev_0.json"  # Many-view task on the 1st "dev" sequence of a category.
"set_lists_manyview_dev_1.json"  # Many-view task on the 2nd "dev" sequence of a category.

Eval batches

Each eval_batches_<subset_name_l>.json file contains a list of evaluation examples in the following form:

[
    [  # batch 1
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    [  # batch 1
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
]

Note that the evaluation examples always come from the "test" part of the corresponding set list set_lists_<subset_name_l>.json.

The evaluation task then consists of generating the first image in each batch given the knowledge of the other ones. Hence, the first image in each batch represents the (unseen) target frame, for which only the camera parameters are known, while the rest of the images in the batch are the known source frames whose cameras and colors are given.

Note that for the Many-view task, where a user is given many known views of a particular sequence and the goal is to generate held-out views from the same sequence, eval_batches_manyview_<sequence_set>_<sequence_id>.json contain a single (target) frame per evaluation batch. Users can obtain the known views from the corresponding "train" list of frames in the set list set_lists_manyview_<sequence_set>_<sequence_id>.json.

PyTorch-independent usage

The core data model in co3d/dataset/data_types.py is independent of PyTorch/PyTorch3D and can be imported and used with other machine-learning frameworks.

For example, in order to load the per-category frame and sequence annotations users can execute the following code:

from typing import List
from co3d.dataset.data_types import (
    load_dataclass_jgzip, FrameAnnotation, SequenceAnnotation
)
category_frame_annotations = load_dataclass_jgzip(
    f"{CO3DV2_DATASET_ROOT}/{category_name}/frame_annotations.jgz", List[FrameAnnotation]
)
category_sequence_annotations = load_dataclass_jgzip(
    f"{CO3DV2_DATASET_ROOT}/{category_name}/sequence_annotations.jgz", List[SequenceAnnotation]
)

Furthermore, all challenge-related code under co3d/challenge also does not depend on PyTorch.

Reference

If you use our dataset, please use the following citation:

@inproceedings{reizenstein21co3d,
	Author = {Reizenstein, Jeremy and Shapovalov, Roman and Henzler, Philipp and Sbordone, Luca and Labatut, Patrick and Novotny, David},
	Booktitle = {International Conference on Computer Vision},
	Title = {Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction},
	Year = {2021},
}

License

The CO3D codebase is released under the CC BY 4.0.

Overview video

The following presentation of the dataset was delivered at the Extreme Vision Workshop at CVPR 2021: Overview

co3d's People

Contributors

davnov134 avatar dcharatan avatar patricklabatut avatar pira998 avatar shapovalov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

co3d's Issues

About Figure2

Hi,

Thanks for your awesome work.

I am interested in the source data for plotting Figure 2 (left) I really like Figure 2 in the paper and want to plot a figure in a similar pattern.

Would you mind sharing the plot script?

Incorrect frame-to-sequence labelling, or image files, in dataset

Hi,

Unless I am mistaken, there are errors in the incorrect placement of image files within the dataset. I've only observed this on the Car category, but can't rule it out elsewhere.

For instance, consider the car sequence '336_34852_64130'. This is associated with frame indexes, contiguously, from 9284 to 9385 (102 frames). However, an inspection of the image sequence reveals that about 20 of these frames come from a different sequence (see image for a subset - incorrect cars begin at 9306).
Screenshot 2022-03-04 at 18 58 56

This issue does not seem to only apply to the RGB images. Depth maps and foreground probabilities (pictured here for frame 9307) are "correctly" linked to the RGB images, and thus incorrect for the sequence.
download

Would it be possible to look into this?
Thank you

Cannot run test

I am trying to run test on car dataset but encountering error

Screenshot from 2021-12-04 10-58-39
.

failed unit test due to AttributeError

Hi,

I tried to execute unit test as discribed here in the README. But got the following attribute error:
image

Seems that TestDatasetTypes.test_parsing() in test_types.py failed at the namedtuple type. If I comment out the following lines in test_types.py, the unittest will pass.

        parsed = types._dataclass_from_dict(NT(dct), NT)
        self.assertEqual(parsed.annot, self.entry)

Which categories do you use in Table 3. of your paper?

Hello!
I've read your paper and it says that you use 10 categories to test the category-centric new-view synthesis ability of your model. However, I didn't find out which 10 categories you used. So can you share their names please?
Table 3

Any documents of v2?

Hi. Thanks for great work.

We have recently tried rendering scenes on CO3D v2 and found some issues regarding camera parameters.
Although the our code has successfully rendered scenes on CO3D v1, our model fails to reliably render on CO3D v2.
We have not changed the code.

I didn't find any documentation of v2 so that I want to ask for minor details that CO3D-v2 differs from CO3D-v1.
Was there any change on camera coordinate?
Otherwise, was there any difference on MVS step?

Bugs in foreground masks

Hi, thanks for your great work on the big real-world dataset.

However, if you look at sequence car/216_22790_47232 and other sequences, you will see that many of the foreground masks are wrongly annotated for background cars. Here, only 0038 and 0039 images are correct.
cars

So the corresponding point cloud car/216_22790_47232/pointcloud.ply is incorrect, either.
image

This happens in most of the sequences in car. Maybe the annotating algorithm masks several cars and you only retain one?

Will other categories be better?

eval_demo.py fail - pytorch3d old ndc conventions

hey,
I am working with python 3.7, pytorch 1.9.1, pytorch3d 0.6.0 (which is the latest) but demo still fails with the exception

Traceback (most recent call last):
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/orweiser/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/main.py", line 45, in
cli.main()
File "/home/orweiser/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "/home/orweiser/.vscode-server/extensions/ms-python.python-2021.11.1422169775/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("main"))
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/media/data1/orweiser/anaconda3/envs/co3d/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/orweiser/code/co3d/eval_demo.py", line 209, in
main()
File "/home/orweiser/code/co3d/eval_demo.py", line 56, in main
category, task=task, single_sequence_id=single_sequence_id
File "/home/orweiser/code/co3d/eval_demo.py", line 110, in evaluate_dbir_for_category
test_restrict_sequence_id=single_sequence_id,
File "/home/orweiser/code/co3d/dataset/dataset_zoo.py", line 182, in dataset_zoo
datasets[dataset] = Co3dDataset(**params)
File "", line 30, in init
File "/home/orweiser/code/co3d/dataset/co3d_dataset.py", line 289, in post_init
assert_pytorch3d_has_new_ndc_convention()
File "/home/orweiser/code/co3d/tools/camera_utils.py", line 212, in assert_pytorch3d_has_new_ndc_convention
"This codebase uses the new Pytorch3D NDC convention."

Pointclouds cannot be fuzed together when projecting depth map back into space

Hi, I am trying to fuse together each depth map to get one point cloud, but failed. I use camera parameters given by dataloader. Here is my code:

for i, frame in enumerate(train_dataset):
    depth = frame.depth_mask * frame.depth_map
    depth = depth[0].numpy()
    img = frame.image_rgb * frame.depth_map
    img = img.permute(1, 2, 0).numpy()

    rot, trans = frame.camera.R, frame.camera.T
    rot, trans = rot[0].numpy(), trans[0].numpy()
    extrinsic = np.eye(4)
    extrinsic[:3, :3] = rot
    extrinsic[:3, 3] = trans

    focal_length_px = frame.camera.focal_length[0].numpy() * np.array((h, w)) / 2
    principal_point = frame.camera.principal_point[0].numpy()
    principal_point_px = -1 * (principal_point - 1) * np.array((h, w)) / 2
    intrinsic = np.eye(4)
    intrinsic[[0, 1], [0, 1]] = focal_length_px
    intrinsic[:2, 2] = principal_point_px
    for u in range(w):
        for v in range(h):
            d = depth[u, v]
            if d == 0:
                continue
            coor_img = np.array((u, v, 1, 1 / d))[..., None]
            coor_world = d * np.linalg.inv(intrinsic @ extrinsic) @ coor_img
            xyz = coor_world[:3, 0]
            rgb = img[u, v]
            point = np.hstack([xyz, rgb])
            points.append(point)

Thanks a lot.

About the camera intrinsics matrix

Hi! Thanks for your wonderful dataset!
I have a question about the camera intrinsics matrix. I found for all data the principal_point is [0, 0], which is really rare for real-world cameras. Could you please explain it briefly? Thanks in advance.

3D bounding boxes from point clouds

Helloo

Thanks for creating this awesome dataset :)

Im trying to make 3D bounding boxes from the point clouds using pytorch3D Pointclouds.get_bounding_boxes method, however my results as of now looks completely off - code for transforming object point clouds to 3D bounding boxes:

def bb_vertex_from_sizes(sizes, cloud_idx=0):
    sizes = sizes[cloud_idx]
    x_min, x_max = sizes[0]
    y_min, y_max = sizes[1]
    z_min, z_max = sizes[2]

    point_0 = [x_min, y_min, z_min]
    point_1 = [x_min, y_min, z_max]
    point_2 = [x_min, y_max, z_min]
    point_3 = [x_min, y_max, z_max]
    point_4 = [x_max, y_min, z_min]
    point_5 = [x_max, y_min, z_max]
    point_6 = [x_max, y_max, z_min]
    point_7 = [x_max, y_max, z_max]

    return torch.Tensor(np.stack([point_0, point_1, point_2, point_3, point_4, point_5, point_6, point_7]))


dataset = dataset_zoo("co3d_multisequence", "data", "cup", load_point_clouds=True, test_on_train=False)
train_ds = dataset["train"]
n = random.randint(0, 10000)
frame = train_ds[n]
image = frame.image_rgb.permute(1,2,0).numpy()
point_cloud = frame.sequence_point_cloud[0]
bbox = bb_vertex_from_sizes(point_cloud.get_bounding_boxes())
bbox_proj = frame.camera.transform_points_screen(bbox, image_size=image.shape[:2])
bbox_proj = bbox_proj.int().numpy()[:,:2]


fig, ax = plt.subplots(figsize=(15,10))
ax.imshow(image)
ax.scatter(bbox_proj[:, 0], bbox_proj[:, 1])

output_bbox
output_pcl

Just wanted to check in, if this at all would be possible before diving deeper into it.

The calculated 3D vertices does not seem to be correct, even when the point clouds seem to be without too many outliers.
I see that the point clouds sometimes have outliers, so my approach would be to somehow filter out these by perhaps only accounting for the points inside of the 2d bounding box of the object, however not sure if this is the best approach, though this wouldnt fix the bounding boxes for my current implementation.

Hope you can help :)

Camera focal length for the test set different from dev set

Thanks for the great work on CO3Dv2 and also answering so many questions on Github!

I recently noticed that the focal length of the 1st camera (the novel view we want to render) within the fewview_test subset is drastically different from the rest of the context cameras i.e. (1.8 vs 3.7).

Meanwhile, within the fewview_dev subset, the focal length of the 1st camera and the rest of the context cameras are roughly in the same range i.e. (2.6 vs 2.8).

What is the underlying mechanism behind this phenomenon?

Looking at the sample submission code with dbir, I see that we render a cropped image and then paste the crop onto the original image with paste_render_to_original_image. Therefore, when I run python example_co3d_challenge_submission.py on fewview_dev , it first renders the cropped 800x800 image. This behavior appears to be inline with the focal lengths for fewview_dev subset, but not the fewvew_test subset.

Thanks!

Download only for single-scene reconstruction

Thanks for the great work and dataset!
Our team working on per-scene optimization (the same setting as your paper's "Sec.5.2. Single-scene reconstruction") and really want to try on your dataset.
Do you have the plan to release the data for comparing single-scene reconstruction methods as a subset?

Have a nice day :)

Unit of depth maps w.r.t. camera extrinsics

Hi @davnov134, thanks for releasing the v2 of the dataset! It's very useful.

I am trying to perform depth-based warping between two images of the same sequence. What are the "units" of the depth maps with respect to the camera extrinsics (translations)? I am, so far, getting incorrect warpings which indicates that maybe the depths and camera poses have a different scale? Can you please advise?

I am using the _load_16big_png_depth function to load depths. And have already taken into account the scale_adjustment attribute.

Many thanks!

Problems with training original NeRF using CO3D dataset

Hello, thanks for the amazing work!

I failed to produce good result when using the CO3D dataset (single sequence) to train the original NeRF model. I'm wondering whether you've tried it and how's the result?

Thank you very much!

where to find the full 50 categories?

Thanks @davnov134 for this great dataset.

I somehow cannot gain access to the full 50 categories.

As shown by the screenshot below, I can only find 31/50 categories from the dataset webpage or the batch download file. For instance, I didn't seem to find the "car" category.

Screen Shot 2021-08-26 at 6 29 25 PM

Screen Shot 2021-08-26 at 6 29 34 PM

Not sure if I have missed something. Thanks in advance :)

NerFormer

Hello, many thanks for this impressive work!
Do you have any plans to share the code on NerFormer ?

Misalignment between frame image and point cloud when transforming to screen space

Hi, thanks for releasing this dataset!

I'm trying to visualize the 3D annotations but I'm having some issues projecting the point cloud on the frame.
Essentially I'm picking a single frame from the dataset:

dataset = dataset_zoo("co3d_multisequence", "data_folder", "cup", load_point_clouds=True, test_on_train=False)
train_ds = dataset["train"]
frame = train_ds[42]

from that frame I extract the image and the related point cloud

image = plt.imread(frame.image_path)
point_cloud = frame.sequence_point_cloud
pcl_points = point_cloud.points_list()[0]

I then project the point cloud to screen space

pcl_proj = frame.camera.transform_points_screen(pcl_points, image_size=image.shape[:2])

However drawing a set of the projected points on top of the image shows a misalignment

pcl_proj = pcl_proj[torch.randperm(pcl_proj.size()[0])] # shuffling the point list to sample at random
_, ax = plt.subplots()
ax.imshow(image)
ax.scatter(pcl_proj[:1000,0], pcl_proj[:1000,1])

output

Quick question about the get_rgbd_point_cloud function

Dear authors,

Thank you for your great dataset and code. I am looking to transform an image and corresponding depth map into a point cloud. I see that you have helpfully provided a function get_rgbd_point_cloud in the point cloud utils file for this purpose. However, I am having difficulty with this function.

Specifically, I am able to create a point cloud, but it does not have the expected shape. I've created a minimal example below:

import torch

from pytorch3d.implicitron.dataset.dataloader_zoo import FrameData
from pytorch3d.implicitron.dataset.dataset_zoo import dataset_zoo
from pytorch3d.implicitron.tools.point_cloud_utils import get_rgbd_point_cloud
from pytorch3d.structures import Pointclouds
from pytorch3d.vis.plotly_vis import plot_scene


# Dataset arguments (copied from the single-sequence implicitron config)
dataset_args = {
    'dataset_name': 'co3d_singlesequence',
    'dataset_root': '/path/to/co3d',
    'category': 'hydrant',
    'limit_to': -1,
    'limit_sequences_to': -1,
    'n_frames_per_sequence': 1,
    'test_on_train': False,
    'load_point_clouds': True,
    'mask_images': False,
    'mask_depths': False,
    'restrict_sequence_name': (),
    'test_restrict_sequence_id': 0,
    'assert_single_seq': True,
    'only_test_set': False,
    'aux_dataset_kwargs': {
        'box_crop': True,
        'box_crop_context': 0.3,
        'image_width': 800,
        'image_height': 800,
        'remove_empty_masks': True
    },
    'path_manager': None
}
# Load dataset
datasets = dataset_zoo(**dataset_args)

# Get first item from the dataset
item: FrameData = datasets['train'][0]

# Check shapes
print(item.image_rgb.shape)  # -> torch.Size([3, 800, 800])
print(item.depth_map.shape)  # -> torch.Size([1, 800, 800])
print(item.depth_mask.shape)  # -> torch.Size([1, 800, 800])

# Get point cloud
rendered_pointcloud = get_rgbd_point_cloud(
    camera=item.camera,
    image_rgb=torch.unsqueeze(item.image_rgb, dim=0),
    depth_map=item.depth_map,
    mask=item.depth_mask,
    mask_thr=0.50,
)

# Check shapes
print(rendered_pointcloud.points_packed().shape)  # -> torch.Size([97710, 3])

# Plot a single point cloud using plotly
plot_scene({'Pointcloud': {'scene': rendered_pointcloud}})

# Get point cloud with constant depth
rendered_pointcloud = get_rgbd_point_cloud(
    camera=item.camera,
    image_rgb=torch.unsqueeze(item.image_rgb, dim=0),
    depth_map=item.depth_map,
    mask=item.depth_mask,
    mask_thr=0.50,
)

# Check shapes
print(rendered_pointcloud.points_packed().shape)  # -> torch.Size([97710, 3])

# Plot a single point cloud using plotly
plot_scene({'Pointcloud': {'scene': rendered_pointcloud}})

I obtain the following point cloud:

gif

(This is with a different sequence_id from the minimal example above, but the result is similar)

I was expecting something along the lines of the point cloud in item.sequence_point_cloud (although of course only a partial point cloud, because I only have a single view here). When I visualize that point cloud, it renders as expected.

I realize that I am most likely just misunderstanding something and using this function incorrectly, but I'm not sure how I should be using it.

Thank you and all the best,
Luke

Checksums for the download files

It would be great if the checksums (e.g. SHA256) of the 51 zip files are provided. It would be even better if the checksums are checked in download_dataset.py. Or, do we already have the checksums available somewhere?

Thanks for the help!

Samples for cases where COLMAP SfM fails

Hi the paper reports that COLMAP's SfM produces inaccurate camera poses on about 18% of the raw video collection.
Is it possible to have a few samples / a dataset of such failure cases? Very interested in looking at them.
Thanks!

Is it possible tp provide User ID for each sequence?

Hi, for each sequence, would it possible to provide the (anonymized) user ID of the user that provided/uploaded the video.

I know this might a big ask, but I am trying to set up an "instance retrieval" use-case with the dataset. I am carrying the assumption that videos uploaded by the same user would have a similar background, which would be useful in my project.

Please let me know if providing such annotations would be possible. :)

Thanks,
Yash

Dataset Not Available

Hi, the dataset seems to be only available for the staffs of Facebook AI Research. Could the dataset be released for other people?

Release raw video data?

Hi, thanks for the great work!

I'm wondering whether it is possible to release the raw video data or the subsampled but undistorted video frames. It seems that when running COLMAP, intrinsics are not assumed to be shared across frames in each video sequence, which finally leads to inconsistently estimated intrinsics and differently sized undistorted images. Moreover, I think some of the reconstructions with inaccurately estimated camera poses or 3D structures could be avoided if a more optimized reconstruction pipeline is applied.

How to implement IDR on co3d dataset?

Thanks for the wonderful work!
We are trying to replicate the work of IDR on your data set. However, we have a problem when using the preprocess_cameras.py provided by IDR to normalize the objects and let them fit in the standard sphere. If we try to run IDR model without normalizing, error occurs and we can't find any points during ray tracing. It seems the object often fall far from the origin. We wonder if you have got the same problem and how did you solve it?
Thanks a lot!

NerFormer architecture

Hello, Thanks for your novel dataset.

I am interested in the NerFormer architecture introduced in your CO3D paper.
But I couldn't find the code for NerFormer.
Do you have any plans to share the model code on github?

Technical report and baseline implementation.

Hi! Thanks for your wonderful dataset!!

I watched the video on YouTube for an overview of the dataset and the experiments, so would the technical report with those baseline results be available on arXiv? (Not sure if I missed it...)
And do you plan to release the code for the baseline implementations mentioned in your video, especially the one for your CVPR paper?

Thanks!

Units of the pointclouds

Are the units of the pointclouds in inches or mm? And how does that relate to the camera translation vector (extrinsics) units ?

Cannot download dataset

Hi,

Thanks for releasing the dataset. I encountered some problems when trying to download the dataset.

I encountered a timeout error when downloading the couch data from the CO3D downloads page. I tried to download it again, but found that it became 647MB in size and broke. The timeout error are also raised when I tried to download data of some other categories. I turned to use the download_dataset.py script but still encountered the timeout error.

Can you help me with this? Thanks very much.

Depth map and intrinsic

Thanks for your amazing dataset!

I encountered some weird results (as shown below) when planning to back project the depth map to generate the point cloud. The intrinsic matrix is obtained by @liuyuan in issue#4, and the depth map is directly from the car/106_12650_23736/depths/frame000001.jpg.geometric.png. It seems the intrinsic matrix is not related to the depth map.

Can you give me some quick advice or references?

GIF 2021-12-7 20-52-01

Connection broken: ConnectionResetError(104, 'Connection reset by peer')

Hello, Thank you for the nice contribution on 3d dataset.

When I tried to download all dataset categories with download_dataset.py and --link_list_file option,
It takes much time and always fails to download all categories at once.
Also, when It is broken and restarted, it starts download from the initial.

I got following error message,

requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

I asked about this problem at ICCV 2021 unsupervised 3d in the wild workshop and David said
it is related to the combination of slow connection and timeout settings at the FB server.

I am waiting to fix this setting change.. could you help with it? when it is fixed?
Please help me. Thank you so much for your great work.

Sincerely, Stella Yang.

AttributeError occured when run eval_demo.py

Thanks for releasing the dataset!

I try to run eval_demo.py after installation but encounter an error. The details are as follows:

Traceback (most recent call last):
  File "F:/Github-Projects/co3d-master/eval_demo.py", line 209, in <module>
    main()
  File "F:/Github-Projects/co3d-master/eval_demo.py", line 56, in main
    category, task=task, single_sequence_id=single_sequence_id
  File "F:/Github-Projects/co3d-master/eval_demo.py", line 110, in evaluate_dbir_for_category
    test_restrict_sequence_id=single_sequence_id,
  File "F:\Github-Projects\co3d-master\dataset\dataset_zoo.py", line 182, in dataset_zoo
    datasets[dataset] = Co3dDataset(**params)
  File "<string>", line 29, in __init__
  File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 287, in __post_init__
    self._load_frames()
  File "F:\Github-Projects\co3d-master\dataset\co3d_dataset.py", line 531, in _load_frames
    zipfile, List[types.FrameAnnotation]
  File "F:\Github-Projects\co3d-master\dataset\types.py", line 132, in load_dataclass
    return _dataclass_from_dict(asdict, cls)
  File "F:\Github-Projects\co3d-master\dataset\types.py", line 152, in _dataclass_from_dict
    types = typing.get_args(typeannot)
AttributeError: module 'typing' has no attribute 'get_args'

I'm using python 3.6.8, pytorch 1.7.1 and pytorch3d 0.5.0. Can you provide some advices on it? Thanks in advance.

fewview_train subset JSON contain frames that belong in both of train and test sets

I am trying to use the CO3Dv2 dataset, however, I ran into some weird issues with the set_lists/set_lists_fewview_train.json fewview train JSON subset lists.

As defined in `co3d.implicitron.dataset.json_index_dataset_map_provider_v2.py' line 104, each JSON file should contain the following structure:

Each `set_lists_<subset_name_l>.json` file contains the following dictionary:
{
    "train": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    "val": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
    "test": [
        (sequence_name: str, frame_number: int, image_path: str),
        ...
    ],
}

In the case of the tv, hydrant, donut (and I believe all) categories, in set_lists_fewview_train.json, all of the frames (image_path) under "train" are also under "test".

However, set_lists_fewview_dev.json and set_lists_fewview_test.json contain clearly separated "train" and "test" frames.

I am not sure if this behavior is a design choice or a bug. My goal to is train a model only on the training set, and not the dev or test sets. What would be the correct JSON subset list and subset to use?

Pretrained weights available?

HI, thank you for your great work.
I was wondering if it is possible to release pretrained weights of 3D reconstruction models discussed in the paper? I think it will be very useful to the vision community in general.

md5sum of dataset

Hi,

Thanks for your great work and convenient script for downloading the dataset.

Would you have a plan to provide the md5sum for each zip file?

CDN link expired

Thanks for releasing this useful dataset. I was trying to download the data following the CDN links found in the text file, but for the URLs I get "URL signature expired" error from any browser and any machine I try it from. How do I solve this?

Dataset zip files on webpage are incomplete

Hi,

Most of the zip files on the download page (https://ai.facebook.com/datasets/co3d-downloads/) are not the full archive and thus cannot be opened.

I have tried downloading using the download_dataset.py script and manually from the website.

Some of the links give HTTP 400 errors and do not download at all ( eg. broccoli, toytruck, microwave).
Most of the links download an incomplete file that does not match the checksums from #12. These files cannot be opened. For example, couch is a 648 MiB file that cannot be opened.
The categories that I was able to successfully download and decompress are donut, frisbee, plant, and tv.

Is it possible to double check the links on the website or host them on an alternative file sharing site?

Thanks in advance!

The near and far bounds of camera

Thanks for releasing the dataset! I plan to train a nerf model using the implementation in Pytorch3D on CO3D but have problem on choosing the near and far bounds of camera. Can you provide some advices on how to calculate the two values or give a experimental reference?

absolute scale?

Thanks for the awesome dataset.
Since I am interested in object size estimation, I wonder if the reconstruction has any absolute scale among different sequences? Or is the point cloud aligned per-category in rotation/translation/size?

tools.vis_utils not included?

I am trying to run eval_demo.py with visualizations. In evaluation/evaluate_new_view_synthesis.py on line 203:
imports from tools.vis_utils import make_depth_image
it imports from tools.vis_utils. However, I cannot find the corresponding file in the tools folder. Is this something that is not released yet?

About loading data of multiple categories

Hi, thanks for the great work!

I have a question regarding loading data of multiple categories at the same time.
I saw here you mark it as future work, do you have plans to finish and then release it?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.