Git Product home page Git Product logo

scannet / scannet Goto Github PK

View Code? Open in Web Editor NEW
1.7K 1.7K 342.0 8.04 MB

Home Page: http://www.scan-net.org/

License: Other

Makefile 0.06% C++ 17.76% C 37.35% Lua 1.11% Python 5.21% Ruby 0.03% Shell 0.20% Objective-C 29.82% Objective-C++ 3.81% CMake 0.01% Cuda 1.06% HLSL 0.13% CSS 0.30% HTML 0.09% JavaScript 2.31% MATLAB 0.37% Pug 0.38%
3d-reconstruction computer-graphics computer-vision deep-learning rgbd

scannet's Introduction

ScanNet

ScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations.

ScanNet Data

If you would like to download the ScanNet data, please fill out an agreement to the ScanNet Terms of Use, using your institutional email addresses, and send it to us at [email protected].

If you have not received a response within a week, it is likely that your email is bouncing - please check this before sending repeat requests. Please do not reply to the noreply email - your email won't be seen.

Please check the changelog for updates to the data release.

Data Organization

The data in ScanNet is organized by RGB-D sequence. Each sequence is stored under a directory with named scene<spaceId>_<scanId>, or scene%04d_%02d, where each space corresponds to a unique location (0-indexed). The raw data captured during scanning, camera poses and surface mesh reconstructions, and annotation metadata are all stored together for the given sequence. The directory has the following structure:

<scanId>
|-- <scanId>.sens
    RGB-D sensor stream containing color frames, depth frames, camera poses and other data
|-- <scanId>_vh_clean.ply
    High quality reconstructed mesh
|-- <scanId>_vh_clean_2.ply
    Cleaned and decimated mesh for semantic annotations
|-- <scanId>_vh_clean_2.0.010000.segs.json
    Over-segmentation of annotation mesh
|-- <scanId>.aggregation.json, <scanId>_vh_clean.aggregation.json
    Aggregated instance-level semantic annotations on lo-res, hi-res meshes, respectively
|-- <scanId>_vh_clean_2.0.010000.segs.json, <scanId>_vh_clean.segs.json
    Over-segmentation of lo-res, hi-res meshes, respectively (referenced by aggregated semantic annotations)
|-- <scanId>_vh_clean_2.labels.ply
    Visualization of aggregated semantic segmentation; colored by nyu40 labels (see img/legend; ply property 'label' denotes the nyu40 label id)
|-- <scanId>_2d-label.zip
    Raw 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
|-- <scanId>_2d-instance.zip
    Raw 2d projections of aggregated annotation instances as 8-bit pngs
|-- <scanId>_2d-label-filt.zip
    Filtered 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids
|-- <scanId>_2d-instance-filt.zip
    Filtered 2d projections of aggregated annotation instances as 8-bit pngs

Data Formats

The following are overviews of the data formats used in ScanNet:

Reconstructed surface mesh file (*.ply): Binary PLY format mesh with +Z axis in upright orientation.

RGB-D sensor stream (*.sens): Compressed binary format with per-frame color, depth, camera pose and other data. See ScanNet C++ Toolkit for more information and parsing code. See SensReader/python for a very basic python data exporter.

Surface mesh segmentation file (*.segs.json):

{
  "params": {  // segmentation parameters
   "kThresh": "0.0001",
   "segMinVerts": "20",
   "minPoints": "750",
   "maxPoints": "30000",
   "thinThresh": "0.05",
   "flatThresh": "0.001",
   "minLength": "0.02",
   "maxLength": "1"
  },
  "sceneId": "...",  // id of segmented scene
  "segIndices": [1,1,1,1,3,3,15,15,15,15],  // per-vertex index of mesh segment
}

Aggregated semantic annotation file (*.aggregation.json):

{
  "sceneId": "...",  // id of annotated scene
  "appId": "...", // id + version of the tool used to create the annotation
  "segGroups": [
    {
      "id": 0,
      "objectId": 0,
      "segments": [1,4,3],
      "label": "couch"
    },
  ],
  "segmentsFile": "..." // id of the *.segs.json segmentation file referenced
}

BenchmarkScripts/util_3d.py gives examples to parsing the semantic instance information from the *.segs.json, *.aggregation.json, and *_vh_clean_2.ply mesh file, with example semantic segmentation visualization in BenchmarkScripts/3d_helpers/visualize_labels_on_mesh.py.

2d annotation projections (*_2d-label.zip, *_2d-instance.zip, *_2d-label-filt.zip, *_2d-instance-filt.zip): Projection of 3d aggregated annotation of a scan into its RGB-D frames, according to the computed camera trajectory.

ScanNet C++ Toolkit

Tools for working with ScanNet data. SensReader loads the ScanNet .sens data of compressed RGB-D frames, camera intrinsics and extrinsics, and IMU data.

Camera Parameter Estimation Code

Code for estimating camera parameters and depth undistortion. Required to compute sensor calibration files which are used by the pipeline server to undistort depth. See CameraParameterEstimation for details.

Mesh Segmentation Code

Mesh supersegment computation code which we use to preprocess meshes and prepare for semantic annotation. Refer to Segmentator directory for building and using code.

BundleFusion Reconstruction Code

ScanNet uses the BundleFusion code for reconstruction. Please refer to the BundleFusion repository at https://github.com/niessner/BundleFusion . If you use BundleFusion, please cite the original paper:

@article{dai2017bundlefusion,
  title={BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration},
  author={Dai, Angela and Nie{\ss}ner, Matthias and Zoll{\"o}fer, Michael and Izadi, Shahram and Theobalt, Christian},
  journal={ACM Transactions on Graphics 2017 (TOG)},
  year={2017}
}

ScanNet Scanner iPad App

ScannerApp is designed for easy capture of RGB-D sequences using an iPad with attached Structure.io sensor.

ScanNet Scanner Data Server

Server contains the server code that receives RGB-D sequences from iPads running the Scanner app.

ScanNet Data Management UI

WebUI contains the web-based data management UI used for providing an overview of available scan data and controlling the processing and annotation pipeline.

ScanNet Semantic Annotation Tools

Code and documentation for the ScanNet semantic annotation web-based interfaces is provided as part of the SSTK library. Please refer to https://github.com/smartscenes/sstk/wiki/Scan-Annotation-Pipeline for an overview.

Benchmark Tasks

We provide code for several scene understanding benchmarks on ScanNet:

  • 3D object classification
  • 3D object retrieval
  • Semantic voxel labeling

Train/test splits are given at Tasks/Benchmark.
Label mappings and trained models can be downloaded with the ScanNet data release.

See Tasks.

Labels

The label mapping file (scannet-labels.combined.tsv) in the ScanNet task data release contains mappings from the labels provided in the ScanNet annotations (id) to the object category sets of NYUv2, ModelNet, ShapeNet, and WordNet synsets. Download with along with the task data (--task_data) or by itself (--label_map).

Citation

If you use the ScanNet data or code please cite:

@inproceedings{dai2017scannet,
    title={ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes},
    author={Dai, Angela and Chang, Angel X. and Savva, Manolis and Halber, Maciej and Funkhouser, Thomas and Nie{\ss}ner, Matthias},
    booktitle = {Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year = {2017}
}

Help

If you have any questions, please contact us at [email protected]

Changelog

License

The data is released under the ScanNet Terms of Use, and the code is released under the MIT license.

Copyright (c) 2017

scannet's People

Contributors

angeladai avatar angelxuanchang avatar mhalber avatar msavva avatar niessner avatar rozdavid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scannet's Issues

SemanticVoxelLabelling bad argument error

I'm trying to run the Semantic Voxel Labelling, using the test_scenes.lua file along with the provided test files. The command I'm running is: th test_scenes.lua --model trained_models/scannet.net --h5_list_path data/h5_scannet_samples/test_shape_voxel_data_list.txt. I'm not using the class mapping (is that necessary? which file would I use there?) or a different original number of classes. When I run it, I get the following output:

{
  gpu_index : 0
  model : "trained_models/scannet.net"
  h5_list_path : "data/h5_scannet_samples/test_shape_voxel_data_list.txt"
  orig_num_classes : 42
  class_mapping_file : ""
}
using #classes = 42
Loading model...
#test files = 15
/home/dan/torch/install/bin/luajit: bad argument #2 to '?' (start index out of bound at /tmp/luarocks_torch-scm-1-4825/torch7/generic/Tensor.c:984)
stack traceback:
        [C]: at 0x7fcd5fdcd350
        [C]: in function '__index'
        test_scenes.lua:85: in main chunk
        [C]: in function 'dofile'
        .../dan/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
        [C]: at 0x00405d50

What could be the issue here? Thanks for any help!

aggregation index out of range

I am reading _vh_clean.aggregation.json for a per point annotation, however, for scene train/ 0002_00, I got the following error: index 307925 is out of bounds for axis 1 with size 297362. It seems that the index stored in aggregation.json exceeds the total number of points.

How to download all the sequences for a specific category

Hi,

I am wondering if I want to download all sequences in Living room / Lounge, is there anyway to do that? It seems the only way is to download them by using their IDs. However, how could we obtain all the IDs for the Living room / Lounge category?

Invalid label in the data for 3D semantic label prediction

According to the README file, <scanId>_vh_clean_2.labels.ply contains ply property 'label' which denotes the ScanNet label id, which we use for semantic label prediction. According to the documentation, these labels correspond to that of 40 class nyu dataset as defined in http://kaldir.vc.in.tum.de/scannet_benchmark/labelids_all.txt

However, I found some invalid labels in the label ply file.

  • the file scans/scene0270_00/scene0270_00_vh_clean_2.labels.ply contains invalid label 50
  • the file scans/scene0270_00/scene0270_00_vh_clean_2.labels.ply contains invalid label 50
  • the file scans/scene0384_00/scene0384_00_vh_clean_2.labels.ply contains invalid label 149

Taking a look into your evaluation code, you seem to simply ignore anything out of a subset of 20 classes and these invalid labels may not be a serious problem. Accordingly, I am treating these invalid labels as ignore label.

Could you please take a look into the data? Thank you.

How can I do scannet cameraparameterestimation???

I have an scannerApp with ipad. That ipad has sturcture sensor.
It has the structure sensor calibrator app.
I have once already. And I have a scannet system with ubuntu, windows 10.
How can I do cameraParameterEstimation??
There is only description of a short brief of calibration.
Please, help me.

the world coordinate origin of Pose.txt or extrinsic matrix

Hi,

May I ask how the world coordinate origin is decided when producing the pose.txt? For example, when I have a series of frames, I will set the location of the camera in the first frame as the world coordinate origin, so how about ScanNet? Meanwhile, is all scene data use the same world coordinate origin?

Thank you very much!

Integration with ROS [not issue, request for help/advice]

I would like to try it with PointCloud format produced by ROS/PCL, Any suggestion how to integrate it with ROS, like having ROS node receiving PCL PointCloud format, then pass to the trained model and get the classification as result!

Some questions about 2d annotation projections

In semantic segmentation, the label should range from 0 to C where C is the total number of classes. However, in the 2d annotation projections(*_2d-label.zip), there are a lot of label have extra pixel value larger than C (for example, C = 40, the pixel value is equal 111).

scans_test folder for "xxx.labels.ply" empty

Hi,
do these files exist "scans_test/scene0707_00/scene0707_00_vh_clean_2.labels.ply" to "scans_test/scene0806_00/scene0806_00_vh_clean_2.labels.ply" ? My folders are all empty.

Or is there another way to get the number of objects for the "scans_test" scans?

Thanks,
Florian

How to read depth.pgm file correctly?

Hi,
I am trying to interpret depth value in depth.pgm file after I unpack the .sens file by SensReader. I read depth.pgm file by an online searched python script:

`def read_pgm(filename, byteorder='>'):

#Return image data from a raw PGM file as numpy array.
#Format specification: http://netpbm.sourceforge.net/doc/pgm.html
with open(filename, 'rb') as f:
    buffer = f.read()
try:
    header, width, height, maxval = re.search(
        b"(^P5\s(?:\s*#.*[\r\n])*"
        b"(\d+)\s(?:\s*#.*[\r\n])*"
        b"(\d+)\s(?:\s*#.*[\r\n])*"
        b"(\d+)\s(?:\s*#.*[\r\n]\s)*)", buffer).groups()
except AttributeError:
    raise ValueError("Not a raw PGM file: '%s'" % filename)
return np.frombuffer(buffer,
                     dtype='u1' if int(maxval) < 256 else byteorder+'u2',
                     count=int(width)*int(height),
                     offset=len(header)
                     ).reshape((int(height), int(width)))`

As I reading the depth.pgm by the script, I found the value almost bigger than 1000. What is the measuring unit of the number? Is it millimeter?

Thanks

ScannerApp with Bundlefusion

Hello, I am going to use color and depth info data from ScannerApp to reconstruct 3d model with Bundlefusion.
Now I can get data from ScannerApp(.camera, .depth, .h264, .imu, .txt), but How can I merge these files to one file(.sens)?
Thank you!

Possible bug in the instance segmentation evaluation script

While examining evaluate_semantic_instance.py I found a possible bug:

bool_void = np.in1d(gt_ids, VALID_CLASS_IDS)

I believe that the gt_ids are label_id*1000+inst_id, however the VALID_CLASS_IDS are just label_ids, so the bool_void will always be all zeros. Besides the current formula seems to compute the non-void mask. The right way to compute bool_void is probably:

bool_void = np.logical_not(np.in1d((gt_ids/1000).astype(np.int32), VALID_CLASS_IDS))

This bug may not affect the result too much but should be fixed.

Missing data for scene0141_01

Only instance and label folders for scene0141_01, there are no color, depth, intrinsic_color and intrinsic_depth folders.
Please have a check~

connection timed out

I have trouble download the data using the provided script. Seems the server is down?

Are the ground-truth poses correct?

I have tried to use the pose in the dataset to warp an image to another view. If the ground-truth depth and pose are correct, the warped image would align perfectly with the image at that view point. But it doesn't. So I wonder whether the ground-truth poses are calculated correctly?

What does 'PI' mean in ScanNet Terms of Use?

I try to download this dataset, but have no idea about 'PI''s meaning in the application form where says 'PI's name:'. Is this dataset only available for researchers at Stanford University and Princeton University?

Cannot create a new submission on ScanNet public benchmark

Hi all,

I was hoping if anyone meets the problem when uploading results into the ScanNet benchmark . I have successfully created an account and am trying to submit my 3D instance segmentation results.

However, after filling the information required in the 'Create New Submission' page and clicking on the 'Save' button, the website always returns "This page isn't working; kaldir.vc.in.tum.de is currently unable to handle this request. Http Error 500". Has anyone encountered this problem? Any comments are welcome.

Best,
Jianyuan

Ground truth labels for the 2D segmentation evaluation script

In the evaluation script evalPixelLevelSemanticLabeling.py for 2D segmentation, it is checked in line 260 if the label is contained in the valid class IDs:

if (not groundTruthImgPixel in VALID_CLASS_IDS):
    printError ("Unknown label with id {:}".format(groundTruthImgPixel))

With VALID_CLASS_IDS defined as follows:
VALID_CLASS_IDS = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28, 33, 34, 36, 39])

I generated the ground truth labels by setting the non-valid class IDs to 0, but then there still is an error thrown for ignore labels (0):
ERROR: Unknown label with id 0

How do I construct the ground truth labels correctly?
Thanks

Depth Shift

Hi,

what does depth shift 1000 imply in the SensReader/python/Readme.md file? Does this mean that the depths should be multiplied by 1000 when they are read from the .png files?

About the mesh reconstruction part

The mesh reconstruction result seems fantastic. Did you use 3DLite to finish the mesh reconstruction part on the cloud? Is there any plan to open source that part pls?

Timestamp on sens data

When opening the timestamp with the use of the code provided in sens reader, it appears that all the times are 0. This was tested on "scene0000_00" and "scene0706_00". I was just wondering if this is always the case?

Why number of classes in 'labels.ply' inconsistent with 'aggreagtion.json' ?

For example, in scene0128_00, there are 4 classes in labels.ply while 6 in aggregation.json, missing 'sofa' and 'table' in aggregation file. (using only 20 classes). And clearly we have 'sofa' and 'table' in scene0128_00, just open 'scene0128_00_vh_clean_2.ply' to find out.

It's not only in one scene has this problem, actually, most scenes have inconsistency in class_num of this two files. I counldn't figure out the reason, what I want to manipulate is all class in labels.ply and every object of that class.

can not export color images and depth images using the reader.py

I want to export color images and depth images.
I use reader.py at 'ScanNet/SensReader/python/' with:
python reader.py --filename raw_data_ScanNet/scene0000_01/scene0000_01.sens --output_path datasets_ScanNet/scene0000_01 --export_depth_images --export_color_images
But I receive the error when use the SensorData.py
" line 57, in load
self.sensor_name = ''.join(struct.unpack('c'*strlen, f.read(strlen)))
TypeError: sequence item 0: expected str instance, bytes found "
my python version is python3.7.
could you tell me how to fix this problem?

Corrupted data

I noticed that there are some camera poses in the dataset with invalid data, with matrices only containing inf values. It’s not a big issue, because this only applies to a small portion of frames, but the issue should at least be documented (I didn’t read about this so far).

Further, I think that many sequences don’t have timestamps, but sometimes random frames seem to have random stamps. Maybe the value wasn’t always initialised?

Here is a small program that can highlight these issues

#include <iostream>
#include "../external/scannet/sensorData.h"

using namespace std;

tuple<bool,bool> analyze_sens(const string& input, bool verbose = false){

    // Input
    if(verbose){
        cout << "Loading data ... ";
        cout.flush();
    }
    ml::SensorData sd(input);
    if(verbose){
        cout << "done!" << endl;
        cout << sd << endl;
    }

    // Stats
    bool ts_d_monotonic = true;
    bool ts_c_monotonic = true;
    bool ts_d_available = false;
    bool ts_c_available = false;
    uint64_t ts_d_last = 0;
    uint64_t ts_c_last = 0;

    bool has_illegal_transformation = false;

    for (size_t i = 0; i < sd.m_frames.size(); i++) {

        // Test timestamps
        const ml::SensorData::RGBDFrame& frame = sd.m_frames[i];
        uint64_t t_d = frame.getTimeStampDepth();
        uint64_t t_c = frame.getTimeStampColor();
        if (t_d > 0) ts_d_available = true;
        if (t_c > 0) ts_c_available = true;
        if (t_d < ts_d_last) ts_d_monotonic = false;
        if (t_c < ts_c_last) ts_c_monotonic = false;
        ts_d_last = t_d;
        ts_c_last = t_c;

        // Test poses
        ml::mat4f t = frame.getCameraToWorld();
        if(t.matrix[15] != 1 || t.matrix[14] != 0 || t.matrix[13] != 0 || t.matrix[12] != 0){
            has_illegal_transformation = true;
            if(verbose)
                cout << "Found illegal transformation at frame " << to_string(i) << ": ["
                     << t.matrix[0] << ", " << t.matrix[1] << ", " << t.matrix[2] << ", " <<t.matrix[3] << "]["
                     << t.matrix[4] << ", " << t.matrix[5] << ", " << t.matrix[6] << ", " <<t.matrix[7] << "]["
                     << t.matrix[8] << ", " << t.matrix[9] << ", " << t.matrix[10] << ", " <<t.matrix[11] << "]["
                     << t.matrix[12] << ", " << t.matrix[13] << ", " << t.matrix[14] << ", " <<t.matrix[15] << "]]" << endl;
        }
    }

    if(verbose){
        cout << "Depth timestamps are monotonic: " << (ts_d_monotonic ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
        cout << "RGB   timestamps are monotonic: " << (ts_c_monotonic ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
        cout << "Depth timestamps are available: " << (ts_d_available ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
        cout << "RGB   timestamps are available: " << (ts_c_available ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
        cout << "All  camera  poses  were legal: " << (!has_illegal_transformation ? "\x1B[32m yes" : "\x1B[31m no") << "\x1B[0m \n";
        cout << endl;
    }
    return make_tuple(!ts_d_monotonic || !ts_c_monotonic, has_illegal_transformation);
}

int main(int argc, char* argv[])
{
    if(argc < 2 || argc > 3) {
        cout << "A tool to analyse scannet *.sens data.\n\n"
                "Error, invalid arguments.\n"
                "Mandatory: input *.sens file / input *.txt file\n"
                "Optional path to dataset dir (if txt is provided)."
             << endl;
        return 1;
    }

    // Input data
    string filename = argv[1];
    if(filename.substr(filename.find_last_of(".") + 1) == "txt"){
        // Analyse many sens files
        string sequence_name;
        string root = (argc == 3 ? argv[2] : "");
        ifstream in_stream(filename);
        while (getline(in_stream, sequence_name)){
            cout << "Checking " << sequence_name << "...";
            cout.flush();
            tuple<bool,bool> r = analyze_sens(root + "/" + sequence_name + "/" + sequence_name  + ".sens");

            if(get<0>(r))
                cout << "\x1B[31m Timestamp issue \x1B[0m";
            else
                cout << "\x1B[32m Timestamps good \x1B[0m";

            if(get<1>(r))
                cout << "\x1B[31m Pose issue \x1B[0m" << endl;
            else
                cout << "\x1B[32m Poses good \x1B[0m" << endl;
        }
        in_stream.close();
    } else {
        // Analyse single sens files
        analyze_sens(filename, true);
    }


    return 0;
}

If you run this script on sequence scene0003_01.sens for example, you get a list of invalid poses, like:

Found illegal transformation at frame 1054: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1071: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1079: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1080: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1081: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1082: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1083: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1084: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1085: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1086: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1087: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]
Found illegal transformation at frame 1088: [-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf][-inf, -inf, -inf, -inf]]

Sequence scene0066_00.sens on the other hand has issues with timestamps. You can also run the tool on a list of sequences, giving you output like:

Checking scene0000_00... Timestamps good Poses good
Checking scene0000_01... Timestamps good Poses good
Checking scene0000_02... Timestamps good Poses good
Checking scene0001_00... Timestamps good Pose issue
Checking scene0001_01... Timestamps good Pose issue
Checking scene0002_00... Timestamps good Poses good
Checking scene0002_01... Timestamps good Pose issue
Checking scene0003_00... Timestamps good Pose issue
Checking scene0003_01... Timestamps good Pose issue
Checking scene0003_02... Timestamps good Pose issue
Checking scene0004_00... Timestamps good Poses good
Checking scene0005_00... Timestamps good Poses good
...

Thanks for providing this great dataset by the way!

Color Point Mapping To Depth Point

Hi,
can you show me how to map color point to depth point, since the resolution of color images and depth images is different?
Thanks a lot!

selected camera view

First, thank you for your great work. I have a question.

Is there a selected camera view list? For example, if I want to train my network to perform instance segmentation, I dont need the whole image trajectory, but just a set of selected image.

For example, in Sun-Pbrs (https://github.com/yindaz/pbrs), they provide a 'precomputed camera views', which is a subset of all camera view points. And after rendering images under these viewpoints, we will get standard viewpoint image. (Similar viewpoints as the images we get from web.)

Generate Data from ScanNet Data

image

So given the data above from ScanNet Dataset, how can I run the th training.lua on this dataset? I know for sure, I need the h5 from this dataset for training and testing. Any idea how to translate the ScanNet training data to h5?

About the pose from SensorData.py

I am preprocessing the relative pose between color frames (t, t+1).
Is the pose coordinate aligned to the camera coordinate, defined as CameraToWorld?
If then, is it right to generate the relative pose, such as Pose_{1->2} = Pose_1 @ inv(Pose_2)?
Plus, I wonder how the coordinate of Pose_{1->2} is aligned to the camera coordinate in this case.

Image size of 2D annotation

It seems that the 2D annotation images (label + instance) have a different size (1296x968) than the color and depth frames (640x480), so how can I associate the labels with the depth values? You cannot simply resize integer images or depth images.

Possible bug in scene 0217_00

It seems that the objects in the aggregations.json file for scene0217_00 are duplicated, and each item has 2 entries in the data. It seems that the repeat starts at objectID = 31

Has anyone else found this or was it a problem with my download?

Project the label image

Could you tell me which software is used to generate the projected label image given in _2d-label.zip?(How to generate an image from the given pose)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.