fuenwang / led2-net Goto Github PK

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

Home Page: https://fuenwang.ml/project/led2net/

License: MIT License

Python 100.00%

360-photo panorama-image room-layout computer-vision cvpr

led2-net's Introduction

LED²-Net

This is PyTorch implementation of our CVPR 2021 Oral paper "LED²-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering".

You can visit our project website and upload your own panorama to see the 3D results!

[Project Website] [Paper]

Prerequisite

This repo is primarily based on PyTorch. You can use the follwoing command to intall the dependencies.

pip install -r requirements.txt

Preparing Training Data

Under LED2Net/Dataset, we provide the dataloader of Matterport3D and Realtor360. The annotation formats of the two datasets follows PanoAnnotator. The detailed description of the format is explained in LayoutMP3D.

Under config/, config_mp3d.yaml and config_realtor360.yaml are the configuration file for Matterport3D and Realtor360.

Matterport3D

To train/val on Matterport3D, please modify the two items in config_mp3d.yaml.

dataset_image_path: &dataset_image_path '/path/to/image/location'
dataset_label_path: &dataset_label_path '/path/to/label/location'

The dataset_image_path and dataset_label_path follow the folder structure:

  dataset_image_path/
  |-------17DRP5sb8fy/
          |-------00ebbf3782c64d74aaf7dd39cd561175/
                  |-------color.jpg
          |-------352a92fb1f6d4b71b3aafcc74e196234/
                  |-------color.jpg
          .
          .
  |-------gTV8FGcVJC9/
          .
          .
  dataset_label_path/
  |-------mp3d_train.txt
  |-------mp3d_val.txt
  |-------mp3d_test.txt
  |-------label/
          |-------Z6MFQCViBuw_543e6efcc1e24215b18c4060255a9719_label.json
          |-------yqstnuAEVhm_f2eeae1a36f14f6cb7b934efd9becb4d_label.json
          .
          .
          .

Then run main.py and specify the config file path

python main.py --config config/config_mp3d.yaml --mode train # For training
python main.py --config config/config_mp3d.yaml --mode val # For testing

Realtor360

To train/val on Realtor360, please modify the item in config_realtor360.yaml.

dataset_path: &dataset_path '/path/to/dataset/location'

The dataset_path follows the folder structure:

  dataset_path/
  |-------train.txt
  |-------val.txt
  |-------sun360/
          |-------pano_ajxqvkaaokwnzs/
                  |-------color.png
                  |-------label.json
          .
          .
  |-------istg/
          |-------1/
                  |-------1/
                          |-------color.png
                          |-------label.json
                  |-------2/
                          |-------color.png
                          |-------label.json
                  .
                  .
          .
          .

Then run main.py and specify the config file path

python main.py --config config/config_realtor360.yaml --mode train # For training
python main.py --config config/config_realtor360.yaml --mode val # For testing

Run Inference

After finishing the training, you can use the following command to run inference on your own data (xxx.jpg or xxx.png).

python run_inference.py --config YOUR_CONFIG --src SRC_FOLDER/ --dst DST_FOLDER --ckpt XXXXX.pkl

This script will predict the layouts of all images (jpg or png) under SRC_FOLDER/ and store the results as json files under DST_FOLDER/.

Pretrained Weights

We provide the pretrained model of Realtor360 in this link.

Currently, we use DuLa-Net's post processing for inference. We will release the version using HorizonNet's post processing later.

Layout Visualization

To visualize the 3D layout, we provide the visualization tool in 360LayoutVisualizer. Please clone it and install the corresponding packages. Then, run the following command

cd 360LayoutVisualizer/
python visualizer.py --img xxxxxx.jpg --json xxxxxx.json

Citation

@InProceedings{Wang_2021_CVPR,
    author    = {Wang, Fu-En and Yeh, Yu-Hsuan and Sun, Min and Chiu, Wei-Chen and Tsai, Yi-Hsuan},
    title     = {LED2-Net: Monocular 360deg Layout Estimation via Differentiable Depth Rendering},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {12956-12965}
}

led2-net's People

Contributors

Stargazers

Watchers

Forkers

zhigangjiang zebrajack haiyan-chris-wang trendingtechnology xiaojiean ggbound arctanbell scottjingtt salmonsush lelemi1031 xizhipeng0618 valerich95

led2-net's Issues

Bad result in visualizer

Hello,
when I want to visualize my json result by visualizer.py I got bad result

Do you have any idea?

unused variable up_down_ratio in Label2Mesh function in utils in 360layoutvisualizer project

Error in calculating IoU

HI, Thanks for your work.
However, I recently found that there are unreasonable situations in calculating IoU, which may lead to errors.

Use image to calculate IoU in your code:

def IoU_2D(pred, gt, dummy_height1=None, dummy_height2=None):
    intersect = np.sum(np.logical_and(pred, gt))
    union = np.sum(np.logical_or(pred, gt))
    iou_2d = intersect / union

    return iou_2d

Calculate IoU using polygon in HorizonNet code:

dt_poly = Polygon(dt_floor_xy)
gt_poly = Polygon(gt_floor_xy)

# 2D IoU
area_dt = dt_poly.area
area_gt = gt_poly.area
area_inter = dt_poly.intersection(gt_poly).area
iou2d = area_inter / (area_gt + area_dt - area_inter)

When I set fp_meters=20 in config file:

Use the floor plan image to calculate IoU2D: 0.8690
Use the floor plan polygon to calculate IoU2D: 0.8722

Use the floor plan image to calculate IoU2D: 0.7491
Use the floor plan polygon to calculate IoU2D: 0.6429

It can be found that there is error in the different calculation.My test results show that the IOU calculated using images is generally greater than that calculated using polygon in Matterport3D's test dataset. I think one reason is that the radius of many layouts stored in the test set exceeds fp_meters, so they are not included in the floor plan image.

I tried to modify fp_meters to a larger value, but the error caused by pixel rounding is enlarged. When I set fp_meters=50 in config file:

Use the floor plan image to calculate IoU2D: 0.8934
Use the floor plan polygon to calculate IoU2D:0.8722

utilising several images

hi! i have several panoramic images of the same place and i would like to utilise them together to estimate the layout. or maybe somehow merge the estimated layouts from each image to create the final layout. obviously it’s not part of your project but maybe you can give me some advice/ideas on how to do that? thanks!

CPU Support?

Hello everyone,

First of all, thanks for sharing your work, your results are great. I'm trying to replicate them, but I'm facing some issues. First of all, my computer doesn't have a GPU so I tried changing the value of exp_args.device from 'cuda:0' to 'cpu', but I got the following error:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
So I added this command, changing this line params = torch.load(args.ckpt) to it params = torch.load(args.ckpt, map_location=torch.device(device)) and I think it should work, but I'm getting the following error (probably unrelated to the device issue):

Traceback (most recent call last):
  File "run_inference.py", line 58, in <module>
    pred_fp_down_man, pred_fp_down_man_pts = LED2Net.DuLaPost.fit_layout(pred_fp_down)
  File "/home/leo/sbdinc/LED2-Net/LED2Net/DuLaPost/layout.py", line 86, in fit_layout
    data_cnt.sort(key=lambda x: cv2.contourArea(x), reverse=True)
AttributeError: 'tuple' object has no attribute 'sort'

My Python version is 3.8.10 and my environment is:

absl-py==1.0.0
attrdict==2.0.1
cachetools==5.0.0
certifi==2021.10.8
charset-normalizer==2.0.11
cycler==0.11.0
fonttools==4.29.1
fvcore==0.1.5.post20220119
google-auth==2.6.0
google-auth-oauthlib==0.4.6
grpcio==1.43.0
idna==3.3
imageio==2.14.1
importlib-metadata==4.10.1
iopath==0.1.9
kiwisolver==1.3.2
Markdown==3.3.6
matplotlib==3.5.1
networkx==2.6.3
numpy==1.22.1
oauthlib==3.2.0
opencv-python==4.5.5.62
packaging==21.3
Pillow==9.0.0
portalocker==2.3.2
protobuf==3.19.4
pyasn1==0.4.8
pyasn1-modules==0.2.8
pylsd-nova==1.2.0
pyparsing==3.0.7
python-dateutil==2.8.2
pytorch3d==0.3.0
PyWavelets==1.2.0
PyYAML==6.0
requests==2.27.1
requests-oauthlib==1.3.1
rsa==4.8
scikit-image==0.19.1
scipy==1.7.3
six==1.16.0
tabulate==0.8.9
tensorboard==2.8.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
termcolor==1.1.0
tifffile==2021.11.2
torch==1.10.2
torchaudio==0.10.2
torchvision==0.11.3
tqdm==4.62.3
typing-extensions==4.0.1
urllib3==1.26.8
Werkzeug==2.0.2
yacs==0.1.8
zipp==3.7.0

Any help is appreciated :)