Git Product home page Git Product logo

hybriddepth's Introduction

Hybrid Depth: Robust Depth Fusion for Mobile AR
By Leveraging Depth from Focus and Single-Image Priors

Ashkan Ganj1 · Hang Su2 · Tian Guo1

1Worcester Polytechnic Institute    2Nvidia Research

arXiv

PWC PWC

This work presents HybridDepth. HybridDepth is a practical depth estimation solution based on focal stack images captured from a camera. This approach outperforms state-of-the-art models across several well-known datasets, including NYU V2, DDFF12, and ARKitScenes. teaser

News

  • 2024-07-25: We released the pre-trained models.
  • 2024-07-23: Model and Github repository is online.

TODOs

  • Add Hugging Face model.
  • Release Android Mobile Client for HybridDepth.

Pre-trained Models

We provide three models trained on different datasets. You can download them from the links below:

Model Checkpoint
Hybrid-Depth-NYU-5 Download
Hybrid-Depth-NYU-10 Download
Hybrid-Depth-DDFF12-5 Download
Hybrid-Depth-ARKitScenes-5 Download

Usage

Prepraration

  1. Clone the repository and install the dependencies:
git clone https://github.com/cake-lab/HybridDepth.git
cd HybridDepth
conda env create -f environment.yml
conda activate hybriddepth
  1. Download Necessary Files:
    • Download the necessary file here and place it in the checkpoints directory.
    • Download the checkpoints listed here and put them under the checkpoints directory.
  2. Install Synthesizing cuda package
python utils/synthetic/gauss_psf/setup.py install

This will install the Python package for synthesizing images.

Dataset Preparation

  1. NYU: Download dataset as per instructions given here.

  2. DDFF12: Download dataset as per instructions given here.

  3. ARKitScenes: Download dataset as per instructions given here.

Using HybridDepth model for prediction

For inference you can run the provided notebook test.ipynb or use the following command:

# Load the model checkpoint
model_path = './checkpoints/checkpoint.ckpt'
model = DepthNetModule()
# Load the weights
model.load_state_dict(torch.load(model_path))

model.eval()
model = model.to('cuda')

After loading the model, you can use the following code to process the input images and get the depth map:

from utils.io import prepare_input_image

data_dir = 'focal stack images directory'

# Load the focal stack images
focal_stack, rgb_img, focus_dist = prepare_input_image(data_dir)

# inference
with torch.no_grad():
   out = model(rgb_img, focal_stack, focus_dist)

metric_depth = out[0].squeeze().cpu().numpy() # The metric depth

Evaluation

First setup the configuration file config.yaml in the configs directory. We already provide the configuration files for the three datasets in the configs directory. In the configuration file, you can specify the path to the dataloader, the path to the model, and other hyperparameters. Here is an example of the configuration file:

data:
  class_path: dataloader.dataset.NYUDataModule # Path to your dataloader Module in dataset.py
  init_args:
    nyuv2_data_root: 'root to the NYUv2 dataset or other datasets' # path to the specific dataset
    img_size: [480, 640]  # Adjust if your DataModule expects a tuple for img_size
    remove_white_border: True
    num_workers: 0  # if you are using synthetic data, you don't need multiple workers
    use_labels: True

model:
  invert_depth: True # If the model outputs inverted depth

ckpt_path: checkpoints/checkpoint.ckpt

Then specify the configuration file in the test.sh script.

python cli_run.py test  --config configs/config_file_name.yaml

Finally, run the following command:

cd scripts
sh evaluate.sh

Training

First setup the configuration file config.yaml in the configs directory. You only need to specify the path to the dataset and the batch size. The rest of the hyperparameters are already set. For example, you can use the following configuration file for training on the NYUv2 dataset:

...
model:
  invert_depth: True
  # learning rate
  lr: 3e-4 # you can adjust this value
  # weight decay
  wd: 0.001 # you can adjust this value

data:
  class_path: dataloader.dataset.NYUDataModule # Path to your dataloader Module in dataset.py
  init_args:
    nyuv2_data_root: 'root to the NYUv2 dataset or other datasets' # path to the specific dataset
    img_size: [480, 640]  # Adjust if your NYUDataModule expects a tuple for img_size
    remove_white_border: True
    batch_size: 24 # Adjust the batch size
    num_workers: 0  # if you are using synthetic data, you don't need multiple workers
    use_labels: True
ckpt_path: null

Then specify the configuration file in the train.sh script.

python cli_run.py train  --config configs/config_file_name.yaml

Finally, run the following command:

cd scripts
sh train.sh

Citation

If our work assists you in your research, please cite it as follows:

@misc{ganj2024hybriddepthrobustdepthfusion,
      title={HybridDepth: Robust Depth Fusion for Mobile AR by Leveraging Depth from Focus and Single-Image Priors}, 
      author={Ashkan Ganj and Hang Su and Tian Guo},
      year={2024},
      eprint={2407.18443},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.18443}, 
}

hybriddepth's People

Contributors

ashkanganj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hybriddepth's Issues

Unable to install gauss-psf-cuda package

Hi,

Thank you for your hard work on this project. I have encountered an issue while using the command conda env create -f environment.yml. It seems that the gauss-psf-cuda package cannot be found through pip install.

Could you please provide an alternative method to install this package or advise on how to resolve this issue?

Thank you for your assistance.

About installation

I looked inside your 'setup.py' and your 'environment.yml' noticed that the opencv-python library is missing.
In your 'depthNet.py', I found that you are using absolute paths. I’ve changed all of them, but I still can't run the code.
image

image

Can you point out where i can fix it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.