Git Product home page Git Product logo

stereonet's Introduction

Pytorch StereoNet

Customized implementation of the Stereonet guided hierarchical refinement for real-time edge-aware depth prediction

The network archtecture of StereoNet

Attention: Not accomplished yet

  1. Stereo Matching is not my main research field, and this repo is created for a homework. So maybe it's not very completed, but I have tried to make it perfect. If you need a better version, please refer to https://github.com/meteorshowers/StereoNet.
  2. The approach of computing the cost volume in the StereoNet paper is subtracting the padding image and the other image. Here I changed it to concatenate the two images. If you want to change it to the paper's way, just set it when you initialize the net.
  3. Only training and testing on the KITTI 2015 train dataset is not enough, the best performance has achieved 74.5% (pixels with error smaller than 1). After pretraining on SceneFlow and finetune on KITTI15, the acc achieves 90.054%, not as good as the acc in paper. I have try hard to achieve the accuracy in paper, but still can't. Maybe some details are wrong.

Experiment Results till now

  1. train and test on SceneFlow datasets:
    • epoch 22 total training loss = 3.956
    • average test EPE = 3.496
  2. different finetuning on kitti 15 and result
    • 300 epochs, max 3 pixel error rate = 80.893 on kitti val
      optimizer = RMSprop(model.parameters(), lr=1e-3, weight_decay=0.0001)
      scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
      
    • 300 epochs, max 3 pixel error rate = 83.527 on kitti val
      optimizer = optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.999))
      scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
      
    • 300 epochs, max 3 pixel error rate = 90.054 on kitti val
      optimizer = optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.999))
      if epoch <= 200:
          lr = 0.001
      else:
          lr = 0.0001    
      
    • 2000 epochs, max 3 pixel error rate = 93.680 on kitti val, after 4.98 hours finetune
      optimizer = optim.Adam(model.parameters(), lr=0.001, betas=(0.9, 0.999))
      if epoch <= 200:
          lr = 0.001
      else:
          lr = 0.0001    
      

Pre-requirement

  • Pytorch 1.0.0
  • CUDA Toolkit 10
  • numpy

Datasets:

  1. Pretrain: SceneFlow
  2. KITTI 2015

You can use the anaconda virtual environment to quick start

Install Anaconda

1. wget https://repo.anaconda.com/archive/Anaconda3-5.3.1-Linux-x86_64.sh
2. bash Anaconda3-5.3.1-Linux-x86_64.sh

Please reference to Ubuntu系统下Anaconda使用方法总结 for more information about conda installation.

Create Virtual Environment according to my environment index

conda env create -n your_env_name -f environment.yaml

Training and Test

Switch to the correct python environment

conda activate your_env_name

Start training and test

Pretrain on SceneFlow dataset

cd pretrain-sceneflow
python sceneflow-pretrain.py

Finetune on KITTI 2015

cd finetune-kitti15
python finetune-kitti15.py

Coding Reference

stereonet's People

Contributors

zhixuanli avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.