Git Product home page Git Product logo

deepvo's Introduction

Status

Not functional (i.e. does not converge with our data). But can be a useful starting point since the paper author's code is not public.


A TensorFlow implementation of DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

This is our submission for the ANN with TensorFlow course, winter 2017. Please note that this implementation does not seem entirely correct. Convergence was observed only on a dataset with random moves forwards and backwards, without rotation.

Data Acquisition

In order to make use of the full 720 resolution of the LifeCam 3000, you must do two things

  • Tell the device driver to use this resolution via v4l2-ctl --set-fmt-video=width=1280,height=720,pixelformat=1 (the pixel format is probably not important, but you may need to adjust the ros node accordingly)
  • In the usb_cam_node, set height and width parameters appropriately.

Data Preprocessing

Bagfile conversion

The first thing to do is to convert the rosbag sensor recordings with the conversion tool (which you can find here) like this

bag_to_cam_pose_data -b <file>.bag  -d <outdir> -x -P

The -P flag is to dump one npy file for each image and pose. The -x flag is for writing float image arrays instead of uint8. This will create images and poses folders inside the chosen directory.

Further preprocessing

Use the preprocess_data.py script to prepare the data for our network

  • with -d <path-to-data> you give it the path where the images/ and poses/ folders are located. All modifications are done in-place
  • -f will map the images to (0, 1)
  • -m will subtract the mean (over the entire set) from each image
  • -p will add Pi to all pose angles. The robot's EKF output is in the range (-pi, pi), but we want (0, 2pi)

Potential Problems

  • We are not sure if the timestamps of pose and camera messages are correct and thus whether the training data is good enough
  • We have no control over the exposure time of the camera. Auto-exposure differences while driving around might make the problem more difficult

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.