rwightman / udacity-driving-reader Goto Github PK

Quick docker based reader for udacity rosbag self-driving dataset. Dumps to png/jpg + csv or Tensorflow examples.

License: Apache License 2.0

Shell 2.46% Python 96.20% Dockerfile 1.34%

udacity-driving-reader's Introduction

udacity-driving-reader

Scripts to read and dump data from the rosbag format used in the udacity self-driving dataset(s).

The scripts were setup to run within a Docker container so that I can extract the data from the rosbag format without needing to install ROS on my system. The docker container is built from the ROS kinetic-perception image. I've added python modules and the latest Tensorflow on top of that.

I've run this code on Ubuntu 16.04 and 18.04 with Docker CE installed as per https://docs.docker.com/install/linux/docker-ce/ubuntu/. No other platform has been tried.

Since the original release of this script, this latest iteration has been updated to support bag files with compressed images and bag files that have been split into multiple files by time or topics. With support for this, support for a reordering buffer was added to bag2tf.

The latest versions scan all bag files and extract their info in yaml format before doing a second pass to read the data, this adds some time but provides a mechanism for supporting the variety of bag formats and splits now being used in the datasets. The info yaml files are also dumped as part of the bagdump process.

Installation

Checkout this code and run in place. I have not pushed docker container to hub.

Usage

Build the docker container manually or using ./build.sh before executing any of the run scripts.

Run one of the run scripts for dumping to images + csv or Tensorflow sharded records files.

This and future versions of the scripts expect all datasets to exist in SEPARATE folders with only bag files for the same dataset in each folder. The input folder should thus be a folder with one folder per dataset. The bagdump script will mirror those input folders in the output, while the bag2tf will combine them all into one sharded stream.

The paths passed to the run scripts are used as docker volume mappings. These paths must be absolute paths on your local filesystem (relative to the root). Keep this in mind if you try to change the input/output args.

Dump to images + CSV

./run-bagdump.sh -i [absolute dir with folders containing bag files] -o [absolute output dir] -- [args to pass to python script]

For example, if your dataset bags are in /data/dataset2-1/dataset.bag, /data/udacity-datasetElCamino/*.bag etc., and you'd like the output in /output:

./run-bagdump.sh -i /data -o /output

The same as above, but you want to convert to png instead of jpg:

./run-bagdump.sh -i /data -o /output -- -f png

Dump to Tensorflow sharded files

Same basic arguments as for bagdump above. There are some additional arguments of note to pass to the python script.

The default arguments write all cameras into the same sharded stream along with latest steering entry. To write images to three separate streams, one for each camera, add an -s (or --separate) argument.

i.e.

./run-bag2tf.sh -i /data -o /output -- --separate

udacity-driving-reader's People

Contributors

Stargazers

Watchers

udacity-driving-reader's Issues

docker login

y', docker image 'udacity-reader'...
Unable to find image 'udacity-reader:latest' locally
docker: Error response from daemon: pull access denied for udacity-reader, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.

Error in ./build.sh

Hello, I'm getting the next error while trying to run sudo ./build.sh:

I don't know why this is happening since I'm using python 3.6.9 and already removed the 2.7 version.

Please help!

Issue while extracting the data

Hey,

Thanks a lot for the data set.

I am having a issue, when I try to use the bagdump.sh it is giving an error
File "/usr/local/lib/python2.7/dist-packages/pandas/compat/init.py", line 421, in
raise ImportError('dateutil 2.5.0 is the minimum required version')
ImportError: dateutil 2.5.0 is the minimum required version

Do you have any idea why this might be happening.

Thanks

Docker pull error

I am getting manifest error when trying to install

Error response from daemon: manifest for timeshi1995/udacity-driving-reader:latest not found: manifest unknown: manifest unknown

Dumping all the data is rosbag to csv

Hi,

Thanks for the dataset reader code.

I'm interested in extracting point correspondence between consecutive images in the dataset. I'm interested in 3D points present in sensor_msgs/PointCloud2 and the extrinsic camera parameters in either sensor_msgs/JointState or tf2_msgs/TFMessage.

Could you please provide a more generic code that can extract this data into csv files too?

Many thanks.

error ./run.sh: line 32: docker: command not found

Hi!
I tried to run ./run-bagdump.sh -i /data -o /output , but got the error
./run.sh: line 32: docker: command not found
I set my folders follow this image,
please help me!!

ImportError: dateutil 2.5.0 is the minimum required version

error while using ./run-bagdump.sh

File "/usr/local/lib/python2.7/dist-packages/pandas/compat/init.py", line 421, in
raise ImportError('dateutil 2.5.0 is the minimum required version')
ImportError: dateutil 2.5.0 is the minimum required version

I have dateutil version 2.7 yet its giving the error.

Any advice on how to use the script in windows?

Hi thanks for your nice script.

Howeer, I'm trying to run the script in windows 10, any structions on how to do so? How about directly run the ./build.sh in powershell? And then run the ./run-bagdump.sh command?

In addition, I'm new to this docker thing. Can you please briefly specify how to "Build the docker container manually"

Or a simpler way, may I ask is there any way I can download the extracted image files and csv files?

Requirements to run the scripts?

Hello there,
was wondering if there are many machine specs required to run the script? because I'm trying to run it on a large dataset (80 GBs) and it's stuck at processing dataset, never making any progress or writing any images and it causes my ubuntu to freeze.
Thanks

Issues related to Dockerfile and run.sh

There were several issues I faced while trying to extract the data from .bag files:
Docker file:
When trying to run as is:

pandas throws errors stating numpy version is 11.5.0, while >=12.0.0 is needed.
Similarly, several other PyPi packages that are already installed in ros:kinetic-vision image are left as older versions since there's no --upgrade flag at the end of pip commands(ex. PyYaml, numpy, matplotlib, ipykernel, python-dateutil), which causes problems.

Run.sh:

Doesn't do anything for me since Docker expects absolute paths for OUTPUT_DIR and INPUT_DIR, while given paths are relative to current directory.

After fixing the above problems, everything works as expected. I think Docker file and Run.sh need small changes. I have opened a pull request, please have a look.

Accessing velodyne packets

The script doesn't extract Velodyne packets/data from the .bag file. While the yaml shows inclusion of the following topic: /velodyne_packets, the topic is not accessed via the script.
Can you please add that?

Data units

Hello,

What are the units of the sensor data?
Specifically, I want to use steering angle and speed. Are those in radians and kmh?
Moreover, what are the time units? It seems like it uses [sec*1e-9], but I want to make sure.

Btw, thanks for the great tool, it saved me a lot of time.

error ./run.sh: line 32: docker: command not found

#12
The absolute path also don't work!
Can you help me?
Thank you very much!

bagdump.py was not working as ROS environment was not source

Built the docker environment manually and tried to run python bagdump.py.

Got the error No handler could be found for logger "rosout"

Had to run source /opt/ros/<distro>/setup.bash in container's bash before running python bagdump.py.

Hope this helps ppl who are facing the same issue

download issue

iam using window 10 home which is not compatable for installing docker ,suggest any alternative

Dumping rosbag data inquiry

I am new to docker and I have tried reading and watching tutorials on the internet for a while, but it's not helpful. I wonder how I should run the script files within the Container. Am I supposed to copy this reader directory to a container so that I can run the script files ?

Deal with compressed data stored in separate bag files

error in run-bagdump.sh

this is that in my terminal, could someone help me please...

root@MONTO:/mnt/j/ucity/udacity-driving-reader-master# ./run-bagdump.sh -i /mnt/j/ucity -o /mnt/j/ucity/data
Running 'python script/bagdump.py' with input dir '/mnt/j/ucity', output dir '/mnt/j/ucity/data', docker image 'udacity-reader'...
python: can't open file 'script/bagdump.py': [Errno 2] No such file or directory

thanks!

New error in .build/sh

Even after modifying the docker file with the comment in issue #24 and pull request #28, the terminal gives me the following error:

29 | >>> RUN pip install
30 | >>> scipy==1.2.3
31 | >>> pandas==0.24.2
32 | >>> jupyter==1.0.0
33 |

ERROR: failed to solve: process "/bin/sh -c pip install scipy==1.2.3 pandas==0.24.2 jupyter==1.0.0" did not complete successfully: exit code: 1

bag2tf.py: Error when number of images is below 6000

Thank you very much for this script. Just a heads up that for small data sets, sub 6000 images, the user will encounter an error per below. Its easy to debug, but a warning might help future users. Thanks again.

File "script/bag2tf.py", line 475, in main
splits=split_list, center_only=center_only, debug_print=debug_print)
File "script/bag2tf.py", line 207, in init
writer = ShardWriter(self._outdir, s[0], scaled_images, max_num_shards=scaled_shards)
File "script/bag2tf.py", line 135, in init
self.num_entries_per_shard = num_entries // max_num_shards
ZeroDivisionError: float divmod()

Timestamps from interpolated.csv not existing in brake.csv

Hi!

I used this reader to convert the dataset for Udacity challenge 2 from .bag files to images and .csv files, but the timestamps in the break.csv do not correspond either to the ones in the interpolated.csv nor to the images filenames (link to the 2 files). This also happens for timestamps in the gear.csv or gps.csv. As it was recommended in the repository readme, I used this script.

Do you know why this may happen? or how can I correlate the timestamps in brake.csv, for instance with the images?

Thanks,
Miruna

./run.sh: line 32: docker: command not found


root@bf9f40921c8d:/dataset/udacity-driving-reader-master# ./run-bagdump.sh -i /dataset/udacity-driving-reader-master/data -o /dataset/udacity-driving-reader-master/output

Running 'python script/bagdump.py' with input dir '/dataset/udacity-driving-reader-master/data', output dir '/dataset/udacity-driving-reader-master/output', docker image 'udacityreader'...

./run.sh: line 32: docker: command not found

root@bf9f40921c8d:/dataset/udacity-driving-reader-master#

error while using run-bagdump.sh

docker: Error response from daemon: pull access denied for udacity-reader, repository does not exist or may require 'docker login'.