Git Product home page Git Product logo

transfuser's Introduction

Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

Project Page | Paper | Supplementary | Video | Poster | Blog

This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our code or paper useful, please cite

@inproceedings{Prakash2021CVPR,
  author = {Prakash, Aditya and Chitta, Kashyap and Geiger, Andreas},
  title = {Multi-Modal Fusion Transformer for End-to-End Autonomous Driving},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}

Contents

  1. Setup
  2. Dataset
  3. Data Generation
  4. Training
  5. Evaluation
  6. CARLA Leaderboard Submission
  7. Acknowledgements

Setup

Install anaconda

wget https://repo.anaconda.com/archive/Anaconda3-2020.11-Linux-x86_64.sh
bash Anaconda3-2020.11-Linux-x86_64.sh
source ~/.profile

Clone the repo and build the environment

git clone https://github.com/autonomousvision/transfuser
cd transfuser
conda create -n transfuser python=3.7
pip3 install -r requirements.txt
conda activate transfuser

Download and setup CARLA 0.9.10.1

chmod +x setup_carla.sh
./setup_carla.sh

Dataset

The data is generated with leaderboard/team_code/auto_pilot.py in 8 CARLA towns using the routes and scenarios files provided at leaderboard/data on CARLA 0.9.10.1

chmod +x download_data.sh
./download_data.sh

We used two datasets for different experimental settings:

  • clear_weather_data: contains only ClearNoon weather. This dataset is used for the experiments described in the paper and generalization to new town results shown in the video.
  • 14_weathers_data: contains 14 preset weather conditions mentioned in leaderboard/team_code/auto_pilot.py. This dataset is used for training models for the leaderboard and the generalization to new weather results shown in the video.

The dataset is structured as follows:

- TownX_{tiny,short,long}: corresponding to different towns and routes files
    - routes_X: contains data for an individual route
        - rgb_{front, left, right, rear}: multi-view camera images at 400x300 resolution
        - seg_{front, left, right, rear}: corresponding segmentation images
        - depth_{front, left, right, rear}: corresponding depth images
        - lidar: 3d point cloud in .npy format
        - topdown: topdown segmentation images required for training LBC
        - 2d_bbs_{front, left, right, rear}: 2d bounding boxes for different agents in the corresponding camera view
        - 3d_bbs: 3d bounding boxes for different agents
        - affordances: different types of affordances
        - measurements: contains ego-agent's position, velocity and other metadata

We have provided two versions of the datasets used in our work:

  • Minimal dataset (63G): contains only rgb_front, lidar and measurements from the 14_weathers_data. This is sufficient to train all the models (except LBC which also requires topdown).
  • Large scale dataset (406G): contains multi-view camera data with different perception labels and affordances for both clear_weather_data and 14_weathers_data to facilitate further development of imitation learning agents.

Data Generation

In addition to the dataset, we have also provided all the scripts used for generating data and these can be modified as required for different CARLA versions.

Running CARLA Server

With Display

./CarlaUE4.sh --world-port=2000 -opengl

Without Display

Without Docker:

SDL_VIDEODRIVER=offscreen SDL_HINT_CUDA_DEVICE=0 ./CarlaUE4.sh --world-port=2000 -opengl

With Docker:

Instructions for setting up docker are available here. Pull the docker image of CARLA 0.9.10.1 docker pull carlasim/carla:0.9.10.1.

Docker 18:

docker run -it --rm -p 2000-2002:2000-2002 --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 carlasim/carla:0.9.10.1 ./CarlaUE4.sh --world-port=2000 -opengl

Docker 19:

docker run -it --rm --net=host --gpus '"device=0"' carlasim/carla:0.9.10.1 ./CarlaUE4.sh --world-port=2000 -opengl

If the docker container doesn't start properly then add another environment variable -e SDL_AUDIODRIVER=dsp.

Run the Autopilot

Once the CARLA server is running, rollout the autopilot to start data generation.

./leaderboard/scripts/run_evaluation.sh

The expert agent used for data generation is defined in leaderboard/team_code/auto_pilot.py. Different variables which need to be set are specified in leaderboard/scripts/run_evaluation.sh. The expert agent is based on the autopilot from this codebase.

Routes and Scenarios

Each route is defined by a sequence of waypoints (and optionally a weather condition) that the agent needs to follow. Each scenario is defined by a trigger transform (location and orientation) and other actors present in that scenario (optional). The leaderboard repository provides a set of routes and scenarios files. To generate additional routes, spin up a CARLA server and follow the procedure below.

Generating routes with intersections

The position of traffic lights is used to localize intersections and (start_wp, end_wp) pairs are sampled in a grid centered at these points.

python3 tools/generate_intersection_routes.py --save_file <path_of_generated_routes_file> --town <town_to_be_used>

Sampling individual junctions from a route

Each route in the provided routes file is interpolated into a dense sequence of waypoints and individual junctions are sampled from these based on change in navigational commands.

python3 tools/sample_junctions.py --routes_file <xml_file_containing_routes> --save_file <path_of_generated_file>

Generating Scenarios

Additional scenarios are densely sampled in a grid centered at the locations from the reference scenarios file. More scenario files can be found here.

python3 tools/generate_scenarios.py --scenarios_file <scenarios_file_to_be_used_as_reference> --save_file <path_of_generated_json_file> --towns <town_to_be_used>

Training

The training code and pretrained models are provided below.

mkdir model_ckpt
wget https://s3.eu-central-1.amazonaws.com/avg-projects/transfuser/models.zip -P model_ckpt
unzip model_ckpt/models.zip -d model_ckpt/
rm model_ckpt/models.zip

Note that we have updated the pretrained TransFuser model with the improved checkpoint submitted to the leaderboard. This model contains multiple bug fixes and is trained on a different dataset than the one provided in this repository. (We are currently unable to share the entire dataset due to some issues.)

Evaluation

Spin up a CARLA server (described above) and run the required agent. The adequate routes and scenarios files are provided in leaderboard/data and the required variables need to be set in leaderboard/scripts/run_evaluation.sh.

CUDA_VISIBLE_DEVICES=0 ./leaderboard/scripts/run_evaluation.sh

CARLA Leaderboard Submission

CARLA also has an official Autonomous Driving Leaderboard on which different models can be evaluated. Refer to the leaderboard_submission branch in this repository for building docker image and submitting to the leaderboard.

Acknowledgements

This implementation is based on code from several repositories.

Also, check out other works on autonomous driving from our group.

transfuser's People

Contributors

ap229997 avatar kait0 avatar kashyap7x avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.