bitsauce / carla-ppo Goto Github PK

This repository hosts a customized PPO based agent for Carla. The goal of this project is to make it easier to interact with and experiment in Carla with reinforcement learning based agents -- this, by wrapping Carla in a gym like environment that can handle custom reward functions, custom debug output, etc.

License: MIT License

Python 100.00%

deep-reinforcement-learning autonomous-driving ppo-agent vae agent-learning agent-driving

carla-ppo's Introduction

CARLA PPO agent

About the Project

This project concerns how we may design environments in order to facilitate the training of deep reinforcement learning based autonomous driving agents. The goal of the project is to provide a working deep reinforcement learning framework that can learn to drive in visually complex environments, with a focus on providing a solution that:

Works out-of-the-box.
Learns in a short time to make it easier to quickly iterate on and test hypotheses.
Provide tailored metrics to compare agents between runs.

We have used the urban driving simulator CALRA (version 0.9.5) as our environment.

Find a detailed project write-up here.

Video of results:

Use the timestamps in the description to navigate to the experiments of your interest.

Contributions

Figure 1: Town07 lap

We provide two gym-like environments for CARLA*:
1. Lap environment: This environment is focused on training an agent to follow a predetermined lap (see CarlaEnv/carla_lap_env.py)
2. Route environment: This environment is focused on training agents that can navigate from point A to point B (see CarlaEnv/carla_route_env.py.
We provide analysis of optimal PPO parameters, environment designs, reward functions, etc. with the aim of finding the optimal setup to train reinforcement learning based autonomous driving agents (see Chapter 4 of the project write-up for further details.)
We have shown that how we train and use a VAE can be consequential to the performance of a deep reinforcement learning agent, and we have found that major improvements can be made by training the VAE to reconstruct semantic segmentation maps instead of reconstructing the RGB input itself.
We have devised a model that can reliably solve the lap environment in ~8 hours on average on a Nvidia GTX 970.
We have provided an example of how sub-policies can be used to navigate with PPO, and we found it to have moderate success in the route environment (See the sub-policy branch).

* While there are existing examples of gym-like environments for CARLA, there is no implementation that is officially endorsed by CARLA. Furthermore, most of the third-party environments do not provide an example of an agent that works out-of-the-box.

Related Work

Learning to Drive in a Day by Kendall et. al. This paper by researchers at Wayve describes a method that showed how state representation learning through a variational autoencoder can be used to train a car to follow a straight country road in approximately 15 minutes.
Learning to Drive Smoothly in Minutes by Raffin et. al. This medium article lays out the details of a method that was able to train an agent in a Donkey Car simulator in only 5 minutes, using a similar approach as (1). They further provide some solutions to the unstable steering we may observe when we train with the straight forward speed-as-reward reward formulation of Kendall.
End-to-end Driving via Conditional Imitation Learning by Codevilla et. al. This paper outlines an imitation learning model that is able to learn to navigate arbitrary routes by using multiple actor networks, conditioned on what the current maneuver the vehicle should take is. We have used a similar approach in our route environment agent.

Method Overview

This is a high-level overview of the method.

Collect 10k 160x80x3 images by driving around manually.
Train a VAE to reconstruct the images.
Train an agent using the encoded state representations generated by the trained VAE and append a vector of measurements (steering, throttle, speed.) This is the input of the PPO-based agent.

Figure 2:

How to Run

Prerequisites

Python 3.6
CARLA 0.9.5 (may also work with later versions)
- Our code expects the CARLA python API to be installed and available through import carla (see this)
- We also recommend building a editor-less version of Carla by running the make package command in the root directory of CARLA.
- Note that the map we use, Town07, may not be included by default when running make package. Add +MapsToCook=(FilePath="/Game/Carla/Maps/Town07") to Unreal/CarlaUE4/Config/DefaultGame.ini before running make package to solve this.
TensorFlow for GPU (we have used version 1.13, may work with later versions)
OpenAI gym (we used version 0.12.0)
OpenCV for Python (we used version 4.0.0)
A GPU with at least 4 GB VRAM (we used a GeForce GTX 970)

Running a Trained Agent

With the project, we provide a pretrained PPO agent for the lap environment. The checkpoint file for this model is located in the models folder.

The easiest way to get this model run, is to first set an environment variable named ${CARLA_ROOT} to point to the top-level directory in your CARLA installation.

Afterward, we can simply call:

python run_eval.py --model_name pretrained_agent -start_carla

And CARLA should automatically be started and our agent driving. This particular agent should be able to drive about 850m along the designated lap (Figure 1).

Note that our environment has only been designed to work with Town07 since this map is the one that closest resembles the environments of Kendall et. al. and Raffin et. al..

Training a New Agent

Set ${CARLA_ROOT} as is described in Running a Trained Agent.

Then use the following command to train a new agent:

python train.py --model_name name_of_your_model -start_carla

This will start training an agent with the default parameters, and checkpoint and log files will be written to models/name_of_your_model.

Recording of the evaluation episodes will also be written to models/name_of_your_model/videos by default, making it easier to evaluate an agent's behavior over time.

To view the training progress of an agent, and to compare trained agents in TensorBoard, use the following command:

tensorboard --logdir models/

Training the Variational Autoencoder

If you wish to collect data to train the variational autoencoder yourself, you may use the following command:

python CarlaEnv/collect_data.py --output_dir vae/my_data -start_carla

Press SPACE to begin recording frames. 10K images will be saved by default.

After you have collected data to train the VAE with, use the following command to train the VAE:

cd vae
python train_vae.py --model_name my_trained_vae --dataset my_data

To view the training progress and to compare trained VAEs in TensorBoard, use the following command:

cd vae
tensorboard --logdir models/

Inspecting VAE Reconstructions

Once we have a trained VAE, we can use the following command to inspect how its reconstructions look:

cd vae
python inspect_vae.py --model_dir models/my_trained_vae

Use the Set z by image button to seed your VAE with the latent z that is generated when the image selected is passed through the encoder (useful for comparing VAE reconstructions across models, as there is no guarantee that the features of the input will be encoded in the same indices of Z.)

Inspecting the Agent's Decision Making

We may also use the following command to see how a trained agent will behave to changes in latent space vector z by running:

python inspect_agent.py --model_name name_of_your_model

File Overview

File	Description
train.py	Script for training a PPO agent in the lap environment
run_eval.py	Script for running a trained model in eval mode
utils.py	Contains various mathematical, tensorflow, DRL utility functions
ppo.py	Contains code for constructing the PPO model
reward_functions.py	Contains all reward functions
vae_common.py	Contains functions related to VAE loading and state encoding
inspect_agent.py	Script used to inspect the behavior of the agent as the VAE's latent space vector z is annealed
models/	Folder containing agent checkpoints, tensorboard log files, and video recordings
doc/	Folder containing figures that are used in this readme, in addition to a PDF version of the project write-up
vae/	Folder containing variational autoencoder related code
vae/train_vae.py	Script for training a variational autoencoder
vae/models.py	Contains code for constructing MLP and CNN-based VAE models
vae/inspect_vae.py	Script used to inspect how latent space vector z affects the reconstructions of a trained VAE
vae/data/	Folder containing the images that were used when training the VAE model bundled with the repo
vae/models/	Folder containing VAE model checkpoints and tensorboard logs
CarlaEnv/	Folder containing code related to the CARLA environments
CarlaEnv/carla_lap_env.py	Contains code for the CarlaLapEnv class
CarlaEnv/carla_route_env.py	Contains code for the CarlaRouteEnv class
CarlaEnv/collect_data.py	Script used to manually drive a car in the environment to collect images that can be used to train a VAE
CarlaEnv/hud.py	Code for the HUD displayed on the left-hand side of the spectating window
CarlaEnv/planner.py	Global route planner used to find routes from A to B. Copied and modified from CARLA 0.9.4's PythonAPI
CarlaEnv/wrappers.py	Contains wrapper classes for several CARLA classes
CarlaEnv/agents/	Contains code used by the route planner

Future Work

Here are a couple of ideas of how our work can be expanded or improved on:

Temporal models such as World Models or other LSTM models
Better state representations (higher resolution, sensor fusion with LiDAR, etc.)
Improve exploration (random distillation networks, Ornstein-Uhlenbeck noise, etc.)
Enforcing smooth driving e.g. through reward functions that penalize fluctuating actions
Multi-agent training, or training with other vehicles on the road
Making the agent obey traffic rules

Known Issues

Seed does not make simulations deterministic, even in a synchronous environment
Environment does not strictly confine to OpenAI gym's standard, meaning it cannot be used directly with their algorithms without modification

Cite this Project

@mastersthesis{11250_2625841, title={Accelerating Training of Deep Reinforcement Learning-based Autonomous Driving Agents Through Comparative Study of Agent and Environment Designs}, url={https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/2625841}, school={NTNU}, publisher={NTNU Open Access}, author={Vergara, Marcus Loo}, year={2019}, month={Oct}}

carla-ppo's People

Contributors

Stargazers

Watchers

carla-ppo's Issues

Trained agent in different maps

Hello, I have attempted to implement your trained agent in maps other than Town07. It seems that the agent runs in the environment for a few seconds then the environment closes.
Is the trained agent only compatible with Town07?
Any advice on how to remedy the situation so the agent may be run in a variety of maps?

Carla Root error windows

It seems like this code will not work in Windows. I have spent ages solving different errors but in the end, it does not work due to Carla's root error. if anyone has already run the code in windows please, inform me

Carla-ppo-master/common.py-in load_vae: raise Exception("Failed to load VAE")

hello, bitsauce, thanks for you job ! I want to run the trained agent and I do as you said in the README.md with carla-0.9.5 releases.
When I run the "python3 run_eval.py --model_name TODO", there is an error:
Traceback (most recent call last):
File "run_eval.py", line 83, in
vae = load_vae(args.vae_model, args.vae_z_dim, args.vae_model_type)
File "…Carla-ppo-master/common.py", line 26, in load_vae
raise Exception("Failed to load VAE")
Exception: Failed to load VAE

I look into the file path "bce_cnn_zdim64_beta1_kl_tolerance0.0_data/" and find it's empty and this file is also not in your project, but the vae_model path in run_eval.py is: parser.add_argument("--vae_model", type=str, default="bce_cnn_zdim64_beta1_kl_tolerance0.0_data").

So would you help me find out why this error is occurred?

Why the car looks very slow when running with " the python run_eval.py --model_name pretrained_agent -start_carla"?

Hi,
wonderful and very interesting case.
But ,why the car looks very slow when running with " the python run_eval.py --model_name pretrained_agent -start_carla"?
Actually, the speed tells me about 15km/h. Is it normal?

My env is:
Clould platform GPU T4 (16G)
4cpu
Ubuntu18.04
MobaXterm(SSH log in)
Internet clould bandwidth is 5Mbps

Thanks.

do you plan to add a short instruction?

Great resource, it would be better if you can add a short instruction based on your implementation. Thank you!!

subpolicy question,

Hi！ First,In ppo.py
self.policy = self.loss = -self.policy_loss + self.value_loss - self.entropy_loss
you said ' Reduce sum over all sub-policies (where only the active sub-policy will be non-zero due to previous filtering',but the loss will be a list. How can a list of loss background?
self.train_step = self.optimizer.minimize(self.loss, var_list=policy_params).

Second,you first compute '_create_sub_policy' ,in this part the loss will be reduce mean and finally became a scalar. After filtering，all sub policy module will output the same value. It really work?

Can't find dist under ${CARLA_ROOT}

Hi, bitsause
I have added carla_root by bashrc using:
export CARLA_ROOT=home/dc2-user/carla
export PYTHONPATH=$PYTHONPATH:${CARLA_ROOT}/PythonAPI/carla/dist/carla-0.9.10-py3.6-linux-x86_64.egg:${CARLA_ROOT}/PythonAPI/carla/agents:${CARLA_ROOT}/PythonAPI/carla

And I don't know why carla_lap_env.py can't find dist, could you please help me?

RuntimeError: time-out of 60000ms z

Low Efficiency on Data Collection

While running the data collector Python script in this repository, the efficiency is too low that it dropped to 1FPS even on a GPU3090 machine.
Is this the problem of the ways collecting the data or the problem of Carla itself?

make package problem!

when I build carla on linux,follow the docs,the last step-make package,the error as followed:
ERROR: Cook failed.
(see /home/wzj/Library/Logs/Unreal Engine/LocalBuildLogs/Log.txt for full exception trace)
AutomationTool exiting with ExitCode=25 (Error_UnknownCookFailure)
RunUAT ERROR: AutomationTool was unable to run successfully.
Util/BuildTools/Linux.mk:16: recipe for target 'package' failed
make: *** [package] Error 25

I have tried many times,what should I do?

screen doesn't move at some episodes

Hi
When I run the PPO agent, at some episodes, the throttle of the car is not zero but the screen doesn't move and refresh.Do you know why and how to deal with it.

THANK YOU

Exception: Could not find a packaged distribution of CALRA! (try building CARLA with the "make package" command in ${CARLA_ROOT})

${CARLA_ROOT} has not been set

Hello. When I run run_eval.py it shows the error "Exception: ${CARLA_ROOT} has not been set!"

Now, I use win 10. I followed some instructions but could not fix it. Could you show me how to set the {CARLA_ROOT} on Windows 10. Thanks

sub-policy question

Could you explain your sub-policy model？In your thesis you said trained one PPO actor-critic network for each of the following maneuvers:follow the road，turn left， turn right，but in your code，I just find one PPO model to train for all maneuvers！！

Exception: Expected to find directory "dist" under ${CARLA_ROOT}!

Hi Bitsauce,
I have added the environment variable to bachrc file and sourced it, something like this
"export CARLA_ROOT=/home/chris/carla
export PYTHONPATH=$PYTHONPATH:${CARLA_ROOT}/PythonAPI/carla/dist/carla-0.9.10-py3.6-linux-x86_64.egg"

But I still get an error when I run command " python3 run_eval.py --model_name pretrained_agent -start_carla
".
The error is ：
Traceback (most recent call last):
File "run_eval.py", line 117, in
start_carla=args.start_carla)
File "/home/chris/Carla-ppo-master/CarlaEnv/carla_lap_env.py", line 101, in init
raise Exception("Expected to find directory "dist" under ${CARLA_ROOT}!")
Exception: Expected to find directory "dist" under ${CARLA_ROOT}!

Could you give me a tipp ? I appreciate.

Yours,
Chris

free(): invalid pointer

When I run python3 Carla-ppo/train.py --model_name model1 -start_carla I get the following output:

free(): invalid pointer

As recommended, I'm using Carla 0.9.5.
The manual control script (PythonAPI/examples/manual_control.py) that comes with Carla works without any issues.
Any ideas what the issue could be?

ImportError: cannot import name 'naming'

how can I use the Town07_0.9.5?

module 'tensorflow' has no attribute 'AUTO_REUSE'

File "run_eval.py", line 16, in
from vae.models import ConvVAE, MlpVAE
File "C:\CARLA_0.9.13\PythonAPI\examples\Trials\RL\Carla-ppo\vae\models.py", line 33, in
class VAE():
File "C:\CARLA_0.9.13\PythonAPI\examples\Trials\RL\Carla-ppo\vae\models.py", line 40, in VAE
model_dir=".", loss_fn=bce_loss, training=True, reuse=tf.AUTO_REUSE,
AttributeError: module 'tensorflow' has no attribute 'AUTO_REUSE'

Specify requirements

Hi,
Please specify the requirement to run this code. I am getting this error continuously. I run like this:
python3 run_eval.py --model_name pretrained_agent -start_carla

2024-06-28 09:07:56.997018: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/ryzen/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/cv2/../../lib64:
2024-06-28 09:07:56.997043: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "run_eval.py", line 15, in
from ppo import PPO
File "/home/Carla-ppo-master/ppo.py", line 7, in
import tensorflow_probability as tfp
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/init.py", line 20, in
from tensorflow_probability import substrates
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/substrates/init.py", line 21, in
from tensorflow_probability.python.internal import all_util
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/init.py", line 142, in
dir(globals()[pkg_name]) # Forces loading the package from its lazy loader.
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/internal/lazy_loader.py", line 61, in dir
module = self._load()
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/internal/lazy_loader.py", line 44, in _load
module = importlib.import_module(self.name)
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/experimental/init.py", line 35, in
from tensorflow_probability.python.experimental import bijectors
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/experimental/bijectors/init.py", line 17, in
from tensorflow_probability.python.bijectors.ldj_ratio import forward_log_det_jacobian_ratio
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/bijectors/init.py", line 46, in
from tensorflow_probability.python.bijectors.glow import Glow
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow_probability/python/bijectors/glow.py", line 45, in
tfkl = tfk.layers
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow/python/util/lazy_loader.py", line 62, in getattr
module = self._load()
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow/python/util/lazy_loader.py", line 45, in _load
module = importlib.import_module(self.name)
File "/usr/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/init.py", line 3, in
from keras import internal
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/internal/init.py", line 3, in
from keras.internal import backend
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/internal/backend/init.py", line 3, in
from keras.src.backend import _initialize_variables as initialize_variables
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/src/init.py", line 21, in
from keras.src import models
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/src/models/init.py", line 18, in
from keras.src.engine.functional import Functional
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/src/engine/functional.py", line 25, in
from keras.src import backend
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/src/backend.py", line 34, in
from keras.src.dtensor import dtensor_api as dtensor
File "/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/keras/src/dtensor/init.py", line 22, in
from tensorflow.compat.v2.experimental import dtensor as dtensor_api
ImportError: cannot import name 'dtensor' from 'tensorflow.compat.v2.experimental' (/home/Carla-ppo-master/bitsauce/lib/python3.8/site-packages/tensorflow/_api/v2/compat/v2/experimental/init.py)

How to resolve this?

The training of VAE

I would like to ask about the VAE training, I only used your RGB data set and got the following data roughly, but will the loss be too big.

Also, I would like to ask about this plot obtained using vae_plots.py. Can I think of it as 64 is 64 features, and then the ordinate on the same scale is a visualization of different features on the same image? And what is this [-10,10] scale down here?

Thanks.@bitsauce

High-level command for RL agent

hi @bitsauce, in your model,the input state of RL agent is the vae output and measurement(steer,acc,break),but not include the high-level command,such as turn left,turn right,go straight,fllow the road.I want to add these high-level command to train RL agent,what should I do?

self.world.wait_for_tick(seconds=1.0/self.fps + 0.1) gets struck forever.

In the env reset ticks never increase more than zero and the loop runs forever without moving forward. Seems the wait_for _tick id not at all receiving any input back from the server I wonder why?

car stops for some episodes

I just got the car moving and start training. But i find most of the episodes, the car stops at the starting point, then a new episode would start. Do you know what may cause this? Thank you very much

Training Problem

Hi, bitsause.
I have trained a new agent in both synchronous and asynchronous environments, after training about 2k episodes, but I can't get any decent outcome as you said in your paper. The reward is floating as this:

So I carefully checked the training process. And I found some episodes stopped training before the car started running in both synchronous and asynchronous environments. Most of them got rewards: -10 as the data graph showing. I think they are invalid training. Especially, the frame of the server is just slightly ahead of the client's(around 40:30) in an asynchronous setting. This may be a hardware issue. But it can't explain all.
I trained agents in Town07, Carla 0.9.5.
Can you explain the reasons for me?