Git Product home page Git Product logo

dd_qnet's Introduction

Double Dueling Q Net

running

Table of contents

Requirements

Basic requirements:

Environment setup

The gym-gazebo environments can be found here.

  • Copy custom folder to gym-gazebo/gym_gazebo/envs.
  • Copy launch, models, worlds to gym-gazebo/gym_gazebo/envs/assets and skip duplicated files.
  • Replace __init__.py file in the gym-gazebo/gym_gazebo folder with the __init__.py file here.
  • Replace the line <arg name="world_file" default="/home/cloud/gym-gazebo/gym_gazebo/envs/assets/worlds/maze_color.world"/> in the gym-gazebo/gym_gazebo/envs/assets/launch/MazeColor.launch file with your own path to the maze_color.world file

Project structure

Environments

The environment files used for different training and tesing situation, they are all stored in custom_envs folder. To use one environment, copy the code in gazebo_turtlebot_maze_color*.py to gazebo_turtlebot_maze_color.py in gym-gazebo/gym_gazebo/envs/custom List environments:

  • gazebo_turtlebot_maze_color.py: enviroment for training CNN model which use both image and laser as state input. To change target position, change line 52 self.num_target = 1 to another index (from 0 to 2).

evn1

  • gazebo_turtlebot_maze_color_laser_only.py: enviroment for training laser model. The environment have 5 target position to be random at the start or after one episode end.

env2

  • gazebo_turtlebot_maze_color_laser_only_ver2.py: environment for training laser model. The environment have 5 hint at 5 corner of the maze, the laser model can learn to go to the corner and turn back.

evn3

  • gazebo_turtlebot_maze_color_laser-image.py: environment for testing laser model with the image processing strategy. It have 5 different reward and hints positions which can be change by line 59 self.num_target = 1 in the code to another index (from 0 to 4).

evn4

Models

All the main codes are stored in src folder We have 2 type of model:

  • ddq_model.py: the model used CNN architecture and image with laser as input state. The model use the architecture proposed in this paper
  • laser_model.py: The model use DNN and laser as input state.

Usage

Image CNN model

Replace code in gym-gazebo/.../custom/gazebo_turtlebot_maze_color.py with the code in gazebo_turtlebot_maze_color.py to use this model.

Train:

python qlearning.py

For more parameters: python qlearning --help

Test model: python test.py <from_pretrain_dir> <epsilon>

python test.py ddq_model 0.01

DNN laser model

Replace code in gym-gazebo/.../custom/gazebo_turtlebot_maze_color.py with the code in gazebo_turtlebot_maze_color_laser_only.py or gazebo_turtlebot_maze_color_laser_only_ver2.py to use this model. Train:

python laser_learning.py

For more parameters: python qlearning --help

Test model: python test_laser_only.py <from_pretrain_dir> <epsilon>

python test_laser_only.py laser-only 0.1

Use laser model with hint detect strategy

Replace code in gym-gazebo/.../custom/gazebo_turtlebot_maze_color.py with the code in gazebo_turtlebot_maze_color_laser-image.py to use this model.

Run: python test_laser_image.py <from_pretrain_dir> <epsilon>

python test_laser_image.py laser-only 0.0

Pretrain models

Some of our pretrain model can be found here

It can be use as pretrain model or continue to train with the parameters --from_pretrain or --continue_from in the learning files. For examples: python laser_learn --continue_from laser-only --output_dir laser-only

dd_qnet's People

Contributors

mattbui avatar tailongnguyen avatar

Stargazers

 avatar  avatar  avatar  avatar

dd_qnet's Issues

Enviroment for training laser model (gazebo_turtlebot_maze_color_laser_only)

Hi, when I train the laser model by running gazebo_turtlebot_maze_color_laser_only.py, I have an error.
In utils.py, "parser.add_argument("--num_pretrain_step", type=int, default=10000)". This definition shows that the first 10,000 steps are all randomly selecting actions. When it exceeds 10,000, it starts to select actions from the network.
But after the steps exceed 10000, I encountered the following error in terminal:

`Done epoch in 10 steps, 10 random steps, Total reward: -191, Total step: 4981, Average loss: 0.0
Setting target 1th
Traceback (most recent call last):

File "laser_learn.py", line 48, in
action = qnet.get_actions(state.reshape(1, -1))[0]
File "/home/nuc/gym-gazebo/dd_qnet/src/laser_model.py", line 55, in get_actions
qvalues = self.get_qvalues(states)
File "/home/nuc/gym-gazebo/dd_qnet/src/laser_model.py", line 47, in get_qvalues
predicted = self.model.predict(states)
File "/home/nuc/.local/lib/python3.5/site-packages/keras/engine/training.py", line 1152, in predict
x, _, _ = self._standardize_user_data(x)
File "/home/nuc/.local/lib/python3.5/site-packages/keras/engine/training.py", line 754, in _standardize_user_data
exception_prefix='input')
File "/home/nuc/.local/lib/python3.5/site-packages/keras/engine/training_utils.py", line 136, in standardize_input_data
str(data_shape))

ValueError: Error when checking input: expected input_1 to have shape (100,) but got array with shape (1,)`

Looking forward to your reply, thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.