Git Product home page Git Product logo

dsac's Introduction

DSAC

Implementation of Distributional Soft Actor Critic (DSAC). This repository is based on RLkit, a reinforcement learning framework implemented by PyTorch. The core algorithm of DSAC is in rlkit/torch/dsac/

Requirements

  • python 3.6+
  • pytorch 1.0+
  • gym[all] 0.15+
  • scipy 1.0+
  • numpy
  • matplotlib
  • gtimer
  • pyyaml

Usage

You can write your experiment settings in YAML and run with

python dsac.py --config your_config.yaml --gpu 0 --seed 0

To run our implementation of SAC/TD3/TD4, please replace dsac.py with sac.py/td3.py/td4.py. Set --gpu -1, your program will run on CPU.

The experimental configurations of the paper are in config/. A typical configuration in YAML is given as follow:

env: Hopper-v2
version: normal-iqn-neutral # version for logging
eval_env_num: 10 # # of paralleled environments for evaluation
expl_env_num: 10 # of paralleled environments for exploration
layer_size: 256 # hidden size of networks
num_quantiles: 32
replay_buffer_size: 1000000
algorithm_kwargs:
  batch_size: 256
  max_path_length: 1000
  min_num_steps_before_training: 10000
  num_epochs: 1000
  num_eval_paths_per_epoch: 10
  num_expl_steps_per_train_loop: 1000
  num_trains_per_train_loop: 1000
trainer_kwargs:
  alpha: 0.2
  discount: 0.99
  policy_lr: 0.0003
  zf_lr: 0.0003
  soft_target_tau: 0.005
  tau_type: iqn # quantile fraction generation method, choices: fix, iqn, fqf
  use_automatic_entropy_tuning: false

Learning under risk measures is available for DSAC and TD4. We provide 6 choices of risk metrics: neutral, std, VaR, cpw, wang, cvar. You can change the risk preference by add two additional items in your YAML config:

...

trainer_kwargs:
  ...
  risk_type: std
  risk_param: 0.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.