The deeppde_actorcritic from jerome-cong

deeppde_actorcritic's Introduction

Accompanying code for Actor-Critic Method for High Dimensional Static Hamilton--Jacobi--Bellman Partial Differential Equations based on Neural Networks, using actor-critic method to solve the HJB equations. The code is written on Tensorflow 2.0.

Run the following command to solve the HJB equation directly:

python main.py --config_path=configs/lqr_d5.json

Names of config files: "lqr" denotes the linear quadratic regulator; "vdp" denotes the stochastic Van Der Pol oscilator; "ekn" denotes the diffusive Eikonal equation; "lqr_var" denotes the linear quadratic regulator with a non-constant diffusion coefficient.

Experiments in the paper	Config names
Linear quadratic regulator (Figure 2)	lqr_d5.json, lqr_d10.json, lqr_d20.json
Stochastic Van Der Pol oscilator (Figure 3)	vdp_d4.json, vdp_d10.json, vdp_d20.json
Diffusive Eikonal equation (Figure 4)	ekn_d5.json, ekn_d10.json, ekn_d20.json
Linear quadratic regulator with non-constant diffusion	lqr_var_d5.json, lqr_var_d10.json, lqr_var_d20.json

Fileds in config files "sample": "normal" means sampling Brownian increments with normal distribution and "bounded" means bounded sample. "scheme": "naive" means using the naive scheme in the paper and "adaptive" means using the stepsize adaptive scheme. "TD": "TD1" means using the variance-reduced least square temporal difference (VR-LSTD) and "TD2" means using least square temporal difference (LSTD). "train": "actor-critic" means training both the value function and the control, "actor" means only training the control (given the correct value function), and "critic" means training only the value function (given the correct control).

Recommend Projects

jerome-cong / deeppde_actorcritic Goto Github PK

deeppde_actorcritic's Introduction

deeppde_actorcritic's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent