A deep reinforcement learning algorithm is used to solve the reacher benchmark problem.
The Branching Dueling Q-Network (BDQ) is a branching variant of the Dueling Double Deep Q-Network (Dueling DDQN). The code was adapted (and simplified) from here. The paper can be found here.
2 DOF reacher after 20,000 training episodes:
6 DOF reacher after 100,000 training episodes:
python 3.7.3
gym 0.9.1
tensorflow 1.14.0
mujoco_py 0.5.7
git clone https://github.com/PierreExeter/BDQ_for_reacher.git
Run the script train_continuous.py
Open the script enjoy_continuous.py
Select the trained model from the trained_models
folder.
Run the script enjoy_continuous.py