bluerivertech / quanser-openai-driver Goto Github PK

OpenAI Gym wrapper for the Quanser Qube and Quanser Aero

License: MIT License

Python 92.32% Cython 7.68%

openai-gym machine-learning reinforcement-learning reinforcement-learning-playground deep-learning deeplearning openai-gym-environments openai-gym-environment openai-universe robotics

quanser-openai-driver's Introduction

Quanser OpenAI Driver

Has an OpenAI Gym wrapper for the Quanser Qube Servo 2 and Quanser Aero

Setup
Basic Usage
Warning

Setup

We have tested on Ubuntu 16.04 LTS and Ubuntu 18.04 LTS using Python 2.7 and Python 3.6.5

Prerequisites

Install the HIL SDK from Quanser.
A mirror is available at https://github.com/quanser/hil_sdk_linux_x86_64.

You can install the driver by:

    git clone https://github.com/quanser/hil_sdk_linux_x86_64.git
    sudo chmod a+x ./hil_sdk_linux_x86_64/setup_hil_sdk ./hil_sdk_linux_x86_64/uninstall_hil_sdk
    sudo ./hil_sdk_linux_x86_64/setup_hil_sdk

You also must have pip installed:

    sudo apt-get install python3-pip

Installation

We recommend that you use a virtual environment such as conda (recommended), virtualenv, or Pipenv

You can install the driver by cloning and pip-installing:

    git clone https://github.com/BlueRiverTech/quanser-openai-driver.git
    cd quanser-openai-driver
    pip3 install -e .

Once you have that setup: Run the classical control baseline (ensure the Qube is connected to your computer)

python tests/test.py --env QubeSwingupEnv --controller flip

Usage

Usage is very similar to most OpenAI gym environments but requires that you close the environment when finished. Without safely closing the Env, bad things may happen. Usually you will not be able to reopen the board.

This can be done with a context manager using a with statement

import gym
from gym_brt import QubeSwingupEnv

num_episodes = 10
num_steps = 250

with QubeSwingupEnv() as env:
    for episode in range(num_episodes):
        state = env.reset()
        for step in range(num_steps):
            action = env.action_space.sample()
            state, reward, done, _ = env.step(action)

Or can be closed manually by using env.close(). You can see an example here.

Environments

Information about various environments can be found in docs/envs and our whitepaper.

Control

Information about baselines can be found in docs/control.

Hardware Wrapper

Information about the Python wrapper for Quanser hardware and Qube Servo 2 simulator can be found in docs/quanser and our whitepaper.

Citing

If you use this in your research please cite the following whitepaper:

@misc{2001.02254,
  author = {{Polzounov}, Kirill and {Sundar}, Ramitha and {Redden}, Lee},
  title = "{Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware}",
  year = {2019},
  eprint = {arXiv:2001.02254},
  howpublished = {Accepted at the Workshop on Deep Reinforcement Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.}
}

quanser-openai-driver's People

Contributors

Stargazers

Watchers

Forkers

tinker495 zuzuba journeyman-msft allenjseb abelvdavid bjoseru cobv beyondlosdrones michaelschlafke mtippmann

quanser-openai-driver's Issues

No controllers work in the simulator

Hello!

I am using Ubuntu 18.04 with Python 3.8.0, and managed to use the flip-controller with the QUBE 2 hardware succesfully. However, when I try to use it with the simulator (by running python tests/test.py --env QubeSwingupEnv --controller flip -s -r), it seems like the controller just immediately swings the arm all the way to the left or right, ending the episode. The same is seen when using other controllers such as "sw".

The simulator still seems to work on its own, though; I can train RL agents against it and get successful results. But there seems to be a deviation between the simulator and hardware due to the controllers not working.

Am I doing something wrong? Could it be that I am using incorrect versions of the required Python modules?

Thanks in advance!
Best Regards

PPO Crash and Freeze when Reset

Hi,
first of all, thanks a lot for providing this interface!

I am also trying to apply the baseline algorithm for PPO on the pendulum. Did you also encounter problems with the interface then? At the end of an episode the rotary arm sometimes crashes against the encoder cable and the program freezes in the reset function when trying to apply a voltage to the Qube. Could not find a solution for this. Did you have similar issues and how did you solve them?

Thanks so much!

recreating simulink schemes for ubuntu 20.4.

Hi, i would like to create simulink blocks from your quanser openai driver. I know there are already block schemes for simulink from quanser: https://quanserinc.box.com/shared/static/08vjgurb59omat6s1xd9u42m1xhko199.zip
Quarc is needed to run those and since quarc is not available on ubuntu I would like to recreate those for simulink with quanser openai driver. I already sucessfully run 1 of yours examples as Sfunction from matlab with python:
import gym
from gym_brt import QubeSwingupEnv

num_episodes = 10
num_steps = 250

with QubeSwingupEnv() as env:
for episode in range(num_episodes):
state = env.reset()
for step in range(num_steps):
action = env.action_space.sample()
state, reward, done, _ = env.step(action)

How can i create and run basic experiments like those examples in hil sdk written in c using your driver? Is it even possible? I couldn't find any basic example like those in hil sdk.

Weird noise

Hi, i have qube servo 2 and ubuntu 20.4 server. I did ewrything according to setup and i could run examples from hil_sdk for qube servo 2 without any issue and sound or noise. When i tried command "python3 tests/test.py --env QubeSwingupEnv --controller flip" my qube 2 started to make weird sound so i quickly shut it off. Pendulum was moving but noise scared me so i rather shut it off. I am new in this and i want to be able to run experiments online on qube connected to server. Should it make that noise ? Did i do something wrong? I tried to run that command twice but it was still making the sound. I can still run examples from hil_sdk without problem. Thanks for answer.

Simulation different from hardware

Hi!
I am using the simulator to collect preliminary data before moving to the hardware.
I started working on the master branch when I realized that there is a simulator_debug branch. There, in test/check_simulator.py the test for the flip up and hold control returns a different results when using the simulator and the hardware (the hardware stabilizes at the top equilibrium while this does not happen in simulation).
Do you have an idea of what may be causing this problem?

Thanks a lot for your help.