thomashirtz / gym-hybrid Goto Github PK

Collection of OpenAI parametrized action-space environments.

Python 100.00%

openai-gym openai-gym-environments hybrid-action-space parametrized-action-space parametrized hybrid reinforcement-learning reinforcement-learning-environments gym-hybrid

gym-hybrid's People

Contributors

Stargazers

Watchers

Forkers

paparazz1 sheldonresearch xueliu8617112 ddharshan spik3y4n9 asahazmy adysonmaia something-original jayscoder

gym-hybrid's Issues

Rendering Problem for gym>=0.25.0

Hi!

OpenAi gym remove the rendering.py for supporting pygame.
"from gym.envs.classic_control import rendering" would fail in environment.py.

openai/gym#2599

Confusion about the implementation of accelerate method in SlidingAgent

Hello, I am kind of confused about the implementation of the accelerate method at here
According to the following formula from stackexchange (as mentioned in L58),

to calculate the magnitude of the sum of to polar vectors, we need to know the magnitudes and angles of these two polar vectors, and I think self.speed and speed represent the magnitudes, but I don't understand why they're used to calculate the cosine value by np.cos(value - self.speed) (instead of something represents the angle). By the way, I don't quite understand the way you simulate inertia, from my physics knowledge, it may be necessary to use some variables to represent the 'acceleration' or 'derivative' which specifies how fast the speed changes, am I right ?

Also, I wonder what does "direction of the agent" mean at here and what's the difference between "direction of the agent" and "angle of the velocity vector".

Your timely response will be greatly appreciated.

PyPi Release

Hello, this is a really cool project! Would you be up to making a pypi release so that people can install via pip install gym-hybrid? It pins the version which will also help people reproduce results in the future :D

AssertionError

Hi,
I was trying to run the code in Jupyter Notebook. However, I got an error as AssertionError: Agent's minimum observation value is greater than it's maximum when i run the codes that are inside the test folder. Would it be possible you to suggest me any comments for this error???

Thank you

More test cases on hybrid env?

Hi, I would like to ask that whether you are considering adding test cases on the hybrid env with respect to the reward value? For example, test whether the reward value is correct in some special states (like moving towards the goal, or moving away from the goal or terminal states). I think it would greatly improve the robustness of the environment. 👍

Algorithm results about PDQN/HPPO in gym-hybrid

Hi, this is a nice project for hybrid action space, and I see you mentioned PDQN/HPPO in README.md. Do you have some experiment results about these algorithms in this environment. If not, we want to invite you to implement related algorithms and benchmarks in our repo DI-engine together, we will offer corresponding supports for you. Do you have will to construct a hybrid action space RL benchmark? Other comments are also welcome.

Improving action_space

Some algorithms such as Q-PAMDP needs to know the exact shape of all the action spaces. It would be nice to be able to implement an action space that can give this information. The issue is that in this environment, BREAK has no dimension, it is therefore not possible to copy the technique from https://github.com/cycraig/gym-platform

Need to investigate possible solutions. One may be to give a dummy low and high such as 0 and 0. However, this may break some reinforcement learning algorithms to give a 0 range

import numpy as np
from gym import spaces


ACCELERATE = 0
TURN = 1
BREAK = 2

action_id_to_domain = {
    ACCELERATE: {'low': [0.], 'high': [1.]},
    TURN: {'low': [-1.], 'high': [1.]},
    BREAK: {'low': [], 'high': []},
}

action_space = spaces.Tuple(
    (
        spaces.Discrete(len(action_id_to_domain)),
        spaces.Tuple(
            tuple(
                spaces.Box(low=np.array(d['low']), high=np.array(d['high']), dtype=np.float32)
                for _, d in sorted(action_id_to_domain.items())
            )
        )
    )
)

Shape of action space

Hi,
I am trying to define interval in between continuous action space as self.action_space = Tuple((Discrete(2), Box(-10, 10, (2,)))), in this parameterized action space. could u please tell me how to define in your gym- environment?

parameters_min = np.array([0, -1])
parameters_max = np.array([1, +1])

    self.action_space = spaces.Tuple((spaces.Discrete(3),
                                      spaces.Box(parameters_min, parameters_max)))

I defined as

self.action_space = spaces.Tuple((spaces.Discrete(3),spaces.Box(parameters_min, parameters_max, shape=(0.1, 0.1))))

but got the error as low shape doesn’t match provided shape

thomashirtz / gym-hybrid Goto Github PK

gym-hybrid's People

Contributors

Stargazers

Watchers

Forkers

gym-hybrid's Issues

Rendering Problem for gym>=0.25.0

Confusion about the implementation of accelerate method in SlidingAgent

PyPi Release

AssertionError

More test cases on hybrid env?

Algorithm results about PDQN/HPPO in gym-hybrid

Improving action_space

Shape of action space

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent