A one-stop package for different reinforcement learning environments

License: Other

Julia 98.34% Dockerfile 1.66%

hacktoberfest machine-learning reinforcement-learning reinforcement-learning-environments

reinforcementlearningenvironments.jl's Issues

Add a general grid world environment

JuliaReinforcementLearning/CommonRLInterface.jl#17 This just reminds me that we need to add a general grid world example similar to https://github.com/maximecb/gym-minigrid . (Though it's on my todo list for a really long time...)

So that, https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/issues/54 and Maze.jl can all rely on this.

Tried making a custom environment, the episode is not terminating

I tried making an OpenAI gym's frozen lake like environment (code attached):
frozenlake.zip

The episodes are able to terminate when I run:

RLBase.test_runnable!(Env)

but not when I run:

run(RandomPolicy(action_space(Env)), Env, StopAfterEpisode(1))

What could be the reason for this?

Incorrect type when sampling from Spaces

Hi, I was trying to sample random observations for the CartPole Env. I am specifying the type to be FLoat32 during initialization, but the returned observations are in Float64.

Here is an MWE:

julia> using ReinforcementLearningEnvironments, ReinforcementLearningBase

julia> env = CartPoleEnv(T=Float32)
CartPoleEnv{Float32}(gravity=9.8,masscart=1.0,masspole=0.1,totalmass=1.1,halflength=0.5,polemasslength=0.05,forcemag=10.0,tau=0.02,thetathreshold=0.20943952,xthreshold=2.4,max_steps=200)

julia> get_observation_space(env)
MultiContinuousSpace{Array{Float32,1}}(Float32[-4.8, -1.0f38, -0.41887903, -1.0f38], Float32[4.8, 1.0f38, 0.41887903, 1.0f38])

julia> rand(get_observation_space(env))
4-element Array{Float64,1}:
  2.005482937920169
 -9.852210595964816e37
  0.09814665975717973
  2.2353768849015113e37

Create a struct for observe result

A struct would be helpful to extract some specific fields from observation result. See example usage here: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/master/src/patches/ReinforcementLearningEnvironments.jl

Warning: Package ReinforcementLearningEnvironments does not have PyCall in its dependencies

┌ Warning: Package ReinforcementLearningEnvironments does not have PyCall in its dependencies:
│ - If you have ReinforcementLearningEnvironments checked out for development and have
│   added PyCall as a dependency but haven't updated your primary
│   environment's manifest file, try `Pkg.resolve()`.
│ - Otherwise you may need to report an issue with ReinforcementLearningEnvironments
└ Loading PyCall into ReinforcementLearningEnvironments from project dependency, future warnings for ReinforcementLearningEnvironments are suppressed.

Something's not quite right with how optional dependencies for OpenAI gym are set up. PyCall is declared in [extras] so I'm not sure what the issue is. I'll make a PR if I find it.

DefaultStateStyleEnv doesn't work in TicTacToeEnv

DefaultStateStyleEnv wrapper seems to have no impact on state() function.
Changes the return of DefaultStateStyle(E) function properly, but later returns first(ss) stateStyle

How to reproduce:

using ReinforcementLearning

env = TicTacToeEnv()

s_env = state(env)
println(s_env)

E = DefaultStateStyleEnv{Observation{Int}()}(env)  # error probably somewhere here
println("DefaultStateStyle(E) is ", DefaultStateStyle(E), " so it has changed")

s = state(E)
println("But state(E) return string: ", s)


s_E_int = state(E, Observation{Int}())
s_env_int = state(env, Observation{Int}())
println("These work properly: ", s_E_int, s_env_int)

returns:

...
...
...

DefaultStateStyle(E) is Observation{Int64}() so it has changed
But state(E) return string: ...
...
...

These work properly: 1 1

Add OpenSpiel

It seems not that complicated to support the newly released OpenSpiel, which uses pybind for the python wrapper.

Question about observation/states

Hello,
im trying writing a simple env to detect up/down of an array using dataframe.
it is extremely easy on python-gym but
but i have a question:
i dont understand how to return the observation, i have seen the guide and the cartpole example but i cant figure it out
does "RLBase.get_state(env::AcrobotEnv) " equal to the observation returned in gym's step?

i have just noticed that in the readme there: GymEnv | PyCall.jl
but no example on how to use it. (Can I just use a gym wrapped environment here? Would it slow down performance?)

also i wonder why the action space is missing in this example: https://juliareinforcementlearning.org/blog/how_to_write_a_customized_environment/

EDIT: found in an example

RLBase.observe(env::MazeEnv) =
(
reward = Float64(env.position == env.goal),
terminal = env.position == env.goal,
state = (env.position[2] - 1) * env.NX + env.position[1],
)
but i still dont understand how the state works and why the .observe isnt in any env

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Make `Observation` mutable?

Making Observation mutable can allow hooks to modify observations on the fly. But I'm not sure if it is a good design to do so.

Port unmerged environments from FluxML/Gym.jl#37 to this package

This pr adds 1)Algorithmic 2)Atari 3) Toy-text envs . I think the Atari part is already ported here (https://github.com/JuliaReinforcementLearning/ArcadeLearningEnvironment.jl). Maybe I should start with Toy-text envs, they seem to be simpler than the algorithmic ones. Correct me if I am wrong @findmyway . Thank you :)

remove rng in OpenSpielEnv without chance node

The rng in OpenSpielEnv without chance node is redundant. Better to set it to nothing.

Reduce dependencies

ReinforcementLearningEnvironments depends on

  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [7d9fca2a] + Arpack v0.4.0
  [68821587] + Arpack_jll v3.5.0+2
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2
  [e66e0078] + CompilerSupportLibraries_jll v0.2.0+1
  [3a865a2d] + CuArrays v1.7.2
  [9a962f9c] + DataAPI v1.1.0
  [864edb3b] + DataStructures v0.17.10
  [31c24e10] + Distributions v0.22.5
  [1a297f60] + FillArrays v0.8.5
  [0c68f7d7] + GPUArrays v2.0.1
  [28b8d3ca] + GR v0.47.0
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [e1d29d7a] + Missings v0.4.3
  [872c559c] + NNlib v0.6.6
  [4536629a] + OpenBLAS_jll v0.3.7+6
  [efe28fd5] + OpenSpecFun_jll v0.5.3+2
  [bac558e1] + OrderedCollections v1.1.0
  [90014a1f] + PDMats v0.9.11
  [1fd47b50] + QuadGK v2.3.1
  [e575027e] + ReinforcementLearningBase v0.6.5
  [25e41dd2] + ReinforcementLearningEnvironments v0.2.3
  [ae029012] + Requires v1.0.1
  [79098fc4] + Rmath v0.6.1
  [f50d1b31] + Rmath_jll v0.2.2+0
  [a2af1166] + SortingAlgorithms v0.3.1
  [276daf66] + SpecialFunctions v0.10.0
  [2913bbd2] + StatsBase v0.32.2
  [4c63d2b9] + StatsFuns v0.9.4
  [a759f4b9] + TimerOutputs v0.5.3

Base packages.

It's not outrageous, but if the dependency on the GPU stuff / NNLib could be taken out, it would make my package lighter...

Expectations on observation type

I'm thinking about making a bridge from POMDPs.jl to this package. Are there any expectations/requirements on what the output of observe is? Should it be an AbstractVector for example?

Add SnakeGame

Just created an interesting package for a tutorial:

https://github.com/JuliaReinforcementLearning/SnakeGames.jl

It can be used to test single player or zero-sum two player algorithms.

Inspired by https://agrishchenko.wixsite.com/snakesai/rules

Support CommonRLInterface

Though all the environments are written based on ReinforcementLearningBase.jl, it's relatively easy to support the minimal interfaces provided in CommonRLInterface.

struct CommonEnvWrapper{T<:RLBase.AbstractEnv} <: CommonRL.AbstractCommonEnv
     env::T
end

convert(CommonRL.AbstractCommonEnv, env::RLBase.AbstractEnv) = CommonEnvWrapper(env)

function CommonRL.step!(env::CommonEnvWrapper, action)
    env.env(action)
    obs = RLBase.observe(env.env)
    RLBase.get_state(obs), RLBase.get_reward(obs), RLBase.get_terminal(obs), obs
end

function CommonRL.reset!(env::CommonEnvWrapper)
    RLBase.reset!(env.env)
    obs = RLBase.observe(env.env)
    RLBase.get_state(obs), RLBase.get_reward(obs), RLBase.get_terminal(obs), obs
end

CommonRL.actions(env::CommonEnvWrapper) = RLBase.get_action_space(env)

What do you think? @zsunberg

Error while training a basic DQN in MountainCarEnv

Hi,

I am experiencing an error while trying to train a basic DQN on MountainCarEnv with a CircularArrayBuffer. It seems that some functions are not found by julia: update!(cb::CircularArrayBuffer{T,N}, data::AbstractArray), _buffer_frame(cb::CircularArrayBuffer, I::Vector{Int}) and function _buffer_frame(cb::CircularArrayBuffer, i::Int).

I managed to run my code by manually adding those functions to my julia file. Getting them from ReinforcementLearningCore.jl/src/utils/base.jl :

@inline function _buffer_frame(cb::CircularArrayBuffer, i::Int)
    n = capacity(cb)
    idx = cb.first + i - 1
    if idx > n
        idx - n
    else
        idx
    end
end

_buffer_frame(cb::CircularArrayBuffer, I::Vector{Int}) = map(i -> _buffer_frame(cb, i), I)

function RL.update!(cb::CircularArrayBuffer{T,N}, data::AbstractArray) where {T,N}
    select_last_dim(cb.buffer, _buffer_frame(cb, cb.length)) .= data
    cb
end

The strange stuff is that I do not experience this issue when I run the absolute same code using the CartPoleEnv.

I am having trouble understanding what is happening and I hope this will help pointing out something to improve in the source code.

ReinforcementLearningZoo.jl experiments

Some experiment listed in "List of built-in experiments" are not in the source code folder https://github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl/tree/master/src/experiments, like for example E`JuliaRL_VPG_Pendulum` , where can i find them?

MountainCarEnv: https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/blob/master/src/environments/classic_control/mountain_car.jl#L67

CartPoleEnv: https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/blob/master/src/environments/classic_control/cartpole.jl#L51

In CartPole, the parameters are directly converted to the chosen type but I can't find a line doing the same in MountainCarEnv.

Am I the only one to get this ? Hope that the suggestion can help fix this issue !

juliareinforcementlearning / reinforcementlearningenvironments.jl Goto Github PK

reinforcementlearningenvironments.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org