A one-stop package for different reinforcement learning environments

License: Other

Julia 98.34% Dockerfile 1.66%

hacktoberfest machine-learning reinforcement-learning reinforcement-learning-environments

reinforcementlearningenvironments.jl's People

Contributors

Stargazers

Watchers

Forkers

findmyway alainchau zsunberg aterenin cstjean jonas-eschmann sid-bhatia-0 rishabhvarshney14 sriram13m zeta1999 kharyal

reinforcementlearningenvironments.jl's Issues

Issue while using Float32 as type for the MountainCar env

I am trying to use Float32 as type for the MountainCarEnv (basically stating that argument T = Float32) and I am having an issue.

It's seems that CartPoleEnv, which is very similar to MountainCarEnv, works perfectly with different types. Thus this might come from the differences in the definition of both environments:

MountainCarEnv: https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/blob/master/src/environments/classic_control/mountain_car.jl#L67

CartPoleEnv: https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/blob/master/src/environments/classic_control/cartpole.jl#L51

In CartPole, the parameters are directly converted to the chosen type but I can't find a line doing the same in MountainCarEnv.

Am I the only one to get this ? Hope that the suggestion can help fix this issue !

Implement Acrobot

See gym implementation and how we typically implement these classic examples here.

Add SnakeGame

Just created an interesting package for a tutorial:

https://github.com/JuliaReinforcementLearning/SnakeGames.jl

It can be used to test single player or zero-sum two player algorithms.

Inspired by https://agrishchenko.wixsite.com/snakesai/rules

DefaultStateStyleEnv doesn't work in TicTacToeEnv

DefaultStateStyleEnv wrapper seems to have no impact on state() function.
Changes the return of DefaultStateStyle(E) function properly, but later returns first(ss) stateStyle

How to reproduce:

using ReinforcementLearning

env = TicTacToeEnv()

s_env = state(env)
println(s_env)

E = DefaultStateStyleEnv{Observation{Int}()}(env)  # error probably somewhere here
println("DefaultStateStyle(E) is ", DefaultStateStyle(E), " so it has changed")

s = state(E)
println("But state(E) return string: ", s)


s_E_int = state(E, Observation{Int}())
s_env_int = state(env, Observation{Int}())
println("These work properly: ", s_E_int, s_env_int)

returns:

...
...
...

DefaultStateStyle(E) is Observation{Int64}() so it has changed
But state(E) return string: ...
...
...

These work properly: 1 1

Should `seed!` be part of the interface?

Should seed! be part of this interface to allow for easily reproducing exact trajectories? Or is there another standard way of seeding the environment's rng?

remove rng in OpenSpielEnv without chance node

The rng in OpenSpielEnv without chance node is redundant. Better to set it to nothing.

Expectations on observation type

I'm thinking about making a bridge from POMDPs.jl to this package. Are there any expectations/requirements on what the output of observe is? Should it be an AbstractVector for example?

Error while training a basic DQN in MountainCarEnv

Hi,

I am experiencing an error while trying to train a basic DQN on MountainCarEnv with a CircularArrayBuffer. It seems that some functions are not found by julia: update!(cb::CircularArrayBuffer{T,N}, data::AbstractArray), _buffer_frame(cb::CircularArrayBuffer, I::Vector{Int}) and function _buffer_frame(cb::CircularArrayBuffer, i::Int).

I managed to run my code by manually adding those functions to my julia file. Getting them from ReinforcementLearningCore.jl/src/utils/base.jl :

@inline function _buffer_frame(cb::CircularArrayBuffer, i::Int)
    n = capacity(cb)
    idx = cb.first + i - 1
    if idx > n
        idx - n
    else
        idx
    end
end

_buffer_frame(cb::CircularArrayBuffer, I::Vector{Int}) = map(i -> _buffer_frame(cb, i), I)

function RL.update!(cb::CircularArrayBuffer{T,N}, data::AbstractArray) where {T,N}
    select_last_dim(cb.buffer, _buffer_frame(cb, cb.length)) .= data
    cb
end

The strange stuff is that I do not experience this issue when I run the absolute same code using the CartPoleEnv.

I am having trouble understanding what is happening and I hope this will help pointing out something to improve in the source code.

what happened to `render` function?

I see the api has changed.

Add OpenSpiel

It seems not that complicated to support the newly released OpenSpiel, which uses pybind for the python wrapper.

Question about observation/states

Hello,
im trying writing a simple env to detect up/down of an array using dataframe.
it is extremely easy on python-gym but
but i have a question:
i dont understand how to return the observation, i have seen the guide and the cartpole example but i cant figure it out
does "RLBase.get_state(env::AcrobotEnv) " equal to the observation returned in gym's step?

i have just noticed that in the readme there: GymEnv | PyCall.jl
but no example on how to use it. (Can I just use a gym wrapped environment here? Would it slow down performance?)

also i wonder why the action space is missing in this example: https://juliareinforcementlearning.org/blog/how_to_write_a_customized_environment/

EDIT: found in an example

RLBase.observe(env::MazeEnv) =
(
reward = Float64(env.position == env.goal),
terminal = env.position == env.goal,
state = (env.position[2] - 1) * env.NX + env.position[1],
)
but i still dont understand how the state works and why the .observe isnt in any env

Create a struct for observe result

A struct would be helpful to extract some specific fields from observation result. See example usage here: https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/master/src/patches/ReinforcementLearningEnvironments.jl

Tried making a custom environment, the episode is not terminating

I tried making an OpenAI gym's frozen lake like environment (code attached):
frozenlake.zip

The episodes are able to terminate when I run:

RLBase.test_runnable!(Env)

but not when I run:

run(RandomPolicy(action_space(Env)), Env, StopAfterEpisode(1))

What could be the reason for this?

Incorrect type when sampling from Spaces

Hi, I was trying to sample random observations for the CartPole Env. I am specifying the type to be FLoat32 during initialization, but the returned observations are in Float64.

Here is an MWE:

julia> using ReinforcementLearningEnvironments, ReinforcementLearningBase

julia> env = CartPoleEnv(T=Float32)
CartPoleEnv{Float32}(gravity=9.8,masscart=1.0,masspole=0.1,totalmass=1.1,halflength=0.5,polemasslength=0.05,forcemag=10.0,tau=0.02,thetathreshold=0.20943952,xthreshold=2.4,max_steps=200)

julia> get_observation_space(env)
MultiContinuousSpace{Array{Float32,1}}(Float32[-4.8, -1.0f38, -0.41887903, -1.0f38], Float32[4.8, 1.0f38, 0.41887903, 1.0f38])

julia> rand(get_observation_space(env))
4-element Array{Float64,1}:
  2.005482937920169
 -9.852210595964816e37
  0.09814665975717973
  2.2353768849015113e37

Port unmerged environments from FluxML/Gym.jl#37 to this package

This pr adds 1)Algorithmic 2)Atari 3) Toy-text envs . I think the Atari part is already ported here (https://github.com/JuliaReinforcementLearning/ArcadeLearningEnvironment.jl). Maybe I should start with Toy-text envs, they seem to be simpler than the algorithmic ones. Correct me if I am wrong @findmyway . Thank you :)

Reduce dependencies

ReinforcementLearningEnvironments depends on

  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [7d9fca2a] + Arpack v0.4.0
  [68821587] + Arpack_jll v3.5.0+2
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2
  [e66e0078] + CompilerSupportLibraries_jll v0.2.0+1
  [3a865a2d] + CuArrays v1.7.2
  [9a962f9c] + DataAPI v1.1.0
  [864edb3b] + DataStructures v0.17.10
  [31c24e10] + Distributions v0.22.5
  [1a297f60] + FillArrays v0.8.5
  [0c68f7d7] + GPUArrays v2.0.1
  [28b8d3ca] + GR v0.47.0
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [e1d29d7a] + Missings v0.4.3
  [872c559c] + NNlib v0.6.6
  [4536629a] + OpenBLAS_jll v0.3.7+6
  [efe28fd5] + OpenSpecFun_jll v0.5.3+2
  [bac558e1] + OrderedCollections v1.1.0
  [90014a1f] + PDMats v0.9.11
  [1fd47b50] + QuadGK v2.3.1
  [e575027e] + ReinforcementLearningBase v0.6.5
  [25e41dd2] + ReinforcementLearningEnvironments v0.2.3
  [ae029012] + Requires v1.0.1
  [79098fc4] + Rmath v0.6.1
  [f50d1b31] + Rmath_jll v0.2.2+0
  [a2af1166] + SortingAlgorithms v0.3.1
  [276daf66] + SpecialFunctions v0.10.0
  [2913bbd2] + StatsBase v0.32.2
  [4c63d2b9] + StatsFuns v0.9.4
  [a759f4b9] + TimerOutputs v0.5.3

Base packages.

It's not outrageous, but if the dependency on the GPU stuff / NNLib could be taken out, it would make my package lighter...

ReinforcementLearningZoo.jl experiments

Some experiment listed in "List of built-in experiments" are not in the source code folder https://github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl/tree/master/src/experiments, like for example E`JuliaRL_VPG_Pendulum` , where can i find them?

Warning: Package ReinforcementLearningEnvironments does not have PyCall in its dependencies

┌ Warning: Package ReinforcementLearningEnvironments does not have PyCall in its dependencies:
│ - If you have ReinforcementLearningEnvironments checked out for development and have
│   added PyCall as a dependency but haven't updated your primary
│   environment's manifest file, try `Pkg.resolve()`.
│ - Otherwise you may need to report an issue with ReinforcementLearningEnvironments
└ Loading PyCall into ReinforcementLearningEnvironments from project dependency, future warnings for ReinforcementLearningEnvironments are suppressed.

Something's not quite right with how optional dependencies for OpenAI gym are set up. PyCall is declared in [extras] so I'm not sure what the issue is. I'll make a PR if I find it.

Add a general grid world environment

JuliaReinforcementLearning/CommonRLInterface.jl#17 This just reminds me that we need to add a general grid world example similar to https://github.com/maximecb/gym-minigrid . (Though it's on my todo list for a really long time...)

So that, https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/issues/54 and Maze.jl can all rely on this.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Make `Observation` mutable?

Making Observation mutable can allow hooks to modify observations on the fly. But I'm not sure if it is a good design to do so.

juliareinforcementlearning / reinforcementlearningenvironments.jl Goto Github PK

reinforcementlearningenvironments.jl's People

Contributors

Stargazers

Watchers

Forkers

reinforcementlearningenvironments.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org