Git Product home page Git Product logo

reinforcementlearningenvironments.jl's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

reinforcementlearningenvironments.jl's Issues

DefaultStateStyleEnv doesn't work in TicTacToeEnv

DefaultStateStyleEnv wrapper seems to have no impact on state() function.
Changes the return of DefaultStateStyle(E) function properly, but later returns first(ss) stateStyle

How to reproduce:

using ReinforcementLearning

env = TicTacToeEnv()

s_env = state(env)
println(s_env)

E = DefaultStateStyleEnv{Observation{Int}()}(env)  # error probably somewhere here
println("DefaultStateStyle(E) is ", DefaultStateStyle(E), " so it has changed")

s = state(E)
println("But state(E) return string: ", s)


s_E_int = state(E, Observation{Int}())
s_env_int = state(env, Observation{Int}())
println("These work properly: ", s_E_int, s_env_int)

returns:

...
...
...

DefaultStateStyle(E) is Observation{Int64}() so it has changed
But state(E) return string: ...
...
...

These work properly: 1 1

Incorrect type when sampling from Spaces

Hi, I was trying to sample random observations for the CartPole Env. I am specifying the type to be FLoat32 during initialization, but the returned observations are in Float64.

Here is an MWE:

julia> using ReinforcementLearningEnvironments, ReinforcementLearningBase

julia> env = CartPoleEnv(T=Float32)
CartPoleEnv{Float32}(gravity=9.8,masscart=1.0,masspole=0.1,totalmass=1.1,halflength=0.5,polemasslength=0.05,forcemag=10.0,tau=0.02,thetathreshold=0.20943952,xthreshold=2.4,max_steps=200)

julia> get_observation_space(env)
MultiContinuousSpace{Array{Float32,1}}(Float32[-4.8, -1.0f38, -0.41887903, -1.0f38], Float32[4.8, 1.0f38, 0.41887903, 1.0f38])

julia> rand(get_observation_space(env))
4-element Array{Float64,1}:
  2.005482937920169
 -9.852210595964816e37
  0.09814665975717973
  2.2353768849015113e37

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Make `Observation` mutable?

Making Observation mutable can allow hooks to modify observations on the fly. But I'm not sure if it is a good design to do so.

Question about observation/states

Hello,
im trying writing a simple env to detect up/down of an array using dataframe.
it is extremely easy on python-gym but
but i have a question:
i dont understand how to return the observation, i have seen the guide and the cartpole example but i cant figure it out
does "RLBase.get_state(env::AcrobotEnv) " equal to the observation returned in gym's step?

i have just noticed that in the readme there: GymEnv | PyCall.jl
but no example on how to use it. (Can I just use a gym wrapped environment here? Would it slow down performance?)

also i wonder why the action space is missing in this example: https://juliareinforcementlearning.org/blog/how_to_write_a_customized_environment/

EDIT: found in an example

RLBase.observe(env::MazeEnv) =
(
reward = Float64(env.position == env.goal),
terminal = env.position == env.goal,
state = (env.position[2] - 1) * env.NX + env.position[1],
)
but i still dont understand how the state works and why the .observe isnt in any env

Issue while using Float32 as type for the MountainCar env

I am trying to use Float32 as type for the MountainCarEnv (basically stating that argument T = Float32) and I am having an issue.

It's seems that CartPoleEnv, which is very similar to MountainCarEnv, works perfectly with different types. Thus this might come from the differences in the definition of both environments:

MountainCarEnv: https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/blob/master/src/environments/classic_control/mountain_car.jl#L67

CartPoleEnv: https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/blob/master/src/environments/classic_control/cartpole.jl#L51

In CartPole, the parameters are directly converted to the chosen type but I can't find a line doing the same in MountainCarEnv.

Am I the only one to get this ? Hope that the suggestion can help fix this issue !

Reduce dependencies

ReinforcementLearningEnvironments depends on

  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [7d9fca2a] + Arpack v0.4.0
  [68821587] + Arpack_jll v3.5.0+2
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2
  [e66e0078] + CompilerSupportLibraries_jll v0.2.0+1
  [3a865a2d] + CuArrays v1.7.2
  [9a962f9c] + DataAPI v1.1.0
  [864edb3b] + DataStructures v0.17.10
  [31c24e10] + Distributions v0.22.5
  [1a297f60] + FillArrays v0.8.5
  [0c68f7d7] + GPUArrays v2.0.1
  [28b8d3ca] + GR v0.47.0
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [e1d29d7a] + Missings v0.4.3
  [872c559c] + NNlib v0.6.6
  [4536629a] + OpenBLAS_jll v0.3.7+6
  [efe28fd5] + OpenSpecFun_jll v0.5.3+2
  [bac558e1] + OrderedCollections v1.1.0
  [90014a1f] + PDMats v0.9.11
  [1fd47b50] + QuadGK v2.3.1
  [e575027e] + ReinforcementLearningBase v0.6.5
  [25e41dd2] + ReinforcementLearningEnvironments v0.2.3
  [ae029012] + Requires v1.0.1
  [79098fc4] + Rmath v0.6.1
  [f50d1b31] + Rmath_jll v0.2.2+0
  [a2af1166] + SortingAlgorithms v0.3.1
  [276daf66] + SpecialFunctions v0.10.0
  [2913bbd2] + StatsBase v0.32.2
  [4c63d2b9] + StatsFuns v0.9.4
  [a759f4b9] + TimerOutputs v0.5.3
  • Base packages.

It's not outrageous, but if the dependency on the GPU stuff / NNLib could be taken out, it would make my package lighter...

Add OpenSpiel

It seems not that complicated to support the newly released OpenSpiel, which uses pybind for the python wrapper.

Warning: Package ReinforcementLearningEnvironments does not have PyCall in its dependencies

┌ Warning: Package ReinforcementLearningEnvironments does not have PyCall in its dependencies:
│ - If you have ReinforcementLearningEnvironments checked out for development and have
│   added PyCall as a dependency but haven't updated your primary
│   environment's manifest file, try `Pkg.resolve()`.
│ - Otherwise you may need to report an issue with ReinforcementLearningEnvironments
└ Loading PyCall into ReinforcementLearningEnvironments from project dependency, future warnings for ReinforcementLearningEnvironments are suppressed.

Something's not quite right with how optional dependencies for OpenAI gym are set up. PyCall is declared in [extras] so I'm not sure what the issue is. I'll make a PR if I find it.

Expectations on observation type

I'm thinking about making a bridge from POMDPs.jl to this package. Are there any expectations/requirements on what the output of observe is? Should it be an AbstractVector for example?

Error while training a basic DQN in MountainCarEnv

Hi,

I am experiencing an error while trying to train a basic DQN on MountainCarEnv with a CircularArrayBuffer. It seems that some functions are not found by julia: update!(cb::CircularArrayBuffer{T,N}, data::AbstractArray), _buffer_frame(cb::CircularArrayBuffer, I::Vector{Int}) and function _buffer_frame(cb::CircularArrayBuffer, i::Int).

I managed to run my code by manually adding those functions to my julia file. Getting them from ReinforcementLearningCore.jl/src/utils/base.jl :

@inline function _buffer_frame(cb::CircularArrayBuffer, i::Int)
    n = capacity(cb)
    idx = cb.first + i - 1
    if idx > n
        idx - n
    else
        idx
    end
end

_buffer_frame(cb::CircularArrayBuffer, I::Vector{Int}) = map(i -> _buffer_frame(cb, i), I)

function RL.update!(cb::CircularArrayBuffer{T,N}, data::AbstractArray) where {T,N}
    select_last_dim(cb.buffer, _buffer_frame(cb, cb.length)) .= data
    cb
end

The strange stuff is that I do not experience this issue when I run the absolute same code using the CartPoleEnv.

I am having trouble understanding what is happening and I hope this will help pointing out something to improve in the source code.

Support CommonRLInterface

Though all the environments are written based on ReinforcementLearningBase.jl, it's relatively easy to support the minimal interfaces provided in CommonRLInterface.

struct CommonEnvWrapper{T<:RLBase.AbstractEnv} <: CommonRL.AbstractCommonEnv
     env::T
end

convert(CommonRL.AbstractCommonEnv, env::RLBase.AbstractEnv) = CommonEnvWrapper(env)

function CommonRL.step!(env::CommonEnvWrapper, action)
    env.env(action)
    obs = RLBase.observe(env.env)
    RLBase.get_state(obs), RLBase.get_reward(obs), RLBase.get_terminal(obs), obs
end

function CommonRL.reset!(env::CommonEnvWrapper)
    RLBase.reset!(env.env)
    obs = RLBase.observe(env.env)
    RLBase.get_state(obs), RLBase.get_reward(obs), RLBase.get_terminal(obs), obs
end

CommonRL.actions(env::CommonEnvWrapper) = RLBase.get_action_space(env)

What do you think? @zsunberg

Should `seed!` be part of the interface?

Should seed! be part of this interface to allow for easily reproducing exact trajectories? Or is there another standard way of seeding the environment's rng?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.