<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="64

Design The original design in <a href="https://github.com/maximecb

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Comments (2)

findmyway commented on June 30, 2024

Design

The original design in gym-minigrid is too OOP. Directly translating the code into Julia will make it very ugly and unreadable. So I propose the following design.

Split an environment into three layers

Feature layer

A feature layer basically describes what's in each grid. For space efficiency, I'd recommend using a BitArray{3} of size (n_objects, height, width) to represent grids. Usually, the number of objects in each grid will not be too large.

In gym-minigrid, objects may have several states: (:open, :closed, :locked). and colors. To represents this kind of object, we need more bits for the one-hot representation.

To support the partial view of an agent, we need to remember the current direction of the agent. One approach is to encode the current direction into the representation of the agent, so we need four bits to represent an agent. Another approach is to save the direction in the runtime logic layer as part of the state.

Instead of using BitArray{3} to represent the feature layer, an Array{Int8,3} can also be used to store (object_id, color, state) in each grid. This might be necessary if we want to keep things aligned with those environments in gym-minigrid when reproducing the results in published papers.

Runtime logic layer

Different environments have their own runtime logics. Ideally, each action will lead to the modification of the feature layer.

The most difficult part here while coding is how to translate the modifications into bit operations. I'd suggest the following approach. Other suggestions are welcomed.

# low level
set!(::CustomEnv, feature_layer, pos::CartesianIndex{2}, AGENT)
reset!(::CustomEnv, feature_layer, pos::CartesianIndex{2}, AGENT)
# other useful functions like `clear!`

# high level
function move!(::CustomEnv, feature_layer, :agent, src::CartesianIndex{2}, dest::CartesianIndex{2})
    # set! dest
    # reset! src
end
# other functions like, open_door,

Visualization layer

For visualization, I'd recommend using Makie.jl, the biggest advantage is that it's very easy for interaction. This is very useful for debugging (and is more user friendly for human playing).

Generally, most work in the visualization layer is to determine what to show in each grid (for example, if a key is in a box, you should show only the color of the box, not the key). In some cases, we may take the state info into account.

Integration with ReinforcementLearningEnvironments.jl

This part should be independent of the implementation of grid environments. We can have many wrappers like the maximum allowed steps, reward score, and so on.

Generality

When implementing environments in gym-minigrid, we should keep the generality in mind. So that other environments like Maze, and boxoban https://github.com/JuliaReinforcementLearning/ReinforcementLearningEnvironments.jl/issues/54 can reuse components in this package as much as possible.

And without loss of generality, we may also support 3-D or higher-order environments and multiagent environments. See also the https://github.com/JuliaReinforcementLearning/SnakeGames.jl

Unfortunately, I won't have too much time to work on this recently. Ping @sriram13m, @sriyash421 if you'd like to take the challenge. I'd like to provide help as much as I can.

from reinforcementlearningenvironments.jl.

sriyash421 commented on June 30, 2024

@findmyway I would be happy to take this up.

from reinforcementlearningenvironments.jl.

Add a general grid world environment about reinforcementlearningenvironments.jl HOT 2 CLOSED

Comments (2)

Design

Split an environment into three layers

Feature layer

Runtime logic layer

Visualization layer

Integration with ReinforcementLearningEnvironments.jl

Generality

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent