Git Product home page Git Product logo

Comments (3)

JesseFarebro avatar JesseFarebro commented on July 28, 2024

There's currently no way to do this. You'd have to implement sticky actions yourself on top of the environment. Out of curiosity why would you want to do this?

from atari-py.

arjung128 avatar arjung128 commented on July 28, 2024

While this may not be a native feature, is there any way I could modify the gym code directly to do something like this that you are aware of?

I am trying to train a model to extract actions from sequences of frames. The offline RL datasets I have been using were generated with repeat_action_probability = 0.25, meaning that 25% of the ground truth action labels in the datasets are garbage. So I wanted to regenerate the exact same data, but with knowledge of the actually executed action. Having the exact same data (frame-by-frame, not just the same protocol) but with the actually executed actions will allow me to use these standard offline RL datasets to benchmark my model.

from atari-py.

JesseFarebro avatar JesseFarebro commented on July 28, 2024

So the short answer is yes, you can easily use environment wrappers, check out some of the built-in wrappers here: https://github.com/openai/gym/tree/master/gym/wrappers. You could also just as easily implement this on top of Gym without wrappers, just do the sampling yourself when interacting with the environment.

I still don't fully understand your use case. If you have an offline RL dataset the actions in the dataset are presumably the agent's chosen actions, not the actions the environment will execute, I don't see how 25% of your action labels are garbage? If sticky actions were enabled when you generated the dataset then perhaps 25% of your transitions wouldn't correspond to the ground truth emulation, i.e., S_{t+1} is a stochastic function S_t, A_t due to sticky actions, the only source of stochasticity in the ALE. Do you really want to remove this stochasticity? Generally, it should be assumed that this is part of the environment dynamics.

If you don't want any form of stochasticity you could just generate the dataset with sticky actions disabled?

from atari-py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.