Comments (8)
As far as I can tell, we reach this point in two cases:
- Adding an item to the buffer where the new item has an inconsistent structure compared to the already existing structure and
- updating an existing item with an item that has an inconsistent structure compared to the already existing structure.
I think the main question here is if these operations should be allowed in the first place. In the RL context this probably only happens in case the info dict changes its structure between steps (as happens in MoveToRightEnv in the tests). One could argue, that the env is ill-defined in this case and instead of setting arbitrary default values, the env should be fixed?
from tianshou.
Regarding your suggestion to set values to NaN instead, assigning np.nan
will fail on arrays whose dtype is not float (as is the case in the above mentioned test).
from tianshou.
@maxhuettenrauch unfortunately, in the RL context this is bound to happen because some things might not be known at reset
that are known at step
, and thus will contain None
entries. The prime example is the action, which is not available at reset
, but some entrances from the info
might also be missing
from tianshou.
This issue is strongly related to #1087
from tianshou.
But when would you add obs directly after reset to the buffer (where a Batch object is created) and only after that retrieve an action, call step
, and append the rest to this entry?
from tianshou.
I thought it might happen in the collectors, but maybe I'm mistaken.
For sure I've seen this happen with info objects somewhere in collector tests - though there it is a bad implementation of the env.
Unfortunately Gymnasium doesn't force any interface on the info dicts which are the main drivers of this problem.We can stop supporting such cases, or ask the use to specify what should happen then explicitly.
I am all for restricting the number of supported operations to decrease complexity and probability of errors :)
from tianshou.
Yes, definitely happening in the collector tests, due to said bad env design. I'm gonna check some examples with standard mujoco envs.
from tianshou.
Related: Farama-Foundation/Gymnasium#540
from tianshou.
Related Issues (20)
- How can I make action sampling within the range specified by my environment when using onpolicy_trainer? HOT 6
- Document effects of the relations between buffer size, num workers and episode length
- Poetry update the torch versioned from cuda (2.0.1+cu118) to cpu (2.1.1) defaultly on Windows HOT 9
- [question] Why does Tianshou use a replay buffer in on-policy RL algorithms? HOT 1
- ImportError: cannot import name 'Self' from 'typing' (/root/miniconda3/lib/python3.10/typing.py) HOT 1
- ModuleNotFoundError: No module named 'tianshou.highlevel' HOT 2
- Support dict observation spaces in highlevel api
- get_env_attr not working in SubprocVectorEnv? HOT 2
- How to save the log which axis is each epoch not epoch's steps? HOT 2
- Python Bug: lambda function refers only one environment HOT 4
- expected to be in range of [-1, 0], but got 1 HOT 3
- Unable to replicate original PPO performance HOT 7
- Clarification Needed on Implementing Action Masking in DQN with preprocess_fn in Collector
- will add dreamerv3 ?
- Documentation for multi-agent needs fixing
- No minibatch for computation of logp_old in PPOPolicy HOT 1
- MPO Implementation HOT 1
- Improve interface of BasePolicy.compute_action
- Suggestion - Redesign RayEnvWorker for Improved Performance
- tianshou v1.0.0 failed to install on python 3.12.4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tianshou.