Comments (5)
The data points are in the correct order as trajectories, so obs[t+1] is the next state after t. However, MuJoCo environments do not return terminal=True when a trajectory ends due to a timeout. This induces non-Markovian dependencies where the reward-to-go is dependent on how many steps to the goal are remaining.
As a sample loading function, we have provided a function here: https://github.com/rail-berkeley/d4rl_evaluations/blob/master/bear/examples/bear_hdf5_d4rl.py#L18
(note the for loop that appropriately accounts for termination due to a timeout vs a termination due to a termination flag).
from d4rl.
I am closing this issue for now, but let us know if there are any concerns.
from d4rl.
We're adding a new function in #36
from d4rl.
It seems the next_observation is not correctly aligned with observation. For example, I use "dataset = d4rl.qlearning_dataset(env)" to load the hopper-expert-v0 dataset, I get:
In [42]: dataset['observations'][1306]
Out[42]:
array([ 1.1960315 , -0.12238141, -0.27723378, -0.1835278 , 0.7594525 ,
3.751946 , -1.1062441 , 0.8567439 , -2.6299708 , 0.46845496,
-2.8050947 ], dtype=float32)In [43]: dataset['next_observations'][1306]
Out[43]:
array([ 1.2480359e+00, -5.7157112e-04, -2.6452148e-03, -3.2997034e-03,
-2.6625939e-04, -1.1605565e-03, 1.4991794e-03, 4.3500989e-04,
-4.7029392e-03, -6.1305630e-04, -1.0700644e-03], dtype=float32)In [45]: dataset['observations'][308]
Out[45]:
array([ 1.2471584 , -0.15857491, -0.5438693 , 0.00728944, 0.743322 ,
2.6159544 , -2.4281225 , -0.06683858, 0.5487718 , -0.1945495 ,
-3.872534 ], dtype=float32)In [46]: dataset['next_observations'][308]
Out[46]:
array([ 1.2462358e+00, -3.1487644e-03, 5.2377465e-04, -3.9951322e-03,
2.3823448e-03, 3.3964110e-03, 1.5305470e-03, -1.3778923e-03,
9.2762033e-04, 3.9863074e-04, 9.7913540e-04], dtype=float32)In [47]: dataset['terminals'][308]
Out[47]: False
These next_observations look like the beginning of a new episode (the first number is around 1.25 and the others around 0) which shouldn't be connected to these observations.
I checked the original dataset without using d4rl.qlearning_dataset. It seems index 309, 1308, 2307 are the beginning of the new episodes.
from d4rl.
Hi @aviralkumar2907 @justinjfu, would you help me to re-open this issue? It seems I'm not able to re-open it.
from d4rl.
Related Issues (20)
- [Question] A question about d4rl which is in Windows
- import d4rl No module named 'flow' and 'carla' HOT 2
- [Question] Expert score for maze2d environment may be wrong HOT 2
- [Question] Question for change the environment
- Issue with env.sim.set_state(obs) , also tried env.set_state(obs)
- Unable to get video using hopper-expert-v2 but was able to get with ant-v2 etc.
- Why is state value not same as env.sim.get_state().qpos and env.sim.get_state().qvel ?
- [Proposal] Humanoid Offline RL data HOT 1
- [Bug Report] d4rl.sequence_dataset raise error when handling mujoco-v2 environments
- [Question] More than 1 goal point for mazes
- [Proposal] Please tag your release code
- [Question] get_dataset failed HOT 3
- Add stochasticity to the environment HOT 1
- [Question] Access to the checkpoints
- [Question] Stochastic of environmental dynamics in Gym control tasks
- [Question] about normalization
- Cython complaint when running example python script that uses d4rl HOT 3
- Will support be added for new MuJoCo bindings? HOT 1
- [Proposal] Add python 3.11 support
- Terminals? How is the data split up into trajectories and why do you make your own terminal finding code?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from d4rl.