Comments (3)
Amortizing over a trajectory length of 1000 (standard for gym), it seems like the medium dataset is collected with a policy getting a score of approx 3470 which is very close to optimal. On the other hand, the expert dataset is collected with a policy of score 1500, which seems closer to a medium sub-optimal policy.
I'm guessing the datasets have been named incorrectly when uploading?
from d4rl.
Thanks - I swapped the two datasets. Clean the downloaded datasets (`rm ~/.d4rl/datasets/hopper*v1.hdf5') and you can download them again.
from d4rl.
I have a similar question about hopper_medium_replay:
env = gym.make('hopper-random-v1')
dataset = env.get_dataset()
print(np.mean(dataset['rewards'])) # 0.8286486
env = gym.make('hopper-medium-v1')
dataset = env.get_dataset()
print(np.mean(dataset['rewards'])) # 1.5018191
env = gym.make('hopper-expert-v1')
dataset = env.get_dataset()
print(np.mean(dataset['rewards'])) # 3.466414
env = gym.make('hopper-medium-replay-v1')
dataset = env.get_dataset()
print(np.mean(dataset['rewards'])) # 3.0534504202260146
I don't think it makes sense for medium replay to have a higher per-step reward than medium.
from d4rl.
Related Issues (20)
- [Question] A question about d4rl which is in Windows
- import d4rl No module named 'flow' and 'carla' HOT 2
- [Question] Expert score for maze2d environment may be wrong HOT 2
- [Question] Question for change the environment
- Issue with env.sim.set_state(obs) , also tried env.set_state(obs)
- Unable to get video using hopper-expert-v2 but was able to get with ant-v2 etc.
- Why is state value not same as env.sim.get_state().qpos and env.sim.get_state().qvel ?
- [Proposal] Humanoid Offline RL data HOT 1
- [Bug Report] d4rl.sequence_dataset raise error when handling mujoco-v2 environments
- [Question] More than 1 goal point for mazes
- [Proposal] Please tag your release code
- [Question] get_dataset failed HOT 3
- Add stochasticity to the environment HOT 1
- [Question] Access to the checkpoints
- [Question] Stochastic of environmental dynamics in Gym control tasks
- [Question] about normalization
- Cython complaint when running example python script that uses d4rl HOT 3
- Will support be added for new MuJoCo bindings? HOT 1
- [Proposal] Add python 3.11 support
- Terminals? How is the data split up into trajectories and why do you make your own terminal finding code?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from d4rl.