Comments (5)
The numbers in the NeurIPS version of the CQL paper: https://proceedings.neurips.cc/paper/2020/file/0d2b2061826a5df3221116a5085a6052-Paper.pdf are supposed to be used as reference, which refers to the table you mentioned. The original CQL paper (old version) matches the D4RL paper. We are in the process of fixing github issues in D4RL and will report the updated numbers in the next update.
from d4rl.
Hi, CQL reported numbers from the first arxiv version of the D4RL paper, which (for BEAR) have then improved in the newer version of D4RL. We will update the numbers for baselines in CQL, and so the results in D4RL should be used as reference. I think the difference is mainly in BEAR numbers, which changed since we moved to a better BEAR implementation.
from d4rl.
Hi, I also cross check the CQL scores reported in D4RL (arXiv-v4).
1. The mismatches has not been fixed @IcarusWizard @justinjfu @aviralkumar2907
2. In Table2 and Table3 of D4RL, there are also few mismatches for the same env.
For BC, 923 / 3234 = 29
For CQL, 2557 / 3234 = 79
not 58
Task | SAC | BC | CQL | |
---|---|---|---|---|
hopper-medium | D4RL (arXiv-v4) Table2 Normalized Score | 100 | 29.0 | 58 |
hopper-medium | D4RL (arXiv-v4) Table3 Un-Normalized Score | 3234.3 | 923.5 | 2557.3 |
from d4rl.
Thanks for your response.
One more question, were hyperparameters tuned per environment and data setting? or just one set of hyperparameter is used for all environments?
from d4rl.
Hi, I just cross check the CQL scores reported in D4RL (arXiv-v4) and CQL (NeurIPS) papers, there are few mismatches.
Task | D4RL (arXiv-v4) | CQL (NeurIPS) |
---|---|---|
walker2d-medium | 79.2 | 74.5 |
hopper-medium | 58.0 | 86.6 |
walker2d-medium-replay | 26.7 | 32.6 |
I hope you can clarify which one can be correctly used as a reference. Especially for hopper-medium
, since the difference is huge.
from d4rl.
Related Issues (20)
- [Question] A question about d4rl which is in Windows
- import d4rl No module named 'flow' and 'carla' HOT 2
- [Question] Expert score for maze2d environment may be wrong HOT 2
- [Question] Question for change the environment
- Issue with env.sim.set_state(obs) , also tried env.set_state(obs)
- Unable to get video using hopper-expert-v2 but was able to get with ant-v2 etc.
- Why is state value not same as env.sim.get_state().qpos and env.sim.get_state().qvel ?
- [Proposal] Humanoid Offline RL data HOT 1
- [Bug Report] d4rl.sequence_dataset raise error when handling mujoco-v2 environments
- [Question] More than 1 goal point for mazes
- [Proposal] Please tag your release code
- [Question] get_dataset failed HOT 3
- Add stochasticity to the environment HOT 1
- [Question] Access to the checkpoints
- [Question] Stochastic of environmental dynamics in Gym control tasks
- [Question] about normalization
- Cython complaint when running example python script that uses d4rl HOT 3
- Will support be added for new MuJoCo bindings? HOT 1
- [Proposal] Add python 3.11 support
- Terminals? How is the data split up into trajectories and why do you make your own terminal finding code?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from d4rl.