Git Product home page Git Product logo

Comments (5)

aviralkumar2907 avatar aviralkumar2907 commented on July 17, 2024 4

The numbers in the NeurIPS version of the CQL paper: https://proceedings.neurips.cc/paper/2020/file/0d2b2061826a5df3221116a5085a6052-Paper.pdf are supposed to be used as reference, which refers to the table you mentioned. The original CQL paper (old version) matches the D4RL paper. We are in the process of fixing github issues in D4RL and will report the updated numbers in the next update.

from d4rl.

aviralkumar2907 avatar aviralkumar2907 commented on July 17, 2024 1

Hi, CQL reported numbers from the first arxiv version of the D4RL paper, which (for BEAR) have then improved in the newer version of D4RL. We will update the numbers for baselines in CQL, and so the results in D4RL should be used as reference. I think the difference is mainly in BEAR numbers, which changed since we moved to a better BEAR implementation.

from d4rl.

yifan123 avatar yifan123 commented on July 17, 2024 1

Hi, I also cross check the CQL scores reported in D4RL (arXiv-v4).
1. The mismatches has not been fixed @IcarusWizard @justinjfu @aviralkumar2907
2. In Table2 and Table3 of D4RL, there are also few mismatches for the same env.
For BC, 923 / 3234 = 29
For CQL, 2557 / 3234 = 79 not 58

Task SAC BC CQL
hopper-medium D4RL (arXiv-v4) Table2 Normalized Score 100 29.0 58
hopper-medium D4RL (arXiv-v4) Table3 Un-Normalized Score 3234.3 923.5 2557.3

image

from d4rl.

rasoolfa avatar rasoolfa commented on July 17, 2024

Thanks for your response.
One more question, were hyperparameters tuned per environment and data setting? or just one set of hyperparameter is used for all environments?

from d4rl.

IcarusWizard avatar IcarusWizard commented on July 17, 2024

Hi, I just cross check the CQL scores reported in D4RL (arXiv-v4) and CQL (NeurIPS) papers, there are few mismatches.

Task D4RL (arXiv-v4) CQL (NeurIPS)
walker2d-medium 79.2 74.5
hopper-medium 58.0 86.6
walker2d-medium-replay 26.7 32.6

I hope you can clarify which one can be correctly used as a reference. Especially for hopper-medium, since the difference is huge.

from d4rl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.