Git Product home page Git Product logo

Comments (22)

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024 1

No, the test is indeed not good enough to prove the point (and it does fail after the fix to the humanoid)

The analysis done in the issues proves it

I have done testing for millions of steps, with them being zero (it is in the other issues)

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024 1

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

Figure_1
v4 is the existing Hopper-v4 model (blue)
v5 is with my hand ported xml model (orange)
v5_gen is the auto generated xml model from mujoco/complile (green)

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate
while the v5_gen has different behavior from the beginning

from gymnasium-robotics.

saran-t avatar saran-t commented on June 8, 2024 1

Your problem there is float32. At float64 those two numbers are different (https://evanw.github.io/float-toy/).

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024 1

Sure, where would the blog post be published? (farama.org?)

In the meantime the active changelog is here: #104

from gymnasium-robotics.

rodrigodelazcano avatar rodrigodelazcano commented on June 8, 2024

Can you elaborate more on Proposal 1 and how would this work. I don't know if you mean to register mamujoco environments with Gymnasium id's.

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

I mean adding them to, to the suite of available environments.

from gymnasium-robotics.

pseudo-rnd-thoughts avatar pseudo-rnd-thoughts commented on June 8, 2024

Im unconvinced by the test showing this to be an issue.
It is quite possible for a couple of the observation elements for the first N time steps.
Could you upgrade the tests to show that from the reset obs and for the first (100) time steps then the same elements have a fixed value, don't have to be zero.

If we make this change, we will need to train agents to ensure there is no performance regression is a result of the change.

It is a shame we can't do v4.1 as the environment number

from gymnasium-robotics.

pseudo-rnd-thoughts avatar pseudo-rnd-thoughts commented on June 8, 2024

Could this include new compiled versions of all of the models so they have the same format

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

Could this include new compiled versions of all of the models so they have the same format

I would rather not, I am not sure if I can assert identical behavior because they add autolimits="true"
(and see no benefit in it, other than aesthetics)

from gymnasium-robotics.

pseudo-rnd-thoughts avatar pseudo-rnd-thoughts commented on June 8, 2024

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

Figure_1

Very interesting, thanks for doing that
Deepmind never claimed that compile produces identical models but this is very different, particularly, in training. To check if this is expected.
One response could be that we have only run this once and over several runs the differences average out

from gymnasium-robotics.

saran-t avatar saran-t commented on June 8, 2024

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

Figure_1 v4 is the existing Hopper-v4 model (blue) v5 is with my hand ported xml model (orange) v5_gen is the auto generated xml model from mujoco/complile (green)

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate while the v5_gen has different behavior from the beginning

Can you please provide us with the complete model for each of the three cases?

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

Hey @saran-t, I am currently doing more testing,
here are the xml files:
v4: https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper.xml
v5: https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new.xml
auto generated model (with mujoco/compile): https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

Here is a test for 10 runs with 200k steps
Figure_1
v4 is the existing Hopper-v4 model (blue)
v5 is with my hand ported xml model (orange)
v5_gen is the auto generated xml model from mujoco/complile (green)

Shaded Area Show mins and maxs

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate, while still being in the same ballpark
while the v5_gen has different behavior from the beginning

@saran-t

from gymnasium-robotics.

saran-t avatar saran-t commented on June 8, 2024

Please try the auto-converted XML in https://colab.research.google.com/drive/1slY_8RlzzRffDQhLt3uqazyD_OU9pd2r#scrollTo=lH94cPjK4SKZ and let me know how that works.

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

@saran-t
This is identical to:
https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml
(other than numbers being in full prec)

you can use a service like https://www.diffchecker.com/text-compare/ to compare them

from gymnasium-robotics.

saran-t avatar saran-t commented on June 8, 2024

(other than numbers being in full prec)

That is literally the whole point? The only difference that you're seeing is just the tail end of the precision. You can see at the bottom of the Colab that the auto-converted XMLs have bit-identical results.

To be clear, what I'm saying is you should just replace your existing XML with https://colab.research.google.com/drive/1slY_8RlzzRffDQhLt3uqazyD_OU9pd2r#scrollTo=s4DDElgo4_d6 (which is the absolute minimal diff that results in absolutely no change whatsoever under MuJoCo 2.3.3).

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

@saran-t k, I will test it

I am under the impression that the only change was printing my numbers at full prec

>>> numpy.array(-0.19999999999999996, dtype=numpy.float32) == numpy.array(-0.2, dtype=numpy.float32)
True

i will test your manually made xml

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

Figure_1
v4 is the existing Hopper-v4 model (blue) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper.xml
v5 is with my hand ported xml model (orange) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new.xml
v5_gen is the auto generated xml model from mujoco/complile (green) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml
@saran_t manually transcribed (red) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_saran_t_trans.xml

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate, while still being in the same ballpark
while the v5_gen and @saran_t manually transcribed have different behavior from the beginning

from gymnasium-robotics.

saran-t avatar saran-t commented on June 8, 2024

Not really sure how that can be the case since I've verified that the model is literally identical (bit perfect) to the v4 XML...

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

I am double-checking, this weirds me out too.

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 8, 2024

I re-run the benchmark (I had pytorch related issues)
Figure_1

(The other models have different behavior, I am not showing them)
As you can see @saran-t's model is identical to hopper-v4.xml
On top of that I run a test for 1000 episodes with 4000 simulation steps per episode (with random actions) and the behavior 100% identical between both models (https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/test_hop.py).

@saran-t

  1. Can you please do the same with walker2d.xml https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/mujoco/assets/walker2d.xml Never mind I did it myself https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/walker2d_ported.xml
  2. According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on mujoco==2.3.3
  3. According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on 2.1.3<=mujoco<=2.3.3
  4. According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on 1.5<=mujoco-py

questions 2-4 are there to know if we can change the model for environment versions v2, v3, v4

from gymnasium-robotics.

pseudo-rnd-thoughts avatar pseudo-rnd-thoughts commented on June 8, 2024

When this is done, I think we should write a blog post as this will be a better way of publishing the changes and for users to cite the changes compared to a PR.
Additionally, it would be easier to include all of the results in there.
@Kallinteris-Andreas Would you be interested in this?

from gymnasium-robotics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.