Proposal 0 After the Gymnasium/envs/mujo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Your problem there is float32 . At <code class="notran

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

[Proposal] Mujoco-v5,about farama-foundation/gymnasium-robotics

Comments (22)

Kallinteris-Andreas commented on June 8, 2024 1

No, the test is indeed not good enough to prove the point (and it does fail after the fix to the humanoid)

The analysis done in the issues proves it

I have done testing for millions of steps, with them being zero (it is in the other issues)

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024 1

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

v4 is the existing Hopper-v4 model (blue)
v5 is with my hand ported xml model (orange)
v5_gen is the auto generated xml model from mujoco/complile (green)

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate
while the v5_gen has different behavior from the beginning

from gymnasium-robotics.

saran-t commented on June 8, 2024 1

Your problem there is float32. At float64 those two numbers are different (https://evanw.github.io/float-toy/).

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024 1

Sure, where would the blog post be published? (farama.org?)

In the meantime the active changelog is here: #104

from gymnasium-robotics.

rodrigodelazcano commented on June 8, 2024

Can you elaborate more on Proposal 1 and how would this work. I don't know if you mean to register mamujoco environments with Gymnasium id's.

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

I mean adding them to, to the suite of available environments.

from gymnasium-robotics.

pseudo-rnd-thoughts commented on June 8, 2024

Im unconvinced by the test showing this to be an issue.
It is quite possible for a couple of the observation elements for the first N time steps.
Could you upgrade the tests to show that from the reset obs and for the first (100) time steps then the same elements have a fixed value, don't have to be zero.

If we make this change, we will need to train agents to ensure there is no performance regression is a result of the change.

It is a shame we can't do v4.1 as the environment number

from gymnasium-robotics.

pseudo-rnd-thoughts commented on June 8, 2024

Could this include new compiled versions of all of the models so they have the same format

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

Could this include new compiled versions of all of the models so they have the same format

I would rather not, I am not sure if I can assert identical behavior because they add autolimits="true"
(and see no benefit in it, other than aesthetics)

from gymnasium-robotics.

pseudo-rnd-thoughts commented on June 8, 2024

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

Very interesting, thanks for doing that
Deepmind never claimed that compile produces identical models but this is very different, particularly, in training. To check if this is expected.
One response could be that we have only run this once and over several runs the differences average out

from gymnasium-robotics.

saran-t commented on June 8, 2024

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

v4 is the existing Hopper-v4 model (blue) v5 is with my hand ported xml model (orange) v5_gen is the auto generated xml model from mujoco/complile (green)

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate while the v5_gen has different behavior from the beginning

Can you please provide us with the complete model for each of the three cases?

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

Hey @saran-t, I am currently doing more testing,
here are the xml files:
v4: https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper.xml
v5: https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new.xml
auto generated model (with mujoco/compile): https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

Here is a test for 10 runs with 200k steps

v4 is the existing Hopper-v4 model (blue)
v5 is with my hand ported xml model (orange)
v5_gen is the auto generated xml model from mujoco/complile (green)

Shaded Area Show mins and maxs

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate, while still being in the same ballpark
while the v5_gen has different behavior from the beginning

@saran-t

from gymnasium-robotics.

saran-t commented on June 8, 2024

Please try the auto-converted XML in https://colab.research.google.com/drive/1slY_8RlzzRffDQhLt3uqazyD_OU9pd2r#scrollTo=lH94cPjK4SKZ and let me know how that works.

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

@saran-t
This is identical to:
https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml
(other than numbers being in full prec)

you can use a service like https://www.diffchecker.com/text-compare/ to compare them

from gymnasium-robotics.

saran-t commented on June 8, 2024

(other than numbers being in full prec)

That is literally the whole point? The only difference that you're seeing is just the tail end of the precision. You can see at the bottom of the Colab that the auto-converted XMLs have bit-identical results.

To be clear, what I'm saying is you should just replace your existing XML with https://colab.research.google.com/drive/1slY_8RlzzRffDQhLt3uqazyD_OU9pd2r#scrollTo=s4DDElgo4_d6 (which is the absolute minimal diff that results in absolutely no change whatsoever under MuJoCo 2.3.3).

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

@saran-t k, I will test it

I am under the impression that the only change was printing my numbers at full prec

>>> numpy.array(-0.19999999999999996, dtype=numpy.float32) == numpy.array(-0.2, dtype=numpy.float32)
True

i will test your manually made xml

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

v4 is the existing Hopper-v4 model (blue) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper.xml
v5 is with my hand ported xml model (orange) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new.xml
v5_gen is the auto generated xml model from mujoco/complile (green) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml
@saran_t manually transcribed (red) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_saran_t_trans.xml

from gymnasium-robotics.

saran-t commented on June 8, 2024

Not really sure how that can be the case since I've verified that the model is literally identical (bit perfect) to the v4 XML...

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

I am double-checking, this weirds me out too.

from gymnasium-robotics.

Kallinteris-Andreas commented on June 8, 2024

I re-run the benchmark (I had pytorch related issues)

(The other models have different behavior, I am not showing them)
As you can see @saran-t's model is identical to hopper-v4.xml
On top of that I run a test for 1000 episodes with 4000 simulation steps per episode (with random actions) and the behavior 100% identical between both models (https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/test_hop.py).

@saran-t

~~Can you please do the same with walker2d.xml https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/mujoco/assets/walker2d.xml~~ Never mind I did it myself https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/walker2d_ported.xml
According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on mujoco==2.3.3
According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on 2.1.3<=mujoco<=2.3.3
According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on 1.5<=mujoco-py

questions 2-4 are there to know if we can change the model for environment versions v2, v3, v4

from gymnasium-robotics.

pseudo-rnd-thoughts commented on June 8, 2024

When this is done, I think we should write a blog post as this will be a better way of publishing the changes and for users to cite the changes compared to a PR.
Additionally, it would be easier to include all of the results in there.
@Kallinteris-Andreas Would you be interested in this?

from gymnasium-robotics.

[Proposal] Mujoco-v5 about gymnasium-robotics HOT 22 CLOSED

Comments (22)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent