Git Product home page Git Product logo

Comments (4)

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 2, 2024 1

@alexdavey feel free to make a PR

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 2, 2024

Your test for sparse rewards, is not correct, since using random actions is very unlikely to get you to the terminal state

Does this also affect the PointEnv?

from gymnasium-robotics.

alexdavey avatar alexdavey commented on June 2, 2024

Thanks for taking a look at this issue.

  1. I agree that it is not a robust test in general, it’s primarily just a check to compare pre/post commit (after the suggested changes, it does get a very small but non-zero reward). If you have suggestions for a more robust test I can implement that, but I think the issue is fairly clear from the step method of AntMaze:

def step(self, action):
ant_obs, _, _, _, info = self.ant_env.step(action)
obs = self._get_obs(ant_obs)
terminated = self.compute_terminated(obs["achieved_goal"], self.goal, info)
truncated = self.compute_truncated(obs["achieved_goal"], self.goal, info)
reward = self.compute_reward(obs["achieved_goal"], self.goal, info)
if self.render_mode == "human":
self.render()
return obs, reward, terminated, truncated, info

and comparing to the PointMaze step:

def step(self, action):
obs, _, _, _, info = self.point_env.step(action)
obs_dict = self._get_obs(obs)
info["success"] = bool(
np.linalg.norm(obs_dict["achieved_goal"] - self.goal) <= 0.45
)
reward = self.compute_reward(obs_dict["achieved_goal"], self.goal, info)
terminated = self.compute_terminated(obs_dict["achieved_goal"], self.goal, info)
truncated = self.compute_truncated(obs_dict["achieved_goal"], self.goal, info)
return obs_dict, reward, terminated, truncated, info

.compute_terminated() is common to both and resets the goal location if within 0.45.

This issue previously existed for PointMaze. I’m proposing to make the same fix as commit ace181e, but for AntMaze.

  1. The PointEnv/PointMaze env’s are not affected by either of these issues, due to the commit referenced above, and because maze_size_scaling = 1 so the missing factor does not change anything.

from gymnasium-robotics.

Kallinteris-Andreas avatar Kallinteris-Andreas commented on June 2, 2024
  1. I would say it is worrying that compute_terminated is not a pure function, it should be refactored into 2 functions: compute_terminated (but now it just computes if it is terminated), and update_goal

These changes would requite a new revision ("AntMaze_UMaze-v4")

@rodrigodelazcano

from gymnasium-robotics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.