Git Product home page Git Product logo

Comments (5)

mfe7 avatar mfe7 commented on August 15, 2024 1

Hi @krishna-bala this all looks nice and makes sense. I suggest you change action passed to the env in trainA2C.py to be a dictionary of actions (where the key is the id of the agent you'd like to take that random action, e.g., action = {0: [spd, heading]})

from gym-collision-avoidance.

mfe7 avatar mfe7 commented on August 15, 2024 1

I think that's essentially what we did here:

def run_episode(env, one_env):
total_reward = 0
step = 0
done = False
while not done:
obs, rew, done, info = env.step([None])
total_reward += rew[0]
step += 1

from gym-collision-avoidance.

krishna-bala avatar krishna-bala commented on August 15, 2024

Hi @mfe7 -- I just noticed that mistake as well after comparing code to example.py.

I modified my code to instantiate actions as a dict, which solves the error from self._take_action(actions,dt) in collision_avoidance_env.py. Now, an action list of [delta heading angle, speed] should get passed to agent.policy.external_action_to_action().

However, the VecEnv wrapper around the CollisionAvoidanceEnv (object env from the method create_env() ) calls the step() method from vec_env.py from the baselines library.

This vec_env.py step() method calls step_async() and step_wait() from dummy_vec_env.py (baselines library).

In the method step_wait(), the action dict gets converted to a list before calling env.step() (the method defined in collision_avoidance_env.py). See code below.

def step_wait(self):
        print('step_wait() method')
        for e in range(self.num_envs):
            action = self.actions[e]
            # if isinstance(self.envs[e].action_space, spaces.Discrete):
            #    action = int(action)
            print('action from step_wait() method: {}'.format(action))
            obs, self.buf_rews[e], self.buf_dones[e], self.buf_infos[e] = self.envs[e].step(action)
            if self.buf_dones[e]:
                obs = self.envs[e].reset()
            self._save_obs(e, obs)
        return (self._obs_from_buf(), np.copy(self.buf_rews), np.copy(self.buf_dones),
                self.buf_infos.copy())

The one_env (unwrapped) environment doesn't have this issue where the action is converted from a dict to a list before being called. Should I nest a dict inside of the actions dict when using the VecEnv wrapper? e.g. action = {0: {0: [spd, heading]}} ?

Here is the random agent that works properly on the one_env environment.

import os
os.environ['GYM_CONFIG_CLASS'] = 'Train'

import gym
from gym_collision_avoidance.envs import Config
import gym_collision_avoidance.envs.test_cases as tc
from gym_collision_avoidance.experiments.src.env_utils import create_env
from stable_baselines.common.policies import MlpPolicy
#from stable_baselines.common import make_vec_env
from stable_baselines import A2C
from stable_baselines.common.env_checker import check_env


# env: a VecEnv wrapper around the CollisionAvoidanceEnv
# one_env: an actual CollisionAvoidanceEnv class (the unwrapped version of the first env in the VecEnv)

env, one_env = create_env()

# check_env(env, warn=True)
# model = A2C(MlpPolicy, env, verbose=1)
# model.learn(total_timesteps=1000)

# The reset method is called at the beginning of an episode
obs = one_env.reset()

num_episodes = 1000

for i in range(num_episodes):
    actions = {}
    actions[0] = one_env.action_space.sample()
    
    obs, reward, done, info = one_env.step(actions)
    if done:
        obs = one_env.reset()

from gym-collision-avoidance.

mfe7 avatar mfe7 commented on August 15, 2024

Hmm is it possible that you can send a list of dicts (e.g., actions = [{0: [spd, heading]}]) to env.step, because the VecEnv wrapper you posted there would then grab the first element of that list and send it to environment 0.

from gym-collision-avoidance.

krishna-bala avatar krishna-bala commented on August 15, 2024

Yes that works! Thank you.

from gym-collision-avoidance.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.