Git Product home page Git Product logo

Comments (14)

rLoka avatar rLoka commented on August 23, 2024 2

Hi Michael,

I also have a question concerning safe_actions publisher. I am currently trying to get cadrl_node running for testing purposes and for now, I matched all the input/output topics with my own setup.

The problem, as it seems, is with empty feasible_actions, which, if I understood the code well, should be set by the very publisher on ~safe_actions topic using cbNNActions callback:

    def cbNNActions(self, msg):
        self.feasible_actions = msg

Since it is empty, calling cbComputeActionGA3C fails with Invalid Feasible Actions message.

I presume the publisher that uses NN should be written, but currently I am not sure how would I do that.

Any help is appreciated.

Thanks,
Karlo

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024 2

Hi Karlo - I see what's going on. You should be able to run the ROS node completely without feasible_actions, but I forgot to fully remove it. Can you checkout the no_feasible_actions branch I just added, and let me know if that removes the error?

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024 1

Very cool @rLoka !! Thanks for sharing those videos. In the second clip, it still looks a bit jumpy (stop & go) which seems sorta weird.

I suspect this trained policy won't excel in those static obstacle fields, since it was mainly trained on dynamic agents (with a small percentage of static/non-cooperative agents around as well).

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024

Good question. Currently we are not using this topic, so I don't have any code to provide. If you are operating in an open area, you can likely ignore the safe_actions topic/concept.

The idea was to have one set of possible actions (speed-heading pairs), which are first checked by an external static obstacle collision checking script (e.g. using ROS costmap_2d), and then the "safe" ones are sent to be evaluated by the policy with respect to dynamic obstacles. This is a rough way to handle static obstacles (walls) using a policy only trained on dynamic+round obstacles.

In this implementation, the learned policy is sorta a delta function (~99% on the best action, <1% on other actions), so if the best action according to the policy is deemed "unsafe" by the static obstacle checker, it's not clear that the 2nd best action according to the policy has much meaning. Of course, if the static obstacle checker says to stop, you should stop regardless of what the policy says about that action.

An alternative approach might be to look at the values of the states after taking an action (e.g. V(s + a*dt)), or to learn a policy that's not as extreme (e.g. by adding more entropy weight).

from cadrl_ros.

njhetherington avatar njhetherington commented on August 23, 2024

Thanks for getting back to me.

We do have static obstacles in our environment, so I'm trying to write a collision checking script as you've described.

I'm new to reinforcement learning - could you please explain how I would check the state values for a given action ?

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024

Take a look at the python notebook in this repo to see how to query the policy (probability of selecting each action) given a state vector (observation of other agents).

If you add the following line to the network.py script (at line 47), it should also compute the value function:

# Cost: v 
self.logits_v = tf.squeeze(tf.layers.dense(inputs=self.fc1, units = 1, use_bias = True, activation=None, name = 'logits_v'), axis=[1])

The network.py script is where the conversion from state to policy is defined. This additional line helps define the value for the current state, so in order to compare values of the various actions under consideration, you'll have to do something to propagate forward the current state based on each action. In RL, there is an alternative form of value function that takes in a state and action, but we didn't use that approach here (e.g. Q-learning).

I haven't checked the quality of the learned value function, so I'm honestly not sure how good it'll be. With the particular learning algorithm we used in this paper, the value function was learned sorta as an auxiliary measurement to help train what we really wanted: the policy (see actor-critic methods).

from cadrl_ros.

rLoka avatar rLoka commented on August 23, 2024

Hey Michael,

Thanks for being prompt! Switching to no_feasible_actions branch did fix the Invalid Feasible Actions error.

However, another thing emerged. When the new goal is given, via rviz for instance, cadrl_node outputs the following:

Not in NN mode
2

I managed to overcome the error by changing cbGlobalGoal method at line 112 from:

self.operation_mode.mode = self.operation_mode.SPIN_IN_PLACE

to

self.operation_mode.mode = self.operation_mode.NN

Although no further error occurred, nn_cmd_vel publishes only 0 velocity values

---
linear: 
  x: 0.0
  y: 0.0
  z: 0.0
angular: 
  x: 0.0
  y: 0.0
  z: 0.0
---

and no markers (apart from simulated agents) are published either.

Am I doing something wrong?

Thanks,
Karlo

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024

Hmm, could we try to localize whether the issue is with the ROS code or the network query? On line 496 the line predictions = self.nn.predict_p(obs, None)[0] is the actual network query. Could you see what obs and predictions are? And the next few lines should turn the predictions (policy pdf over actions) into a more readable (speed, heading).

The SPIN_IN_PLACE code was supposed to make it so upon receiving a new goal, the robot stops and spins until it's roughly pointing toward the goal, then switches into NN mode and uses the network queries. This was useful when testing the robot going back and forth across a room, but may not be needed in your scenario.

from cadrl_ros.

rLoka avatar rLoka commented on August 23, 2024

Here is the output:

obs: [[0.         5.59617619 1.53353307 1.2        0.5        0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.         0.         0.         0.
  0.         0.         0.        ]]


predictions: [9.1763228e-01 5.9054375e-02 1.0029992e-04 9.9890160e-05 9.9890131e-05
 1.4806822e-02 1.1613100e-03 1.0508312e-04 6.1284434e-03 5.9589912e-04
 2.1576905e-04]


best action index: 0

raw_action: [ 1.         -0.52359878]

action: [ 1.2        -0.52359878]

chosen action (rel angle) 1.2 -0.5235987755982988

Update

It looks like the if self.goal.header.stamp == rospy.Time(0) condition in cbControl method was causing zero twist. From whatever reason the goal header stamp was always zero.

After removing that condition, I got it to work pretty decently. The only thing left to do is adjust the nn_cmd_vel output for my drive type (which is causing the robot to wiggle as can be seen below).

wiggle

Update 2

Using the Jackal simulator for testing seems to give the expected results (no wiggle):
no_wiggle

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024

from cadrl_ros.

rLoka avatar rLoka commented on August 23, 2024

I updated my previous post with what I have done to fix it. Thanks for helping. It seems to be working now without problems.

from cadrl_ros.

rLoka avatar rLoka commented on August 23, 2024

Yeah, it is still not perfect but at least I have it set and running. Will investigate further if the actual drive is causing these stop and go motions or is it up to the trained policy.

from cadrl_ros.

DDBarBar avatar DDBarBar commented on August 23, 2024

@mfe7 @rLoka
Hi- I still don't understand how to solve this problem. Itβ€˜s:

The problem, as it seems, is with empty feasible_actions, which, if I understood the code well, should be set by the very publisher on ~safe_actions topic using cbNNActions callback:

def cbNNActions(self, msg):
    self.feasible_actions = msg

Since it is empty, calling cbComputeActionGA3C fails with Invalid Feasible Actions message.

Could you tell me how to solve it in details? I will appreciate you very much if you could give me some guidance.

Thanks,
Zhangdi

from cadrl_ros.

mfe7 avatar mfe7 commented on August 23, 2024

@DDBarBar did you try switching branches as noted above? I tried to get rid of everything related to feasible_actions in that branch.

from cadrl_ros.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.