pyrddlgym-project / pyrddlgym-symbolic Goto Github PK

Symbolic compilation of RDDL domains, Dynamic Bayes net (DBN) visualization, symbolic dynamic programming (SDP).

Home Page: https://pyrddlgym.readthedocs.io/en/latest/xadd.html

License: MIT License

Python 100.00%

dynamic-bayesian-network planning planning-algorithms rddl symbolic symbolic-computation xadd policy-iteration-algorithm value-iteration-algorithm

pyrddlgym-symbolic's People

Contributors

Stargazers

Watchers

Forkers

xuanghdu

pyrddlgym-symbolic's Issues

Simplify DBN XADD w.r.t. state-invariants

In reference to the discussion in issue #2, we should simplify XADDs according to state-invariants.

This can be done as follows:

Compile each state-invariant into its own XADD (exactly the same way as cpf expressions are currently compiled). Satisfaction of the state-invariant should lead to leaf 1. Dissatisfaction should lead to leaf NaN.
The compiled cpf XADD should be replaced with a multiplication of itself with all state-invariant XADDs.

Leaves in the resulting cpf XADDs will then lead to a leaf value of NaN if that state is unreachable under the state-invariants. (I.e., if the user tries to evaluate the cpf in an illegal state, there has to be some way for the XADD to indicate that this is an illegal state w.r.t. the state-invariants and hence cannot be evaluated.)

XADD compilation fails for Elevators instances with a single elevator object

Hi, for the Elevators domain and instance_1 I get the following error message when I try to instantiate the class RDDL2Graph:

RDDLInvalidNumberOfArgumentsError: Logical operator | does not have the required number of args, got 1
Expression(etype=('boolean', '|'), args=
    Expression(etype=('boolean', '^'), args=
        Expression(etype=('boolean', '^'), args=
            Expression(etype=('pvar', 'elevator-at-floor___e0__f0'), args=('elevator-at-floor___e0__f0', None))
            Expression(etype=('boolean', '~'), args=
                Expression(etype=('pvar', 'elevator-dir-up___e0'), args=('elevator-dir-up___e0', None))))
        Expression(etype=('boolean', '~'), args=
            Expression(etype=('pvar', 'elevator-closed___e0'), args=('elevator-closed___e0', None)))))

However, it works without any problems for every Elevators instance with more than one elevator object.

KeyError Wildfire XADD compilation

Trying to reproduce the Wildfire XADD compilation example from the README I get a KeyError.

The code to easily reproduce the error:

from pyRDDLGym_symbolic.examples import run_xadd_compilation

run_xadd_compilation.main(
    domain="Wildfire", cpf="burning___x1__y1", save_graph=True, reparam=False
)

Traceback

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[10], line 1
----> 1 run_xadd_compilation.main(
      2     domain="Wildfire", cpf="burning___x1__y1", save_graph=True, reparam=False
      3 )

File ~/.venv/lib/python3.11/site-packages/pyRDDLGym_symbolic/examples/run_xadd_compilation.py:40, in main(domain, instance, cpf, save_graph, reparam)
     38 # XADD compilation
     39 xadd_model = RDDLModelXADD(model, reparam=reparam)
---> 40 xadd_model.compile()
     41 context = xadd_model.context
     43 if save_graph:

File ~/.venv/lib/python3.11/site-packages/pyRDDLGym_symbolic/core/model.py:68, in RDDLModelXADD.compile(self, reparam)
     66     self.reparam = reparam
     67 self.reset_dist_var_num()
---> 68 self.convert_cpfs_to_xadds()
     69 self.compiled = True

File ~/.venv/lib/python3.11/site-packages/pyRDDLGym_symbolic/core/model.py:96, in RDDLModelXADD.convert_cpfs_to_xadds(self)
     94     # Post-processing for Bernoulli CPFs
     95     if self._need_postprocessing:
---> 96         expr_xadd_node_id = self.postprocess(
     97             expr_xadd_node_id,
     98             **self._postprocessing_kwargs)
     99     self.cpfs[name] = expr_xadd_node_id
    100 self.cpfs = self.cpfs

File ~/.venv/lib/python3.11/site-packages/pyRDDLGym_symbolic/core/model.py:465, in RDDLModelXADD.postprocess(self, node_id, **kwargs)
    463         print("bernoulli_set: ", bernoulli_set)
    464         # Manually remove the Bernoulli variable from the Boolean variable set.
--> 465         self.context._bool_var_set.remove(v)
    466 for rv in bernoulli_set:
    467     bern_param_node_id = get_bernoulli_node_id(rv)

KeyError: Bernoulli_params_1_154

Debugging attempt

I generally lack knowledge regarding XADD, but while trying to debug it I came across the following (possibly helpful):

def postprocess(...) is called several times. In the first call, Bernoulli_params_1_154 is in both self.context._bool_var_set and var_set. Therefore no problems:

var_set:  {put_out___x1__y1, Bernoulli_params_1_154, out_of_fuel___x1__y1, burning___x1__y1}

self.context._bool_var_set:  {burning___x1__y2, burning___x5__y4, burning___x2__y2, burning___x2__y3, burning___x1__y3, burning___x5__y1, burning___x5__y3, burning___x5__y5, burning___x2__y1, burning___x3__y4, burning___x4__y5, burning___x3__y5, burning___x4__y1, burning___x1__y5, burning___x1__y1, burning___x4__y3, burning___x3__y2, burning___x4__y4, out_of_fuel___x1__y1, burning___x3__y1, Bernoulli_params_1_154, burning___x5__y2, burning___x2__y4, burning___x3__y3, put_out___x1__y1, burning___x2__y5, burning___x1__y4, burning___x4__y2}

In a later call of the def postprocess(...) function, Bernoulli_params_1_154 was not added to self._bool_var_set via def add_boolean_var(...). Accordingly, Bernoulli_params_1_154 is only in var_set and no longer in self.context._bool_var_set:

var_set:  {out_of_fuel___x2__y1, Bernoulli_params_1_154, burning___x2__y1, put_out___x2__y1}

self.context._bool_var_set:  {burning___x3__y5, burning___x1__y5, burning___x4__y1, burning___x5__y2, put_out___x1__y4, burning___x3__y3, burning___x5__y1, burning___x3__y1, put_out___x1__y1, burning___x5__y4, put_out___x2__y1, out_of_fuel___x2__y1, put_out___x1__y2, put_out___x1__y5, burning___x1__y3, burning___x3__y2, burning___x2__y4, burning___x1__y2, burning___x5__y3, out_of_fuel___x1__y3, put_out___x1__y3, burning___x2__y2, out_of_fuel___x1__y4, out_of_fuel___x1__y5, burning___x1__y1, out_of_fuel___x1__y1, burning___x2__y1, burning___x2__y3, burning___x4__y5, burning___x5__y5, burning___x3__y4, burning___x4__y4, burning___x4__y2, burning___x4__y3, out_of_fuel___x1__y2, burning___x2__y5, burning___x1__y4}

Trying to remove Bernoulli_params_1_154 from self.context._bool_var_set therefore results in a KeyError. Is this potentially to the duplicate (due to the first def postprocess(...) call, where we already deleted Bernoulli_params_1_154)?

RDDLNotImplementedError: Distribution KronDelta does not allow reparameterization

If I see it correctly there are some typos in def randomvar_to_xadd(self, expr: Expression) in the elif statements. krondelta should be KronDelta and exponential must be Exponential.

In the method description I read:

Currently, self.reparam should be set to True for all rvs except for Bernoulli.

How should one set self.param if a domain uses multiple distributions, e.g. Navigation. Due to the assertion in line 259 self.reparam must be True:

	cpfs {
	
		robot-at'(?x,?y) =
		
			if ( GOAL(?x,?y) ^ robot-at(?x,?y)  )
			then 
				KronDelta(true)
			else if (( exists_{?x2 : xpos, ?y2 : ypos} [ GOAL(?x2,?y2) ^ robot-at(?x2,?y2)  ] )
					 | ( move-north ^ exists_{?y2 : ypos} [ NORTH(?y,?y2) ^ robot-at(?x,?y) ] )
					 | ( move-south ^ exists_{?y2 : ypos} [ SOUTH(?y,?y2) ^ robot-at(?x,?y) ] )
					 | ( move-east ^ exists_{?x2 : xpos} [ EAST(?x,?x2) ^ robot-at(?x,?y) ] )
					 | ( move-west ^ exists_{?x2 : xpos} [ WEST(?x,?x2) ^ robot-at(?x,?y) ] ))
			then 
				KronDelta(false) 
			else if (( move-north ^ exists_{?y2 : ypos} [ NORTH(?y2,?y) ^ robot-at(?x,?y2) ] )
					 | ( move-south ^ exists_{?y2 : ypos} [ SOUTH(?y2,?y) ^ robot-at(?x,?y2) ] )
					 | ( move-east ^ exists_{?x2 : xpos} [ EAST(?x2,?x) ^ robot-at(?x2,?y) ] )
					 | ( move-west ^ exists_{?x2 : xpos} [ WEST(?x2,?x) ^ robot-at(?x2,?y) ] ))
			then 
				Bernoulli( 1.0 - P(?x, ?y) ) 
			else 
				KronDelta( robot-at(?x,?y) );

Potential DBN inconsistencies

When I played around with the navigation domain (I appended the domain.rddl and instance.rddl) and displayed the DBN of single states and ground fluents, I could not interpret the results correctly:

There are 24 different grounded SFs and the goal for the given instance is in GOAL(x4,y6).
When generating the DBN for the single fluent robot-at and the grounded fluent x1__y1 I get:

When generating the DBN for the grounded fluent x4__y6 I get:

The DBN for the grounded fluent x4__y6 looks correct to me. However, for the grounded fluent x1__y1 I don't understand yet, why the next-state (robot-at'(?x1,?y1)) is dependent on the actions move-north and move-east.

Is there a way to find out the connection between actions and SF? Using the example of the DBN of the grounded fluent x4__y6, the next-state depends on whether robot-at(x4,y5) ^ move-north or robot-at(x3,y6) ^ move-east or the robot has already arrived at the goal. As graphical representation I mean sth. like:

robot-at(x4,y5) + move-north --> robot-at(x4,y6)
robot-at(x3,y6) + move-east --> robot-at(x4,y6)
robot-at(x4,y6) --> robot-at(x4,y6)

domain.rddl:

////////////////////////////////////////////////////////////////////
//
// Navigation MDP
//
// Author: Scott Sanner (ssanner [at] gmail.com)
//
// In a grid, a robot (R) must get to a goal (G).  Every cell offers
// the robot a (different) chance of disappearing.  The robot needs
// to choose a path which gets it to the goal most reliably within
// the finite horizon time.
//
// ***********************
// *  0   0   0   0   R  * 
// * .1  .3  .5  .7  .9  *
// * .1  .3  .5  .7  .9  * 
// * .1  .3  .5  .7  .9  *
// * .1  .3  .5  .7  .9  *
// *  0   0   0   0   G  * 
// ***********************
//
// This is a good domain to test deteminized planners because
// one can see here that the path using the .3 chance of failure
// leads to a 1-step most likely outcome of survival, but
// a poor 4-step change of survival (.7^(.4)) whereas the path
// using a .1 chance of failure is much more safe.
//
// The domain generators for navigation have a flag to produce slightly 
// obfuscated files to discourage hand-coded policies, but 
// rddl.viz.NavigationDisplay can properly display these grids, e.g.,
//
//  ./run rddl.sim.Simulator files/final_comp/rddl rddl.policy.RandomBoolPolicy 
//        navigation_inst_mdp__1 rddl.viz.NavigationDisplay
//
// (Movement was not made stochastic due to the lack of intermediate
//  variables and synchronic arcs to support both the PPDDL and SPUDD 
//  translations.)
// 
////////////////////////////////////////////////////////////////////

domain navigation_mdp {
	requirements = {
//		constrained-state,
		reward-deterministic
	};
	
	types {
		xpos : object;
		ypos : object;
	};
	
	pvariables {

		NORTH(ypos, ypos) : {non-fluent, bool, default = false};
		SOUTH(ypos, ypos) : {non-fluent, bool, default = false};
		EAST(xpos, xpos)  : {non-fluent, bool, default = false};
		WEST(xpos, xpos)  : {non-fluent, bool, default = false};

		MIN-XPOS(xpos) : {non-fluent, bool, default = false};
		MAX-XPOS(xpos) : {non-fluent, bool, default = false};
		MIN-YPOS(ypos) : {non-fluent, bool, default = false};
		MAX-YPOS(ypos) : {non-fluent, bool, default = false};
	
		P(xpos, ypos) : {non-fluent, real, default = 0.0};
		
		GOAL(xpos,ypos) : {non-fluent, bool, default = false};
		
		// Fluents
		robot-at(xpos, ypos) : {state-fluent, bool, default = false};
		
		// Actions
		move-north : {action-fluent, bool, default = false};
		move-south : {action-fluent, bool, default = false};
		move-east  : {action-fluent, bool, default = false};
		move-west  : {action-fluent, bool, default = false};
	};
	
	cpfs {
	
		robot-at'(?x,?y) =
		
			if ( GOAL(?x,?y) ^ robot-at(?x,?y)  )
			then 
				KronDelta(true)
			else if (( exists_{?x2 : xpos, ?y2 : ypos} [ GOAL(?x2,?y2) ^ robot-at(?x2,?y2)  ] )
					 | ( move-north ^ exists_{?y2 : ypos} [ NORTH(?y,?y2) ^ robot-at(?x,?y) ] )
					 | ( move-south ^ exists_{?y2 : ypos} [ SOUTH(?y,?y2) ^ robot-at(?x,?y) ] )
					 | ( move-east ^ exists_{?x2 : xpos} [ EAST(?x,?x2) ^ robot-at(?x,?y) ] )
					 | ( move-west ^ exists_{?x2 : xpos} [ WEST(?x,?x2) ^ robot-at(?x,?y) ] ))
			then 
				KronDelta(false) 
			else if (( move-north ^ exists_{?y2 : ypos} [ NORTH(?y2,?y) ^ robot-at(?x,?y2) ] )
					 | ( move-south ^ exists_{?y2 : ypos} [ SOUTH(?y2,?y) ^ robot-at(?x,?y2) ] )
					 | ( move-east ^ exists_{?x2 : xpos} [ EAST(?x2,?x) ^ robot-at(?x2,?y) ] )
					 | ( move-west ^ exists_{?x2 : xpos} [ WEST(?x2,?x) ^ robot-at(?x2,?y) ] ))
			then 
				Bernoulli( 1.0 - P(?x, ?y) ) 
			else 
				KronDelta( robot-at(?x,?y) );
				
	};
	
	// 0 reward for reaching goal, -1 in all other cases
	reward = [sum_{?x : xpos, ?y : ypos} -(GOAL(?x,?y) ^ ~robot-at(?x,?y))]; 
	
}

instance.rddl:

non-fluents nf_navigation_inst_mdp__0 {
	domain = navigation_mdp;
	objects {
		xpos : {x1,x2,x3,x4};
		ypos : {y1,y2,y3,y4,y5,y6};
	};
	non-fluents {
		NORTH(y1,y2);
		SOUTH(y2,y1);
		NORTH(y2,y3);
		SOUTH(y3,y2);
		NORTH(y3,y4);
		SOUTH(y4,y3);
		NORTH(y4,y5);
		SOUTH(y5,y4);
		NORTH(y5,y6);
		SOUTH(y6,y5);

		EAST(x1,x2);
		WEST(x2,x1);
		EAST(x2,x3);
		WEST(x3,x2);
		EAST(x3,x4);
		WEST(x4,x3);

		MIN-XPOS(x1);
		MAX-XPOS(x4);
		MIN-YPOS(y1);
		MAX-YPOS(y6);

		GOAL(x4,y6);

		P(x1,y2) = 0.05619587033111525;
		P(x1,y3) = 0.04836146267135677;
		P(x1,y4) = 0.04224611192322229;
		P(x1,y5) = 0.027581761688785046;
		P(x2,y2) = 0.3290676024238835;
		P(x2,y3) = 0.31491139267692797;
		P(x2,y4) = 0.3367653188138302;
		P(x2,y5) = 0.33334257087090663;
		P(x3,y2) = 0.6136325686764752;
		P(x3,y3) = 0.6107738967872153;
		P(x3,y4) = 0.6151765929039653;
		P(x3,y5) = 0.6120255439526258;
		P(x4,y2) = 0.9491586061795277;
		P(x4,y3) = 0.9140435651595797;
		P(x4,y4) = 0.9300722126485798;
		P(x4,y5) = 0.920351061739235;
	};
}

instance navigation_inst_mdp__0 {
	domain = navigation_mdp;
	non-fluents = nf_navigation_inst_mdp__0;
	init-state {
		robot-at(x4,y1);
	};
	max-nondef-actions = 1;
	horizon = 40;
	discount = 1.0;
}

Ignore KronDelta when reparam=False

Domains with only Bernoulli and KronDelta (such as Navigation) should compile to XADDs without reparameterization (reparam=False).

This should be a simple fix for the compiler to ignore KronDelta (now deprecated and not required) during compilation.

pyrddlgym-project / pyrddlgym-symbolic Goto Github PK

pyrddlgym-symbolic's People

Contributors

Stargazers

Watchers

Forkers

pyrddlgym-symbolic's Issues

Simplify DBN XADD w.r.t. state-invariants

XADD compilation fails for Elevators instances with a single elevator object

KeyError Wildfire XADD compilation

Traceback

Debugging attempt

RDDLNotImplementedError: Distribution KronDelta does not allow reparameterization

Potential DBN inconsistencies

Ignore KronDelta when reparam=False

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent