julianje / bishop Goto Github PK

Mental state inference from observable behavior

License: MIT License

Python 100.00%

bishop's Introduction

#Bishop

About

Bishop, after Washington Bishop, is a python (3) package for modeling Theory of mind. Given some observable behavior, Bishop infers (through Bayesian inference over a rational model of decision making and planning under uncertainty) the cost and reward functions that explain the agent's choices and actions.

Install and uninstall

python setup.py install
pip uninstall Bishop

Using Bishop

The main object in Bishop are observers. Observers are agents with a Theory of Mind that can infer mental-states from action, predict an agent's future actions, and simulate behaviors.

Simulate agents:

from Bishop import *	
Observer = LoadObserver("P_TT_TRE") # Load a ToM agent that is observing the P_TT_TRE map.
R = Observer.SimulateAgents(Samples=100) # Have the observer 'imagine' the behavior of 100 random agents.
R.SaveCSV("MySamples.csv") # Save costs, rewards, actions, and state transitions as a CSV file.
R.Display() # Print everything

To see a list of available maps:

ShowAvailableMaps() # Print all maps
ShowAvailableMaps("Flag") # Print maps that contain the word "Flag"

Cost-reward inference given observable actions

From the terminal

$Bishop --help
$Bishop -m P_TT_TRE -sp 0 -a "R R" -s 5000 -o MySamples -v

uses the P_TT_TRE file (in Bishop's library) to load the map and places an agent in location 0 who took two steps to the right. It then infers the cost and reward function using 5000 samples and stores the output in "MySamples.p"

Inside python

Obs = LoadObserver("Tatik_T1_L1") # Load a ToM agent observing the Tatik_T1_L1 map.
# InferAgent takes a sequence of actions and run mental-state inference on them.
# The actions must be given as a list of the following movements: 'U' (up), 'D' (down), 'L' (left), 'R' (right)
# 'UL' ("up-left"; northwest diagonal), 'UR' (northeast diagonal), 'DL' (southwest diagonal), and 'DR' (southeast diagonal).
Res = Obs.InferAgent(['UL'], Samples=100, Feedback=True) #UL (Up-Left) is a diagonal move

The Observer.InferAgent returns a PosteriorContainer object. This object contains the mental-state and competence inferences as well as functions to assess the quality of inference. Here are some things you can do with it

Res.Summary()
Res.Summary(human=False) # Or print it in csv-format
Res.AnalyzeConvergence() # Visually check if sampling converged
Res.PlotCostPosterior()
Res.PlotRewardPosterior()
Res.LongSummary() # Do everything above.
SaveSamples(Res, "MyResults") # Bishop is sampling based, so you can store the samples with their likelihoods

You can reload the samples and the observer model later with

Res = LoadSamples("MyResults.p")
Obs = LoadObserverFromPC(Res)

Creating a new map

Through configuration files

A map consists of two files: An ASCII description, and a .ini configuration file.

ASCII files begin with a map drawing, with each terrain type indicated numerically. After a line break, each terrain name is specified in a single line. These are the files for "FlagSetup" map

FlagSetup.ini

[MapParameters]
DiagonalTravel: True
MapName: Flag_Map
# Starting point can get overriden later with Observer.SetStartingPoint()
StartingPoint: 2
ExitState: 58

[Objects]
ObjectLocations: 41 49
ObjectTypes: 0 1
ObjectNames: LTreat RTreat
# If the two treats were the same type:
# ObjectTypes: 0 0
# ObjectNames: OnlyOneNameNeeded

[AgentParameters]
Method: Linear # Determines how costs are treated.
# If linear then costs are substracted from rewards.
# if discount then costs are treated as future discounts over rewards.
# Prior over costs and rewards.
Prior: ScaledUniform
# Force terrain 0 to be always less costly than the rest?
Restrict: False
SoftmaxChoice = False
SoftmaxAction = False
# Softmax parameters
# actionTau = 0.01
# choiceTau = 0.01 
# When different than 0 prior becomes a mixture of the
# prior above with a peak in 0. The value determines the mass on that point.
RNull = 0.2
CNull = 0
# Parameters for priors. Meaning changes depending on the prior. See docstrings
CostParameters = 1
RewardParameters = 10

FlagMap

0000011122222
0000011122222
0000011122222
0000011122222
0000011122222
0000011122222
0000011122222

LeftTerrainName
CenterTerrainName
RightTerrainName

Building a map inside python

Map skeleton

To generate a simple grid-world with one terrain start with

MyMap = Map()
MyMap.BuildGridWorld(5,3,Diagonal=True)

This creates a 5 by 3 map that can be navigated diagonally. Terrain type is stored in MyMap.StateTypes. The first terrain has by default a value of 0. New terrains are added through squares:

MyMap.InsertSquare(2, 1, 2, 3, 1):

added a 2x3 square with the top-left corner positioned on (2,1). Both coordinates begin in 1 and the y-axis is counted from top to bottom. The last argument (1) gives the terrain code. Inserting overlapping squares always rewrites past terrain. You can then add terrain names

MyMap.AddTerrainNames(["Water","Jungle"])

To see what your map looks like type

MyMap.PrintMap()

Adding starting point, exit point, and objects

See docstrings for

MyMap.AddStartingPoint()
MyMap.AddExitState()
MyMap.InsertObjects()

Using the map

Once you have a map, you need to create an agent, and use both to create an observer

MyAgent = Agent(MyMap, CostPrior, RewardPrior, CostPriorParameters, RewardPriorParameters)
MyObserver = Observer(MyMap, MyAgent)

See Agent's constructor docstring for list of all parameters agent can take and more details.

bishop's People

Contributors

Stargazers

Watchers

Forkers

luxmiranda mandiepandy63020 sandguine marleneberke

bishop's Issues

Observer.InferAgent()

Add incomplete path support.

Incomplete path bug

from Bishop import *
O = LoadEnvironment("Tatik_T1_L1")
O.InferAgent([1],300)

New function to quickly print summary of samples

Have another function that just summarizes everything from a saved set of samples. Similar to AuxiliaryFunctions.LoadSamples()

Capacity check not testing if path is incomplete

In [3]: O = LoadObserver("Flag_Asym_TwoR")
No method. Setting to Linear.

Action space: ['L', 'R', 'U', 'D', 'UL', 'UR', 'DL', 'DR']
Targets: [[4, 3], [14, 3]]
Exit state: [9, 4]

Terrains: Center Left Right
Map labels: Exit state (E), starting point (S), LTreat(0), RTreat(1)

_0S_1
_E_

In [4]: R = O.Infe
O.InferAgent O.InferAgent_ImportanceSampling
O.InferAgentUsingPC O.InferAgent_MCMC

In [4]: R = O.InferAgent(['R','R','R','R'],500,1)

Progress | | 0.0%ERROR: Number of objects agent collected is outside the range specified in the map.
ERROR: Failed to compute likelihood. OBSERVER-001

PosteriorContainer.AnalyzeConvergence() bug

Res.LongSummary()
Results using 500 samples.

INFERRED REWARDS

Target A: 1.43754457737
Target B: 15.9720887129
Probability that R(A)>R(B): 0.0

GOAL PREDICTIONS

Probability that agent will get target A: 0.0
Probability that agent will get target B: 1.0

INFERRED COSTS

Mud: 0.023535290877
Jungle: 0.0991173686504
Water: 0.163322128548
['Mud', 'Jungle', 'Water']
[[ 1.00000000e+00 2.34846763e-65 2.34846763e-65]
[ 1.00000000e+00 1.00000000e+00 3.41713708e-05]
[ 1.00000000e+00 9.99965829e-01 1.00000000e+00]]
Traceback (most recent call last):
File "", line 1, in
File "Bishop/PosteriorContainer.py", line 33, in LongSummary
self.AnalyzeConvergence()
File "Bishop/PosteriorContainer.py", line 166, in AnalyzeConvergence
rangevals=range(0,self.Samples,jump)
TypeError: range() integer step argument expected, got float.

Res.AnalyzeConvergence()
Traceback (most recent call last):
File "", line 1, in
File "Bishop/PosteriorContainer.py", line 166, in AnalyzeConvergence
rangevals=range(0,self.Samples,jump)
TypeError: range() integer step argument expected, got float.
Res.AnalyzeConvergence(2)
Traceback (most recent call last):
File "", line 1, in
File "Bishop/PosteriorContainer.py", line 175, in AnalyzeConvergence
axarr[i,0].plot(xvals,ycostvals[:,i], 'b-')
File "/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 1373, in plot
for line in self._get_lines(_args, *_kwargs):
File "/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 303, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 281, in _plot_args
x, y = self._xy_from_xy(x, y)
File "/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 223, in _xy_from_xy
raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension

Allow goal predictions to remove conditioning

Observer.InferAgents() now conditions the goal predictions on the actions. But both predictions are useful. The unconditioned prediction tells you what the agent would do on another day, and the conditioned one tells you what they'll finish doing today. I've switched to the condition for now, but switching should be easier.

TestModel bug

Using FlagMap

Are maps getting copied?

It looks like Tatik_TX maps aren't transferred after installation

Planner.BuildCostFunction() missing documentation

Check if DeadState is vestigial code.

Check that diagonal off works

Test if turning diagonal travel off is working with the expanded MDP reward function

Maybe bug in prior refreshing?

Five simulations exceeded time limit in a row, the following 995 all succeeded.

In [1]: run BuildInitialDistributions.py
Using a linear utility function (Add a Method in the AgentParameters block to change to 'Rate' utilities).
Setting restrict to false (i.e., uncertainty over which terrain is the easiest)
No organic markers. Treating all objects as dead. Add an Organic line to mark if some object types are agents (add probability of death).

Action space: ['L', 'R', 'U', 'D', 'UL', 'UR', 'DL', 'DR']
Targets: [[6, 1], [6, 5]]
Exit state: [7, 3]

Terrains: MainTerrain TopTerrain BottomTerrain Walls
Items: RedTreat YellowTreat
Map labels: Exit state (E), starting point (S), RedTreat(0), YellowTreat(1)

****0

S*****E

****1

Progress | | 1.1%ERROR: Simulation exceeded timelimit. PLANNER-009
ERROR: Simulation exceeded timelimit. PLANNER-009
ERROR: Simulation exceeded timelimit. PLANNER-009
ERROR: Simulation exceeded timelimit. PLANNER-009
ERROR: Simulation exceeded timelimit. PLANNER-009
ERROR: Simulation exceeded timelimit. PLANNER-009
Progress |████████████████████| 100.0%

ShowAvailableMaps() needs to search on folders recursively.

StartingPoint is absolute

bishop.Observer.SimulateAgents takes StartingPoint as the raw state number. It needs to get switched to cartesian coordinates.

Observer.SimulateAgents needs to print information about samples

Observer.InferAgent() bug

R = O.InferAgent(['UR','DR','UL','L','L','UR','UL','L','L','UR','UR'],1000,True)

Progress | | 0.0%
ERROR: New states do not align with new actions. PLANNER-012
ERROR: Failed to compute likelihood. OBSERVER-001

Null reward bug

If you force a reward source to be null the utility computation produces a constant cost function (!!??)

Remove Null mixture properties and add them to the Resample() method

Add tatik running script

Make a script that loads the tatik maps and the paths. Similar to the inference.py scripts in OM.

InferAgent time deadline

Have Observer.InferAgent() also run with a time limit or a convergence test.

InferAgent bug.

If path doesn't align with minimum and capacity map parameters code breaks. Need to add code that makes sure the path is valid given the object collection constraints.

In [84]: O = LoadObserver("P_TT_TRE")

Action space: ['L', 'R', 'U', 'D', 'UL', 'UR', 'DL', 'DR']
Targets: [[4, 1], [4, 7]]
Exit state: [7, 4]

Terrains: Outside Inside
Map labels: Exit state (E), starting point (S), TopTreat(0), BottomTreat(1)

_0_

S*****E

_1_

In [86]: R = O.InferAgent(['R','R','R','R','R','R'],500,1)

Progress | | 0.2%---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
----> 1 R = O.InferAgent(['R','R','R','R','R','R'],500,1)

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/Bishop-2.5.0-py2.7.egg/Bishop/Observer.pyc in InferAgent(self, ActionSequence, Samples, Feedback, Method)
180 ActionSequence = self.GetActionIDs(ActionSequence)
181 if Method == "Importance":
--> 182 return self.InferAgent_ImportanceSampling(ActionSequence, Samples, Feedback)
183 if Method == "MCMC":
184 return self.InferAgent_MCMC(ActionSequence, Samples, Feedback)

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/Bishop-2.5.0-py2.7.egg/Bishop/Observer.pyc in InferAgent_ImportanceSampling(self, ActionSequence, Samples, Feedback)
282 self.Plr.Prepare(self.Validate)
283 # Get log-likelihood
--> 284 LogLikelihoods[i] = self.Plr.Likelihood(ActionSequence)
285 # If anything went wrong just stope
286 if LogLikelihoods[i] is None:

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/Bishop-2.5.0-py2.7.egg/Bishop/Planner.pyc in Likelihood(self, ActionSequence)
407 # Find all action sequences that are consistent with the observations:
408 if Complete:
--> 409 goalindex = [self.goalindices.index(objectscollected)]
410 else:
411 # If goal is incomplete then select all plans

ValueError: [] is not in list

In [87]: O.Plr.goalindices
Out[87]: [[0], [1]]

Ensure cost sample range is bounded when using discount method

Have Planner.Validate() generate some random cost samples and check that they're constrained to [0,1] when Planner.Method is set to discount.

Allow for observer to take a PosteriorContainer object and reuse the samples with the posteriors as the priors

PosteriorContainer.AnalyzeConvergence() bug

Obs = LoadObserver("Tatik_T1_L1")
Res = Obs.InferAgent(['UL'],100,True)
Res.AnalyzeConvergence()
In [5]: Res.AnalyzeConvergence()
Recomputing expected value after every sample
WARNING: All likelihoods are zero up to this point. POSTERIORCONTAINER-001
WARNING: All likelihoods are zero up to this point. POSTERIORCONTAINER-001
WARNING: All likelihoods are zero up to this point. POSTERIORCONTAINER-001

WARNING: All likelihoods are zero up to this point. POSTERIORCONTAINER-001

IndexError Traceback (most recent call last)
in ()
----> 1 Res.AnalyzeConvergence()

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/Bishop-2.5.0-py2.7.egg/Bishop/PosteriorContainer.pyc in AnalyzeConvergence(self, jump)
464 jump = 1
465 rangevals = range(0, self.Samples, jump)
--> 466 ycostvals = [self.GetExpectedCosts(i) for i in rangevals]
467 ycostvals = np.array(ycostvals)
468 yrewardvals = [self.GetExpectedRewards(i) for i in rangevals]

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/Bishop-2.5.0-py2.7.egg/Bishop/PosteriorContainer.pyc in GetExpectedCosts(self, limit)
184 a = self.CostSamples[0:(limit + 1), i]
185 b = NL
--> 186 res = sum([float(a[i]) * float(b[i]) for i in range(limit + 1)])
187 ExpectedCosts.append(res)
188 return ExpectedCosts

IndexError: list index out of range

Move AuxiliaryFunctions.SaveSamples() into PosteriorContainer.SaveSamples()

Observer object has no attribute 'M'

obs =bishop.LoadEnvironment('Tatik_T1_l1')
obs.GetSemantics()

PosteriorContainer to do

Add methods for saving and loading posteriorcontainer objects as csv files. DONE
Add functions for belief in qualitative differences (e.g., Probability that A is better than B). DONE
Add human-readable summary function. DONE
Have posterior container store more meta-data, such as map name, etc. DONE
Make PlotCostPosterior grid over a binwidth. DONE
Add print summary as csv line. DONE

LoadObserver on revision mode should ignore tau parameters when Softmax is off

Even when softmax is off on actions and choices LoadObserver still asks you to confirm tau values if it finds them in the .ini file.

TestModel bug

In [7]: O = LoadObserver("Flag_Asym_TwoR_DistB")

Action space: ['L', 'R', 'U', 'D', 'UL', 'UR', 'DL', 'DR']
Targets: [[1, 2], [12, 2]]
Exit state: [9, 3]

Terrains: Left Center Right
Map labels: Exit state (E), starting point (S), LTreat(0), RTreat(1)

0******S__1*
***_E_

In [8]: O.TestModel(20,100,0,1)
Simulating agents...
Progress |████████████████████| 100.0%

Running inference...
Inferring agent 20 |███████████████████ | 95.0%---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
----> 1 O.TestModel(20,100,0,1)

/Users/julianjara-ettinger/Documents/Projects/Models/Bishop/Bishop/Observer.py in TestModel(self, Simulations, Samples, Return, Verbose)
69 sys.stdout.write("| " + str(Percentage) + "%")
70 sys.stdout.flush()
---> 71 Results = self.InferAgent(Agents.Actions[i], Samples)
72 InferredCosts[i] = Results.GetExpectedCosts()
73 InferredRewards[i] = Results.GetExpectedRewards()

/Users/julianjara-ettinger/Documents/Projects/Models/Bishop/Bishop/Observer.py in InferAgent(self, ActionSequence, Samples, Feedback, Method)
168 """
169 Compute a series of samples with their likelihoods.
--> 170
171 Args:
172 ActionSequence (list): Sequence of actions

/Users/julianjara-ettinger/Documents/Projects/Models/Bishop/Bishop/Observer.py in InferAgent_ImportanceSampling(self, ActionSequence, Samples, Feedback)
221 sys.stdout.write(" " * (20 - roundper))
222 sys.stdout.write("| " + str(Percentage) + "%")
--> 223 sys.stdout.flush()
224 # Propose a new sample
225 self.Plr.Agent.ResampleAgent()

/Users/julianjara-ettinger/Documents/Projects/Models/Bishop/Bishop/Planner.py in Likelihood(self, ActionSequence)
463 else:
464 LogLikelihoodTerms[i] = -sys.maxint - 1
--> 465 LogLikelihood = scipy.misc.logsumexp(LogLikelihoodTerms)
466 return LogLikelihood
467

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/scipy/misc/common.pyc in logsumexp(a, axis, b)
74 else:
75 a = rollaxis(a, axis)
---> 76 a_max = a.max(axis=0)
77 if b is not None:
78 b = asarray(b)

/Users/julianjara-ettinger/anaconda/lib/python2.7/site-packages/numpy/core/_methods.pyc in _amax(a, axis, out, keepdims)
24 # small reductions
25 def _amax(a, axis=None, out=None, keepdims=False):
---> 26 return umr_maximum(a, axis, None, out, keepdims)
27
28 def _amin(a, axis=None, out=None, keepdims=False):

ValueError: zero-size array to reduction operation maximum which has no identity