Git Product home page Git Product logo

pymdp's Introduction

A Python package for simulating Active Inference agents in Markov Decision Process environments. Please see our companion paper, published in the Journal of Open Source Software: "pymdp: A Python library for active inference in discrete state spaces" for an overview of the package and its motivation. For a more in-depth, tutorial-style introduction to the package and a mathematical overview of active inference in Markov Decision Processes, see the longer arxiv version of the paper.

This package is hosted on the infer-actively GitHub organization, which was built with the intention of hosting open-source active inference and free-energy-principle related software.

Most of the low-level mathematical operations are NumPy ports of their equivalent functions from the SPM implementation in MATLAB. We have benchmarked and validated most of these functions against their SPM counterparts.

Status

status PyPI version Documentation Status DOI

pymdp in action

Here's a visualization of pymdp agents in action. One of the defining features of active inference agents is the drive to maximize "epistemic value" (i.e. curiosity). Equipped with such a drive in environments with uncertain yet disclosable hidden structure, active inference can ultimately allow agents to simultaneously learn about the environment as well as maximize reward.

The simulation below (see associated notebook here) demonstrates what might be called "epistemic chaining," where an agent (here, analogized to a mouse seeking food) forages for a chain of cues, each of which discloses the location of the subsequent cue in the chain. The final cue (here, "Cue 2") reveals the location a hidden reward. This is similar in spirit to "behavior chaining" used in operant conditioning, except that here, each successive action in the behavioral sequence doesn't need to be learned through instrumental conditioning. Rather, active inference agents will naturally forage the sequence of cues based on an intrinsic desire to disclose information. This ultimately leads the agent to the hidden reward source in the fewest number of moves as possible.

You can run the code behind simulating tasks like this one and others in the Examples section of the official documentation.


Cue 2 in Location 1, Reward on Top


Cue 2 in Location 3, Reward on Bottom

Quick-start: Installation and Usage

In order to use pymdp to build and develop active inference agents, we recommend installing it with the the package installer pip, which will install pymdp locally as well as its dependencies. This can also be done in a virtual environment (e.g. with venv).

When pip installing pymdp, use the package name inferactively-pymdp:

pip install inferactively-pymdp

Once in Python, you can then directly import pymdp, its sub-packages, and functions.

import pymdp
from pymdp import utils
from pymdp.agent import Agent

num_obs = [3, 5] # observation modality dimensions
num_states = [3, 2, 2] # hidden state factor dimensions
num_controls = [3, 1, 1] # control state factor dimensions
A_matrix = utils.random_A_matrix(num_obs, num_states) # create sensory likelihood (A matrix)
B_matrix = utils.random_B_matrix(num_states, num_controls) # create transition likelihood (B matrix)

C_vector = utils.obj_array_uniform(num_obs) # uniform preferences

# instantiate a quick agent using your A, B and C arrays
my_agent = Agent( A = A_matrix, B = B_matrix, C = C_vector)

# give the agent a random observation and get the optimized posterior beliefs

observation = [1, 4] # a list specifying the indices of the observation, for each observation modality

qs = my_agent.infer_states(observation) # get posterior over hidden states (a multi-factor belief)

# Do active inference

q_pi, neg_efe = my_agent.infer_policies() # return the policy posterior and return (negative) expected free energies of each policy as well

action = my_agent.sample_action() # sample an action

# ... and so on ...

Getting started / introductory material

We recommend starting with the Installation/Usage section of the official documentation for the repository, which provides a series of useful pedagogical notebooks for introducing you to active inference and how to build agents in pymdp.

For new users to pymdp, we specifically recommend stepping through following three Jupyter notebooks (can also be used on Google Colab):

Special thanks to Beren Millidge and Daphne Demekas for their help in prototyping earlier versions of the Active Inference from Scratch tutorial, which were originally based on a grid world POMDP environment create by Alec Tschantz.

We also have (and are continuing to build) a series of notebooks that walk through active inference agents performing different types of tasks, such as the classic T-Maze environment and the newer Epistemic Chaining demo.

Contributing

This package is under active development. If you would like to contribute, please refer to this file

If you would like to contribute to this repo, we recommend using venv and pip

cd <path_to_repo_fork>
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
pip install -e ./ # This will install pymdp as a local dev package

You should then be able to run tests locally with pytest

pytest test

Citing pymdp

If you use pymdp in your work or research, please consider citing our paper (open-access) published in the Journal of Open-Source Software:

@article{Heins2022,
  doi = {10.21105/joss.04098},
  url = {https://doi.org/10.21105/joss.04098},
  year = {2022},
  publisher = {The Open Journal},
  volume = {7},
  number = {73},
  pages = {4098},
  author = {Conor Heins and Beren Millidge and Daphne Demekas and Brennan Klein and Karl Friston and Iain D. Couzin and Alexander Tschantz},
  title = {pymdp: A Python library for active inference in discrete state spaces},
  journal = {Journal of Open Source Software}
}

For a more in-depth, tutorial-style introduction to the package and a mathematical overview of active inference in Markov Decision Processes, you can also consult the longer arxiv version of the paper.

Authors

pymdp's People

Contributors

alec-tschantz avatar arun-niranjan avatar berenmillidge avatar conorheins avatar dependabot[bot] avatar dimarkov avatar mahault avatar pitmonticone avatar ran-wei-verses avatar swauthier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymdp's Issues

Sampling action when using deep policy

Hi, thank you for this package I'm really enjoying learning about active inference.

I deeply appreciate the contributors to this package.

However, I have a question when I tried to implement the explore-exploit task in this article (Smith et al., A step-by-step tutorial on active inference and its application to empirical data, https://doi.org/10.1016/j.jmp.2021.102632) which is already implemented in MATLAB and "pymdp".

I tried to run a loop for active inference of deep policy (two time-steps) according to the "complete recipe for active inference" as written in the "pymdp" tutorial notebook, but I found that the "sample_action" method of the "Agent" class only sample action from the first timestep of policy (each policy has the shape of (2,2), the first dim is the number of timesteps and the second dim is the number of factors) using "control.sample_policy" function as below:

(line 674-675, control.py)

for factor_i in range(num_factors):
     selected_policy[factor_i] = policies[policy_idx][0, factor_i]

My setting of the agent class was:

timepoints = [0,1,2]
agent = Agent(
    A = A_gm,
    B = B,
    C = C,
    D = D_gm,
    E = E,
    pA = pA,
    pD = pD,
    policies = policies,
    policy_len = policies[0].shape[0],
    inference_horizon = len(timepoints),  
    inference_algo="MMP",
    sampling_mode="full",
    modalities_to_learn=[1],
    use_BMA = True,
    policy_sep_prior = False,
)

In my thought, to sample the action of the other timestep in each policy, line 675 would be better if changed like this:

selected_policy[factor_i] = policies[policy_idx][timestep, factor_i]

If I didn't understand this package well, then please let me know how to correct it.

Thank you!

Habit learning and precision

Hello!
I couldn't find implementation of habit learning (updating of prior over policies E) and precision(gamma) updating through betta prior.
Am I missing it? And if not, is it planned for future releases?
Thanks!

Better error messages for MMP

Add better error /warning messages when you try to initialise an Agent() instance with the MMP inference algorithm (inference_algo="MMP") , but that choice doesn't make sense (inconsistent other choices of policy_depth, etc.).

Polish tutorial 2

I really enjoyed this tutorial, overall. It shows the power of Agent. I think it's a little hard to read in some places, and I wanted to flag some of these issues:

  1. It's a non-standard bandit task, because there's a third action, HINT. It would be nice to indicate this in the intro - or to refer to a version of the task which includes the hint action, because there's no indication in the only reference, which is the Wikipedia link.
  2. Going from tutorial 1 to tutorial 2, there's an unexplained shift in the definition of A. In tutorial 1, it was a two-dimensional array p(o|s), but now it's a four-dimensional array. I realized a bit late in the process that this was all explained in the pymdp fundamentals part of the doc - a backlink would be nice.
  3. Generally I find using raw arrays to define states difficult to read. I presume that this is an artifact of the history of the package (Matlab cell arrays). It would be helpful to remind people the definition of the axes as we go along. OTOH, I did find the diagrams very helpful.
  4. I found the last part about manipulating the reward very confusing. I had to go back to definitions to figure out that the last manipulation changed the loss into a reward (+5) - if I just look at the output, it looks like the agent always picks the arm resulting in a loss.

One thing that's not explained here is how the agent's inference mechanism differs from those of other, more well-known agents from the RL literature. Having read the first few chapters of Richard Sutton's book eons ago, I wondered whether the inference was equivalent to a finite horizon dynamic programming solution, or similar in spirit to an agent that uses a UCB heuristic or Thompson sampling. If you could have a few references in the tutorial about this, it would be great.

Index Error

Hi
I came up with thois message when running agent-demo (tmaze works fine):

image

State inference: Considerable performance difference with MATLAB implementation?

Hi all,

First of, many thanks for your work, it looks very promising!

I was comparing some simple agents between the current and matlab implementation and noticed that in terms of reversal learning, the pymdp version appears to adapt considerably slower. I've played around with a variety of setups and hyperparameters but the difference is quite significant.

Example setup: slot machine task without hints
2 actions: 'button' 0 and 'button' 1
A 'button 0 is better' context and a 'button 1 is better' context.

40 trials, with the hidden context switching after 20 trials. Here I plot the state posterior (black) of 100 agents starting with flat context priors, compared to the true but unknown state/context (red). Below I'll include the pymdp code. I'm assuming I'm using the package wrong, and would love to know my misunderstanding.

image
image

import pymdp
from pymdp import utils
from pymdp.agent import Agent
import numpy as np
import matplotlib.pyplot as plt

num_obs = [3, 3] # 3 Rewards, 3 choice observations
num_states = [3, 2] # 3 choice states, 2 hidden states
num_factors = len(num_states)

# Press one of two buttons
num_controls = [2, 1] 

A_shapes = [[o_dim] + num_states for o_dim in num_obs]

# initialize the A array to all 0's
A = utils.obj_array_zeros(A_shapes)

# reward probabilities
r1=0.9
r2=0.9

# Reward observations
# STATE 1     Start a0 a1
A[0][0,:,0] = [0, 1-r1, r2  ] # Positive reward
A[0][1,:,0] = [0, r1  , 1-r2] # Negative reward
A[0][2,:,0] = [1, 0   , 0   ] # Neutral (start state)
# STATE 2     Start a0 a1
A[0][0,:,1] = [0, r1  , 1-r2] # Positive
A[0][1,:,1] = [0, 1-r1, r2  ] # Negative
A[0][2,:,1] = [1, 0   , 0   ] # Neutral (start state)

# No uncertainty about choice observations
A[1][:,:,0] = np.eye(num_obs[1])
A[1][:,:,1] = np.eye(num_obs[1])


B_shapes = [[s_dim, s_dim, num_controls[f]] for f, s_dim in enumerate(num_states)]

B = utils.obj_array_zeros(B_shapes)

for i in range(2):
    B[0][0,:,i] = np.ones(3)
    
B[0][:,0,0] = [0, 1, 0] # action 0: Start  -> a0
B[0][:,0,1] = [0, 0, 1]  # action 1: Start  -> a1

B[1][:,:,0] = np.eye(num_states[1])

C = utils.obj_array_zeros(num_obs)
C[0] = np.array([1, -1, 0]) # Prefer rewards

D = utils.obj_array_uniform(num_states)
D[0] = np.array([1, 0, 0]) # Start in the 'start'-state

# ENVIRONMENT
class my_env():
    
    def __init__(self, A):
        
        self.A = A
        
    def step(self, action, state):
                
        obs = utils.sample(self.A[0][:, action[0].astype(int)+1, state])
            
        return [obs, action[0].astype(int)+1]

# SIMULATIONS
T = 40
alpha = 16
gamma = 16
AS = "deterministic"

for run in range(100):
    D2 = D.copy()

    model = Agent(A=A, B=B, C=C, D=D, policy_len=1, action_selection=AS)
    switches = [20,40,50]
    state = 0
    states = []
    pstate = []
    pact = []
    e = my_env(A)

    model.infer_states([2,0]) # 2 = neutral obs, 0 = start state

    for t in range(T):
#         if t > 0: 
#             D2[1] = model.qs[1]
#             model = Agent(A=A, B=B, C=C, D=D2, policy_len=1, action_selection=AS)

        if t in switches:
            state = 1 - state
        states.append(state)

        # Start position for the trial (I believe you don't use this in the tutorial, but it doesnt seem to matter much)
        model.infer_states([2,0]) # 2 = neutral reward, 0 = start state observation

        q_pi, neg_efe = model.infer_policies()

        action = model.sample_action()

        obs = e.step(action, state=state)
        model.infer_states(obs)

        # Save belief and output
        pstate.append(model.qs[1][0])
        pact.append(q_pi[0])

    plt.plot([1-s for s in pstate], label="p(s)", linewidth=3, alpha=0.1, color='k')
    
plt.plot([s*1.1-0.05 for s in states], label="s", color="r", linewidth=3)
plt.xlabel("trial")
plt.ylim([-0.05, 1.05])
plt.title("Python")

lack of time symmetry in run_mmp when creating messages

When going through mmp with @conorheins yesterday, I noticed this section:

                # past message
                if t == 0:
                    lnB_past = spm_log(prior[f])
                else:
                    past_msg = B[f][:, :, int(policy[t - 1, f])].dot(qs_seq[t - 1][f])
                    lnB_past = spm_log(past_msg)

                # future message
                if t >= future_cutoff:
                    lnB_future = qs_T[f]
                else:
                    future_msg = trans_B[f][:, :, int(policy[t, f])].dot(qs_seq[t + 1][f])
                    lnB_future = spm_log(future_msg)

I am not sure if the past_msg and future_msg lines are consistent with their use of t-1, t and t +1. I think this is worth checking against 1) the original SPM to see what they do and 2) deciding what the correct behaviour should be.

Fix `pymdp.utils.obj_array_from_list`

Creating an object array from a list of numpy arrays seems to break if the component arrays of the list have matching leading axis. For example

arrs = [np.zeros((3, 6)), np.zeros((3, 4, 5))]
obj_arr = np.array(arrs, dtype=object)

will throw the following error:

ValueError: could not broadcast input array from shape (3,6) into shape (3,)

whereas

arrs = [np.zeros((4, 6)), np.zeros((3, 4, 5))]
obj_arr = np.array(arrs, dtype=object)

works just fine.

Full API documentation

Hi!

I think it would be useful to provide full API documentation for your package on your RTD website.
For example, I think the agent class and the envs subpackage are quite important, but I can't find API docs for them on the site.
I'd find it easier to follow if they were listed in the modules section of the site, or even if you wanted to provide a separate full API documentation section that would work too I think.
Thanks!

B matrix navigation problem

Hello,

Thanks so much with your work on bringing Active Inference to Python.

While going through your epistemic chaining demo, it appears there is a problem with the agent navigation. When I moved cue1 location from (2,0) to (0,1), the agent takes two moves downward and then tries to move "left" into the wall. The agent never recovers from this and doesn't seem to know to try a different direction.

I assume this is a problem with the B Matrix but I'm not smart enough to figure out if this is some challenge in the agent class or in the rules set up during the demo itself for (["UP", "DOWN", "LEFT", "RIGHT", "STAY"]); elif

Any help/advice would be greatly appreciated! Please see the output log from the agent movements below . The only change I make to your demo is the my_env section where I change the cue1_loc to (0,1) - you'll see that once it completes the second action it tries to go LEFT... then STAY, then tries LEFT a few more times:

Action at time 0: DOWN
Grid location at time 0: (1, 0)
Reward at time 0: Null
Action at time 1: DOWN
Grid location at time 1: (2, 0)
Reward at time 1: Null
Action at time 2: LEFT
Grid location at time 2: (2, 0)
Reward at time 2: Null
Action at time 3: STAY
Grid location at time 3: (2, 0)
Reward at time 3: Null
Action at time 4: STAY
Grid location at time 4: (2, 0)
Reward at time 4: Null
Action at time 5: STAY
Grid location at time 5: (2, 0)
Reward at time 5: Null
Action at time 6: LEFT
Grid location at time 6: (2, 0)
Reward at time 6: Null
Action at time 7: LEFT
Grid location at time 7: (2, 0)
Reward at time 7: Null
Action at time 8: LEFT
Grid location at time 8: (2, 0)
Reward at time 8: Null
Action at time 9: LEFT
Grid location at time 9: (2, 0)
Reward at time 9: Null

Choice of data structures for distributions/matrices

I thought it'd be good to get a discussion going on how we represent distributions in pymdp.

Currently we have the categorical and dirichlet classes, which I believe we are trying to move away from as they're tricky to use and it means all of the algorithms which call them have to constantly check if their inputs are one of these classes or not.

It seems for now we're moving towards using numpy arrays of numpy arrays to shift data around. IIUC This is necessary because we often have varying sub structures depending on the number of factors involved (e.g. we have multi-dimensional tensors representing an agents priors, policy choices etc.)

I previously considered whether we should write our own dataclass for representing what we need, but I think it's likely to end up the same way as the Dirichlet/Categorical classes. It would be a step forward, but still vulnerable to the same problems of being fragile to refactor and improve upon.

I suggest we use the third party Awkward Array to solve this problem for us. They seem to have a stable API, and we should be able to use any existing numpy methods with those arrays as long as they are not jagged (which we have to sort out anyway in the current implementation).

I think using this library will provide the following benefits:

  1. make the code more readable
  2. significant speedups when we have a much larger range of policies and states to contend with

There is always a risk of introducing third party dependencies, but I am of the opinion that if it's good enough for particle physics (tracks and decay events) at the scale CERN is dealing with, it is almost certainly good enough for us.

@conorheins @alec-tschantz what do you think? I could have a go at introducing it in a PR if you think this idea is worth investigating further

(tutorial video here is nice: https://www.youtube.com/watch?v=WlnUF3LRBj4)

Remove redundant operation for normalization in case of 3-tensors

@OzanCatalVerses pointed out that norm_dist within the pymdp.utils module strangely treats the case of 3-tensors differently than other numbers of dimensions, see here and code below.

def norm_dist(dist):
    """ Normalizes a Categorical probability distribution (or set of them) assuming sufficient statistics are stored in leading dimension"""
    if dist.ndim == 3:
        new_dist = np.zeros_like(dist)
        for c in range(dist.shape[2]):
            new_dist[:, :, c] = np.divide(dist[:, :, c], dist[:, :, c].sum(axis=0))
        return new_dist
    else:
        return np.divide(dist, dist.sum(axis=0))

Will correct this to:

def norm_dist(dist):
    """ Normalizes a Categorical probability distribution (or set of them) assuming sufficient statistics are stored in leading dimension"""
      return np.divide(dist, dist.sum(axis=0))

Fix assertion when whether `num_controls` is consistent with actions enumerated in policy space

In line 149 of agent.py, this assertion statement is used:

assert all([n_c == max_action for (n_c, max_action) in zip(self.num_controls, list(np.max(all_policies, axis =0)+1))]), "Maximum number of actions is not consistent with `num_controls`"

But sometimes certain actions will not be allowed (they will be pruned / absent in policies) and thus even when taking the maximum across all policies, you won't see the maximum action taken within a given control factor.

Therefore, @tverbele suggests to change this to

assert all([n_c >= max_action for (n_c, max_action) in zip(self.num_controls, list(np.max(all_policies, axis =0)+1))]), "Maximum number of actions is not consistent with `num_controls`"

Not sure linting step in github actions is doing what it was expected to do..

.github/workflows/python-package.yml:

    - name: Lint with flake8
      run: |
        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
        flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

While it is true that the code gets linted, it happens in context of a github docker image and is lost after the tests are run.

If you want auto-linting to happen before commits, there are a variety of ways to do it, but I suggest looking into precommit. If you set expectations that code is linted before commit, this step can be updated so that it fails if the linting results in changing the files at all, because it implies that the files weren't linted prior to committing.

Nothing is perfect and there are ways around any such solution, but you can see how we're using precommit to make sure that static artifacts are build correctly before commits here: https://github.com/VersesTech/c4-verses-ai/blob/main/.pre-commit-config.yaml

transition matrix B form

Hey there,
First of all, I'm grateful for your work and your effort to make active inference more accessible for everyone. I'm quite new to this approach, so I started playing around with the provided notebooks.
I was wondering why the (controllable) transition matrix B (for the Agent class inside pymdp.agent) accepts direct state interaction only, i.e. it has to be in the form s^i_{t+1} = B(s^i_t, a) and not in a form with internal state interactions like s^i_{t+1} = B(s^i_t, s^j_t ,a), where s^i, s^j refer to different hidden states. Am I wrong here? Because imho it shouldn't be a problem, since you can always define a world state \tilde{s}(s^i, s^j), but makes your program more complicated.
Best,
Martin

README badges

I think it would be helpful to have the badges in README to link to something other than the picture of the badge.
E.g. the pypi could link to https://pypi.org/project/inferactively-pymdp/
Further I would suggest a badge to link to your read the docs site, and if you have CI set up for your pytest tests, a badge for that would be very helpful

A matrix stub bug

Hi,
Will this new method fix the error I get with A-Matrix and B-Matrix examples?


AssertionError Traceback (most recent call last)
in
----> 1 A = utils.convert_A_stub_to_ndarray(A_stub, model_labels)

~\pymdp\core\utils.py in convert_A_stub_to_ndarray(A_stub, model_labels)
490 for g, modality_name in enumerate(model_labels['observations'].keys()):
491 A[g] = A_stub.loc[modality_name].to_numpy().reshape(num_obs[g], *num_states)
--> 492 assert (A[g].sum(axis=0) == 1.0).all(), 'A matrix not normalized! Check your initialization....\n'
493
494 return A

AssertionError: A matrix not normalized! Check your initialization....

or unrelated?
Thanks

Originally posted by @osaaso3 in #26 (comment)

Sparse representation of likelihood tensors (`A` and `B` arrays)

  • allow users to specify an optional list of hidden state factor indices when passing in A and B arrays, that specify which hidden state factors correspond to the lagging dimensions of these tensors. The purpose of this feature is to speed up computation speed and limit memory consumption. It leverages the (often-satisfied) assumption in generative models that not all observation modalities depend on all hidden state factors. This means we don't have to include many of the lagging dimensions in the likelihood tensors that correspond to hidden state factors that the observation modality in question doesn't depend on. The same goes for the B arrays, which are already accordingly sparse due to the baked-in mean-field approximation of the posterior (statistical independence of hidden state factors in the prior).

    • I imagine these structures should be two list of lists (one for A and one for B), structured as follows:
      - A_factor_list, where A_factor_list[m] contains a list of the hidden state factor indices that correspond to the factors that observation modality m is conditionally dependent on.
      - B_factor_list where B_factor_list[f] is the list of hidden state factor indices that are needed to predict the dynamics of factor f. For now we will assume control factors are still fully conditionally independent, such that one B[f] sub-array only depends on the control state factor with index f, even though it may depend on other hidden state factors.
  • completing the above will also require building in the option of having additional dependencies in each factor-specific B array, on the other hidden state factors.

! pip install inferactively-pymdp

Dear researchers

Unfortunatelly this wonderfull pyton toolbox for active inference cannot be opened in colab since Thursday 23.3.23 evening. What can be done?

I hope this can be fixed soon! Thank you in advance.

Best
Susanne Valavanis

einsum while calculating free energy

hi, guys, i have a question about calculating the free energy. While calculating einsum, why only sum to the first dim:

arg_list = [X, list(range(X.ndim))] + list(chain(*([x[xdim_i],[dims[xdim_i]]] for xdim_i in range(len(x))))) + [[0]]

and then use:

spm_dot(likelihood, qs)[0]

to fetch only the first level of s0 (suppose hidden state s is factorized as s0, s1, ... ) ?

in my opinion, we should sum over all the tensor by removing the last + [[0]] and the Y should just be a scalar.
am I totally wrong, if i am, please correct me, thanks a lot!

Bug when `num_controls` is passed as an argument to `Agent` constructor

Attribute error in agent.py at line 124:

self.control_fac_idx = [f for f in range(self.num_factors) if self.num_controls[f] > 1]

whennum_controls is passed in as argument to agent.Agent() constructor. Need an else statement after 118:

self.num_controls = [self.B[f].shape[2] for f in range(self.num_factors)]
, so that num_controls gets assigned to a property of self if provided as argument.

Originally posted by @conorheins in #4 (reply in thread)

Typos in Tutorial 1

There are two instances which read THe, which should be changed to The

A heading is mislabeled:

  1. The prior over observations: the D vector or P(s)

Should read:

  1. The prior over hidden states: the D vector or P(s)

an error in the official documentation

Dear sir,
In the official document "Active Inference Demo: Epistemological Chains",

Starting with the 0-th modality, the Location observation modality: A[0]

A[0] = np.tile(np.expand_dims(np.eye(num_grid_points), (-2, -1)), (1, 1, num_states[1], num_states[2]))

i run it but with the error "A matrix is not normalized (i.e. A.sum(axis = 0) must all be equal to 1.0)",this looks initialized but not normalized

Error when using `stochastic` mode action sampling in case that `num_controls[f] = 1`

Bug found by @SamGijsen:

Running "stochastic" action selection throws an error when using the current version. I believe it's because actions are sampled even if num_controls=1, in which case utils.sample() attempts to squeeze an array of size 1. Reverting to deterministic selection if num_controls==1 has fixed this for me.
(control.sample_action() line 565 if action_selection == 'deterministic' or num_controls[factor_i]==1:) Happy to submit a PR if that's useful, otherwise I believe it's just this line of code.

Originally posted by @SamGijsen in #82 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.