Git Product home page Git Product logo

hoi's Introduction

Black Codecov

https://github.com/brainets/hoi/blob/main/docs/_static/hoi-logo.png

Description

HOI (Higher Order Interactions) is a Python package to go beyond pairwise interactions by quantifying the statistical dependencies between 2 or more units using information-theoretical metrics. The package is built on top of Jax allowing computations on CPU or GPU.

Installation

Dependencies

HOI requires :

  • Python (>= 3.8)
  • numpy(>=1.22)
  • scipy (>=1.9)
  • jax
  • pandas
  • scikit-learn
  • tqdm

User installation

To install Jax on GPU or CPU-only, please refer to Jax's documentation : https://jax.readthedocs.io/en/latest/installation.html

If you already have a working installation of NumPy, SciPy and Jax, the easiest way to install hoi is using pip:

pip install -U hoi

You can also install the latest version of the software directly from Github :

pip install git+https://github.com/brainets/hoi.git

For developers

For developers, you can install it in develop mode with the following commands :

git clone https://github.com/brainets/hoi.git
cd hoi
pip install -e .['full']

The full installation of HOI includes additional packages to test the software and build the documentation :

  • pytest
  • pytest-cov
  • codecov
  • xarray
  • sphinx!=4.1.0
  • sphinx-gallery
  • pydata-sphinx-theme
  • sphinxcontrib-bibtex
  • numpydoc
  • matplotlib
  • flake8
  • pep8-naming
  • black

Help and Support

Documentation

Communication

For questions, please use the following link : https://github.com/brainets/hoi/discussions

Acknowledgments

HOI was mainly developed during the Google Summer of Code 2023 (https://summerofcode.withgoogle.com/archive/2023/projects/z6hGpvLS)

hoi's People

Contributors

aopy avatar brovelli avatar chrisferreyra13 avatar dependabot[bot] avatar dishie2498 avatar etiennecmb avatar mattehub avatar thomasrobiglio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hoi's Issues

To do

HOI metrics

  • RSI / Infotot / RedMMI / SynMMI : fix computations using entropy formula
  • Oinfo : rename it for Oinfo instead of OinfoZeroLag (28920dc)
  • GradientOinfo : compute it using the Oinfo class (0b68f2b)
  • Replace data with x (48bb64f)

Features

  • HOIEstimator : allow providing multiplets to compute (891e87f)
  • Simulations : check Matteo's function

GSoC - Remove neuro orientation and dependency to Frites

The current version of the code assumes brain data as inputs. However, HOI can be computed on any type of system, from population of neurons, interactions between molecules, psychological tests etc. The goal of this first issue is to make the code more general and supports other types of data. Also, here's the Matlab version of the HOI that we'll use for reference.

@Dishie2498

Code structure

Core functions

This folder will contain the low level-functions

  • New folder hoi/core
  • New file hoi/core/__init__.py
  • New file hoi/core/it.py : low-level information-theoretical functions for computing entropy (e.g. ent_g)
  • New file hoi/core/combinatory.py: e.g. combinations

HOI functions

This folder will contain HOI measurements (zero-lag, task-related, lagged)

  • New folder hoi/metrics
  • New file hoi/metrics/__init__.py and hoi/metrics/oinfo_zerolag.py
  • In hoi/metrics/oinfo_zerolag.py, new function oinfo_zerolag and move the content of the function conn_oinfo_jax there

Remove dependencies

  • Remove dependency to Frites and xarray (also modify setup.py)
  • Inutx will still be three dimensional, but we will change the dimension names internally and in the documentation: n_epochsn_samples; n_roin_features; n_timesn_variables

Documentation, unit-test and workflows

HI @Dishie2498, @Mattehub, @brovelli. Please find below the following steps we will go through until the end of the GSoC.

1. Documentation

1.1 Gallery of examples

  • @Dishie2498 is currently working on getting up the gallery of examples (#15)
  • @Mattehub and I we will then fill the gallery with examples illustrating 1) the HOI metrics and 2) benchmark showcasing JAX (performance claims)

1.2 Theoretical background and README

  • @Mattehub is going to write the Theoretical background required to understand HOI. This includes 1) explaining entropy and mutual information (gcmi, knn, kernel and binning) and mutual-information and 2) explain the metric of HOI.
  • @Mattehub will also propose a high-level description of the repo that is going to be used in the README. This should include: 1) the overall goal of the HOI, 2) explain how to install it and the required dependencies such as sponsors (GSoC)

1.3 Logo

1.4 License

  • Add license (BSD claused 3)

2. Data simulation

  • @Mattehub made a function to simulate HOI. I'll review the code to integrate it inside hoi.simulation. Those simulations are then going to be used for the gallery of examples
  • @Dishie2498 will then integrate inside the documentation

3. Testing functionalities and workflows

3.1 Unit-testing

  • @Dishie2498 and I we will write the unit-tests (functional and smoke tests)

3.2 Workflows

Here are the workflows that are going to be required (@Dishie2498 and I) :

  • Github action to test code quality (example)
  • Github action to send the repo to PyPi (example)
  • Github action to test the functionalities on several python version (example). This action implements first unit-testing and then publish the doc.
  • It would be great if could add dropdown menu in the doc to select the documentation associated with each release (see mne-python
  • Add badges to the readme.rst and index.rst

Simulation Function

Hi @Mattehub, @aopy, @Dishie2498 and @brovelli,

@Mattehub wrote a function to simulate HOIs using multivariate Gaussians. @aopy accepted to improve the draft of the function and make the PR to integrate it inside the HOI package. @Dishie2498 and I will be in charge of reviewing the PR.

Main features implemented

  • Generate HOIs between triplets and quadruplets nodes (output shape (n_samples, n_features))
  • HOIs can also occur with an external variable (a.k.a. "task-related")
  • The HOIs can be dynamic (input shape (n_samples, n_features, n_variables)). In that case, there's a bump in the HOIs.

Things that need to be done

  • We are targeting a broad audience, not only Neuroscience. Consequently, we need to remove any reference to neuroscientific data (brain areas, trials, ROI etc.). We used the following convention for inputs and outputs shapes: n_samples, n_features, n_variables
  • The code needs to pass the Black and Flake8 coding styles
  • Output type should be NumPy arrays

Function definition

For the moment, let's start simple and focus on a single function to simulate HOIs and let's set aside the task-related part for the moment. Here's a proposition for the function definition :

def simulate_hois_gauss(
    n_samples=200, n_features=4, n_variables=1, triplets=[(0, 1, 2), (1, 2, 3)],
    tiplet_types=['redundancy', 'synergy']
    )
    """Simulate higher-order interactions.

    n_samples: number of samples
    n_features: number of nodes
    n_variables: number of repetitions. This parameter can be used to simulate dynamic hoi
    triplets: list of triplets of nodes linked by hoi.
    tiplet_types: specify whether each triplet interaction type should be redundant or synergistic. By default
    the function generates redundant interactions between nodes (0, 1, 2) and synergistic interactions between
    nodes (1, 2, 3).
    """
    pass

A typical minimal use case :

from hoi.simulation import simulate_hois_gauss
from hoi.metrics import Oinfo
from hoi.utils import get_nbest_mult

# simulate hois
x = simulate_hois_gauss()

# compute hois
model = Oinfo()
hoi = model.fit(x)

# print the results and check that it correspond to the ground truth
df = get_nbest_mult(hoi, model=model)

Structure of tutorials and examples

Following the diataxis approach to documentation, I propose the following organization of tutorials and examples.

EXPLANATION -> theoretical explanations as already present here + references
(might require checking of the equations + adding some figures)

REFERENCE -> API documentation

TUTORIALS
The goal of this part is to guide the user through the structure of the important functions in the toolbox.

  • "Core information theoretical measures"
    Here the goal is to show how to compute entropy and mutual information. I think a simple case with entropy/MI of univariate Gaussians is ok.
  • "Higher-order information theoretical measures"
    Here we show for a single metric (O-info?) how the hoi.metrics class works. Something very similar to the existing example with RSI.
    Create a simple data, define the model, fit it, print the best multiplets, and plot the results
  • "Simulation of HOI"

HOW-TO GUIDES -> Examples
This consists of a gallery of examples (one per metric?) showing the computation of the given metric on simple data (similar to what is done in the corresponding tutorial but with shorter descriptions).

  • Oinfo
  • InfoTopo
  • TC
  • DTC
  • Sinfo
  • GradientOinfo
  • RSI
  • RedundancyMMI
  • SynergyMMI
  • InfoTot

GSoC - Documentation v2

Hi @Dishie2498, I've included below the modifications for the v2 of the documentation. As a general rule of thumb, it would be great to follow the organization of frites :

1. First page

  • Small description of the package (example). Introduce the "HOI" acronym
  • Hyperlinks to navigation tabs (see 2. below)

2. Navigation tab

Here's the list of tabs the website should contains :

  • Installation : should describe how to install the software and should include the list of dependencies (example)
  • API reference : list of functions and classes of the package (see 2.1 below)
  • Examples : (example)

2.1 API reference

2.1.1 Overall structure of the API

├── API reference
│   └── Metrics of HOI
│       └── `OinfoZeroLag`
|       └── `InfoTopo`
|       └── `RSI`
|       └── `RedundancyMMI`
|       └── `SynergyMMI`
|       └── `InfoTot`
│   └── Utility functions
│       └── Preprocessing
│            └── `digitize`
|            └── `normalize`
│       └── Postprocessing
|            └── `landscape`
|            └── `get_nbest_mult`
│   └── Plot HOI
│       └── `plot_landscape`
│   └── Low-level core functions
│       └── Information-theoretical measures
|            └──  Measures of Entropy
|                └── `entropy_gcmi`
|                └── `entropy_bin`
|                └── `entropy_knn`
|                └── `entropy_kernel`
|                └── `copnorm_nd`
|                └── `prepare_for_entropy`
|                └── `get_entropy`
|            └──  Measures of Mutual Information
|                └── `mi_entr_comb`
│       └── Combinatory
|            └── `combinations`

2.1.2 Modifications

  • Start the page with the List of classes and functions (example)
  • Change and reorder subheadings : hoi.metricsMetrics of HOI ; hoi.utilsUtility functions; hoi.plotPlot HOI; hoi.coreLow-level core functions
  • When clicking on one module (e.g. hoi.metrics) the new page should start with the list functions and classes inside this module (example)
  • Then when clicking on a function or class of a module, it should open the description on a new page (example)

2.1.3 Scientific references

Using sphinxcontrib-bibtex we can include the link to scientific references (look at the bottom of the page of this example)

  • Add the references rosas2019oinfo to hoi.metrics.OinfoZeroLag and baudot2019infotopo to hoi.metrics.InfoTopo ()
  • On the website of the doc, references should be automatically listed (example)

In the refs.bib file :

@article{rosas2019oinfo,
	title = {Quantifying high-order interdependencies via multivariate extensions of the mutual information},
	volume = {100},
	url = {https://link.aps.org/doi/10.1103/PhysRevE.100.032305},
	doi = {10.1103/PhysRevE.100.032305},
	number = {3},
	urldate = {2022-12-17},
	journal = {Physical Review E},
	author = {Rosas, Fernando E. and Mediano, Pedro A. M. and Gastpar, Michael and Jensen, Henrik J.},
	month = sep,
	year = {2019},
	note = {Publisher: American Physical Society},
	pages = {032305}
}


@article{baudot2019infotopo,
	title = {Topological information data analysis},
	volume = {21},
	issn = {1099-4300},
	url = {https://www.mdpi.com/1099-4300/21/9/869},
	doi = {10.3390/e21090869},
	number = {9},
	journal = {Entropy. An International and Interdisciplinary Journal of Entropy and Information Studies},
	author = {Baudot, Pierre and Tapia, Monica and Bennequin, Daniel and Goaillard, Jean-Marc},
	year = {2019},
	note = {Number: 869},
}

Tensor vs. vmap computations

Goal

Comparison between the loop-free tensor implementation of the entropy vs. using jax.vmap looping over the first axis.

Comparison

  • Which implementation is the fastest?
  • Which one takes less memory (GPU)?

Documentation v3

Hey @Dishie2498, please find below some improvements for the v3 of the doc :

1. Bug fixing

  • Add the line autosummary_generate = True in the conf.py like here
  • References are not listed because none are used in the code. You need to add a References section for the OinfoZeroLag and InfoTopo and cite the refences like here

2. Improvements

  • The documentation of classes and functions generated using autosummary should follow those templates
  • Can you create a banner that goes at the bottom of each page of the documentation? The banner should contains the logo of the INT, Aix-Marseille university and gsoc
  • In the Overview tab, can you create a file called ovw_theory.rst with the title Theoretical background. Later, @Mattehub will write some of the mathematical background about information-theory and HOI. You'll also have to create a overview/index.rst file

Optimisations

  • vmap = On GPU, it works much better when it's vmapping over the last axis
  • pmap = there's a way of doing multi-processing (check)
  • use iterators instead of arrays. But does it work with Jax?

GSoC - Implementation of the boostrap

Python implementation of the boostraps for selecting significant multiplets + confidence interval estimation. For reference, here's the matlab version.

  • For the resampling, we should directly use scikit-learn. You can also look at a past implementation. The function defining the indices for resampling should be added to hoi/core/combinatory.py
  • Compute significance, CI and comparison between orders

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.