GIRAFFE: biologically informed inference of gene regulatory networks

A scalable matrix factorization-based algorithm to jointly infer regulatory effects and transcription factor activities developed in the context of my Master's Thesis.

Consider a setting with G genes, TF proteins (transcription factors), and n samples (e.g. individual in a studies or cells). Given a gene expression matrix (matrix of dimension G x n), a prior for the regulatory network (matrix of dimension G x TF), and a protein-protein interaction network (matrix of dimension TF x TF), GIRAFFE computes:

A matrix for the transcription factor activities TFA. Each entry describes the amount of proteins available to regulate their target genes.
A regulatory network R of dimension G x TF. Weights can be interpreted as coefficients of a linear model that considers the transcription factor activity as covariates, and gene expression as target.

Install

Clone the repository into your local disk:

git clone https://github.com/soelmicheletti/giraffe.git

Then install giraffe through pip:

cd giraffe
pip install -e .

Upon completion you can load netZooPy in your python code through

import giraffe

Usage

import giraffe
import numpy as np

# Generate toy data
G = 100 # Genes
TF = 20 # Transcription factors (proteins)
n = 10 # Samples (e.g. individuals)

# expression of size (G, n); prior of size (G, TF); PPI of size (TF, TF)
expression = np.random.random((G, n))
prior = np.random.randint(0, 2, size = (G, TF))
ppi = np.random.randint(0, 2, size = (TF, TF))
ppi ^= ppi.T
np.fill_diagonal(ppi, 1)

# Run GIRAFFE
giraffe_model = giraffe.Giraffe(expression, prior, ppi)

R_hat = giraffe_model.get_regulation() # Size (G, TF)
TFA_hat = giraffe_model.get_tfa() # Size (TF, n)

More details can be found in our Tutorial.

Structure of the repo

giraffe contains the source code of our algorithm.
notebooks/data contains the data used and generated in the experiments. Note that the version on GitHub does not contain all the data. Please download them from Zenodo.
The jupyter notebooks in notebooks can be used to reproduce the experiments in the thesis.
We provide an introduction to computational methods for gene regulation on Medium, with the hope to facilitate researchers without a computational biology background.

Appreciation

Alexander Marx, Julia Vogt, and John Quackenbush for making this exchange possible.
Alexander Marx, Jonas Fischer, and Panagiotis Mandros for their supervision and invaluable guidance throughout this project.
Marouen Ben Guebila, Rebekka Burkholz, Chen Chen, Derrick DeConti, Dawn DeMeo, Viola Fanfani, Intekhab Hossain, Camila Lopes-Ramos, John Quackenbush, Enakshi Saha, Katherine Shutta, and Julia Vogt for thoughtful critiques and discussions.

soelmicheletti / giraffe Goto Github PK

giraffe's Introduction

GIRAFFE: biologically informed inference of gene regulatory networks

Install

Usage

Structure of the repo

Appreciation

giraffe's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent