Git Product home page Git Product logo

adaptiveattractor's Introduction

Adaptive Attractor

An adaptive method for finding co-expression signatures in single-cell/bulk RNA-seq datasets.

Overview

An adaptive version of attractor algorithm.

cafr::findAttractor finds a converged attractor based on the seed gene provided. The findAttractor.adaptive gradually decreased the exponent parameter to maximize the strength of the Nth-ranked genes in the converged attractor.

The attractor algorithm was first proposed for identifying co-expression signatures from bulk expression values in samples [Ref.1]. A detailed description and application of the attractor algorithm on single-cell RNA-seq data can be found in [Ref.2]. Users can refer to our recent manuscript [Ref.3] for a detailed description and application of adaptive attractor algorithm.

Tutorials

Quick start

Please see Seed selection for more information about the seed.

source("./findAttractor.adaptive.r")
data <- as.matrix(data)  # a normalized expression matrix with genes in the rows, cells in the columns.
seed <- "LUM"
attr <- findAttractor.adaptive(data, seed)

Examples

R script for generating Supplementary Table 1 in our recent manuscript [link].

## install cafr package 
devtools::install_github("weiyi-bitw/cafr")

## load metadata
# Data website: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE129363
metadata <- data.table::fread("./GSE129363/raw/GSE129363_Discovery_Cohort_CellAnnotation.txt")

## normalize the count matrix 
library("Seurat")
adata <- Read10X("./GSE129363/raw") #including barcodes.tsv, genes.tsv, matrix.mtx
adata <- CreateSeuratObject(counts = adata[, metadata$CellID])    
adata <- NormalizeData(adata)
data_all <- adata[["RNA"]]@data; rm(adata); gc()

## select one sample
sampleID <- "Ate-05-SAT"
data <- data_all[, metadata[which(metadata$SampleName == sampleID), ]$CellID]
  
## apply adaptive attractor algorithm
source("./findAttractor.adaptive.r")
data <- as.matrix(data)  #normalized matrix for each sample
seed <- "LUM"
attr <- findAttractor.adaptive(data, seed, exponent.max = 5, exponent.min = 2, step.large = 1, step.small = 0.1, dominantThreshold = 0.2, seedTopN = 50, compareNth = 10, verbose = FALSE)

## results
head(attr$attractor.final)
attr$final.exponent

Parameters

data An expression matrix with genes in the rows, samples in the columns.
seed Seed gene. Please see Seed selection for more information.
exponent.max The maximum exponent of the mutual information, used to create weight vector for metagenes.
exponent.min The minimum exponent of the mutual information, used to create weight vector for metagenes.
step.large Decreasing step of the first round of scanning.
step.small Decreasing step of the second round of scanning.
dominantThreshold Threshold of dominance.
seedTopN Threshold of divergence.
compareNth It will maxmize the weight of the top Nth gene.
verbose When TRUE, it will show the top 20 genes of the metagene in each iteration.
epsilon Threshold of convergence.
minimize Default FALSE,when minimize = TRUE, the algorithm will minize the difference of metagene[first.idx] and metagene[second.idx] instead of maxmizing the weight of the top Nth gene.
first.idx Default first.idx = 1, the first gene of the metagene in each iteration.
second.idx Default second.idx = 2, the second gene of the metagene in each iteration.

Seed selection

Users can choose some general markers as seed genes, such as:

Fibroblast Pericyte Macrophage Endothelial Mitotic
LUM RGS5 AIF1 PECAM1 TOP2A
DCN

If the dataset has many cells of one cell type (e.g. fibroblasts) and the attractor exponent (a) is fixed, then identical attractors will be found using different general markers for one cell type, like DCN, LUM, and COL1A1.

Demonstration

Here, we applied the attractor finding algorithm using the findAttractor function implemented in the cafr (v0.312) R package [Ref.1] with the general fibroblastic marker gene LUM and DCN as seed.

Description of original attractor algorithm

Detailed descriptions can be found in Ref.1.

The exponent (a) is fixed. For the analysis of UMI based (e.g. 10x) and full-length-based (e.g. Smart-seq2) datasets, we would suggest using a = 3 and a = 5, respectively.

## install cafr package 
devtools::install_github("weiyi-bitw/cafr")
cafr::findAttractor(data, vec, a = 3, epsilon = 1e-7)

adaptiveattractor's People

Contributors

lingyic avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.