Git Product home page Git Product logo

honourscodes's Introduction

Supplementary Codes

Supplementary codes for my Honours thesis.

Mixture models for gene distributions in single-cell RNA-seq data

  • Four mixture models:
    • Gamma-Normal (GN): gammaNormMix()
    • Constrained Gamma-Normal (cGN): gammaNormFix()
    • Gamma-Gamma (GG): gammaMix()
    • Constrained Gamma-Gamma (cGG): gammaMixFix()
  • Model selection:
    • Select the best model using BIC: modelSelect()

Runing the functions

Load the R packages

require(distr)
require(scales)
require(cvTools)
require(edgeR)

Download and read the mixtureModels.R file.

source("mixtureModels.R")

We start from a count matrix, where each row represents a gene, and each column represents a cell. Here, we simulate a count matrix from a negative binomial distribution with 20000 genes and 250 cells as an example.

nGene<-20000
nCell<-250
gene_mean <- 2^runif(nGene, -2, 8)
count <- matrix(rnbinom(nGene*nCell, mu=gene_mean, size=10), nrow=nGene)

Then we need to transform the raw count matrix into a log-scale counts per million mapped reads (CPM) matrix. Note that, one can also use log-scale FPKM, RPKM or TPM matrix or other normalisation output as the input of mixture model functions.

#Transform the raw count matrix
log2cpm<-log2(cpm(count)+1)

Fitting the GN and GG mixture models for Gene 1

GN_gene1<-gammaNormMix(log2cpm[1,])
GG_gene1<-gammaMix(log2cpm[1,])

Fitting the GN and GG for the whole dataset to get global parameters

Note that the fittings for the entire dataset need a few minutes...

GN_dataset<-gammaNormMix(log2cpm,plot=FALSE)
GG_dataset<-gammaMix(log2cpm,plot=FALSE)

Fitting the contrained mixture models (cGN, cGG) for Gene 1

cGN_gene1<-gammaNormFix(log2cpm[1,],alpha_fix=GN_dataset$alpha,beta_fix=GN_dataset$beta)
cGG_gene1<-gammaMixFix(log2cpm[1,],alpha_fix=GG_dataset$alpha1,beta_fix=GG_dataset$beta1)

Select the best model for Gene 1 using BIC


modelSelect(log2cpm[1,],alpha_fix=GG_dataset$alpha1,beta_fix=GG_dataset$beta1,
 	alpha_fix_norm=GN_dataset$alpha,beta_fix_norm =GN_dataset$beta,parameter=TRUE)
  

Author

Yingxin Lin

Acknowledgement

gammaNormMix() function is modified from the codes provided by Shila Ghazanfar.

honourscodes's People

Contributors

yingxinlin avatar hanijieunkim avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.