Git Product home page Git Product logo

cnvassoc's Introduction

CNVassoc

Association analysis of CNVs and imputed SNPs incorporating uncertainty

Overview

CNVassoc is an R package that carries out analysis of common Copy Number Variants (CNVs) and imputed Single Nucleotide Polymorphisms (SNPs) in population-based studies.

It includes tools for estimating association under a series of study designs (case-control, cohort, etc), using several dependent variables (class status, censored data, counts) as response, adjusting for covariates and considering various inheritance models.

Moreover, it is possible to perform epistasis studies with pairs of CNVs or imputed SNPs.

It has been optimized in order to make feasible the analyses of Genome Wide Association studies (GWAs) with hundreds of thousands of genetic variants (CNVs / imputed SNPs).

Also, it incorporates functions for inferring copy number (CNV genotype calling). Various classes and methods for generic functions (print, summary, plot, anova, ...) have been created to facilitate the analysis.

An extensive manual describing all CNVassoc capabilities with real examples is available in package vignette.


Package installation

Install the CNVassoc package from Github repository by typing:

library(devtools)
devtools::install_github(repo = "isglobal-brge/CNVassoc")
library(CNVassoc)

Performing accurate association analyses of Copy Number Variants (CNV)

  • Load example data:
data(dataMLPA)
  • Infer number of copies from probe signal values
CNV  <-  cnv(x  =  dataMLPA$Gene2,  threshold.0  =  0.01,  mix.method  =  "mixdist")
  • Explore signal and copy number inferred
CNV
Inferred copy number variant by a quantitative signal
   Method: function mix {package: mixdist}  

-. Number of individuals: 651 
-. Copies 0, 1, 2 
-. Estimated means: 0, 0.2435, 0.4469 
-. Estimated variances: 0, 0.0041, 0.0095 
-. Estimated proportions: 0.1306, 0.4187, 0.4507 
-. Goodness-of-fit test: p-value= 0.4887659 


-. Note: number of classes has been selected using the best BIC
plot(CNV, case.control = factor(dataMLPA$casco, labels=c("controls", "cases")))

getQualityScore(CNV)
--Probability of good classification: 0.9081028 
  • Association model considering additive effect and adjusting with a covariate
modadd  <-  CNVassoc(casco  ~  CNV + cov,  data  =  dataMLPA,  model  =  "add")
summary(modadd)
Call:
CNVassoc(formula = casco ~ CNV + cov, data = dataMLPA, model = "add")

Deviance: 874.6909 
Number of parameters: 3 
Number of individuals: 651 

Coefficients:
            OR lower.lim upper.lim       SE     stat pvalue
trend  0.58634   0.45457   0.75631  0.12987 -4.11060  0.000
cov    0.88435   0.75597   1.03454  0.08003 -1.53566  0.125

(Dispersion parameter for  binomial  family taken to be  1 )


Covariance between coefficients:
          intercept CNVadd  cov    
intercept  0.6825   -0.0222 -0.0643
CNVadd               0.0169 -0.0001
cov                          0.0064

Performing efficient association analyses of imputed SNPS

  • Import genotype probabilities from example data from SNPTEST software consisting of 500 cases and 500 controls on 200 imputed SNPS.
fileprobs <- system.file("exdata/SNPTEST.probs", package = "CNVassoc")
  • Fit an association model for each imputed SNP
resp <- resp<-rep(0:1, each = 500)
results <- fastCNVassoc(fileprobs, resp ~ 1, family = "binomial")
Reading .probs data...
Done! Took  0.31 seconds
  • Adjust p-values by FDR, and show the table by the most significant SNPs
results$pvalue <- p.adjust(results$pvalue)
head(results[order(results$pvalue),])
  variant       beta         se    zscore pvalue iter
1       1 0.09876262 0.09356259 1.0555781      1    4
2       2 0.03171118 0.12907790 0.2456747      1    4
3       3 0.14015608 0.09325326 1.5029617      1    4
4       4 0.05239490 0.10868035 0.4821010      1    4
5       5 0.16669960 0.09632611 1.7305754      1    4
6       6 0.12066259 0.09179185 1.3145239      1    4

References

Subirana I, Diaz-Uriarte R, Lucas G, Gonzalez JR. CNVassoc: Association analysis of CNV data using R. BMC Med Genomics. 2011 May 24;4:47. doi: 10.1186/1755-8794-4-47. PubMed PMID: 21609482; PubMed Central PMCID: PMC3121578

Subirana I, González JR. Genetic association analysis and meta-analysis of imputed SNPs in longitudinal studies. Genet Epidemiol. 2013 Jul;37(5):465-77. doi: 10.1002/gepi.21719. Epub 2013 Apr 17. PubMed PMID: 23595425; PubMed Central PMCID: PMC4273087.

Subirana I, González JR. Interaction association analysis of imputed SNPs in case-control and follow-up studies. Genet Epidemiol. 2015 Mar;39(3):185-96. doi: 10.1002/gepi.21883. Epub 2015 Jan 22. PubMed PMID: 25613387.

cnvassoc's People

Contributors

isubirana avatar isglobal-brge avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.