Association analysis of CNVs and imputed SNPs incorporating uncertainty
CNVassoc is an R package that carries out analysis of common Copy Number Variants (CNVs) and imputed Single Nucleotide Polymorphisms (SNPs) in population-based studies.
It includes tools for estimating association under a series of study designs (case-control, cohort, etc), using several dependent variables (class status, censored data, counts) as response, adjusting for covariates and considering various inheritance models.
Moreover, it is possible to perform epistasis studies with pairs of CNVs or imputed SNPs.
It has been optimized in order to make feasible the analyses of Genome Wide Association studies (GWAs) with hundreds of thousands of genetic variants (CNVs / imputed SNPs).
Also, it incorporates functions for inferring copy number (CNV genotype calling). Various classes and methods for generic functions (print, summary, plot, anova, ...) have been created to facilitate the analysis.
An extensive manual describing all CNVassoc capabilities with real examples is available in package vignette.
Install the CNVassoc
package from Github repository by typing:
library(devtools)
devtools::install_github(repo = "isglobal-brge/CNVassoc")
library(CNVassoc)
- Load example data:
data(dataMLPA)
- Infer number of copies from probe signal values
CNV <- cnv(x = dataMLPA$Gene2, threshold.0 = 0.01, mix.method = "mixdist")
- Explore signal and copy number inferred
CNV
Inferred copy number variant by a quantitative signal
Method: function mix {package: mixdist}
-. Number of individuals: 651
-. Copies 0, 1, 2
-. Estimated means: 0, 0.2435, 0.4469
-. Estimated variances: 0, 0.0041, 0.0095
-. Estimated proportions: 0.1306, 0.4187, 0.4507
-. Goodness-of-fit test: p-value= 0.4887659
-. Note: number of classes has been selected using the best BIC
plot(CNV, case.control = factor(dataMLPA$casco, labels=c("controls", "cases")))
getQualityScore(CNV)
--Probability of good classification: 0.9081028
- Association model considering additive effect and adjusting with a covariate
modadd <- CNVassoc(casco ~ CNV + cov, data = dataMLPA, model = "add")
summary(modadd)
Call:
CNVassoc(formula = casco ~ CNV + cov, data = dataMLPA, model = "add")
Deviance: 874.6909
Number of parameters: 3
Number of individuals: 651
Coefficients:
OR lower.lim upper.lim SE stat pvalue
trend 0.58634 0.45457 0.75631 0.12987 -4.11060 0.000
cov 0.88435 0.75597 1.03454 0.08003 -1.53566 0.125
(Dispersion parameter for binomial family taken to be 1 )
Covariance between coefficients:
intercept CNVadd cov
intercept 0.6825 -0.0222 -0.0643
CNVadd 0.0169 -0.0001
cov 0.0064
- Import genotype probabilities from example data from SNPTEST software consisting of 500 cases and 500 controls on 200 imputed SNPS.
fileprobs <- system.file("exdata/SNPTEST.probs", package = "CNVassoc")
- Fit an association model for each imputed SNP
resp <- resp<-rep(0:1, each = 500)
results <- fastCNVassoc(fileprobs, resp ~ 1, family = "binomial")
Reading .probs data...
Done! Took 0.31 seconds
- Adjust p-values by FDR, and show the table by the most significant SNPs
results$pvalue <- p.adjust(results$pvalue)
head(results[order(results$pvalue),])
variant beta se zscore pvalue iter
1 1 0.09876262 0.09356259 1.0555781 1 4
2 2 0.03171118 0.12907790 0.2456747 1 4
3 3 0.14015608 0.09325326 1.5029617 1 4
4 4 0.05239490 0.10868035 0.4821010 1 4
5 5 0.16669960 0.09632611 1.7305754 1 4
6 6 0.12066259 0.09179185 1.3145239 1 4
Subirana I, Diaz-Uriarte R, Lucas G, Gonzalez JR. CNVassoc: Association analysis of CNV data using R. BMC Med Genomics. 2011 May 24;4:47. doi: 10.1186/1755-8794-4-47. PubMed PMID: 21609482; PubMed Central PMCID: PMC3121578
Subirana I, González JR. Genetic association analysis and meta-analysis of imputed SNPs in longitudinal studies. Genet Epidemiol. 2013 Jul;37(5):465-77. doi: 10.1002/gepi.21719. Epub 2013 Apr 17. PubMed PMID: 23595425; PubMed Central PMCID: PMC4273087.
Subirana I, González JR. Interaction association analysis of imputed SNPs in case-control and follow-up studies. Genet Epidemiol. 2015 Mar;39(3):185-96. doi: 10.1002/gepi.21883. Epub 2015 Jan 22. PubMed PMID: 25613387.