xinhe-lab / mirage Goto Github PK
View Code? Open in Web Editor NEWMixture model based Rare variant Analysis on Genes
Home Page: https://xinhe-lab.github.io/mirage
Mixture model based Rare variant Analysis on Genes
Home Page: https://xinhe-lab.github.io/mirage
I got this email from a user:
In my data analysis, I got the result by using annovar anova. I found that you annotate with several popular programs including PolyPhen, CADD and SIFT. In my data analysis, I will do this step before mirage analysis? I read your code about mirage, there are four columns( format of input data column 1:variant, 2,NO.variant in cases 3. No variant in control 4 variant group index). but I do not know where these columns come from? can you teach me?
@han16 we wrote on this page "Variant groups can be user defined, usually depending on its annotations." but it seems too vague to be helpful to users. Is it correct if we change it to the following:
"Variant groups can be user defined, usually depending on its annotations. For example, in Han et al (2019+), we label as group 2
those variants with PolyPhen 194 score greater than 0.957, CADD score top 10% or SIFT score < 0.05; other variants are labelled group 1
"
Is this correct?
Is it true to obtain the NO.case and NO.control?
#NO.case
#grep 0/1
count_case_01 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('0/1',x))))
rownames(count_case_01) <- tmp_vcf_case_data[,1]
colnames(count_case_01) <- "count_01"
#grep 1/2
count_case_12 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('1/2',x))))
rownames(count_case_12) <- tmp_vcf_case_data[,1]
colnames(count_case_12) <- "count_12"
#grep 1/1
count_case_11 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('1/1',x))))
rownames(count_case_11) <- tmp_vcf_case_data[,1]
colnames(count_case_11) <- "count_11"
#grep 2/2
count_case_22 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('2/2',x))))
rownames(count_case_22) <- tmp_vcf_case_data[,1]
colnames(count_case_22) <- "count_22"
#combine four data for case
count_case <- cbind(count_case_01,count_case_11,count_case_12,count_case_22)
count_case[,5] <- 2rowSums(count_case[,2:4])+1count_case[,1]
colnames(count_case)[5] <- "N.case"
#NO.control
#grep 0/1
count_contro_01 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('0/1',x))))
rownames(count_contro_01) <- tmp_vcf_control_data[,1]
colnames(count_contro_01) <- "count_01"
#grep 1/2
count_contro_12 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('1/2',x))))
rownames(count_contro_12) <- tmp_vcf_control_data[,1]
colnames(count_contro_12) <- "count_12"
#grep 1/1
count_contro_11 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('1/1',x))))
rownames(count_contro_11) <- tmp_vcf_control_data[,1]
colnames(count_contro_11) <- "count_11"
#grep 2/2
count_contro_22 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('2/2',x))))
rownames(count_contro_22) <- tmp_vcf_control_data[,1]
colnames(count_contro_22) <- "count_22"
#combine four data for control
count_cantro <- cbind(count_contro_01,count_contro_11,count_contro_12,count_contro_22)
count_cantro[,5] <- 2rowSums(count_cantro[,2:4])+1count_cantro[,1]
colnames(count_cantro)[5] <- "N.control"
Dear,
I still felt confused about how to obtain the NO.case and NO.control. You said that " (No.case, how many times the variant appears in cases, No.contr, how many times the variant appears in controls โ you can compute these quantities from your data)". which file I can get this information. Can you give me an example?Thank you!
@han16 I was asked by @linnanqia offline who has fixed bug in her code and got what seems encouraging results (log(BF) about 5 for some genes that seems to make sense). However in our tutorial we didn't explain how results are interpreted; in particular, how multiple testing is performed -- how gene level posterior probability should be interpreted in terms of FDR, and what threshold to use.
Could you kindly update the tutorial adding a section on interpreting the results? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.