Git Product home page Git Product logo

mirage's People

Contributors

gaow avatar han16 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

mirage's Issues

Annotation group documentation

I got this email from a user:

In my data analysis, I got the result by using annovar anova. I found that you annotate with several popular programs including PolyPhen, CADD and SIFT. In my data analysis, I will do this step before mirage analysis? I read your code about mirage, there are four columns( format of input data column 1:variant, 2,NO.variant in cases 3. No variant in control 4 variant group index). but I do not know where these columns come from? can you teach me?

@han16 we wrote on this page "Variant groups can be user defined, usually depending on its annotations." but it seems too vague to be helpful to users. Is it correct if we change it to the following:

"Variant groups can be user defined, usually depending on its annotations. For example, in Han et al (2019+), we label as group 2 those variants with PolyPhen 194 score greater than 0.957, CADD score top 10% or SIFT score < 0.05; other variants are labelled group 1"

Is this correct?

Is it true to obtain the NO.case and NO.control?

Is it true to obtain the NO.case and NO.control?
#NO.case
#grep 0/1
count_case_01 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('0/1',x))))
rownames(count_case_01) <- tmp_vcf_case_data[,1]
colnames(count_case_01) <- "count_01"
#grep 1/2
count_case_12 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('1/2',x))))
rownames(count_case_12) <- tmp_vcf_case_data[,1]
colnames(count_case_12) <- "count_12"

#grep 1/1

count_case_11 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('1/1',x))))
rownames(count_case_11) <- tmp_vcf_case_data[,1]
colnames(count_case_11) <- "count_11"

#grep 2/2
count_case_22 <- data.frame(apply(tmp_vcf_case_data,1,function(x) length(grep('2/2',x))))
rownames(count_case_22) <- tmp_vcf_case_data[,1]
colnames(count_case_22) <- "count_22"

#combine four data for case
count_case <- cbind(count_case_01,count_case_11,count_case_12,count_case_22)
count_case[,5] <- 2rowSums(count_case[,2:4])+1count_case[,1]
colnames(count_case)[5] <- "N.case"

#NO.control
#grep 0/1
count_contro_01 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('0/1',x))))
rownames(count_contro_01) <- tmp_vcf_control_data[,1]
colnames(count_contro_01) <- "count_01"

#grep 1/2
count_contro_12 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('1/2',x))))
rownames(count_contro_12) <- tmp_vcf_control_data[,1]
colnames(count_contro_12) <- "count_12"

#grep 1/1

count_contro_11 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('1/1',x))))
rownames(count_contro_11) <- tmp_vcf_control_data[,1]
colnames(count_contro_11) <- "count_11"

#grep 2/2
count_contro_22 <- data.frame(apply(tmp_vcf_control_data,1,function(x) length(grep('2/2',x))))
rownames(count_contro_22) <- tmp_vcf_control_data[,1]
colnames(count_contro_22) <- "count_22"
#combine four data for control
count_cantro <- cbind(count_contro_01,count_contro_11,count_contro_12,count_contro_22)
count_cantro[,5] <- 2rowSums(count_cantro[,2:4])+1count_cantro[,1]
colnames(count_cantro)[5] <- "N.control"

how to obtain the NO.case and NO.control

Dear,
I still felt confused about how to obtain the NO.case and NO.control. You said that " (No.case, how many times the variant appears in cases, No.contr, how many times the variant appears in controls โ€” you can compute these quantities from your data)". which file I can get this information. Can you give me an example?Thank you!

Example of gene level FDR & multiple testing in genome-wide scan

@han16 I was asked by @linnanqia offline who has fixed bug in her code and got what seems encouraging results (log(BF) about 5 for some genes that seems to make sense). However in our tutorial we didn't explain how results are interpreted; in particular, how multiple testing is performed -- how gene level posterior probability should be interpreted in terms of FDR, and what threshold to use.

Could you kindly update the tutorial adding a section on interpreting the results? Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.