Git Product home page Git Product logo

Comments (7)

goldenflaw avatar goldenflaw commented on August 17, 2024

We did not solve this problem. Few immediate thoughts:

(1) Is it a problem per say that such a relationship exists for sQTL
mapping? The cutoffs are quite small: I'm wondering if it affects the set
of sQTL calls. I can't imagine it will affect it that much (the signal is
weak and most of the p-values are above the cutoffs anyway).

(2) Related to (1). To compute our FDR cutoffs, we used permutation
techniques which should control for this effect as this effect must be
independent from genotype label.

On Tue, Jul 5, 2016 at 10:22 AM, Boxiang Liu [email protected]
wrote:

Hello David and Yang,

This is not an issue per se but I would like to hear your thoughts and
opinions.

I mapped sQTLs with normalized leafcutter ratios, and the p-value
distribution look non-uniform. In particular, there is a positive slope
towards one as shown in the figure below.

[image: image]
https://cloud.githubusercontent.com/assets/4122434/16593216/311b2bd8-4299-11e6-95ce-fb30b2555f21.png

I hypothesized that the positive slope is due to introns with low reads.
For instance, if a particular intron only has a few reads per sample, it
will produce low p-values regardless of the effect size. I therefore
plotted the mean p-value against the binned reads as shown below.

[image: image]
https://cloud.githubusercontent.com/assets/4122434/16593293/806fab50-4299-11e6-8a31-0a396cd8db6f.png

This figure shows all bins besides the last 3 have median slighted above
0.5. This means that unless I set a threshold at mean count >=e^9.72 =
16647.24 reads, I will get non-uniform p-value distributions. However, this
cutoff is unreasonably high.

Did you see similar things for your sQTL mappings? If so, how did you
solve the problem?

P.S. My normalization method is similar to the GTEx eQTL normalization.
First quantile normalize samples against each other and then normalized the
ratio of each intron to a gaussian.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#9, or mute the thread
https://github.com/notifications/unsubscribe/AE1piTu1pHkRhCeTbz00o598XcNL2FKxks5qSpLegaJpZM4JFV14
.

from leafcutter.

boxiangliu avatar boxiangliu commented on August 17, 2024

thanks Yang.
Just to clarify: the mean count cutoffs are in log scale. Therefore they are larger than they seem.

I think too that permutation should solve the problem. How did you perform permutation? There are ~110k introns with ~6M variants. My feeling is that it would take a very long time to get reasonable permutation p-values.

from leafcutter.

goldenflaw avatar goldenflaw commented on August 17, 2024

You can use standard tools like matrix-eQTL.

Yang

On Tue, Jul 5, 2016 at 11:17 AM, Boxiang Liu [email protected]
wrote:

thanks Yang.
Just to clarify: the mean count cutoffs are in log scale. Therefore they
are larger than they seem.

I think too that permutation should solve the problem. How did you perform
permutation? There are ~110k introns with ~6M variants. My feeling is that
it would take a very long time to get reasonable permutation p-values.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#9 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AE1piVpFDp1chclKsxlALtxZQdMkdr1Nks5qSp-cgaJpZM4JFV14
.

from leafcutter.

js29 avatar js29 commented on August 17, 2024

FastQTL is an even faster way of doing permutations than MatrixEQTL.

from leafcutter.

boxiangliu avatar boxiangliu commented on August 17, 2024

Thanks Yang. I am unclear on how you use matrix eQTLs to perform permutations. If running permutation to get null distribution for each SNP-intron association, one needs to run ~10,000 times or more permutations to get reasonably low p-values. Although matrix eQTL is fast, it still sounds quite computationally intensive.

Thanks Jeremy. fastQTL is similar to matrix eQTL in "nominal pass" mode. I do not want to use the "permutation mode" because that only outputs the top eSNP per intron but I need all eSNPs. I am not aware of a way to make fastQTL to output all eSNPs using permutation. Could I be missing something?

from leafcutter.

js29 avatar js29 commented on August 17, 2024

You run both modes of FastQTL. To get raw p values for all SNPs you use nominal pass mode. Permutations is just to assess where the effect size of the best SNP per gene falls among the permuted set. I.e. to say "does this gene have a significant QTL at all". If you want to know which is the best SNP for the gene, you just want the nomical p values.
What would you do with p values from all SNPs from all permutations anyway?
As you say, permutations are computationally expensive. That's just the name of the game.
A benefit of fastQTL is the beta approximation, which means that you can get away with 1000 or 10,000 permutations, but still get estimated P values of 1e-20, so you can discriminate cases with strong splice QTLs from those with weak sQTLs. If you run 1000 permutations with MatrixEQTL all you can say is "this was the smallest p value from 1000 permutations", i.e. p < 0.001.

from leafcutter.

davidaknowles avatar davidaknowles commented on August 17, 2024

Closing this as it's more of an research question.

from leafcutter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.