enrichment of high p-values about leafcutter HOT 7 CLOSED

davidaknowles commented on August 17, 2024

enrichment of high p-values

from leafcutter.

Comments (7)

goldenflaw commented on August 17, 2024

We did not solve this problem. Few immediate thoughts:

(1) Is it a problem per say that such a relationship exists for sQTL
mapping? The cutoffs are quite small: I'm wondering if it affects the set
of sQTL calls. I can't imagine it will affect it that much (the signal is
weak and most of the p-values are above the cutoffs anyway).

(2) Related to (1). To compute our FDR cutoffs, we used permutation
techniques which should control for this effect as this effect must be
independent from genotype label.

On Tue, Jul 5, 2016 at 10:22 AM, Boxiang Liu [email protected]
wrote:

Hello David and Yang,

This is not an issue per se but I would like to hear your thoughts and
opinions.

I mapped sQTLs with normalized leafcutter ratios, and the p-value
distribution look non-uniform. In particular, there is a positive slope
towards one as shown in the figure below.

[image: image]
https://cloud.githubusercontent.com/assets/4122434/16593216/311b2bd8-4299-11e6-95ce-fb30b2555f21.png

I hypothesized that the positive slope is due to introns with low reads.
For instance, if a particular intron only has a few reads per sample, it
will produce low p-values regardless of the effect size. I therefore
plotted the mean p-value against the binned reads as shown below.

[image: image]
https://cloud.githubusercontent.com/assets/4122434/16593293/806fab50-4299-11e6-8a31-0a396cd8db6f.png

This figure shows all bins besides the last 3 have median slighted above
0.5. This means that unless I set a threshold at mean count >=e^9.72 =
16647.24 reads, I will get non-uniform p-value distributions. However, this
cutoff is unreasonably high.

Did you see similar things for your sQTL mappings? If so, how did you
solve the problem?

P.S. My normalization method is similar to the GTEx eQTL normalization.
First quantile normalize samples against each other and then normalized the
ratio of each intron to a gaussian.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#9, or mute the thread
https://github.com/notifications/unsubscribe/AE1piTu1pHkRhCeTbz00o598XcNL2FKxks5qSpLegaJpZM4JFV14
.

from leafcutter.

boxiangliu commented on August 17, 2024

thanks Yang.
Just to clarify: the mean count cutoffs are in log scale. Therefore they are larger than they seem.

I think too that permutation should solve the problem. How did you perform permutation? There are ~110k introns with ~6M variants. My feeling is that it would take a very long time to get reasonable permutation p-values.

from leafcutter.

goldenflaw commented on August 17, 2024

You can use standard tools like matrix-eQTL.

Yang

On Tue, Jul 5, 2016 at 11:17 AM, Boxiang Liu [email protected]
wrote:

thanks Yang.
Just to clarify: the mean count cutoffs are in log scale. Therefore they
are larger than they seem.

I think too that permutation should solve the problem. How did you perform
permutation? There are ~110k introns with ~6M variants. My feeling is that
it would take a very long time to get reasonable permutation p-values.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#9 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AE1piVpFDp1chclKsxlALtxZQdMkdr1Nks5qSp-cgaJpZM4JFV14
.

from leafcutter.

js29 commented on August 17, 2024

FastQTL is an even faster way of doing permutations than MatrixEQTL.

from leafcutter.

boxiangliu commented on August 17, 2024

Thanks Yang. I am unclear on how you use matrix eQTLs to perform permutations. If running permutation to get null distribution for each SNP-intron association, one needs to run ~10,000 times or more permutations to get reasonably low p-values. Although matrix eQTL is fast, it still sounds quite computationally intensive.

Thanks Jeremy. fastQTL is similar to matrix eQTL in "nominal pass" mode. I do not want to use the "permutation mode" because that only outputs the top eSNP per intron but I need all eSNPs. I am not aware of a way to make fastQTL to output all eSNPs using permutation. Could I be missing something?

from leafcutter.

js29 commented on August 17, 2024

You run both modes of FastQTL. To get raw p values for all SNPs you use nominal pass mode. Permutations is just to assess where the effect size of the best SNP per gene falls among the permuted set. I.e. to say "does this gene have a significant QTL at all". If you want to know which is the best SNP for the gene, you just want the nomical p values.
What would you do with p values from all SNPs from all permutations anyway?
As you say, permutations are computationally expensive. That's just the name of the game.
A benefit of fastQTL is the beta approximation, which means that you can get away with 1000 or 10,000 permutations, but still get estimated P values of 1e-20, so you can discriminate cases with strong splice QTLs from those with weak sQTLs. If you run 1000 permutations with MatrixEQTL all you can say is "this was the smallest p value from 1000 permutations", i.e. p < 0.001.

from leafcutter.

davidaknowles commented on August 17, 2024

Closing this as it's more of an research question.

from leafcutter.

enrichment of high p-values about leafcutter HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent