Hi,
I would like to ask about some miRNA counts discrepancies I noticed when switching between different isomiRs versions. In past, I used isomiRs 1.10 and now I switched to isomiRs 1.16.2. I am fully aware there were many versions between and certainly many improvements that would affect the final miRNA counts, but I am seeing huge differences in quantified miRNA counts which are bothering me. For example, here is first 10 miRNAs counts produced by isomiRs 1.10 (output of function IsomirDataSeqFromFiles()
):
METSEQ-T04 METSEQ-T05 METSEQ-T06 METSEQ-T07 METSEQ-T08 METSEQ-T09
"hsa-let-7a-2-3p" 16 3 22 1 18 1
"hsa-let-7a-3p" 444 436 807 474 564 835
"hsa-let-7a-5p" 250567 211944 536064 342337 309013 339820
"hsa-let-7b-3p" 94 164 531 272 104 217
"hsa-let-7b-5p" 59497 70568 198727 161083 47276 114909
"hsa-let-7c-3p" 17 39 83 47 22 14
"hsa-let-7c-5p" 8253 24841 91330 30526 7284 12840
"hsa-let-7d-3p" 276 279 937 452 483 805
"hsa-let-7d-5p" 8848 6314 17324 13667 16358 21746
and here by isomiRs 1.16.2:
METSEQ-T04 METSEQ-T05 METSEQ-T06 METSEQ-T07 METSEQ-T08 METSEQ-T09
"hsa-let-7a-2-3p" 8 2 10 0 15 1
"hsa-let-7a-3p" 346 353 601 349 476 636
"hsa-let-7a-5p" 223211 186839 468075 293262 275729 287835
"hsa-let-7b-3p" 59 88 345 144 67 123
"hsa-let-7b-5p" 44572 50948 147743 115957 33111 82647
"hsa-let-7c-3p" 14 35 64 38 18 12
"hsa-let-7c-5p" 6589 19765 73938 23852 5792 10195
"hsa-let-7d-3p" 224 220 749 276 366 623
"hsa-let-7d-5p" 7069 4855 13515 10135 12969 15318
The difference is really big, if I sum it for each sample it gives even up to 1M, so I would like to know what is causing the difference. In issue #17, @svattathil mentioned that there is some noise filtering happening in IsomiRs, but I did not manage to find any information about it.
The problem might be related to the underlying issue of non-functioning conda isomiRs packages (latest versions) that I already posted here. Specifically, the problem is probably in the changed syntax of dplyr
package. I think so because when running isomiRs 1.16.2 IsomirDataSeqFromFiles()
I got following warnings (for each input line, but the output table is produced):
1: Problem with `mutate()` input `prop`.
ℹ Chi-squared approximation may be incorrect
ℹ Input `prop` is `list(tidy(prop.test(value, total, pctco, alternative = "greater")))`.
ℹ The error occurred in row 6399.
I am using your great package a lot now and otherwise it does exactly what I need, so thank you for your work on this software so far :) I will very much appreciate it if you can look into this issue.
All the input/ouput files are here.
Thank you!
Karolina