seppinho / mutserve Goto Github PK
View Code? Open in Web Editor NEWmtDNA Variant Caller
Home Page: https://mutserve.vercel.app
License: MIT License
mtDNA Variant Caller
Home Page: https://mutserve.vercel.app
License: MIT License
Does this file still exist? We do not have it in the Results-Tab currently. What about e.g. the coverage plots?
Would be helpful if we have the coverage per strand in the variant file (txt) as well
Hi,
I'm really excited to try some of your mtDNA tools with long PacBio amplicons. Since it's a new data type, I'm never sure whether I'm hitting an issue related to the data type or completely unrelated. I really appreciate any suggestions you have to help with this.
Does mutation-server
require the contig name rCRS
? It appears to be asking for a specific contig name, but I can't understand what is expected from the error message.
$ samtools idxstats bc1001.minimap2.aln.bam
NC_012920.1 16569 23070 0
* 0 0 0
$ head ../rCRS.fasta.fai
NC_012920.1 16569 57 70 71
$ $MUTATIONSERVER analyse-local --input . --reference ../rCRS.fasta --level 0.01 --outputRaw raw.txt --outputVar var.txt --baq true --baseQ 20 --mapQ 20 --alignQ 30 --indel true
Command [--input, ., --reference, ../rCRS.fasta, --level, 0.01, --outputRaw, raw.txt, --outputVar, var.txt, --baq, true, --baseQ, 20, --mapQ, 20, --alignQ, 30, --indel, true]
Mutation Server v1.1.6 -- Low-frequency Variant Detection
Division of Genetic Epidemiology - Medical University of Innsbruck
(c) Sebastian Schoenherr, Hansi Weissensteiner, Lukas Forer
Processing: bc1001.minimap2.aln.bam
htsjdk.samtools.SAMException: Unable to find entry for contig: rCRS
at htsjdk.samtools.reference.FastaSequenceIndex.getIndexEntry(FastaSequenceIndex.java:202)
at htsjdk.samtools.reference.IndexedFastaSequenceFile.getSubsequenceAt(IndexedFastaSequenceFile.java:217)
at genepi.mut.util.BaqAlt.calcBAQFromHMM(BaqAlt.java:669)
at genepi.mut.util.BaqAlt.baqRead(BaqAlt.java:928)
at genepi.mut.pileup.BamAnalyser.analyseRead(BamAnalyser.java:145)
at genepi.mut.pileup.PileupToolLocal.analyseReads(PileupToolLocal.java:208)
at genepi.mut.pileup.PileupToolLocal.run(PileupToolLocal.java:159)
at genepi.base.Tool.start(Tool.java:193)
at genepi.base.Toolbox.start(Toolbox.java:44)
at genepi.mut.Server.main(Server.java:38)
We focused on identifying heteroplasmy levels in nanopore sequencing data using Mutserve2 with default settings. We simulated various levels of heteroplasmy for one mitochondrial variant using bioinformatics. What we observed is that the proportion of called variants starts to decline at a heteroplasmy ratio between 0.93 and 0.97, with values mainly ranging between 0.93 and 0.94.
This suggests that the variant, which is present in more than 90% of reads, wasn't called in about 70 out of 999 attempts. Furthermore, this variant is in the .txt file but flagged as strand bias, so it's filtered out and doesn't appear in the final VCF file.
Do you have any ideas to explain this anomaly? Are there specific parameters that can be adjusted to address this issue?
Hi, I had a question about the correct output for the test data included in the Github. I downloaded test.cram from mutserve/test-data/mtdna/cram/input/, and also downloaded the latest version of mutserve, v2.0.0-rc12. I then ran the tool as follows:
./mutserve call test.cram --reference rCRS.fasta --output TEST.vcf.gz
The tool completed but in the resulting variant file, I only have 51 variants, several of which say "STRAND_BIAS". However, in the output file you have posted to the github under mutserve/test-data/mtdna/cram/output/variants.txt, there are 85 variants, none of which say "STRAND_BIAS" e.g. all are "PASS".
I was wondering why my output does not match yours; perhaps I'm running mutserve incorrectly? Thanks!
Hi,
The documentation link https://mitoverse.readthedocs.io/mutserve/mutserve/ gives 404 error.
Could you please provide an updated link?
Thanks!
I uploaded sampel samphir 4343. The annotation of aas is currently wrong. The psitions 594, 621 and 666 are currently reported to change protein sequence:
Lys>Glu
Trp>Arg
Arg>Trp
This is not correct. These posititions are the Type b positions and are synonymous changes
594 A>G: Gln4Gln
621 T>C: His13His
666 A>T Thr28Thr
Attention: At pos 666 also A>C changes exist, but also these make Thr>Thr.
UPDATE: Checked a few others. Are not correct too, e.g.
672 A>G: should be Arg>Arg
4925G>A : Should be Ala>Thr
584 C>T: Should be Pro>Leu
Content is the same (except for annotation)?
If a mixture model was generated, by providing the expected variants (homoplasmic as well as heteroplasmic mutations) in an own file, the precision, sensitivity and specificity could be emitted directly.
Buttons "share all data" and "share data" on results page (tables, on the right) do the same. Is this intended?
https://www.thermofisher.com/order/catalog/product/A30938
Overlapping reads mapped to larger genome than rCRS, needs remapping (basically modulo)
-use mix-ups to do that.
Indel: true
Write VCF: true
Exception in thread "main" java.lang.NumberFormatException: For input string: "311.1"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at genepi.io.table.reader.AbstractTableReader.getInteger(AbstractTableReader.java:49)
at genepi.io.table.reader.AbstractTableReader.getInteger(AbstractTableReader.java:29)
at genepi.mut.util.MutationServerReader.parse(MutationServerReader.java:38)
at genepi.mut.util.MutationServerReader.parse(MutationServerReader.java:18)
at genepi.mut.util.VcfWriter.createVCF(VcfWriter.java:40)
at genepi.mut.pileup.PileupToolLocal.run(PileupToolLocal.java:236)
at genepi.base.Tool.start(Tool.java:193)
at genepi.base.Toolbox.start(Toolbox.java:44)
at genepi.mut.Server.main(Server.java:42)
How it could be possible to optimize mutserve to analyse Nanopore reads? (long-reads)
thanks
INDEL ERROR: A Exception raise up when calling variant from some bam, tab txt produce success, but a blank vcf result. when move out calling deletions and insertions, vcf produce normally.
Error Detail:
Processing: .bam
Detected reference: rcrs
Exception in thread "main" java.lang.IllegalStateException: Allele in genotype * not in the variant context [C, CCCC]
at htsjdk.variant.variantcontext.VariantContext.validateGenotypes(VariantContext.java:1359)
at htsjdk.variant.variantcontext.VariantContext.validate(VariantContext.java:1297)
at htsjdk.variant.variantcontext.VariantContext.(VariantContext.java:400)
at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:494)
at htsjdk.variant.variantcontext.VariantContextBuilder.make(VariantContextBuilder.java:488)
at genepi.mut.util.VcfWriter.createVCF(VcfWriter.java:204)
at genepi.mut.pileup.PileupToolLocal.run(PileupToolLocal.java:239)
at genepi.base.Tool.start(Tool.java:193)
at genepi.base.Toolbox.start(Toolbox.java:44)
at genepi.mut.Server.main(Server.java:42)
Parameters used:
Command [--input, *.bam, --output, *.vcf, --reference, rCRS.fasta, --level, 0.01, --deletions, --insertions]
mtDNA Low-frequency Variant Detection v1.3.0
Division of Genetic Epidemiology - Medical University of Innsbruck
(c) Sebastian Schoenherr, Hansi Weissensteiner, Lukas Forer
rCRS shows up also in Reference sequences list for LPA
Hello,
I am trying to run mutserve using .cram files as from documentation seems it supports it:
Usage: mutserve call [--baq] [--deletions] [--help] [--insertions] [--no-ansi]
[--no-freq] [--version] [--write-fasta] [--write-raw]
[--alignQ=<alignQ>] [--baseQ=<baseQ>]
[--contig-name=<contig>] [--level=<level>] [--mapQ=<mapQ>]
[--mode=<mode>] --output=<output> --reference=<reference>
[--threads=<threads>] [<input>...]
Call homoplasmic and heteroplasmic positions.
**[<input>...] BAM/CRAM files**
--alignQ=<alignQ> Minimum Align Quality
Default: 30
--baq Enable BAQ
Default: false
--baseQ=<baseQ> Minimum Base Quality
Default: 20
--contig-name=<contig>
Specifify mtDNA contig name
Default: null
--deletions Call deletions (beta)
Default: false
--help
--insertions Call insertions (beta)
Default: false
--level=<level> Minimum Heteroplasmy Level
--mapQ=<mapQ> Minimum Map Quality
Default: 20
--mode=<mode> Specifify mutserve mode
Default: mtdna
--no-ansi Disable ANSI support
Default: false
--no-freq Use Frequency File
Default: false
--output=<output> "Output (txt or vcf)
--reference=<reference>
Reference
--threads=<threads> Number of threads
--version
--write-fasta Write fasta file
Default: false
--write-raw Write raw file
Default: false
This is my code:
mutserve call --reference rCRS.fa --output out_from_cram.vcf --threads 2 my_input.cram
But the program does not recognise the .cram file, and sees it as a sam file:
htsjdk.samtools.SAMException: Input file must be bam file, not sam file
Is there a way to solve this?
Might need to be updated?
see testfile
QS6LK:01421:01280 0
QS6LK:01421:01280 0
second should be 2048!!
Dear seppinho,
I am having trouble using mutserve call with whole mitogenome data. I get the following error when running the program:
File.bam[>-------------------------------------][00:00:00]
null[>-------------------------------------][00:00:00]
java.lang.IllegalArgumentException: Invalid reference index -1
at htsjdk.samtools.QueryInterval.<init>(QueryInterval.java:24)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:538)
at genepi.mut.tasks.VariantCallingTask.run(VariantCallingTask.java:129)
at lukfor.progress.tasks.Task.call(Task.java:39)
at lukfor.progress.tasks.Task.call(Task.java:8)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
java.lang.Exception: Invalid reference index -1
Variant Calling failed. Mutserve terminated.
This is my workflow:
samtools faidx $GENOME
bwa mem -t $THREADS -R $RG $GENOME $fq1 $fq2 > $sam
samtools view -@ $THREADS -b $sam > $bam
samtools sort -@ $THREADS $bam > $sorted_bam
samtools index $sorted_bam
$MUT/mutserve call $sorted_bam --baseQ 25 --level 0.05 --output $variants --write-fasta true --threads $THREADS --reference=$GENOME
What am I doing wrong here? Many thanks!
Kind regards
I am getting the following error message when trying to run the variant caller tool. I saw someone had a similar error in another thread, which was solved by removing spaces from the reference title #43. There are no spaces in the reference title in my sample, but I think the problem must be with the format of either the reference or the index file. Any other ideas about what could be causing the error?
call, --reference=sample72mtassembly.fasta, --output, sample72out.txt, --threads, 4, sample72_sorted_mappedreadsonly.bam]
โ sample72_sorted_mappedreadsonly.bam[>-------------------------------------][00:00:00]
java.lang.IllegalArgumentException: Invalid reference index -1
at htsjdk.samtools.QueryInterval.(QueryInterval.java:24)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:538)
at genepi.mut.tasks.VariantCallingTask.run(VariantCallingTask.java:129)
at lukfor.progress.tasks.Task.call(Task.java:39)
at lukfor.progress.tasks.Task.call(Task.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
java.lang.Exception: Invalid reference index -1
at genepi.mut.tasks.VariantCallingTask.run(VariantCallingTask.java:133)
at lukfor.progress.tasks.Task.call(Task.java:39)
at lukfor.progress.tasks.Task.call(Task.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Variant Calling failed. Mutserve terminated.
Would add COV or Coverage to ""Mean" & "SD"
Integrate Paired end validation step (Telefonat 19.12.2016).
First, I'm wondering if building and installing with maven enables the other steps of the pipeline for local use, specifically the align tool? Or is analyse-local the only locally available step?
Also I tried building with 'mvn install' and there were 5 failed tests resulting in a build failure:
Failed tests: AlignmentPETest(genepi.mut.steps.MutationServerTest)
AlignmentSETest(genepi.mut.steps.MutationServerTest)
SortTestPE(genepi.mut.steps.MutationServerTest)
SortTestSE(genepi.mut.steps.MutationServerTest)
PileupFromFastqTest(genepi.mut.steps.MutationServerTest)
I can run mutation-server locally using the provided test data (input fastq files). However, when I use my own fastq files, I get the following error message (below). Can you offer any assistance on what the problem might be?
Exception in thread "main" java.lang.NullPointerException
at genepi.mut.pileup.PileupToolLocal.run(PileupToolLocal.java:118)
at genepi.base.Tool.start(Tool.java:193)
at genepi.base.Toolbox.start(Toolbox.java:44)
at genepi.mut.Server.main(Server.java:42)
Hi, i got some porblems like below when i use the mutserve .
"htsjdk.samtools.SAMException: Unable to find entry for contig: rCRS"
My bam was mapped to the rCRS,but also get that error.
What should i do to solve that?
Looking forward to your reply!
in rc6 -> --write-raw emits wrong file - no newline after last entry per sample after position 16569:
S5.bam 16569 G G - G - 2617 2908 5525 0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 []S20.bam 1 G G - G - 1088 1361 2449 0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 []
S20.bam 2 A A G A - 1128 1364 2492 0 0.0 0.999113475177305 0.0 8.865248226950354E-4 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.999113475177305 1.0 8.865248226950354E-4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 []
Currently, all files must be clicked one by one.
Maybe, integrate also a possibility to donwlaod all TXT files at once (otherwise also also all large bam files would be donwloaded with the results - not every user might need to donwlaod the BAM files and might be happy with the txt files alone)
Command run:
java -jar mutation-server-1.1.11.jar analyse-local --input <my-sample>.mt.bam --output test-output/ --reference hs38DH.fa --level 0.01 --writeVcf
Input Folder: <my-sample>.mt.bam
Output Folder: /test-output
Detection limit: 0.01
Base Quality: 20
Map Quality: 20
Alignment Quality: 30
BAQ: true
Indel: false
Write VCF: true
Produces the following error:
Exception in thread "main" java.lang.OutOfMemoryError
at java.lang.AbstractStringBuilder.hugeCapacity(AbstractStringBuilder.java:161)
at java.lang.AbstractStringBuilder.newCapacity(AbstractStringBuilder.java:155)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:125)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at genepi.mut.util.ReferenceUtil.readInReference(ReferenceUtil.java:63)
at genepi.mut.pileup.BamAnalyser.<init>(BamAnalyser.java:79)
at genepi.mut.pileup.PileupToolLocal.run(PileupToolLocal.java:200)
at genepi.base.Tool.start(Tool.java:193)
at genepi.base.Toolbox.start(Toolbox.java:44)
at genepi.mut.Server.main(Server.java:42)
I have also increased the Java max heap size to 200Gb and received the same results.
Is there any intention to allow naming of jobs to ease identification of job contents when multiple jobs are submitted? I think, this might be helpful (also for mtDNA Server?).
vcfWriter.createVCF(variantPath, output, reference, "chrM", 16569, App.VERSION + ";" + App.COMMAND);
replace with reference.length and contig name
Hi! I am not sure why I haven't been able to find references for using mutserve on mouse data.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.