Comments (3)
Sorry for the late reply.
velocyto is looking for the following standard entry in the gtf files:
transcript_id, transcript_name, gene_id, gene_name, exon_number
This info should be standerd in a gtf file formated according GENCODE specification. Here you are missing transcript_name gene_name and exon_number.
However I see the issue here those entities don't make sense for the ERCC spikes... I could easily catch this error and return some default not informative value like
transcript_id="NoId", transcript_name="NoName", gene_id="NoId", gene_name="NoName", exon_number="1"
However I am afraid that this sometimes will lead users that are using incorrectly formatted gtf files to some weird outputs.
I think the best solution is to still throw an error and return a more informative error message. (Even though here the second last line was kind of clear that the regex_trname
was failing, but I can try to be even more explicit).
For you the best solution is to just add those entries in the gtf
from velocyto.py.
Note that now velocyto will be a little more forgiving and if transcript_name or exon_number are not specified no error will be thrown.
from velocyto.py.
Velocyto now also prints the line of the gtf that cause the error, helping the user to debug. I still think that too much "secret patching" of the gtf data where the user is not aware of what is going on, should be avoided, because it might generate anomalies difficult to trace back.
from velocyto.py.
Related Issues (20)
- When and when not to use "repeat annotation mask" in 10X?
- how do i generate the repeat sequences masked gtf file from Ensembl?
- index file HOT 1
- OSError: truncated file is occured in cellranger-5.0.1-dirty output.
- WARNING - The .bam file refers to a chromosome not present in the annotation (.gtf) file HOT 4
- velocyto on filtered or raw matrix or from matrix corrected ?
- Inquiry on the input for running "velocyto run10x"
- Using spatial transcriptomics data on Velocyto.py generates lower counts than Space Ranger
- Additional gene problem
- Seems like a typo.
- Mark spliced/unspliced reads on original BAM file
- Incorrect trimming of chromosome name leads to IOError(f"Input .bam file should be chromosome-sorted. (Hint: use `samtools sort {bamfile}`)") HOT 3
- How to get fastq header information of spliced and unspliced 10X short reads
- Package dependency problem on NumPy HOT 2
- Velocyto ran successfully but main matrix is missing HOT 2
- Memory & run time issue run10x HOT 1
- When calculate the unspliced reads, does UTR region included or not?
- Retriving genes with caracterisitic velocity behavior accorindg to latent time
- Merging multiple loom files with different reference genome
- Integrating Ambient RNA Removal (CellBender) into Velocity Analysis
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from velocyto.py.