Git Product home page Git Product logo

Comments (12)

dewyman avatar dewyman commented on June 8, 2024 3

Yes, absolutely! My colleague and I have put together a utility to extract splice junctions/exon positions as well as the transcripts that contain them. I'm going to run some tests on it and finalize the details, and then once I'm comfortable all is going as intended, I'll let you know so you can try it out!

from talon.

fairliereese avatar fairliereese commented on June 8, 2024 1

Hey, we just fixed how long things were taking. It should run MUCH faster now, and you should be able to do so with full gtfs. Let us know if it's working for you!

from talon.

dewyman avatar dewyman commented on June 8, 2024

Hi! So to clarify, are you attempting to look at cases where you have novel splice junctions in a known gene, and then see how far they are away from known junctions? I don't think we have an existing formal utility that outputs the splice junctions, but it wouldn't be too difficult to make one!

from talon.

gm-nyc avatar gm-nyc commented on June 8, 2024

Yes! I am trying to quantify the distance from the reference splice junctions to the novel junctions I'm seeing in my samples, which have splicing aberrations. The mis-splicing can be transcriptome-wide so I am trying to generate an exon-based matrix for my cells. Does that make sense?

from talon.

iam2b avatar iam2b commented on June 8, 2024

Hi, I am very happy to find this tool for my analysis. I tried first run today and found a problem. The error message was "SAM transcript xxx lacks an MD tag". My samples were DirectRNA Nanopore-seq mapped by Minimap2.
By the way, will you develop a tool like MISO or rMATs to help detect the change of alternative splicing?

from talon.

dewyman avatar dewyman commented on June 8, 2024

Hi iam2b,
You should be able to fix this issue by running Minimap2 with the --MD flag (see issue #45). Currently we are not in the business of developing our own downstream alt splicing tool, but you might consider trying this one https://bioconductor.org/packages/release/bioc/html/IsoformSwitchAnalyzeR.html. The developer has added support for TALON abundance files.

from talon.

iam2b avatar iam2b commented on June 8, 2024

Thank you very much. I have sloved this problem.
Merry Chrismas!

from talon.

gm-nyc avatar gm-nyc commented on June 8, 2024

Hi dewyman,

I wanted to clarify my question a little. The reason I was asking for the distance from the canonical splice junction is that I am trying to identify (and quantify) alternate 3' and 5' splice site usage and thought that the positional information for each junction would be useful since it could be compared with the reference. Thanks for your help and any thoughts/suggestions would be welcome! Hope you're having a good new year.

from talon.

dewyman avatar dewyman commented on June 8, 2024

Hi! Don't worry, your question makes total sense. We've been working on a utility to help address your question. It's technically complete and passed our tests, but is running slowly so we were hoping to do a bit more work on it to make it run faster. In the meantime though, you're welcome to try it out:

usage: talon_get_sjs [-h] [--gtf GTF] [--db DB] [--ref REF_GTF] [--mode MODE]
                     [--outprefix OUTPREFIX]

Extracts the locations, novelty, and transcript assignments of exons/introns
in a TALON database or GTF file. All positions are 1-based.

optional arguments:
  -h, --help            show this help message and exit
  --gtf GTF             TALON GTF file from which to extract exons/introns
  --db DB               TALON database from which to extract exons/introns
  --ref REF_GTF         GTF reference file (ie GENCODE). Will be used to label
                        novelty.
  --mode MODE           Choices are 'intron' or 'exon' (default is 'intron').
                        Determines whether to include introns or exons in the
                        output
  --outprefix OUTPREFIX
                        Prefix for output file

As a side note, when you run this script in 'intron' mode, the start/end positions currently include the exon base that flanks the intron on each side.

Another approach you might try for extracting splice junctions from a TALON GTF file would be to use the TranscriptClean utility described here. Outputs from this script follow the STAR splice junction output format, which is described in the STAR manual (section 4.4) here.

I hope this helps, but don't hesitate to reach out if you have more questions!
Best,
Dana

from talon.

gm-nyc avatar gm-nyc commented on June 8, 2024

Hi Dana,

Thanks for your help. I am trying to run this script and it's either extremely slow or getting stuck. I subsetted my gtf by chromosome and took the smallest one (chrM in my case, with 48 total lines in the gtf) and the script is still running. Is that the expected speed or do you think there is another issue?

here is the code I'm running:

~/talon/talon-4.4.2/python/bin/talon_get_sjs --gtf ${file} --ref ~/gencode.v31.annotation.gtf --mode intron --outprefix intron

from talon.

dewyman avatar dewyman commented on June 8, 2024

Thanks for letting us know- we'll look into it some more.

from talon.

dewyman avatar dewyman commented on June 8, 2024

The reason it's taking so long with your current command is because your --ref file is the entire annotation. If you want to run just chrM, consider subsetting the reference GTF also.

from talon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.