Git Product home page Git Product logo

Comments (15)

nawrockie avatar nawrockie commented on September 28, 2024 2

@cimendes : I added a miniscripts/annotate-tbl2gff.pl a 'miniscript' that can be used to convert v-annotate.pl .tbl output files to GFF3 format. The script is in the develop branch currently, and will be included in the next released version. For now, the version in the develop branch should work fine as a standalone conversion script.

This GFF format is not meant to be used for GenBank submissions. Use the .vadr.pass.tbl and .vadr.fail.tbl files for that.

Do

perl annotate-tbl2gff.pl -h

to see information on usage and options.

Please let me know if there are any problems with the script or feature requests.

from vadr.

nawrockie avatar nawrockie commented on September 28, 2024

@taltman : I plan to include this in the next version (next version after 1.1.1).

from vadr.

nawrockie avatar nawrockie commented on September 28, 2024

Hi @taltman : Sorry GFF output didn't make it into v1.1.1. Could you please share your script with me? I'm curious how you handled the Parent field in the attributes column. Thanks!

from vadr.

taltman avatar taltman commented on September 28, 2024

Hi @nawrockie , sorry for the delayed response.

I hacked together a quick script to do this conversion. I only "validated" it as much as the GFF annotations displayed sensibly in JBrowse. I didn't run it through any GFF validators to see whether it generates compliant GFF files. HTH!

https://bitbucket.org/tomeraltman/darth/src/master/src/tbl2gff.awk

In gearing up for submitting our novel CoV genomes to EBI, I will be verifying that the generated GFF is compliant, so hopefully I'll have an improved version soon.

from vadr.

nawrockie avatar nawrockie commented on September 28, 2024

Great, thanks @taltman ! If you do end up improving it, please let me know.

from vadr.

zhaoxvwahaha avatar zhaoxvwahaha commented on September 28, 2024

Hi, @nawrockie , could VADR output gff3 or gbk file now?

from vadr.

nawrockie avatar nawrockie commented on September 28, 2024

@zhaoxvwahaha : Unfortunately, not yet. Development of other features has taken priority. Are you able to use the script kindly shared by Tomer Altman above?:

https://bitbucket.org/tomeraltman/darth/src/master/src/tbl2gff.awk

from vadr.

Zjianglin avatar Zjianglin commented on September 28, 2024

Hi @nawrockie, It seems the tbl2gff cannot correctly covert the fuzzy positions during determine the ORF strand.
For example,

<3437   4116    mat_peptide
                       product NS2a

lines will be converted to MW164737 vadr mat_peptide 4116 <3437 . - . ID=ftr-8;Name=NS2a; in result GFF file.

I tried to modify the scipt to

$1 && $2 {
        if ( match($1, "[><]") != 0 ) {
                begx = $1
                # The gsub() function returns the number of substitutions made
                gsub("[><]", "", begx)
        } else {
                begx = $1
        }
        if ( match($2, "[><]") != 0 ) {
                endx = $2
                gsub("[><]", "", endx)
        } else {
                endx = $2
        }
      if ( int(begx) < int(endx) ) {
                start  = $1
                end    = $2
                strand = "+"
        } else {
                start  = $2
                end    = $1
                strand = "-"
        } 
        ftr_key = $3
        ++ftr_id
#print start, end, strand, ftr_key, ftr_id
}

Does this modification is right?
By the way, Is there any python library could parse the GFF3 file and process fuzzy positions, the BCBio(https://github.com/chapmanb/bcbb) filed to read the records with fuzzy positions.

from vadr.

nawrockie avatar nawrockie commented on September 28, 2024

@Zjianglin : I'm not sure about your modification of tbl2gff, you might try asking Tomer Altman who wrote that code (https://bitbucket.org/tomeraltman).

I don't know of any python library that can handle the fuzzy positions.

My suggestion would be to either try to modify tbl2gff not output the '>' and '<' characters, or write a simple script that strips them out as an extra step after you've run created the gff file. You could also try writing a script that strips them out of the .tbl file that vadr creates prior to running tbl2gff, or trying to parse the output .ftr table that vadr creates (https://github.com/ncbi/vadr/blob/master/documentation/formats.md#ftr) but note that the .ftr table does not have coordinate positions 'trimmed' due to Ns, like the .tbl file does.

from vadr.

Zjianglin avatar Zjianglin commented on September 28, 2024

Hi @nawrockie , thanks for your reply and suggestions. I would try to manually check the genomes that with fuzzy positions and strip them out. Thank you again.

from vadr.

Zjianglin avatar Zjianglin commented on September 28, 2024

Hi @nawrockie , sorry for the delayed response.

I hacked together a quick script to do this conversion. I only "validated" it as much as the GFF annotations displayed sensibly in JBrowse. I didn't run it through any GFF validators to see whether it generates compliant GFF files. HTH!

https://bitbucket.org/tomeraltman/darth/src/master/src/tbl2gff.awk

In gearing up for submitting our novel CoV genomes to EBI, I will be verifying that the generated GFF is compliant, so hopefully I'll have an improved version soon.

Hi @taltman , could you please check the my modification of the [tbl2gfff](https://bitbucket.org/tomeraltman/darth/src/master/src/tbl2gff.awk)? The original script seems cannot process fuzzy positions.

from vadr.

kapsakcj avatar kapsakcj commented on September 28, 2024

+1 for this request

I've had a couple of requests to visualize the outputs of VADR in IGV, specifically in a GFF3 file

from vadr.

cimendes avatar cimendes commented on September 28, 2024

Is this feature still on the roadmap for VADR development?

from vadr.

nawrockie avatar nawrockie commented on September 28, 2024

@cimendes sorry for the long delay on this requested feature. I'm working on it now and will post another update by the end of next week.

from vadr.

cimendes avatar cimendes commented on September 28, 2024

Thank you for the update! That is wonderful news!

from vadr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.