Git Product home page Git Product logo

Comments (4)

thackl avatar thackl commented on May 26, 2024 2

@xvtyzn just wanted to say thanks again for reporting this issue. read_gff3() now by defaults checks and if warranted sorts exon/CDS coordinates for each gene in a file based on start coordinates.

Also, your data made me aware of an issue with reading gffs produced by Augustus. The CDS annotations in those files don't follow proper gff format, and therefore are not properly read/displayed - you probably noticed that as well. I might try to find a workaround for that in the future. See #65 for more info

Cheers
Thomas

from gggenomes.

thackl avatar thackl commented on May 26, 2024 1

Hi Keigo,

thanks a lot for bringing this to our attention. The intron/exon support is still quite experimental and getting feedback like this really helps! And great job on spotting the issue in the code.

Your fix is a nice one! Although, I think that ideally read_gff() should not get the order of exons wrong in the first place, i.e. read them as they are given in the gff file. Then resorting would not be necessary. I'm currently on holidays. I'll look into this once I'm back!

from gggenomes.

xvtyzn avatar xvtyzn commented on May 26, 2024

Hi

I solved the cause of this error.
It is caused by mrna_exon_introns in the read_gff3 function.

mrna_exon_introns <- filter(x, type == "exon") %>% select(exon_id = feat_id, 
        start, end, feat_id = parent_ids) %>% unchop(feat_id) %>% 
        group_by(feat_id) %>% summarize(introns = list(coords2introns(start, end)))
# A tibble: 14 x 4
# Groups:   feat_id [1]
   exon_id                     start     end feat_id           
   <chr>                       <int>   <int> <chr>             
 1 aten_0.1.m1.802.m1.exon1  9916243 9919080 aten_0.1.m1.802.m1
 2 aten_0.1.m1.802.m1.exon10 9930781 9931032 aten_0.1.m1.802.m1
 3 aten_0.1.m1.802.m1.exon11 9932680 9932805 aten_0.1.m1.802.m1
 4 aten_0.1.m1.802.m1.exon12 9935662 9935778 aten_0.1.m1.802.m1
 5 aten_0.1.m1.802.m1.exon13 9935983 9936081 aten_0.1.m1.802.m1
 6 aten_0.1.m1.802.m1.exon14 9938879 9939091 aten_0.1.m1.802.m1
 7 aten_0.1.m1.802.m1.exon2  9920151 9920639 aten_0.1.m1.802.m1
 8 aten_0.1.m1.802.m1.exon3  9921529 9921776 aten_0.1.m1.802.m1
 9 aten_0.1.m1.802.m1.exon4  9922434 9922598 aten_0.1.m1.802.m1
10 aten_0.1.m1.802.m1.exon5  9924117 9924315 aten_0.1.m1.802.m1
11 aten_0.1.m1.802.m1.exon6  9924592 9924801 aten_0.1.m1.802.m1
12 aten_0.1.m1.802.m1.exon7  9927854 9927979 aten_0.1.m1.802.m1
13 aten_0.1.m1.802.m1.exon8  9928552 9928674 aten_0.1.m1.802.m1
14 aten_0.1.m1.802.m1.exon9  9930577 9930693 aten_0.1.m1.802.m1

スクリーンショット 2021-07-22 14 59 50

The problem can be solved by sorting the start column for this exon tibble using arrange function.

mrna_exon_introns <- filter(x, type == "exon") %>% select(exon_id = feat_id, 
        start, end, feat_id = parent_ids) %>% unchop(feat_id) %>% 
        group_by(feat_id) %>% arrange(start) %>% summarize(introns = list(coords2introns(start, end)))
# A tibble: 14 x 4
# Groups:   feat_id [1]
   exon_id                     start     end feat_id           
   <chr>                       <int>   <int> <chr>             
 1 aten_0.1.m1.802.m1.exon1  9916243 9919080 aten_0.1.m1.802.m1
 2 aten_0.1.m1.802.m1.exon2  9920151 9920639 aten_0.1.m1.802.m1
 3 aten_0.1.m1.802.m1.exon3  9921529 9921776 aten_0.1.m1.802.m1
 4 aten_0.1.m1.802.m1.exon4  9922434 9922598 aten_0.1.m1.802.m1
 5 aten_0.1.m1.802.m1.exon5  9924117 9924315 aten_0.1.m1.802.m1
 6 aten_0.1.m1.802.m1.exon6  9924592 9924801 aten_0.1.m1.802.m1
 7 aten_0.1.m1.802.m1.exon7  9927854 9927979 aten_0.1.m1.802.m1
 8 aten_0.1.m1.802.m1.exon8  9928552 9928674 aten_0.1.m1.802.m1
 9 aten_0.1.m1.802.m1.exon9  9930577 9930693 aten_0.1.m1.802.m1
10 aten_0.1.m1.802.m1.exon10 9930781 9931032 aten_0.1.m1.802.m1
11 aten_0.1.m1.802.m1.exon11 9932680 9932805 aten_0.1.m1.802.m1
12 aten_0.1.m1.802.m1.exon12 9935662 9935778 aten_0.1.m1.802.m1
13 aten_0.1.m1.802.m1.exon13 9935983 9936081 aten_0.1.m1.802.m1
14 aten_0.1.m1.802.m1.exon14 9938879 9939091 aten_0.1.m1.802.m1

スクリーンショット 2021-07-22 14 57 28

Thanks!

Keigo

from gggenomes.

xvtyzn avatar xvtyzn commented on May 26, 2024

Thank you very much!

I really like gggenomes package, and I'm willing to help test it. Have a nice holiday!

from gggenomes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.