Comments (4)
@xvtyzn just wanted to say thanks again for reporting this issue. read_gff3()
now by defaults checks and if warranted sorts exon/CDS coordinates for each gene in a file based on start coordinates.
Also, your data made me aware of an issue with reading gffs produced by Augustus. The CDS annotations in those files don't follow proper gff format, and therefore are not properly read/displayed - you probably noticed that as well. I might try to find a workaround for that in the future. See #65 for more info
Cheers
Thomas
from gggenomes.
Hi Keigo,
thanks a lot for bringing this to our attention. The intron/exon support is still quite experimental and getting feedback like this really helps! And great job on spotting the issue in the code.
Your fix is a nice one! Although, I think that ideally read_gff()
should not get the order of exons wrong in the first place, i.e. read them as they are given in the gff file. Then resorting would not be necessary. I'm currently on holidays. I'll look into this once I'm back!
from gggenomes.
Hi
I solved the cause of this error.
It is caused by mrna_exon_introns in the read_gff3 function.
mrna_exon_introns <- filter(x, type == "exon") %>% select(exon_id = feat_id,
start, end, feat_id = parent_ids) %>% unchop(feat_id) %>%
group_by(feat_id) %>% summarize(introns = list(coords2introns(start, end)))
# A tibble: 14 x 4
# Groups: feat_id [1]
exon_id start end feat_id
<chr> <int> <int> <chr>
1 aten_0.1.m1.802.m1.exon1 9916243 9919080 aten_0.1.m1.802.m1
2 aten_0.1.m1.802.m1.exon10 9930781 9931032 aten_0.1.m1.802.m1
3 aten_0.1.m1.802.m1.exon11 9932680 9932805 aten_0.1.m1.802.m1
4 aten_0.1.m1.802.m1.exon12 9935662 9935778 aten_0.1.m1.802.m1
5 aten_0.1.m1.802.m1.exon13 9935983 9936081 aten_0.1.m1.802.m1
6 aten_0.1.m1.802.m1.exon14 9938879 9939091 aten_0.1.m1.802.m1
7 aten_0.1.m1.802.m1.exon2 9920151 9920639 aten_0.1.m1.802.m1
8 aten_0.1.m1.802.m1.exon3 9921529 9921776 aten_0.1.m1.802.m1
9 aten_0.1.m1.802.m1.exon4 9922434 9922598 aten_0.1.m1.802.m1
10 aten_0.1.m1.802.m1.exon5 9924117 9924315 aten_0.1.m1.802.m1
11 aten_0.1.m1.802.m1.exon6 9924592 9924801 aten_0.1.m1.802.m1
12 aten_0.1.m1.802.m1.exon7 9927854 9927979 aten_0.1.m1.802.m1
13 aten_0.1.m1.802.m1.exon8 9928552 9928674 aten_0.1.m1.802.m1
14 aten_0.1.m1.802.m1.exon9 9930577 9930693 aten_0.1.m1.802.m1
The problem can be solved by sorting the start column for this exon tibble using arrange function.
mrna_exon_introns <- filter(x, type == "exon") %>% select(exon_id = feat_id,
start, end, feat_id = parent_ids) %>% unchop(feat_id) %>%
group_by(feat_id) %>% arrange(start) %>% summarize(introns = list(coords2introns(start, end)))
# A tibble: 14 x 4
# Groups: feat_id [1]
exon_id start end feat_id
<chr> <int> <int> <chr>
1 aten_0.1.m1.802.m1.exon1 9916243 9919080 aten_0.1.m1.802.m1
2 aten_0.1.m1.802.m1.exon2 9920151 9920639 aten_0.1.m1.802.m1
3 aten_0.1.m1.802.m1.exon3 9921529 9921776 aten_0.1.m1.802.m1
4 aten_0.1.m1.802.m1.exon4 9922434 9922598 aten_0.1.m1.802.m1
5 aten_0.1.m1.802.m1.exon5 9924117 9924315 aten_0.1.m1.802.m1
6 aten_0.1.m1.802.m1.exon6 9924592 9924801 aten_0.1.m1.802.m1
7 aten_0.1.m1.802.m1.exon7 9927854 9927979 aten_0.1.m1.802.m1
8 aten_0.1.m1.802.m1.exon8 9928552 9928674 aten_0.1.m1.802.m1
9 aten_0.1.m1.802.m1.exon9 9930577 9930693 aten_0.1.m1.802.m1
10 aten_0.1.m1.802.m1.exon10 9930781 9931032 aten_0.1.m1.802.m1
11 aten_0.1.m1.802.m1.exon11 9932680 9932805 aten_0.1.m1.802.m1
12 aten_0.1.m1.802.m1.exon12 9935662 9935778 aten_0.1.m1.802.m1
13 aten_0.1.m1.802.m1.exon13 9935983 9936081 aten_0.1.m1.802.m1
14 aten_0.1.m1.802.m1.exon14 9938879 9939091 aten_0.1.m1.802.m1
Thanks!
Keigo
from gggenomes.
Thank you very much!
I really like gggenomes package, and I'm willing to help test it. Have a nice holiday!
from gggenomes.
Related Issues (20)
- Submission for JOSS?
- Errror while installing HOT 1
- ggplot2 error using example data HOT 6
- Installation error HOT 3
- fill and fontface otions for geom_gene_label HOT 4
- Move deps from Depends to Imports? HOT 1
- Adding UTRs in the plot HOT 3
- Genes do not start and end at the expected positions HOT 2
- Setting coordinates to visualise part of the genome HOT 1
- shitf() "unused argument" HOT 2
- geom_gene() doesn't work HOT 5
- Errors when running read_feats command HOT 3
- Change space between seqs within the same bin HOT 5
- Changing gene colours HOT 2
- ! Required column(s) missing: • length HOT 2
- breaks to sequences with geom_break() HOT 1
- Error in read_feats HOT 4
- Draw a combined plot with ggtree and gggenomes HOT 3
- Error using continuous fill of geom_link() at same time as discrete fill of geom_gene() HOT 2
- Introducing sequence breaks? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gggenomes.