Git Product home page Git Product logo

Comments (8)

thackl avatar thackl commented on June 20, 2024

Interesting case. Do you have some (dummy) data for a reproducible example that I can play with?

from gggenomes.

trnpl avatar trnpl commented on June 20, 2024

Here is a file with info on the sequences and genes. Don't think links are really necessary here for this particular case.

seqsandgenes.zip

from gggenomes.

thackl avatar thackl commented on June 20, 2024

How about something like this.

library(gggenomes)

s0 <- read_csv("seqs.csv")
g0 <- read_csv("genes.csv")
g0

# full loci
p0 <- gggenomes(g0, s0) +
  geom_seq() +
  geom_gene(aes(fill=str_sub(feature_id, 1,3)))

# magic zoom in on "default" which mean all genes
p1 <- p0 |> focus(.expand = 1e3) 
p1

# decorate sequence breaks
geom_break <- function(mapping_start = NULL, mapping_end = NULL,
    data_start = seqs(start > 1), data_end=seqs(end < length),
    label="//", size=4, hjust=.75, family="bold", stat="identity",
    na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...){

  aes_start <- aes(x=x, y=y)
  aes_start <- gggenomes:::aes_intersect(mapping_start, aes_start)
  
  aes_end <- aes(x=xend, y=y)
  aes_end <- gggenomes:::aes_intersect(mapping_end, aes_end)

  list(
    geom_text(aes_start, data=data_start, label=label, size=size, hjust=hjust,
      family=family, ...),
    geom_text(aes_end, data=data_end, label=label, size=size, hjust=1-hjust,
      family=family, ...)
  )
}

p2 <- p1 + geom_break(label="/")
p2

library(patchwork)
p0 + p1 + p2 + plot_layout(guides="collect") & theme(legend.position = "bottom")
ggsave("break-at-long-non-coding.png", width=10, height=5)

break-at-long-non-coding

from gggenomes.

thackl avatar thackl commented on June 20, 2024

oh, and in case you want to flip things :)

p3 <- p0 |> 
  focus(.expand=1e3) |> 
  flip(where(~any(str_sub(.x$feature_id,1,3) == "hea" & .x$strand == "-")), .bin_track = genes)
p3

image

from gggenomes.

trnpl avatar trnpl commented on June 20, 2024

Thanks! This I think is workable for my purposes :)

from gggenomes.

trnpl avatar trnpl commented on June 20, 2024

Wait -- one more question actually. Is there a way to print the amount of nucleotides excluded in the breaks?

from gggenomes.

thackl avatar thackl commented on June 20, 2024

Two options:

  1. locate() does the same as focus, but instead of cutting seqs in the end, it adds a track with the coordinates of the things that focus would cut out. From that table, you can get exact coordinates an compute what was excluded (see code below)
p4 <- p0 |> 
  locate(.expand = 1e3) +
  geom_feat(position = position_nudge(y=-.3))
p4$data$feats$loci

# A tibble: 30 × 15
       y     x  xend bin_id   seq_id start   end     i locus_score locus_id locus_length
   <int> <dbl> <dbl> <chr>    <chr>  <dbl> <dbl> <int>       <int> <glue>          <dbl>
 1    13     0 21286 Dano_sm… Dano_…     1 21286     1           7 Dano_sm…        22101
 2    13 37943 53399 Dano_sm… Dano_… 37944 53399     2           3 Dano_sm…        15456
 3    14     0 20276 Dpal_sm… Dpal_…     1 20276     1           7 Dpal_sm…        21108
 4    14 35778 51267 Dpal_sm… Dpal_… 35779 51267     2           3 Dpal_sm…        15489
 5    12   323 17521 Dpan_sm… Dpan_…   324 17521     1           3 Dpan_sm…        17198
 6    12 33051 54735 Dpan_sm… Dpan_… 33052 54735     2           6 Dpan_sm…        22462
 7    15     0 24170 dana_sm… dana_…     1 24170     1           8 dana_sm…        24985
 8    15 38128 53604 dana_sm… dana_… 38129 53604     2           3 dana_sm…        15476
 9    11     0 16792 datr_sm… datr_…     1 16792     1           5 datr_sm…        17663
10    11 29803 45264 datr_sm… datr_… 29804 45264     2           3 datr_sm…        15461

image

  1. You can manually set the coordinates to focus() on via focus(.loci=<tibble(seq_id, start, end)>). With that you would know the sizes of gaps ahead of time.
# first 0-23k and 35-55k for first two seqs
loci <- tribble(
  ~seq_id, ~start, ~end,
  "dana_smallsynteny", 1, 23e3,
  "dana_smallsynteny", 35e3,55e3,
  "Dano_smallsynteny", 1, 23e3,
  "Dano_smallsynteny", 35e3,55e3
)

p5 <- p0 |> focus(.loci=loci) + geom_seq_label()
p5

image

from gggenomes.

trnpl avatar trnpl commented on June 20, 2024

Extremely helpful, thank you so much!

from gggenomes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.