Dear gggenomes users, I have 6 bacterial genomes of ~3MB each, and I

You need to give all_seqs with added <code class="not

Focus on different region of different genomes about gggenomes HOT 4 OPEN

LemoAlex commented on September 26, 2024

Focus on different region of different genomes

from gggenomes.

Comments (4)

thackl commented on September 26, 2024

Hi Alexandre,

you are on the right track. Two ways to do this.

You can add columns start and end to your all_seqs.

all_seqs <- mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)

You can provide a table with locus coordinates for all your sequences (contigs) to focus.

You can use focus for this like this:

loci <- tribble(
~seq_id, ~start, ~end,
"genome1", 50000, 110000, # probably more realistic would be genomeA_contig1 or so
"genome2", 500000,  560000,
# ...
)

gggenomes(...) |> focus(.loci=loci)

Hope that helps

from gggenomes.

LemoAlex commented on September 26, 2024

Thank you for the answer!

I did try the first method, because the second one, with many genes in the region, might be too tedious.

So:

all_seqs <- mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)

But I get the error message

Error in UseMethod("mutate") : 
  no applicable method for 'mutate' applied to an object of class "c('double', 'numeric')"

When I look at my all_seqs file, it looks like this:

file_id seq_id seq_desc length

1 6genomes g1 NA 3836532
2 6genomes g2 NA 3937483
3 6genomes g3 NA 3750370
4 6genomes g4 NA 3995103
5 6genomes g5 NA 4006609
6 6genomes g6 NA 3980852

I thought about cutting the fasta files directly and loading only the fractions of the genomes I am interested in, but I am afraid it will then be messy to format the gff files according to the positions.

Cheers,

Alexandre

from gggenomes.

LemoAlex commented on September 26, 2024

Hello again,

Sorry for the previous post, I fixed the issue by simply changing the line to:

all_seqs %>% mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)

and it worked as I now have two extra columns in the all_seqs object.

However, how can I tell gggenomes to now only plot those regions rather than the entire genome ?

Cheers,

Alexandre

from gggenomes.

thackl commented on September 26, 2024

You need to give all_seqs with added start,end as seqs to gggenomes.

all_seqs_<-read_seqs('6genomes.fa',.id='file_id')
all_genomes_genes <- read_gff("genomes.gff") 

plot1 <- gggenomes(seq=A118_seqs, A118_genes)
plot1

all_loci <- all_seqs %>% mutate(start=c(50000,500000,1000000, ...), end=c(110000,560000,1060000, ...)

plot2 <- gggenomes(seq=all_loci,  all_genes)
plot2

from gggenomes.

Focus on different region of different genomes about gggenomes HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent