pjotrp / bioruby-alignment Goto Github PK
View Code? Open in Web Editor NEWTODO: one-line summary of your gem
License: MIT License
TODO: one-line summary of your gem
License: MIT License
Bio-alignment is really helpful.
Is there a quick way (baked-in support) for returning an alignment
object after slicing,
say I have a seqalignment
object, seqalign
and I call seqalign[start,stop]
It returns an array of columns.
What i really need is an alignment object, with sequence ids, preserved.
A workaround is to create an alignment object from the columns but I
think this is tedious.
What do you think?
Hi pjotrp,
Was just trying to get some of your examples to work, and I ran into an issue, essentially I used this file 'short_trial.aln.fasta'
43617/31050-31061
accaatatcgcc
46P47B1/31054-31065
gccaatatcgcc
7169/31137-31148
gccaatatcgcc
B496/31053-31064
gccaatatcgcc
B502/31118-31129
accaatatcgcc
B503/31055-31066
gccaatatcgcc
B507/31052-31063
accaatatcgcc
B508/31057-31068
accaatatcgcc
B583/31081-31092
accaatatcgcc
B615/31109-31120
accaatatcgcc
B616/31296-31307
gccaatattatt
B617/31057-31068
accaatatcgcc
B618/31052-31063
accaatatcgcc
B619/31062-31073
gccaatatcgcc
Then ran these commands in irb:
require 'bio-alignment'
require 'bigbio' # for the Fasta reader
include Bio::BioAlignment # Namespace
aln = Alignment.new
fasta = FastaReader.new('short_trial.aln.fasta')
fasta.each do | rec |
aln.sequences << rec
end
aln[3][0]
And at this point I get the error:
irb(main):012:0> aln[3][0]
NoMethodError: undefined method []' for #<FastaRecord:0x0000000251f3e0> from (irb):12 from /usr/bin/irb:12:in
Am I messing this up somehow?
~josh
With bio-alignment (0.0.7)
require 'bio' # BioRuby
require 'bio-alignment'
require 'bio-alignment/bioruby' # make Bio::Sequence enumerable
include Bio::BioAlignment
aln = Alignment.new
aln << Bio::Sequence::NA.new("atgcatgcaaaa")
aln << Bio::Sequence::NA.new("atg---tcaaaa")
returns an Error
undefined method `<<' for :Bio::BioAlignment::Alignment (NoMethodError)
this seems to be the error returned by all the examples on the README.
Got this error when running under Ruby 2.0.0
block in codon_al': undefined method <<' for #<Bio::BioAlignment::Alignment:0x007f8c642d9248 @sequences=[]> (NoMethodError)
Description from UCSC:
The multiple alignment format stores a series of multiple alignments in a format that is easy to parse and relatively easy to read. This format stores multiple alignments at the DNA level between entire genomes. Previously used formats are suitable for multiple alignments of single proteins or regions of DNA without rearrangements, but would require considerable extension to cope with genomic issues such as forward and reverse strand directions, multiple pieces to the alignment, and so forth.
Spec:
http://genome.ucsc.edu/FAQ/FAQformat#format5
Available software:
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/
http://compgen.bscb.cornell.edu/phast/
Sample data:
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/multiz46way/maf/
BioPython:
http://biopython.org/wiki/Multiple_Alignment_Format
They have an interesting indexing system.
Maf is used by:
https://github.com/mlin/PhyloCSF/wiki
UCSC
Should bio-alignment support a method to generate consensus sequences? There are various schools of thought on what a consensus residue is. A simple or naive approach is to return the mode or the most frequent character at a given position. If you have an alignment of 2 sequences and you have a variant position you would have to choose one of the two residues as the consensus; same applies for an alignment where you have 50/50 occurrence of bases at a particular position. I bet there are several approaches to this problem. Is this something that bio-alignment can support?
The GBlocks routine is often used, but the source code is not open source. This
is a feature request for a reimplementation of GBlocks. Some links:
Open sourcing request by Debian: http://lists.debian.org/debian-med/2011/02/msg00008.html
Binary download of GBlocks: http://molevol.cmima.csic.es/castresana/Gblocks.html
Documentation: http://molevol.cmima.csic.es/castresana/Gblocks/Gblocks_documentation.html
It is quite a simple routine, and would be easy to validate against existing outcomes.
see
https://github.com/pjotrp/bioruby-alignment/blob/master/features/edit/gblocks.feature
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.