Git Product home page Git Product logo

sina's Introduction

SINA - reference based multiple sequence alignment

latest Bioconda downloads TravisCI CircleCI Read the Docs Codecov

SINA aligns nucleotide sequences to match a pre-existing MSA using a graph based alignment algorithm similar to PoA. The graph approach allows SINA to incorporate information from many reference sequences building without blurring highly variable regions. While pure NAST implementations depend highly on finding a good match in the reference database, SINA is able to align sequences relatively distant to references with good quality and will yield a robust result for query sequences with many close reference.

Features

  • Speed. Aligning 100,000 full length rRNA against the SILVA NR takes 40 minutes on a mid-sized 2018 desktop computer. Aligning 1,000,000 V4 amplicons takes about 60 minutes.
  • Accuracy. SINA is used to build the SILVA SSU and LSU rRNA databases.
  • Classification. SINA includes an LCA based classification module.
  • ARB. SINA is able to directly read and write ARB format files such as distributed by the SILVA project.

Online Version

An online version for submitting small batches of sequences is made available by the SILVA project as part of their ACT: Alignment, Classification and Tree Service. In addition to SINA's alignment and classification stages, ACT allows directly building phylogenetic trees with RAxML or FastTree from your sequences and (optionally) additional sequences chosen using SINA's add-neighbors feature.

Installing SINA

The preferred way to install SINA locally is via Bioconda. If you have a working Bioconda installation, just run:

conda create -n sina sina
conda activate sina

Alternatively, self-contained images are available at https://github.com/epruesse/SINA/releases. Choose the most recent tar.gz appropriate for your operating system and unpack:

tar xf sina-1.7.3-dev-dev-linux.tar.gz
cd sina-1.7.3-dev-dev
./sina

Documentation

The full documentation is available at https://sina.readthedocs.io.

The algorithm is explained in the paper:

Elmar Pruesse, Jörg Peplies, Frank Oliver Glöckner; SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics 2012; 28 (14): 1823-1829. doi:10.1093/bioinformatics/bts252

sina's People

Contributors

codacy-badger avatar epruesse avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sina's Issues

Conda environment broken

Hi,

When I install via conda and run 'sina' I receive the following error:

sina: symbol lookup error: ~/anaconda3/envs/sina/bin/../lib/libsina.so.0: undefined symbol: _ZN5boost15program_options3argE

Thanks

double free (concurrency bug)

From https://circleci.com/gh/epruesse/SINA/461

*** glibc detected *** /root/build/src/.libs/lt-sina: double free or corruption (out): 0x00007fb558819390 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x75e5e)[0x7fb569847e5e]
/lib64/libc.so.6(+0x78cf0)[0x7fb56984acf0]
/root/build/src/.libs/libsina.so.0(_ZNSt6vectorIN4sina15aligned_compactINS0_10base_iupacEEESaIS3_EEaSERKS5_+0x151)[0x7fb56a301a11]
/root/build/src/.libs/libsina.so.0(_ZN4sina9query_arb7getCseqERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1bd)[0x7fb56a36b0cd]
/root/build/src/.libs/libsina.so.0(_ZN4sina11kmer_search4impl4findERKNS_14annotated_cseqERSt6vectorINS_6search11result_itemESaIS7_EEj+0x89b)[0x7fb56a3de39b]
/root/build/src/.libs/libsina.so.0(_ZN4sina11kmer_search4findERKNS_14annotated_cseqERSt6vectorINS_6search11result_itemESaIS6_EEj+0x17)[0x7fb56a3dea47]
/root/build/src/.libs/libsina.so.0(_ZN4sina9famfinder4impl5matchERSt6vectorINS_6search11result_itemESaIS4_EERKNS_14annotated_cseqE+0x429)[0x7fb56a352469]
/root/build/src/.libs/libsina.so.0(_ZN4sina9famfinder4implclENS_4trayE+0xac0)[0x7fb56a358e20]
/root/build/src/.libs/libsina.so.0(_ZN4sina9famfinderclERKNS_4trayE+0x54)[0x7fb56a3595a4]
/root/build/src/.libs/lt-sina(+0x48a8b)[0x559994757a8b]
/root/build/src/.libs/lt-sina(+0x88135)[0x559994797135]
/root/build/src/.libs/lt-sina(+0x8829a)[0x55999479729a]
/root/miniconda/lib/libtbb.so.2(+0x294a9)[0x7fb56a7304a9]
/root/miniconda/lib/libtbb.so.2(+0x22af8)[0x7fb56a729af8]
/root/miniconda/lib/libtbb.so.2(+0x21384)[0x7fb56a728384]
/root/miniconda/lib/libtbb.so.2(+0x1d1e4)[0x7fb56a7241e4]
/root/miniconda/lib/libtbb.so.2(+0x1d45a)[0x7fb56a72445a]
/lib64/libpthread.so.0(+0x7aa1)[0x7fb5695bcaa1]
/lib64/libc.so.6(clone+0x6d)[0x7fb5698bac4d]
[...]
FAIL: tests/accuracy.test 9 - realign msc 0.7

conda install sina installs v1.3.5

The documentation here and in the bioconda link suggests that
conda install sina will install version 1.4.0, but it instead installs 1.3.5:

package                    |            build
    ---------------------------|-----------------
    python-3.7.1               |       h0371630_3        36.4 MB
    intel-openmp-2019.1        |              144         885 KB
    arb-bio-tools-6.0.6        |       h5901010_5         651 KB  bioconda
    ncurses-6.1                |       he6710b0_1         958 KB
    zlib-1.2.11                |       h7b6447c_3         120 KB
    setuptools-40.6.2          |           py37_0         603 KB
    mkl_random-1.0.1           |   py37h4414c95_1         372 KB
    numpy-base-1.15.4          |   py37h81de0dd_0         4.2 MB
    numpy-1.15.4               |   py37h1d66e8a_0          35 KB
    openssl-1.1.1a             |       h7b6447c_0         5.0 MB
    py-boost-1.67.0            |   py37h04863e7_4         302 KB
    certifi-2018.10.15         |           py37_0         138 KB
    libarbdb-6.0.6             |       h5901010_5         327 KB  bioconda
    pip-18.1                   |           py37_0         1.7 MB
    boost-1.67.0               |           py37_4          11 KB
    gettext-0.19.8.1           |       hd7bead4_3         3.7 MB
    sina-1.3.5                 |       h4ef8376_2         2.2 MB  bioconda
    wheel-0.32.3               |           py37_0          35 KB
    mkl_fft-1.0.6              |   py37h7dd41cf_0         150 KB
    ------------------------------------------------------------
                                           Total:        57.7 MB

The following NEW packages will be INSTALLED:

    arb-bio-tools:   6.0.6-h5901010_5        bioconda
    blas:            1.0-mkl                         
    boost:           1.67.0-py37_4                   
    bzip2:           1.0.6-h14c3975_5                
    ca-certificates: 2018.03.07-0                    
    certifi:         2018.10.15-py37_0               
    gettext:         0.19.8.1-hd7bead4_3             
    glib:            2.56.2-hd408876_0               
    icu:             58.2-h9c2bf20_1                 
    intel-openmp:    2019.1-144                      
    libarbdb:        6.0.6-h5901010_5        bioconda
    libboost:        1.67.0-h46d08c1_4               
    libedit:         3.1.20170329-h6b74fdf_2         
    libffi:          3.2.1-hd88cf55_4                
    libgcc-ng:       8.2.0-hdf63c60_1                
    libgfortran-ng:  7.3.0-hdf63c60_0                
    libstdcxx-ng:    8.2.0-hdf63c60_1                
    mkl:             2018.0.3-1                      
    mkl_fft:         1.0.6-py37h7dd41cf_0            
    mkl_random:      1.0.1-py37h4414c95_1            
    ncurses:         6.1-he6710b0_1                  
    numpy:           1.15.4-py37h1d66e8a_0           
    numpy-base:      1.15.4-py37h81de0dd_0           
    openssl:         1.1.1a-h7b6447c_0               
    pcre:            8.42-h439df22_0                 
    pip:             18.1-py37_0                     
    py-boost:        1.67.0-py37h04863e7_4           
    python:          3.7.1-h0371630_3                
    readline:        7.0-h7b6447c_5                  
    setuptools:      40.6.2-py37_0                   
    sina:            1.3.5-h4ef8376_2        bioconda
    sqlite:          3.25.3-h7b6447c_0               
    tk:              8.6.8-hbc83047_0                
    wheel:           0.32.3-py37_0                   
    xz:              5.2.4-h14c3975_4                
    zlib:            1.2.11-h7b6447c_3               

Proceed ([y]/n)? y


Downloading and Extracting Packages
python-3.7.1         | 36.4 MB   | ######################################################################## | 100% 
intel-openmp-2019.1  | 885 KB    | ######################################################################## | 100% 
arb-bio-tools-6.0.6  | 651 KB    | ######################################################################## | 100% 
ncurses-6.1          | 958 KB    | ######################################################################## | 100% 
zlib-1.2.11          | 120 KB    | ######################################################################## | 100% 
setuptools-40.6.2    | 603 KB    | ######################################################################## | 100% 
mkl_random-1.0.1     | 372 KB    | ######################################################################## | 100% 
numpy-base-1.15.4    | 4.2 MB    | ######################################################################## | 100% 
numpy-1.15.4         | 35 KB     | ######################################################################## | 100% 
openssl-1.1.1a       | 5.0 MB    | ######################################################################## | 100% 
py-boost-1.67.0      | 302 KB    | ######################################################################## | 100% 
certifi-2018.10.15   | 138 KB    | ######################################################################## | 100% 
libarbdb-6.0.6       | 327 KB    | ######################################################################## | 100% 
pip-18.1             | 1.7 MB    | ######################################################################## | 100% 
boost-1.67.0         | 11 KB     | ######################################################################## | 100% 
gettext-0.19.8.1     | 3.7 MB    | ######################################################################## | 100% 
sina-1.3.5           | 2.2 MB    | ######################################################################## | 100% 
wheel-0.32.3         | 35 KB     | ######################################################################## | 100% 
mkl_fft-1.0.6        | 150 KB    | ######################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Support FastQ

  • parse (simple)
  • weight alignment using phred score (not simple)
  • deal with paired end properly

command find_family fails

Dear Elmar,

I am trying the parallel version of sina (1.4.0) with a set of ~22,000 full 16S sequences but after 1d of running I start to get a lot of warnings like:

AISC_SET_ERROR: CONNECTION PROBLEMS
Unable to execute find_family command on pt-server
Retring...

and eventually, an abort message:
No retries left; aborting.

How can I solve the problem?
The command I am running is:
sina -i derep.fasta -o derep_aln.fasta --meta-fmt csv --db silva/ARB/SILVA_132_SSURef_12_12_17_opt.arb --search --search-min-sim 0.865 --search-db silva/ARB/SILVA_132_SSURef_12_12_17_opt.arb --lca-fields tax_slv --num-pts 8

I installed sina using conda in a Ubuntu 16.04.1 environment. The computer has 28 cores and 512Gb of RAM.
Thanks for your help!

ARBHOME not set (w/o wrapper script)

When not using the wrapper script, e.g. installation via Bioconda, the ARBHOME environment variable may not be set causing arb_pt_server to fail.

Encode the path in the binary and try to find ARB at startup.

GPL software linking against non-free library

Hi,
I wanted to package SINA for Debian since it is needed by some QIIME2 module. However, I realised that it is linking against non-free software (libARB). Do you see any chance to either replace this by some free alternatives or talk to the ARB authors to free at least this part of their code (I talked to the ARB authors several times and know that it might be hard to free the whole software - but may be its possible in parts?)
Kind regards, Andreas.

Improve classifier

The LCA classifier has obvious shortcomings as it does not take into account the spread in a search result.

Tab characters in sequence should be ignored

Tab characters are currently considered invalid and cause discard of the sequence containing them. They should be considered whitespace and ignored when processing the sequences instead.

Search stage not parallelized

If the search DB and the reference DB are the same, and the engine is PT, the search should use the same PT servers. Otherwise it needs its own.

Installing SINA from source on ubuntu 18.04

Hi!

I have succesfully installed SINA 1.4.0 through conda, but I would like to install SINA from the github source tree to get the newest development version, but I get the error below during ./configure. I have followed the instructions on the wiki and installed ARB6 with apt install arb arb-common arb-doc libarb libarb-dev and also all libboost packages, so I don't understand why these are not found. I can see that for example libARBDB is already in /usr/lib/arb/lib/libARBDB.so, so why can't it be found?

configure: error: Required libraries found missing:  
  ARB libraries (libARBDB)
  ARB PROBE library (PROBE_COM/client.a)
  ARB HELIX library (SL/HELIX/HELIX.a)

Thanks in advance
Kasper

activation of search leads to boost exception

When the --search flag is included an exception from boost is thrown.

For example the command:
sina -i sequence.fasta -o sequence..aligned.fasta --db SILVA_132.arb --search-db SILVA_132.arb --search

Will through:
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::bad_any_cast >'
what(): boost::bad_any_cast: failed conversion using boost::any_cast
Aborted (core dumped)

The command was run on Ubuntu 18.04 and 16.04 using conda installation or the precompiled distributions with the same results.

Option to filter gap-only columns from output alignment

Hey Elmar,

It would be really useful if SINA could simplify its output alignment a bit, by removing columns that contain only gap characters. Currently I'm using the QIIME1 script "filter_alignment.py" to do this.

Thanks!

-JLD

Align reduces family if --realign

Realign requires that no reference sequences containing the query be used. This is checked in align right now, which means that the reference set gets smaller. It should be done in famfinder - probably actually as "leave-query-out".

align_ident_slv field blank if not 100% identity

Hi!

I have used SINA 1.4.0 to align and assign taxonomy to some full length 16S sequences. When inspecting the output, the field align_ident_slv in the output CSV file only shows sequences with an identity of 100, everything else is just blank. I ran the same data on SINA 1.3.5 last week and all fields in the column had a value. This is the command I ran:

sina -i fssu.fa --intype fasta -o output.fa \
  --meta-fmt csv nearest nuc achieved_idty name \
  -r refdatabases/SILVA_132_SSURef_NR99_13_12_17_opt.arb --search \
  --search-db refdatabases/SILVA_132_SSURef_NR99_13_12_17_opt.arb \
  --lca-fields tax_slv --search-min-sim 0.5 --search-max-result 1 \
  --num-pts $((THREADS / 2)) --threads $THREADS

I run Ubuntu 18.04 LTS on a Dell 7920 with 40 threads + 128GB RAM.

Thanks in advance
Kasper

Improve CSV output

Hi Elmar,

Would be nice to be able to choose the information in taxonomy search output!

Thanx,
Veljo

Write README.md

Begin converting docs by writing an informative intro page.

Improve logging

Have a look at boost log. The current implementation is loud and inflexible and causes issues on clusters as it fails if stderr/stdout are not open.

SINA documentation

Hello Dr Pruesse,

Is it just me, or that the options in the online documentation didnt show up? And thank you so much for the software! (previously I simply used the online aligner, only recently I tried SINA cli)

Looping files through SINA?

I encountered a problem running SINA via command line and was wondering if anyone might be able to suggest a solution?

I am able to run a single fasta file fine with no problems, for example:

sina -i file1.fasta -o file1.output.fasta
--meta-fmt csv
--ptdb SSURef_NR99_132_SILVA_13_12_17_opt.arb
--search --search-db SSURef_NR99_132_SILVA_13_12_17_opt.arb --lca-fields tax_slv

However, I have many files that I’d like to run, so I created a loop as follows:

for i in *.fasta
do
sina -i $i -o $i.output.fasta
--meta-fmt csv
--ptdb SSURef_NR99_132_SILVA_13_12_17_opt.arb
--search --search-db SSURef_NR99_132_SILVA_13_12_17_opt.arb --lca-fields tax_slv
done

When I do this it seems to progress as expected up until alignment of the 16th sequence, at which point it aborts with the following error message:

Time for alignment phase: 41.081814s
Terminating PT server…

ARB_PT_SERVER: received shutdown message

I tried using --search-all within the loop and that worked fine, but was too slow. I’d like to run the loop with the PT server, so any suggestions would be much appreciated!

Problem regarding sina

I am working with sina and this is the script which i have used it takes too much time still it is not complete. is it running ok?

(sina) pkd@pkd-HP-406-G1-MT:~$ sina -i ~/Sina/amplicons_seeds.fasta --intype fasta -o sina_out --outtype fasta --search --meta-fmt csv --overhang remove --insertion forbid --filter none --fs-kmer-no-fast --fs-kmer-len 10 --fs-req 2 --fs-req-full 1 --fs-min 40 --fs-max 40 --fs-weight 1 --fs-full-len 350 --fs-msc 0.7 --match-score 1 --mismatch-score -1 --pen-gap 5 --pen-gapext 2 --search-cover query --search-iupac optimistic --search-min-sim 0.9 --turn all --lca-quorum 0.7 --search-db ~/Sina/sina_ssu.arb --db ~/Sina/sina_ssu.arb --lca-fields tax_slv
[2019-02-09 13:54:08.891] [log] [info] Loglevel set to warning
13:54:08 [SINA] This is SINA 1.4.0.

0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|


ARB: Loading '/home/pkd/Sina/sina_ssu.arb.index.arb'
ARB: no FastLoad File '/home/pkd/Sina/sina_ssu.arb.index.ARM' found => loading entire DB
ARB: Loading '/home/pkd/Sina/sina_ssu.arb.index.arb' done

Progress: Remove unused database entries
...................................................................... [100.0%] used: 5s
[done]
Database contains 38792 species
Progress: Preparing sequence data
...................................................................... [100.0%] used: 4s
[done]

All species contain data in alignment 'ali_16s'.
[ptserver '-build_clean' took 10s]
sum of above: 0 b
overall alloc: 0 b
sum of above: 0 b
overall alloc: 0 b
ARB: Loading '/home/pkd/Sina/sina_ssu.arb.index.arb'
ARB: Opening FastLoad File '/home/pkd/Sina/sina_ssu.arb.index.ARM' ...
ARB: Loading '/home/pkd/Sina/sina_ssu.arb.index.arb' done

Building PT-Server for alignment 'ali_16s'...
Database contains 38792 species
Progress: Checking data
...................................................................... [100.0%] used: 0s
[done]
Failed to write to '/home/pkd/anaconda3/envs/sina/lib/arb/lib/pts/ptserver.log'
Visible memory: 11.7 Gb
Restricting used memory (by internal default '90%') to 10.5 Gb
Note: Setting envar ARB_MEMORY will override that restriction (percentage or absolute memsize)
Memory available for build: 10.4 Gb
Estimated memory usage for 1 pass: 752 Mb
Overall bases: 54.1 Mbp
Max. partition size: 54.1 Mbp (=100.0%)
Progress: Build index in 1 passes
{pass 1/1}............................................................ [ 25.0%] left: 33s
...................................................................... [ 50.0%] left: 25s
...................................................................... [ 75.0%] left: 13s
...................................................................... [100.0%] used: 1m0s
[done]
PT_SERVER database "/home/pkd/Sina/sina_ssu.arb.index.arb" has been created.
Failed to write to '/home/pkd/anaconda3/envs/sina/lib/arb/lib/pts/ptserver.log'
[ptserver '-build' took 1m2s]
arb_message: PT_SERVER database "/home/pkd/Sina/sina_ssu.arb.index.arb" has been created.
blocksize: 9 allocated: 304 kb [~ 0%] <33.8 kBlocks [PT1_EMPTY_NODE_SIZE]
blocksize: 10 allocated: 515 kb [
0%] <51.5 kBlocks
blocksize: 11 allocated: 759 kb [
0%] <69.0 kBlocks [PT1_MIN_CHAIN_ENTRY_SIZE]
blocksize: 12 allocated: 735 kb [
0%] <61.3 kBlocks
blocksize: 13 allocated: 650 kb [
0%] <50.0 kBlocks
blocksize: 14 allocated: 630 kb [
0%] <45.0 kBlocks
blocksize: 15 allocated: 56.2 Mb [11%] <3.75 MBlocks [PT1_EMPTY_LEAF_SIZE]
blocksize: 16 allocated: 532 kb [
0%] <33.3 kBlocks
blocksize: 17 allocated: 208 Mb [41%] <12.3 MBlocks [PT1_NODE_WITHSONS_SIZE(1)]
blocksize: 18 allocated: 477 kb [
0%] <26.6 kBlocks
blocksize: 19 allocated: 451 kb [
0%] <23.8 kBlocks [PT1_CHAIN_SHORT_HEAD_SIZE]
blocksize: 20 allocated: 415 kb [
0%] <20.8 kBlocks
blocksize: 21 allocated: 389 kb [
0%] <18.6 kBlocks [PT1_CHAIN_LONG_HEAD_SIZE]
blocksize: 22 allocated: 330 kb [
0%] <15.1 kBlocks
blocksize: 23 allocated: 259 kb [
0%] <11.3 kBlocks [PT1_MAX_CHAIN_ENTRY_SIZE]
blocksize: 24 allocated: 78.0 kb [
0%] <3.25 kBlocks
blocksize: 25 allocated: 69.7 Mb [14%] <2.79 MBlocks [PT1_NODE_WITHSONS_SIZE(2)]
blocksize: 26 allocated: 221 kb [
0%] <8.51 kBlocks
blocksize: 27 allocated: 385 kb [
0%] <14.3 kBlocks
blocksize: 28 allocated: 539 kb [
0%] <19.3 kBlocks
blocksize: 29 allocated: 14.6 kb [
0%] <
512 Blocks
blocksize: 30 allocated: 750 kb [
0%] <25.1 kBlocks
blocksize: 31 allocated: 465 kb [
0%] <15.1 kBlocks
blocksize: 32 allocated: 200 kb [
0%] <6.25 kBlocks
blocksize: 33 allocated: 26.0 Mb [
5%] <
807 kBlocks [PT1_NODE_WITHSONS_SIZE(3)]
blocksize: 34 allocated: 8.51 kb [
0%] <
256 Blocks
blocksize: 35 allocated: 403 kb [
0%] <11.6 kBlocks
blocksize: 36 allocated: 630 kb [
0%] <17.6 kBlocks
blocksize: 37 allocated: 204 kb [
0%] <5.50 kBlocks
blocksize: 38 allocated: 266 kb [
0%] <7.00 kBlocks
blocksize: 39 allocated: 9.76 kb [
0%] <~ 256 Blocks
blocksize: 40 allocated: 520 kb [~ 0%] <13.1 kBlocks
blocksize: 41 allocated: 113 Mb [22%] <2.76 MBlocks [PT1_NODE_WITHSONS_SIZE(4)]
blocksize: 42 allocated: 273 kb [
0%] <6.50 kBlocks
blocksize: 43 allocated: 151 kb [
0%] <3.50 kBlocks
blocksize: 44 allocated: 11.1 kb [
0%] <
256 Blocks
blocksize: 45 allocated: 304 kb [
0%] <6.75 kBlocks
blocksize: 46 allocated: 552 kb [
0%] <12.1 kBlocks
blocksize: 47 allocated: 317 kb [
0%] <6.75 kBlocks
blocksize: 48 allocated: 252 kb [
0%] <5.25 kBlocks
blocksize: 49 allocated: 3.62 Mb [
1%] <75.5 kBlocks [PT1_NODE_WITHSONS_SIZE(5)]
blocksize: 50 allocated: 113 kb [
0%] <2.25 kBlocks
blocksize: 51 allocated: 179 kb [
0%] <3.50 kBlocks
blocksize: 52 allocated: 312 kb [
0%] <6.00 kBlocks
blocksize: 53 allocated: 199 kb [
0%] <3.75 kBlocks
blocksize: 54 allocated: 13.6 kb [
0%] <~ 256 Blocks
blocksize: 55 allocated: 206 kb [~ 0%] <3.75 kBlocks
blocksize: 56 allocated: 98.0 kb [
0%] <1.75 kBlocks
blocksize: 57 allocated: 941 kb [
0%] <16.6 kBlocks [PT1_NODE_WITHSONS_SIZE(6)]
blocksize: 58 allocated: 479 kb [
0%] <8.26 kBlocks
blocksize: 60 allocated: 450 kb [
0%] <7.50 kBlocks
blocksize: 61 allocated: 290 kb [
0%] <4.75 kBlocks
blocksize: 62 allocated: 93.0 kb [
0%] <1.50 kBlocks
blocksize: 63 allocated: 63.0 kb [
0%] <1.00 kBlocks
blocksize: 65 allocated: 146 kb [
0%] <2.25 kBlocks
blocksize: 66 allocated: 215 kb [
0%] <3.25 kBlocks
blocksize: 67 allocated: 251 kb [
0%] <3.75 kBlocks
blocksize: 68 allocated: 136 kb [
0%] <2.00 kBlocks
blocksize: 70 allocated: 140 kb [
0%] <2.00 kBlocks
blocksize: 71 allocated: 142 kb [
0%] <2.00 kBlocks
blocksize: 72 allocated: 180 kb [
0%] <2.50 kBlocks
blocksize: 73 allocated: 383 kb [
0%] <5.25 kBlocks
blocksize: 75 allocated: 206 kb [
0%] <2.75 kBlocks
blocksize: 76 allocated: 209 kb [
0%] <2.75 kBlocks
blocksize: 77 allocated: 366 kb [
0%] <4.75 kBlocks
blocksize: 78 allocated: 176 kb [
0%] <2.25 kBlocks
blocksize: 80 allocated: 80.0 kb [
0%] <1.00 kBlocks
blocksize: 81 allocated: 40.5 kb [
0%] <~ 512 Blocks
blocksize: 82 allocated: 82.0 kb [~ 0%] <1.00 kBlocks
blocksize: 83 allocated: 166 kb [
0%] <2.00 kBlocks
blocksize: 85 allocated: 234 kb [
0%] <2.75 kBlocks
blocksize: 86 allocated: 172 kb [
0%] <2.00 kBlocks
blocksize: 87 allocated: 65.3 kb [
0%] <~ 768 Blocks
blocksize: 88 allocated: 88.0 kb [~ 0%] <1.00 kBlocks
blocksize: 90 allocated: 158 kb [
0%] <1.75 kBlocks
blocksize: 91 allocated: 159 kb [
0%] <1.75 kBlocks
blocksize: 92 allocated: 322 kb [
0%] <3.50 kBlocks
blocksize: 93 allocated: 186 kb [
0%] <2.00 kBlocks
blocksize: 95 allocated: 143 kb [
0%] <1.50 kBlocks
blocksize: 96 allocated: 240 kb [
0%] <2.50 kBlocks
blocksize: 97 allocated: 291 kb [
0%] <3.00 kBlocks
blocksize: 98 allocated: 221 kb [
0%] <2.25 kBlocks
blocksize: 100 allocated: 75.0 kb [
0%] <~ 768 Blocks
blocksize: 101 allocated: 50.5 kb [~ 0%] <~ 512 Blocks
blocksize: 102 allocated: 51.0 kb [~ 0%] <~ 512 Blocks
blocksize: 103 allocated: 77.3 kb [~ 0%] <~ 768 Blocks
blocksize: 105 allocated: 158 kb [~ 0%] <1.50 kBlocks
blocksize: 106 allocated: 79.5 kb [
0%] <~ 768 Blocks
blocksize: 107 allocated: 161 kb [~ 0%] <1.50 kBlocks
blocksize: 108 allocated: 189 kb [
0%] <1.75 kBlocks
blocksize: 110 allocated: 110 kb [
0%] <1.00 kBlocks
blocksize: 111 allocated: 83.3 kb [
0%] <~ 768 Blocks
blocksize: 112 allocated: 56.0 kb [~ 0%] <~ 512 Blocks
blocksize: 113 allocated: 84.8 kb [~ 0%] <~ 768 Blocks
blocksize: 115 allocated: 173 kb [~ 0%] <1.50 kBlocks
blocksize: 116 allocated: 290 kb [
0%] <2.50 kBlocks
blocksize: 117 allocated: 263 kb [
0%] <2.25 kBlocks
blocksize: 118 allocated: 88.5 kb [
0%] <~ 768 Blocks
blocksize: 120 allocated: 90.0 kb [~ 0%] <~ 768 Blocks
blocksize: 121 allocated: 182 kb [~ 0%] <1.50 kBlocks
blocksize: 122 allocated: 275 kb [
0%] <2.25 kBlocks
blocksize: 123 allocated: 246 kb [
0%] <2.00 kBlocks
blocksize: 125 allocated: 93.8 kb [
0%] <~ 768 Blocks
blocksize: 126 allocated: 63.0 kb [~ 0%] <~ 512 Blocks
blocksize: 127 allocated: 63.5 kb [~ 0%] <~ 512 Blocks
blocksize: 128 allocated: 64.0 kb [~ 0%] <~ 512 Blocks
blocksize: 130 allocated: 65.0 kb [~ 0%] <~ 512 Blocks
blocksize: 131 allocated: 32.8 kb [~ 0%] <~ 256 Blocks
blocksize: 132 allocated: 99.0 kb [~ 0%] <~ 768 Blocks
blocksize: 133 allocated: 99.8 kb [~ 0%] <~ 768 Blocks
blocksize: 135 allocated: 169 kb [~ 0%] <1.25 kBlocks
blocksize: 136 allocated: 170 kb [
0%] <1.25 kBlocks
blocksize: 137 allocated: 103 kb [
0%] <~ 768 Blocks
blocksize: 138 allocated: 69.0 kb [~ 0%] <~ 512 Blocks
blocksize: 140 allocated: 140 kb [~ 0%] <1.00 kBlocks
blocksize: 141 allocated: 70.5 kb [
0%] <~ 512 Blocks
blocksize: 142 allocated: 107 kb [~ 0%] <~ 768 Blocks
blocksize: 143 allocated: 71.5 kb [~ 0%] <~ 512 Blocks
blocksize: 145 allocated: 145 kb [~ 0%] <1.00 kBlocks
blocksize: 146 allocated: 256 kb [
0%] <1.75 kBlocks
blocksize: 147 allocated: 257 kb [
0%] <1.75 kBlocks
blocksize: 148 allocated: 148 kb [
0%] <1.00 kBlocks
blocksize: 150 allocated: 37.5 kb [
0%] <~ 256 Blocks
blocksize: 151 allocated: 75.5 kb [~ 0%] <~ 512 Blocks
blocksize: 152 allocated: 152 kb [~ 0%] <~1.00
sum of above: 511 Mb
overall alloc: 511 Mb
sum of above: 0 b
overall alloc: 0 b

Control edge gap weights

It would be nice to be able to weight gaps at the edges differently than in the middle. MUSCLE e.g. weights them at half the cost by default (IRRC).

This is complicated by the fact that the graph might have different trailing end lengths.

Fix cseq length

See unit test:

ASSERTION FAILURE:
- file   : cseq_test.cpp
- line   : 132
- message: check c.getWidth() == aligned.size() has failed [75 != 76]            

Improve test coverage

  • rw_fasta:
    • --meta-fmt [none|header|csv|comment]
    • fail if --fasta-idx and piped
    • in fasta with stuff before first >
    • in fasta with windows line ends
    • in fasta with description in sequence
    • in fasta with comment parameters
    • in fasta with broken chars
    • can't open out fasta
    • can't open out fasta.csv
    • sequence below idty threshold
    • line length
  • align:
    • --overhang attach|remove|edge
    • --lowercase none|original|unaligned
    • --insertion shift|forbid|remove
    • copy alignment from longer template
    • --use-subst-matrix
    • --debug-graph
    • weights
    • --fs-no-graph
    • --write-used-rels
  • query_arb:
    • write to ARB file with empty alignment (expect fail)
    • copy sequence
    • set mark
  • cseq_comparator
  • various score types
  • sina:
    • show conf
    • intype / outtype none|auto|arb|fasta

Add dependencies for SINA build/run to conda

Non zero exit status

Since we moved to a new server running Ubuntu 16.04, I got the sina binary from here (1.3.1-a5). The programs runs fine and produces output, but when I integrate it in Snakemake it complains that it returns a non-zero exit status 6. And therefore it removes the output files. Any idea why it doesn't give a 0 exit code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.