Git Product home page Git Product logo
BEETL: Burrows-Wheeler Extended Tool Library

Copyright (c) 2011-2014 Illumina, Inc.

This software is covered by the "BSD 2-Clause License" (see accompanying LICENSE file)
and any user of this software or source file is bound by the terms therein.


#### Documentation in: 
doc/BEETL.md

#### Citations

*** BWT *** 

  Markus J. Bauer, Anthony J. Cox, Giovanna Rosone: 
  Lightweight BWT Construction for Very Large String Collections. 
  Combinatorial Pattern Matching - 22nd Annual Symposium, CPM 2011, 
  Lecture Notes in Computer Science 6661: 219-231
  doi: 10.1007/978-3-642-21458-5_20, Springer 2011.

  Markus J. Bauer, Anthony J. Cox, Giovanna Rosone: 
  Lightweight algorithms for constructing and inverting the BWT of string collections. 
  Theoretical Computer Science 483: 134-148 (2013). 
  doi: 10.1016/j.tcs.2012.02.002.


*** BWT and LCP *** 

  Markus J. Bauer, Anthony J. Cox, Giovanna Rosone and Marinella Sciortino: 
  Lightweight LCP Construction for Next-Generation Sequencing Datasets. WABI 2012. 
  Lecture Notes in Bioinformatics. Volume 7534, pp 326-337, 2012. Springer Berlin / Heidelberg.
  doi: 10.1007/978-3-642-33122-0_26.

*** FASTA compressor *** 

  Anthony J. Cox, Markus J. Bauer, Tobias Jakobi, and Giovanna Rosone. 
  Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform. 
  Bioinformatics. 28(11): 1415-1419, 2012. 
  doi:10.1093/bioinformatics/bts173.

***  FASTQ compressor *** 

  Lilian Janin, Giovanna Rosone, and Anthony J. Cox: 
  Adaptive reference-free compression of sequence quality scores. 
  Bioinformatics (2014) 30 (1): 24-30
  doi: 10.1093/bioinformatics/btt257.


*** Release notes ***

Version 1.1.0 (24th February 2015)

- Beetl-fastq: New --mode=restore option to restore the original fastq data. It is known to be very slow, as our beetl-unbwt algorithm didn't get any love for years.
- Beetl-fastq now uses the output directory for intermediate files, to prevent read-only issues.
- New BWT Index format, allowing run lengths longer that 2^32.
- Additional tab-separated output file for metaBEETL.


Version 1.0.2 (19th September 2014)

- Beetl-fastq: Can now run multiple instances simultaneously in same directory
+ Thanks to Stephen Sammut


Version 1.0.1 (15th September 2014)

- Beetl-extend: Fixed bug which was preventing output files from being written
- Beetl-fastq: changed "zcat file.gz" to "cat file.gz|zcat" for mac compatibility
+ Thanks to Stephen Sammut for the bug report


Version 1.0 (29th August 2014)

- BEETL is now released under the BSD 2-Clause license, enjoy!
- New "bwt_v3" Run Length Encoding format for BWT files, set as default (and described in doc/FileFormats.md)


Version 0.10.0 (17th June 2014)

- New beetl-fastq tool to convert FASTQ files into a set of smaller and searchable files (including quality scores and read ids)
- Added RLE53 class (run length encoding: 3 bits for base, 5 bits for run length), but RLE44 is still our default BWT format
- Beetl-convert handles BWT44 -> BWT53 conversions
- Integrated LCP code with main BCR, making it faster
- Beetl-extend can now rebuild the whole reads' bases with --propagate-sequence (instead of stopping at the $ signs)
- Replaced compile-time PROPAGATE_SEQUENCE with runtime --propagate-sequence in beetl-compare
- Beetl-compare output now correctly goes into the directory specified by --output
- Beetl-search/beetl-extend --use-shm stores pre-parsed index files in shared memory for faster tool startup when doing many small searches
+ Thanks to Shaun Jackman and Rayan Chikhi for compilation patches


Version 0.9.0 (9th December 2013)

- New K-mer search and propagation tools (see example with beetl-search/beetl-extend in doc/BEETL.md)
- MetaBeetl filtering changed a bit, in preparation for further improvements
- MetaBeetl database construction scripts and instructions updated
- Better distribution packaging, with "make distcheck"


Version 0.8.0 (29th October 2013)

- New BWT-based tumour-normal filter (see example in doc/BEETL.md)
- Beetl-compare modes: 'split' renamed to 'splice'
- Beetl-compare modes: new 'tumour-normal'
- 16-way parallel backtracker/beetl-compare
- MetaBEETL is now able to run without sequence propagation (which is faster)
- Upgraded requirement: autoconf 2.65 (improved regression testing)


Version 0.7.0 (18th September 2013)

- New beetl-correct tool
- Updated regression tests
- Detection of required libraries and OpenMP in configure script


Version 0.6.0 (27th August 2013)

- Important: Harmonised file encodings: you may need to rebuild your BWT files or just convert bwt-B00 from BWT_ASCII to BWT_RLE with beetl-convert (bwt-B00 was always ASCII-encoded. It now follows the same encoding as the other piles)
- Now requires GCC with C++11 support (gcc 4.6 or above)
- New Markdown documentation style
- Faster beetl-compare and metaBEETL
- Lossless quality compression scripts added, even though they are not using BWT
- New features:
  - BWT creation with reverse-complemented reads (--add-rev-comp)
  - BWT creation with multiple reads per sequence (--sequence-length)
- Support for BCL and BCL.gz in beetl-convert


Version 0.5.0 (10th June 2013)

- Faster metaBEETL
  - Pruning of BKPT to reduce the number of MTAXA tests and output lines
  - New beetl-compare --subset obtion for distributed computing


Version 0.4.0 (10th April 2013)

- Robustness improvements


Version 0.3.0 (7th April 2013)

- Metagenomics 'META-BEETL' code added to main tree


Version 0.2.0 (19th March 2013)

- Longest Common Prefix computation integrated under --generate-lcp in beetl-bwt
- New beetl-unbwt command line interface


Version 0.1.0 (28th February 2013)

- New command line interface
- Support for FASTA, FASTQ, raw SEQ and cycle-by-cycle files as input
- Distinct intermediate and final formats
- Support for Huffman encoding as intermediate or final format
- Support for intermediate "incremental run-length-encoded" format
- Automatic detection of format
- Performance prediction for automatic use of fastest algorithm and options


Version 0.0.2 (25th June 2012)


Version 0.0.1 (18th November 2011)

This contains initial implementations of the BCR and BCRext algorithms
as described in our CPM paper.


*** Contributors ***

Markus J. Bauer, Illumina UK
Anthony J. Cox, Illumina UK (project lead)
Tobias Jakobi, University of Bielefeld
Giovanna Rosone, University of Palermo
Ole Schulz-Trieglaff, Illumina UK
Lilian Janin, Illumina UK

beetl's Projects

htslib icon htslib

C library for high-throughput sequencing data formats

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.