gmarcais / quorum Goto Github PK

View Code? Open in Web Editor NEW

7.0 7.0 4.0 2.4 MB

QUality Optimized Reads from the University of Maryland

License: GNU General Public License v3.0

Makefile 0.37% M4 0.21% C++ 75.84% TeX 23.11% Perl 0.47%

quorum's People

Contributors

Stargazers

Watchers

Forkers

whitesymmetry alekseyzimin enformatik

quorum's Issues

Version of Jellyfish required

Good afternoon,

I'm trying to build a conda recipe for Quorum, and would like to know the version of Jellyfish required. There is a recipy for Jellyfish 2.2.3 on bioconda. Is that sufficient?

While I'm asking, does Quorum require autoconf/automake and yaggo to run, or just to install? Any guidance you have on how to build this software would be greatly appreciated.

Compiling on OS X

Hi, I'm running into the following issue when compiling v1.1.1 on Mac OS X (10.12).

In file included from src/error_correct_reads.cc:31:
./include/gzip_stream.hpp:6:10: fatal error: 'ext/stdio_filebuf.h' file not found
#include <ext/stdio_filebuf.h>
         ^
1 error generated.

It looks like this has been a problem since v1.0.0: see https://github.com/Homebrew/homebrew-science/issues/2551.

Small patch to build with gcc-12

Hi,
for the Debian package of Quorum I added a small patch to build the code with GCC 12 successfully. Feel free to take over this patch.
Kind regards, Andreas.

FTBFS when LTO is used

quorum FTBFS in Debian when LTO is used. You can see the bug report here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1030954

/bin/bash ./libtool  --tag=CXX   --mode=link g++ -Wall -g -O2 -std=c++0x -Werror -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -fdebug-prefix-map=/<<PKGBUILDDIR>>=/usr/src/quorum-1.1.1-7 -DHAVE_NUMERIC_LIMITS128  -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -Wl,-z,relro -Wl,-z,now -o quorum_error_correct_reads src/error_correct_reads.o src/err_log.o -ljellyfish-2.0 -lpthread  -lrt -lpthread 
libtool: link: g++ -Wall -g -O2 -std=c++0x -Werror -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security -fdebug-prefix-map=/<<PKGBUILDDIR>>=/usr/src/quorum-1.1.1-7 -DHAVE_NUMERIC_LIMITS128 -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -Wl,-z -Wl,relro -Wl,-z -Wl,now -o quorum_error_correct_reads src/error_correct_reads.o src/err_log.o  /usr/lib/x86_64-linux-gnu/libjellyfish-2.0.so -lrt -lpthread
In member function '__ct ',
    inlined from '__ct_base .constprop' at ./include/jflib/multiplexed_io.hpp:82:18:
./include/jflib/pool.hpp:36:26: error: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
   36 |       size_(size), elts_(new T[size]), B2A(size, elts_), A2B(size, elts_)
      |                          ^
/usr/include/c++/12/new: In member function '__ct_base .constprop':
/usr/include/c++/12/new:128:26: note: in a call to allocation function 'operator new []' declared here
  128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
      |                          ^
lto1: all warnings being treated as errors
make[3]: *** [/tmp/ccDYJbhe.mk:5: /tmp/cclUyevp.ltrans1.ltrans.o] Error 1
make[3]: *** Waiting for unfinished jobs....
lto-wrapper: fatal error: make returned 2 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:897: quorum_error_correct_reads] Error 1
make[2]: Leaving directory '/<<PKGBUILDDIR>>'
make[1]: *** [Makefile:707: all] Error 2
make[1]: Leaving directory '/<<PKGBUILDDIR>>'
dh_auto_build: error: make -j1 returned exit code 2
make: *** [debian/rules:12: binary] Error 25
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2

Could quorum-1.1.1 use memory in terabyte range?

Hi,
I would like quorum to use 3TB of memory on our server. I ended up with several core dumps or at the best errors on the commandline. The "-s 3T" is not recognized, probably untested.

$ quorum -t 104 -s 2.5T -k 31 myfile.fastq
Invalid size '2.5T'. It must be a number, maybe followed by a suffix (like k, M, G for thousand, million and billion).
$

I have 323975447 Illumina sequences in the file (paired-end sequences, interleaved, some are probably singletons).

$ quorum -t 104 -s 2500G -k 31 myfile.fastq
terminate called after throwing an instance of 'jellyfish::large_hash::array_base<jellyfish::mer_dna_ns::mer_base_static<unsigned long, 0>, unsigned long, atomic::gcc, jellyfish::large_hash::array<jellyfish::mer_dna_ns::mer_base_static<unsigned long, 0>, unsigned long, atomic::gcc, allocators::mmap> >::ErrorAllocation'
  what():  Failed to allocate 9000000000000 bytes of memory
Creating the mer database failed. Most likely the size passed to the -s switch is too small. at /apps/gentoo/usr/bin/quorum line 143.
$

Is the memory specified on the commandline multiplied by number of threads? I do not understand where 9TB comes from.

The core dump says it was generated by quorum_create_database -s 2500G -m 31 -t 104 -q 38 -b 7 -o

(gdb) bt full
#0  0x00002aaaab776124 in raise () from /apps/gentoo/lib64/libc.so.6
No symbol table info available.
#1  0x00002aaaab77758a in abort () from /apps/gentoo/lib64/libc.so.6
No symbol table info available.
#2  0x00002aaaab1e2ecd in __gnu_cxx::__verbose_terminate_handler() () at /apps/gentoo/var/tmp/portage/sys-devel/gcc-5.4.0-r3/work/gcc-5.4.0/libstdc++-v3/libsupc++/vterminate.cc:95
No locals.
#3  0x00002aaaab1e0d06 in __cxxabiv1::__terminate(void (*)()) () at /apps/gentoo/var/tmp/portage/sys-devel/gcc-5.4.0-r3/work/gcc-5.4.0/libstdc++-v3/libsupc++/eh_terminate.cc:47
No locals.
#4  0x00002aaaab1e0d51 in std::terminate() () at /apps/gentoo/var/tmp/portage/sys-devel/gcc-5.4.0-r3/work/gcc-5.4.0/libstdc++-v3/libsupc++/eh_terminate.cc:57
No locals.
#5  0x00002aaaab1e0f68 in __cxa_throw () at /apps/gentoo/var/tmp/portage/sys-devel/gcc-5.4.0-r3/work/gcc-5.4.0/libstdc++-v3/libsupc++/eh_throw.cc:87
No locals.
#6  0x0000000000406a7b in main () at /apps/gentoo/usr/include/jellyfish/large_hash_array.hpp:180
        args = {size_arg = 2500000000000, size_given = true, mer_arg = 31, mer_given = true, bits_arg = 7, bits_given = true, min_qual_value_arg = 38, min_qual_value_given = true, min_qual_char_arg = {<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >> = {static npos = 
    18446744073709551615, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x616c98 ""}, _M_string_length = 0, {_M_local_buf = '\000' <repeats 15 times>, _M_allocated_capacity = 0}}, <No data fields>}, min_qual_char_given = false, 
          threads_arg = 104, threads_given = true, output_arg = 0x7fffffffd6ec "quorum_corrected_mer_database.jf", output_given = true, reprobe_arg = 126, reprobe_given = false, reads_arg = {<std::_Vector_base<char const*, std::allocator<char const*> >> = {
              _M_impl = {<std::allocator<char const*>> = {<__gnu_cxx::new_allocator<char const*>> = {<No data fields>}, <No data fields>}, _M_start = 0x628fa0, _M_finish = 0x628fa8, _M_end_of_storage = 0x628fa8}}, <No data fields>}}
        std::__ioinit = {static _S_refcount = 10, static _S_synced_with_stdio = true}
        jellyfish::mer_dna_ns::mer_base_static<unsigned long, 0>::k_ = 31
(gdb)

Thank you,

Error: Failed to open output file

I am using a script that has worked very well in the past (error correcting sequence reads for ~200 samples). I'm now trying to process about 90 samples and it failed for every one, with the same kind of error:

Error: Failed to open output file 'quorum_output/Acidomeria-cinctipes-CMF-0941_ALL_READS_mer_database.jf'.
Usage: create_database_cmdline [options] reads:path+
Use --help for more information
Creating the mer database failed. Most likely the size passed to the -s switch is too small. at /apps/masurca/2.3.2/bin/quorum line 143.
Wed Aug 28 07:40:12 EDT 2019

I've always used the default -s value. Given the error, I've increased this to 50M, 100M, 1000M, and 10000M, and I continuously get the same error across all sample. I'm starting to suspect the cause may be something other than the -s, but I'm not sure what this could be.

Determining -s Parameter

Greetings,

I read in the README.md that to estimate the parameter for the size of the Jellyfish hash, -s, that one should use the following calculation:

(G + k * n) / 0.8

I'm working with human data and finding that resulting the number may be quite large. Assuming that the parameter G is about 3,200,000,000 bp in humans, that would lead to the following calculation for my data, with k as the default (24) and a FastQ with 1.2 billion reads:

(3,200,000,000 + (24 * 1,200,000,000)) / 0.8 = 40,000,000,000

Does this calculation appear correct (I'm testing it now, I can let you know of the results)?

gmarcais / quorum Goto Github PK

quorum's People

Contributors

Stargazers

Watchers

Forkers

quorum's Issues

Version of Jellyfish required

Compiling on OS X

Small patch to build with gcc-12

FTBFS when LTO is used

Could quorum-1.1.1 use memory in terabyte range?

Error: Failed to open output file

Determining -s Parameter

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent