mateidavid / fast5 Goto Github PK
View Code? Open in Web Editor NEWA C++ header-only library for reading Oxford Nanopore Fast5 files
License: MIT License
A C++ header-only library for reading Oxford Nanopore Fast5 files
License: MIT License
it looks like there's currently
fast5.hpp
hdf5_tools.hpp
fast5_version.hpp
logger.hpp
Huffman_Packer.hpp
Bit_Packer.hpp
In the Debian package, we have it as:
fast5.hpp
fast5/hdf5_tools.hpp
fast5/fast5_version.hpp
fast5/logger.hpp
fast5/Huffman_Packer.hpp
fast5/Bit_Packer.hpp
Would you accept a PR to make that change here? It would also help to separate the fast5 header files from those just needed by the examples.
nanopolish 0.9.0 no longer builds with fast5 0.6.4, so would you mind tagging a new release to go along with it?
When I run the tool f5ls on a fast5 file I get the following error.
...
eventdetection/events/size=12393
(mean=92.6346, stdv=2.68832, start=28195744, length=7)
basecall(0)/group_list=1D_000
basecall(0)/seq_size=6997
Bus error
Similar behaviour is observed for f5-full or any other tool (such as nanopolish) that uses fast5 library when accessing events in fast5 files. The information about the system which I get the error is as follows.
Processor: ARMv7 Processor rev 3
Operating system: Ubuntu 16.04.3 LTS
The output from gdb and backtrace is attached herewith.
gdb_out.txt
The bus error seems to have originated from inside the HDF functions. However, I do not think that this is a bug in the HDF library as the h5dump tool provided by HDF output the following without any issues.
h5dump.txt
Can you shed some light on this to fix the issue?
Hi!
I am very interested in this library, but I cannot make it work: if I compile and run the example a.cpp on a sample fast5 file, I obtain the error:
file_version=1
HDF5-DIAG: Error detected in HDF5 (1.9.220) thread 0:
#000: H5T.c line 2028 in H5Tis_variable_str(): not a datatype
major: Invalid arguments to routine
minor: Inappropriate type
terminate called after throwing an instance of 'hdf5_tools::Exception'
what(): /Analyses/Basecall_2D_000/version: error in H5Tis_variable_str
Aborted (core dumped)
do you have any idea of what is going wrong here? it seems a format error, but I don't understand where it may come from. Thank you!
Hi, I'm unable to build the python wrapper from git master branch.
Here is the build log:
(.venv) user@myhost ~/Desktop/fast5/python $ pip install cython
Collecting cython
Downloading Cython-0.29.21-cp38-cp38-manylinux1_x86_64.whl (1.9 MB)
|████████████████████████████████| 1.9 MB 633 kB/s
Installing collected packages: cython
Successfully installed cython-0.29.21
WARNING: You are using pip version 20.1.1; however, version 20.2.2 is available.
You should consider upgrading via the '/home/user/Desktop/fast5/python/.venv/bin/python3 -m pip install --upgrade pip' command.
(.venv) user@myhost ~/Desktop/fast5/python $ make develop
/home/user/Desktop/fast5/python/.venv/bin/python setup.py develop
Compiling fast5/fast5.pyx because it changed.
[1/1] Cythonizing fast5/fast5.pyx
/home/user/Desktop/fast5/python/.venv/lib/python3.8/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /home/user/Desktop/fast5/python/fast5/fast5.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
running develop
running egg_info
creating fast5.egg-info
writing fast5.egg-info/PKG-INFO
writing dependency_links to fast5.egg-info/dependency_links.txt
writing top-level names to fast5.egg-info/top_level.txt
writing manifest file 'fast5.egg-info/SOURCES.txt'
reading manifest file 'fast5.egg-info/SOURCES.txt'
writing manifest file 'fast5.egg-info/SOURCES.txt'
running build_ext
building 'fast5' extension
creating build
creating build/temp.linux-x86_64-3.8
creating build/temp.linux-x86_64-3.8/fast5
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -march=x86-64 -mtune=generic -O3 -pipe -fno-plt -fno-semantic-interposition -march=x86-64 -mtune=generic -O3 -pipe -fno-plt -march=x86-64 -mtune=generic -O3 -pipe -fno-plt -fPIC -I../include -I../src -I/home/user/Desktop/fast5/python/.venv/include -I/usr/include/python3.8 -c fast5/fast5.cpp -o build/temp.linux-x86_64-3.8/fast5/fast5.o -std=c++11 -Wall -Wextra -Wpedantic
In file included from ../include/fast5.hpp:27,
from fast5/fast5.cpp:680:
../include/fast5/hdf5_tools.hpp: In member function ‘std::vector<std::__cxx11::basic_string<char> > hdf5_tools::File::get_attr_list(const string&) const’:
../include/fast5/hdf5_tools.hpp:2109:60: error: no matching function for call to ‘hdf5_tools::detail::Util::wrap(herr_t (&)(hid_t, H5O_info2_t*, unsigned int), hid_t&, H5O_info2_t*)’
2109 | detail::Util::wrap(H5Oget_info, id_holder.id, &info);
| ^
In file included from ../include/fast5.hpp:27,
from fast5/fast5.cpp:680:
../include/fast5/hdf5_tools.hpp:243:5: note: candidate: ‘template<class Function, class ... Args> static typename std::result_of<_Functor(_ArgTypes ...)>::type hdf5_tools::detail::Util::wrap(Function&&, Args&& ...)’
243 | wrap(Function && f, Args && ...args)
| ^~~~
../include/fast5/hdf5_tools.hpp:243:5: note: template argument deduction/substitution failed:
../include/fast5/hdf5_tools.hpp: In substitution of ‘template<class Function, class ... Args> static typename std::result_of<_Functor(_ArgTypes ...)>::type hdf5_tools::detail::Util::wrap(Function&&, Args&& ...) [with Function = int (&)(long int, H5O_info2_t*, unsigned int); Args = {long int&, H5O_info2_t*}]’:
../include/fast5/hdf5_tools.hpp:2109:60: required from here
../include/fast5/hdf5_tools.hpp:243:5: error: no type named ‘type’ in ‘class std::result_of<int (&(long int&, H5O_info2_t*))(long int, H5O_info2_t*, unsigned int)>’
In file included from ../include/fast5.hpp:27,
from fast5/fast5.cpp:680:
../include/fast5/hdf5_tools.hpp: In member function ‘bool hdf5_tools::File::path_exists(const string&) const’:
../include/fast5/hdf5_tools.hpp:2389:68: error: no matching function for call to ‘hdf5_tools::detail::Util::wrap(herr_t (&)(hid_t, H5O_info2_t*, unsigned int), hid_t&, H5O_info2_t*)’
2389 | detail::Util::wrap(H5Oget_info, o_id_holder.id, &o_info);
| ^
In file included from ../include/fast5.hpp:27,
from fast5/fast5.cpp:680:
../include/fast5/hdf5_tools.hpp:243:5: note: candidate: ‘template<class Function, class ... Args> static typename std::result_of<_Functor(_ArgTypes ...)>::type hdf5_tools::detail::Util::wrap(Function&&, Args&& ...)’
243 | wrap(Function && f, Args && ...args)
| ^~~~
../include/fast5/hdf5_tools.hpp:243:5: note: template argument deduction/substitution failed:
../include/fast5/hdf5_tools.hpp: In substitution of ‘template<class Function, class ... Args> static typename std::result_of<_Functor(_ArgTypes ...)>::type hdf5_tools::detail::Util::wrap(Function&&, Args&& ...) [with Function = int (&)(long int, H5O_info2_t*, unsigned int); Args = {long int&, H5O_info2_t*}]’:
../include/fast5/hdf5_tools.hpp:2389:68: required from here
../include/fast5/hdf5_tools.hpp:243:5: error: no type named ‘type’ in ‘class std::result_of<int (&(long int&, H5O_info2_t*))(long int, H5O_info2_t*, unsigned int)>’
In file included from ../include/fast5.hpp:27,
from fast5/fast5.cpp:680:
../include/fast5/hdf5_tools.hpp: In member function ‘bool hdf5_tools::File::check_object_type(const string&, H5O_type_t) const’:
../include/fast5/hdf5_tools.hpp:2410:64: error: no matching function for call to ‘hdf5_tools::detail::Util::wrap(herr_t (&)(hid_t, H5O_info2_t*, unsigned int), hid_t&, H5O_info2_t*)’
2410 | detail::Util::wrap(H5Oget_info, o_id_holder.id, &o_info);
| ^
In file included from ../include/fast5.hpp:27,
from fast5/fast5.cpp:680:
../include/fast5/hdf5_tools.hpp:243:5: note: candidate: ‘template<class Function, class ... Args> static typename std::result_of<_Functor(_ArgTypes ...)>::type hdf5_tools::detail::Util::wrap(Function&&, Args&& ...)’
243 | wrap(Function && f, Args && ...args)
| ^~~~
../include/fast5/hdf5_tools.hpp:243:5: note: template argument deduction/substitution failed:
../include/fast5/hdf5_tools.hpp: In substitution of ‘template<class Function, class ... Args> static typename std::result_of<_Functor(_ArgTypes ...)>::type hdf5_tools::detail::Util::wrap(Function&&, Args&& ...) [with Function = int (&)(long int, H5O_info2_t*, unsigned int); Args = {long int&, H5O_info2_t*}]’:
../include/fast5/hdf5_tools.hpp:2410:64: required from here
../include/fast5/hdf5_tools.hpp:243:5: error: no type named ‘type’ in ‘class std::result_of<int (&(long int&, H5O_info2_t*))(long int, H5O_info2_t*, unsigned int)>’
error: command 'gcc' failed with exit status 1
make: *** [Makefile:38: develop] Error 1
Program versions:
Archlinux
gcc version 10.2.0
Python 3.8.5
hdf5 1.12.0
Would it be possible to add a version number (@ 7198123) or later for this project? I would like to create bioconda packages for fast5 and nanopolish.
Hey guys.
Great work. Super useful project.
Thought I would let you know I have a Singularity container built on SingularityHub for the use of f5pack
. I had some minor difficulties trying to install f5pack
locally so thought others might appreciate having a container. Info on the recipe to build and where you can download a pre-built version are here.
Cheers
Hi,
I can confirm that nanocall builds nicely when using the latest Git commits. However, to build a Debian package from nanocall I need to link against the Debian packaged version (0.5.6) which fits your latest release tag. When compiling against this one I get:
...
cd /home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/obj-x86_64-linux-gnu/nanocall && /usr/bin/c++ -g -O2 -fPIE -fstack-protector-strong -Wformat -Werror=format-security - D_FORTIFY_SOURCE=2 -std=c++11 -pthread -Wall -Wextra -pedantic -fmax-errors=1 -O3 -DNDEBUG -DDISABLE_ASSERTS -isystem /usr/include/hdf5/serial -I/home/tillea/debian-maintain/alioth/debian-med_git/build-area/ nanocall-0.6.14/obj-x86_64-linux-gnu -I/home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/src/builtin_models -I/home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall- 0.6.14/src/tclap/include -I/home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/src/nanocall -I/home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/src/ version -I/usr/include/hpptools -o CMakeFiles/nanocall.dir/nanocall.cpp.o -c /home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/src/nanocall/nanocall.cpp
/home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/src/nanocall/nanocall.cpp: In function ‘void init_reads(const Pore_Model_Dict_Type&, const std::liststd::basic_string&, std::deque<Fast5_Summary<float, 6u> >&)’:
/home/tillea/debian-maintain/alioth/debian-med_git/build-area/nanocall-0.6.14/src/nanocall/nanocall.cpp:263:68: error: no matching function for call to ‘Fast5_Summary<float, 6u>::Fast5_Summary(const std:: basic_string&, const Pore_Model_Dict_Type&, TCLAP::SwitchArg&)’
Fast5_Summary_Type s(f, models, opts::double_strand_scaling);
^
compilation terminated due to -fmax-errors=1.
...
which somehow smells like fast5 Git has more advanced features which are used by nanocall. If you could tag a fast5 release which has all those needed features this would be very helpful.
Kind regards
Andreas.
I experienced this error:
./submods/fast5/include/fast5/hdf5_tools.hpp:1369:70: error: no match for 'operator[]' (operand types are '__gnu_cxx::__alloc_traits<std::allocator<std::array<char, 1> >, std::array<char, 1> >::value_type' {aka 'std::array<char, 1>'} and 'int')
1369 | reinterpret_cast<std::string &>(out).assign(&char_buff[0][0], reader_base.dspace_size);
| ^
./submods/fast5/include/fast5/hdf5_tools.hpp: In static member function 'static void hdf5_tools::File::copy_attribute(const hdf5_tools::File&, const hdf5_tools::File&, const std::string&, const std::string&)':
./submods/fast5/include/fast5/hdf5_tools.hpp:2301:33: error: no match for 'operator[]' (operand types are '__gnu_cxx::__alloc_traits<std::allocator<std::array<char, 1> >, std::array<char, 1> >::value_type' {aka 'std::array<char, 1>'} and 'int')
2301 | tmp_v[i][0] = tmp[i];
| ^
Compilation of the same code base worked ok on ubuntu 20.04 gcc 9.4.0.
(I will try now gcc 9.4.0 on mac os, assuming I manage to install it)
fast5/src/hdf5_tools.hpp:201: multiple definition of `hdf5_tools::detail::get_path_name(std::string const&)'
fast5/src/hdf5_tools.hpp:556: multiple definition of `hdf5_tools::addr_exists(int, std::string const&)'
Hi,
I am enabling ppc64le build support on travis. But in case of ppc64le it failing with below error:
"
Reading state information...
E: Unable to locate package docker-ce
The command "sudo apt-get install -y -o Dpkg::Options::="--force-confnew" docker-ce" failed and exited with 100 during .
Your build has been stopped."
The full can be tracked here: https://travis-ci.com/github/sanjaymsh/fast5/jobs/396264179
It seems that the file https://github.com/sanjaymsh/fast5/blob/master/.travis.Dockerfile.in#L1
"FROM buildpack-deps:jessie" this image is not having multiarch support.
I checked it here : https://hub.docker.com/_/buildpack-deps?tab=tags&page=1
If i am right , then if we can switch to some other version of image which is available?
Please have a look on it.
Thanks !!
Hi David,
I am trying to stitch together a fast5 package for Gentoo Linux but it fails to compile with 6.3.0 here:
>>> Emerging (2 of 3) sci-libs/fast5-9999::science
>>> Unpacking source...
* Fetching https://github.com/mateidavid/fast5.git ...
git fetch https://github.com/mateidavid/fast5.git +HEAD:refs/git-r3/HEAD
git symbolic-ref refs/git-r3/sci-libs/fast5/0/__main__ refs/git-r3/HEAD
* Checking out https://github.com/mateidavid/fast5.git to /apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work/fast5-9999 ...
git checkout --quiet refs/git-r3/HEAD
GIT update -->
repository: https://github.com/mateidavid/fast5.git
at the commit: 8a48fd7d70d64225ac349135dcf5734b9f452125
>>> Source unpacked in /apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work
>>> Preparing source in /apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work/fast5-9999 ...
>>> Source prepared.
>>> Configuring source in /apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work/fast5-9999 ...
>>> Source configured.
>>> Compiling source in /apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work/fast5-9999 ...
make -j25 -C python develop-user HDF5_DIR=/apps/gentoo/usr HDF5_LIB_DIR=/apps/gentoo/usr/lib64
make: Entering directory '/apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work/fast5-9999/python'
/apps/gentoo/usr/bin/python setup.py develop --user
Compiling fast5/fast5.pyx because it changed.
[1/1] Cythonizing fast5/fast5.pyx
running develop
running egg_info
creating fast5.egg-info
writing fast5.egg-info/PKG-INFO
writing top-level names to fast5.egg-info/top_level.txt
writing dependency_links to fast5.egg-info/dependency_links.txt
writing manifest file 'fast5.egg-info/SOURCES.txt'
reading manifest file 'fast5.egg-info/SOURCES.txt'
writing manifest file 'fast5.egg-info/SOURCES.txt'
running build_ext
building 'fast5' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/fast5
x86_64-pc-linux-gnu-g++ -pthread -O2 -pipe -O2 -pipe -march=native -ftree-vectorize -fPIC -I../include -I../src -I/apps/gentoo/usr/include/python2.7 -c fast5/fast5.cpp -o build/temp.linux-x86_64-2.7/fast5/fast5.o -std=c++11 -Wall -Wextra -Wpedantic -isystem /apps/gentoo/usr/include
In file included from /apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/ext/string_conversions.h:41:0,
from /apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/basic_string.h:5402,
from /apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/string:52,
from /apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/locale_classes.h:40,
from /apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/bits/ios_base.h:41,
from /apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/ios:42,
from fast5/fast5.cpp:536:
/apps/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.0/include/g++-v6/cstdlib:75:25: fatal error: stdlib.h: No such file or directory
#include_next <stdlib.h>
^
compilation terminated.
error: command 'x86_64-pc-linux-gnu-g++' failed with exit status 1
make: *** [Makefile:41: develop-user] Error 1
make: Leaving directory '/apps/gentoo/var/tmp/portage/sci-libs/fast5-9999/work/fast5-9999/python'
It works with gcc-5.4.0.
When trying to build nanopolish with the latest release here, I get
src/nanopolish_squiggle_read.cpp:84:23: error: 'class fast5::File' has no member named 'have_context_tags_params'; did you mean 'have_channel_id_params'?
if(this->f_p->have_context_tags_params()) {
but it looks like you added this member in the current development version. Would you mind bumping the version number for this?
I've installed hdf5 1.8.16 (32-bit) and built f5dump using MSVS 2015, 32-bit console application. When I run it on any ONT fast5 file I get the following written to the console:
file_version=1
sampling_rate=3012
have_sequences_group=0
have_raw_samples=0
have_eventdetection_group=1
HDF5-DIAG: Error detected in HDF5 (1.8.16) thread 0:
#000: C:\autotest\HDF518ReleaseRWDITAR\src\H5A.c line 642 in H5Aread(): unable to read attribute
major: Attribute
minor: Read failed
#1: C:\autotest\HDF518ReleaseRWDITAR\src\H5Aint.c line 641 in H5A_read(): unable to convert between src and dst datatypes
major: Attribute
minor: Feature is unsupported
#2: C:\autotest\HDF518ReleaseRWDITAR\src\H5T.c line 4548 in H5T_path_find(): no appropriate function for conversion path
major: Datatype
minor: Unable to initialize object
hdf5 error: /Analyses/EventDetection_000/version: error in H5Aread
Any idea what I've may have done wrong?
I was looking to package nanopolish for Debian, but it depends on this project, which doesn't have any licensing information. We cannot redistribute it without knowing your licensing terms and if they comply with the Debian Free Software Guidelines.
Hi,
While trying fast5 with hdf5-1.10 I encountered unexpected failures reading valid HDF5 files.
I've eventually found out that some ids aren't of the correct type. Please see the attached patch.
Thanks.
hid_t.txt
Hi, in the blog it's mentioned f5pack does 10x compression. But on our runs (r9.4, albacore 2) we only achieve ~40% compression. Is it expected? Are there any other ways of improving fast5 compression?
~/src/fast5/python/bin/f5pack --archive -R -o f5pack/$d reads/$d
I am getting the following error while running by binary
/home/hariss/hari/fast5_api/fast5/include/fast5/hdf5_tools.hpp:1994: void hdf5_tools::File::read(const string&, Data_Storage&, Args&& ...) const [with Data_Storage = std::vector; Args = {}; std::string = std::basic_string]: Assertion `is_open()' failed.
Aborted (core dumped)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.