Comments (18)
I'm running now but i have a cVCF with 85 samples so takes several hours to split. Unfortunately TileDB-VCF still can't handle cVCF ;-)
from tiledb-vcf.
@sauloal Thanks for reporting this, it looks like the segfault happens in htslib. Can you tell us which version of htslib you have installed? Looks like you are in conda, so can you paste the output of conda list
?
from tiledb-vcf.
Thanks for the quick reply.
htslib 1.11 hd3b49d5_2 bioconda
Please find below the quite large list.
$ conda list
# packages in environment at /home/saulo/anaconda3:
#
# Name Version Build Channel
_anaconda_depends 2020.07 py37_0
_ipyw_jlab_nb_ext_conf 0.1.0 py37_0
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 1_llvm conda-forge
abseil-cpp 20200225.2 he1b5a44_2 conda-forge
adagio 0.2.2 pypi_0 pypi
aiofiles 0.5.0 pypi_0 pypi
alabaster 0.7.12 py_0 conda-forge
anaconda custom py37_1
anaconda-client 1.7.2 py_0 conda-forge
anaconda-navigator 1.9.12 py37_1
anaconda-project 0.9.1 pyhd8ed1ab_0 conda-forge
aniso8601 7.0.0 pypi_0 pypi
antlr4-python3-runtime 4.9.1 pypi_0 pypi
anyio 2.0.2 py37h89c1867_4 conda-forge
appdirs 1.4.4 pyh9f0ad1d_0 conda-forge
argh 0.26.2 pyh9f0ad1d_1002 conda-forge
argon2-cffi 20.1.0 py37h5e8e339_2 conda-forge
arrow-cpp 1.0.1 py37h1234567_1_cpu conda-forge
asciitree 0.3.3 py_2 conda-forge
asn1crypto 1.4.0 pyh9f0ad1d_0 conda-forge
astroid 2.4.2 py37hc8dfbb8_1 conda-forge
astropy 4.2 py37h5e8e339_1 conda-forge
async-exit-stack 1.0.1 pypi_0 pypi
async_generator 1.10 py_0 conda-forge
atomicwrites 1.4.0 pyh9f0ad1d_0 conda-forge
attrs 20.3.0 pyhd3deb0d_0 conda-forge
autopep8 1.4.4 py_0
aws-sdk-cpp 1.7.164 hba45d7a_2 conda-forge
babel 2.9.0 pyhd3deb0d_0 conda-forge
backcall 0.2.0 pyh9f0ad1d_0 conda-forge
backports 1.0 py_2 conda-forge
backports.functools_lru_cache 1.6.1 py_0 conda-forge
backports.shutil_get_terminal_size 1.0.0 py_3 conda-forge
bcolz 1.2.1 py37hb3f55d8_1001 conda-forge
bcrypt 3.1.7 pypi_0 pypi
beautifulsoup4 4.9.3 pyhb0f4dca_0 conda-forge
bitarray 1.6.3 py37h5e8e339_0 conda-forge
bkcharts 0.2 py37_0
blas 2.3 openblas conda-forge
bleach 3.3.0 pyh44b312d_0 conda-forge
blosc 1.21.0 h9c3ff4c_0 conda-forge
bokeh 2.2.3 py37h89c1867_0 conda-forge
boto 2.49.0 py_0 conda-forge
bottleneck 1.3.2 py37h902c9e0_3 conda-forge
brotli 1.0.9 h9c3ff4c_4 conda-forge
brotlipy 0.7.0 py37h5e8e339_1001 conda-forge
brunsli 0.1 h9c3ff4c_0 conda-forge
brython 3.8.9 pypi_0 pypi
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.17.1 h36c2ea0_0 conda-forge
ca-certificates 2020.12.5 ha878542_0 conda-forge
cached-property 1.5.1 py_0 conda-forge
cairo 1.16.0 hcf35c78_1003 conda-forge
certifi 2020.12.5 py37h89c1867_1 conda-forge
cffi 1.14.4 py37h11fe52a_0 conda-forge
chardet 4.0.0 py37h89c1867_1 conda-forge
charls 2.2.0 h9c3ff4c_0 conda-forge
ciso8601 2.1.3 pypi_0 pypi
click 7.1.2 pyh9f0ad1d_0 conda-forge
cloudpickle 1.6.0 py_0 conda-forge
clyent 1.2.2 py_1 conda-forge
colorama 0.4.4 pyh9f0ad1d_0 conda-forge
conda 4.9.2 py37h89c1867_0 conda-forge
conda-build 3.19.2 py37hc8dfbb8_2 conda-forge
conda-env 2.6.0 1 conda-forge
conda-package-handling 1.7.2 py37hb5d75c8_0 conda-forge
conda-verify 3.1.1 py37hc8dfbb8_1001 conda-forge
contextlib2 0.6.0.post1 py_0 conda-forge
coverage 5.4 pypi_0 pypi
cryptography 3.3.1 py37h7f0c10b_1 conda-forge
curl 7.71.1 he644dc0_8 conda-forge
cycler 0.10.0 py_2 conda-forge
cython 0.29.21 py37hcd2ae1e_2 conda-forge
cytoolz 0.11.0 py37h5e8e339_3 conda-forge
dask 2021.1.1 pyhd8ed1ab_0 conda-forge
dask-core 2021.1.1 pyhd8ed1ab_0 conda-forge
dask-glm 0.2.0 pypi_0 pypi
dask-ml 1.8.0 pypi_0 pypi
dask-sql 0.3.1.dev9+g0bb554d dev_0 <develop>
databases 0.3.2 pypi_0 pypi
dateparser 0.7.6 pypi_0 pypi
dbus 1.13.6 he372182_0 conda-forge
decorator 4.4.2 py_0 conda-forge
defusedxml 0.6.0 py_0 conda-forge
dialite 0.5.3 pypi_0 pypi
diff-match-patch 20200713 pyh9f0ad1d_0 conda-forge
distributed 2021.1.1 py37h89c1867_0 conda-forge
dnspython 2.0.0 pypi_0 pypi
docutils 0.16 py37h89c1867_3 conda-forge
ecdsa 0.15 pypi_0 pypi
email-validator 1.1.1 pypi_0 pypi
entrypoints 0.3 pyhd8ed1ab_1003 conda-forge
et_xmlfile 1.0.1 py_1001 conda-forge
expat 2.2.10 h9c3ff4c_0 conda-forge
fastapi 0.63.0 pypi_0 pypi
fastcache 1.1.0 py37h5e8e339_2 conda-forge
fasteners 0.14.1 py_3 conda-forge
filelock 3.0.12 pyh9f0ad1d_0 conda-forge
flake8 3.7.9 py37hc8dfbb8_1 conda-forge
flask 1.1.2 pyh9f0ad1d_0 conda-forge
flexx 0.8.1 pypi_0 pypi
fontconfig 2.13.1 hba837de_1004 conda-forge
freetype 2.10.4 h0708190_1 conda-forge
fribidi 1.0.10 h36c2ea0_0 conda-forge
fs 2.4.12 pypi_0 pypi
fsspec 0.8.5 pyhd8ed1ab_0 conda-forge
fugue 0.5.0 pypi_0 pypi
future 0.18.2 py37h89c1867_3 conda-forge
get_terminal_size 1.0.0 haa9412d_0
gettext 0.19.8.1 hf34092f_1004 conda-forge
gevent 21.1.2 py37h5e8e339_0 conda-forge
gflags 2.2.2 he1b5a44_1004 conda-forge
giflib 5.2.1 h36c2ea0_2 conda-forge
glib 2.58.3 py37he00f558_1004 conda-forge
glob2 0.7 py_0 conda-forge
glog 0.4.0 h49b9bf7_3 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
gmpy2 2.1.0b1 py37hcb968a4_1 conda-forge
graphene 2.1.8 pypi_0 pypi
graphite2 1.3.13 h58526e2_1001 conda-forge
graphql-core 2.3.2 pypi_0 pypi
graphql-relay 2.0.1 pypi_0 pypi
greenlet 0.4.17 py37h5e8e339_2 conda-forge
grpc-cpp 1.30.2 heedbac9_0 conda-forge
gst-plugins-base 1.14.5 h0935bb2_2 conda-forge
gstreamer 1.14.5 h36ae1b5_2 conda-forge
h11 0.9.0 pypi_0 pypi
h5py 3.1.0 nompi_py37h1e651dc_100 conda-forge
harfbuzz 2.4.0 h9f30f68_3 conda-forge
hdf5 1.10.6 nompi_h7c3c948_1111 conda-forge
heapdict 1.0.1 py_0 conda-forge
helpdev 0.7.1 pyhd8ed1ab_0 conda-forge
html5lib 1.1 pyh9f0ad1d_0 conda-forge
htslib 1.11 hd3b49d5_2 bioconda
httptools 0.1.1 pypi_0 pypi
icu 64.2 he1b5a44_1 conda-forge
idna 2.8 pypi_0 pypi
imagecodecs 2021.1.11 py37h70f1e17_0 conda-forge
imageio 2.9.0 py_0 conda-forge
imagesize 1.2.0 py_0 conda-forge
importlib-metadata 3.4.0 py37h89c1867_0 conda-forge
importlib_metadata 3.4.0 hd8ed1ab_0 conda-forge
iniconfig 1.1.1 pyh9f0ad1d_0 conda-forge
intake 0.6.0 pypi_0 pypi
intel-openmp 2020.2 254
intervaltree 3.0.2 py_0 conda-forge
ipykernel 5.3.2 py37h43977f1_0 conda-forge
ipython 7.16.1 py37h43977f1_0 conda-forge
ipython_genutils 0.2.0 py_1 conda-forge
ipywidgets 7.6.3 pyhd3deb0d_0 conda-forge
isort 5.7.0 pyhd8ed1ab_0 conda-forge
itsdangerous 1.1.0 py_0 conda-forge
jbig 2.1 h516909a_2002 conda-forge
jdcal 1.4.1 py_0 conda-forge
jedi 0.15.2 py37_0 conda-forge
jeepney 0.6.0 pyhd8ed1ab_0 conda-forge
jinja2 2.11.3 pyh44b312d_0 conda-forge
joblib 1.0.0 pyhd8ed1ab_0 conda-forge
jpeg 9d h36c2ea0_0 conda-forge
jpype1 1.2.1 pypi_0 pypi
json5 0.9.5 pyh9f0ad1d_0 conda-forge
jsonschema 3.2.0 py_2 conda-forge
jupyter 1.0.0 py37h89c1867_6 conda-forge
jupyter_client 6.1.11 pyhd8ed1ab_1 conda-forge
jupyter_console 6.2.0 py_0 conda-forge
jupyter_core 4.7.1 py37h89c1867_0 conda-forge
jupyter_server 1.2.2 py37h89c1867_1 conda-forge
jupyterlab 3.0.7 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
jupyterlab_server 2.1.3 pyhd8ed1ab_0 conda-forge
jupyterlab_widgets 1.0.0 pyhd8ed1ab_1 conda-forge
jxrlib 1.1 h7f98852_2 conda-forge
keyring 22.0.1 py37h89c1867_0 conda-forge
kiwisolver 1.3.1 py37h2527ec5_1 conda-forge
krb5 1.17.2 h926e7f8_0 conda-forge
lazy-object-proxy 1.4.3 py37h8f50634_2 conda-forge
lcms2 2.11 hcbb858e_1 conda-forge
ld_impl_linux-64 2.35.1 hea4e1c9_2 conda-forge
lerc 2.2.1 h9c3ff4c_0 conda-forge
libaec 1.0.4 h9c3ff4c_1 conda-forge
libarchive 3.5.1 h899b81a_0 conda-forge
libblas 3.9.0 3_openblas conda-forge
libcblas 3.9.0 3_openblas conda-forge
libclang 9.0.1 default_hde54327_0 conda-forge
libcurl 7.71.1 hcdd3856_8 conda-forge
libdeflate 1.7 h7f98852_5 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libevent 2.1.10 hcdb4288_3 conda-forge
libffi 3.2.1 he1b5a44_1007 conda-forge
libgcc 7.2.0 h69d50b8_2 conda-forge
libgcc-ng 9.3.0 h2828fa1_18 conda-forge
libgfortran-ng 7.5.0 h14aa051_18 conda-forge
libgfortran4 7.5.0 h14aa051_18 conda-forge
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 3_openblas conda-forge
liblapacke 3.9.0 3_openblas conda-forge
liblief 0.10.1 he1b5a44_2 conda-forge
libllvm10 10.0.1 he513fc3_3 conda-forge
libllvm9 9.0.1 hf817b99_2 conda-forge
libnghttp2 1.43.0 h812cca2_0 conda-forge
libopenblas 0.3.12 pthreads_hb3c22a3_1 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libprotobuf 3.12.4 h8b12597_0 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libspatialindex 1.9.3 he1b5a44_3 conda-forge
libssh2 1.9.0 hab1572f_5 conda-forge
libstdcxx-ng 9.3.0 h6de172a_18 conda-forge
libthrift 0.13.0 hbe8ec66_6 conda-forge
libtiff 4.2.0 hdc55705_0 conda-forge
libtiledbvcf 0.7.2 hbab4e3b_0 tiledb
libtool 2.4.6 h58526e2_1007 conda-forge
libutf8proc 2.6.1 h7f98852_0 conda-forge
libuuid 2.32.1 h7f98852_1000 conda-forge
libuv 1.40.0 h7f98852_0 conda-forge
libwebp-base 1.2.0 h7f98852_0 conda-forge
libxcb 1.13 h7f98852_1003 conda-forge
libxkbcommon 0.10.0 he1b5a44_0 conda-forge
libxml2 2.9.10 hee79883_0 conda-forge
libxslt 1.1.33 h31b3aaa_0 conda-forge
libzopfli 1.0.3 h9c3ff4c_0 conda-forge
llvm-openmp 11.0.1 h4bd325d_0 conda-forge
llvmlite 0.35.0 py37h9d7f4d0_1 conda-forge
locket 0.2.0 py_2 conda-forge
lxml 4.6.2 py37h77fd288_1 conda-forge
lz4-c 1.9.2 he1b5a44_3 conda-forge
lzo 2.10 h516909a_1000 conda-forge
markupsafe 1.1.1 py37h5e8e339_3 conda-forge
matplotlib 3.3.4 py37h89c1867_0 conda-forge
matplotlib-base 3.3.4 py37h0c9df89_0 conda-forge
mccabe 0.6.1 py_1 conda-forge
mistune 0.8.4 py37h5e8e339_1003 conda-forge
mkl 2020.4 h726a3e6_304 conda-forge
mkl-service 2.3.0 py37h8f50634_2 conda-forge
mkl_fft 1.2.0 py37h161383b_1 conda-forge
mkl_random 1.2.0 py37h9fdb41a_1 conda-forge
mock 4.0.3 py37h89c1867_1 conda-forge
monotonic 1.5 py_0 conda-forge
more-itertools 8.6.0 pyhd8ed1ab_0 conda-forge
mpc 1.1.0 h04dde30_1009 conda-forge
mpfr 4.0.2 he80fd80_1 conda-forge
mpmath 1.1.0 py_0 conda-forge
msgpack-asgi 1.0.0 pypi_0 pypi
msgpack-python 1.0.2 py37h2527ec5_1 conda-forge
multipledispatch 0.6.0 py_0 conda-forge
navigator-updater 0.2.1 py37_0
nbclassic 0.2.6 pyhd8ed1ab_0 conda-forge
nbclient 0.5.1 py_0 conda-forge
nbconvert 6.0.7 py37h89c1867_3 conda-forge
nbformat 5.1.2 pyhd8ed1ab_1 conda-forge
ncurses 6.2 h58526e2_4 conda-forge
nest-asyncio 1.4.3 pyhd8ed1ab_0 conda-forge
networkx 2.5 py_0 conda-forge
nltk 3.4.4 py_0 conda-forge
nose 1.3.7 py_1006 conda-forge
notebook 6.2.0 py37h89c1867_0 conda-forge
nspr 4.29 h9c3ff4c_1 conda-forge
nss 3.61 hb5efdd6_0 conda-forge
numba 0.52.0 py37hdc94413_0 conda-forge
numcodecs 0.7.3 py37hcd2ae1e_0 conda-forge
numexpr 2.7.2 py37hdc94413_0 conda-forge
numpy 1.19.4 pypi_0 pypi
numpy-base 1.17.0 py37h2f8d375_0 r
numpydoc 1.1.0 py_1 conda-forge
olefile 0.46 pyh9f0ad1d_1 conda-forge
opencv-python 3.4.11.45 pypi_0 pypi
openjpeg 2.4.0 hf7af979_0 conda-forge
openpyxl 3.0.6 pyhd8ed1ab_0 conda-forge
openssl 1.1.1i h7f98852_0 conda-forge
orjson 3.3.0 pypi_0 pypi
packaging 20.8 pyhd3deb0d_0 conda-forge
pandas 1.1.5 pypi_0 pypi
pandoc 2.11.4 h7f98852_0 conda-forge
pandocfilters 1.4.2 py_1 conda-forge
pango 1.42.4 h7062337_4 conda-forge
paramiko 2.7.2 pypi_0 pypi
parquet-cpp 1.5.1 2 conda-forge
parso 0.5.2 py_0
partd 1.1.0 py_0 conda-forge
passlib 1.7.2 pypi_0 pypi
patchelf 0.11 he1b5a44_0 conda-forge
path 15.1.0 py37h89c1867_0 conda-forge
path.py 12.5.0 0 conda-forge
pathlib2 2.3.5 py37h89c1867_3 conda-forge
pathtools 0.1.2 py_1 conda-forge
patsy 0.5.1 py_0 conda-forge
pcre 8.44 he1b5a44_0 conda-forge
pep8 1.7.1 py_0 conda-forge
pexpect 4.8.0 pyh9f0ad1d_2 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 8.1.0 py37he6b4880_1 conda-forge
pip 21.0.1 pyhd8ed1ab_0 conda-forge
pixman 0.38.0 h516909a_1003 conda-forge
pkginfo 1.7.0 pyhd8ed1ab_0 conda-forge
pluggy 0.13.1 py37h89c1867_4 conda-forge
ply 3.11 py_1 conda-forge
pomegranate 0.13.3 py37hc928c03_1 conda-forge
pooch 1.3.0 pyhd8ed1ab_0 conda-forge
prometheus_client 0.9.0 pyhd3deb0d_0 conda-forge
promise 2.3 pypi_0 pypi
prompt-toolkit 3.0.14 pyha770c72_0 conda-forge
prompt_toolkit 3.0.14 hd8ed1ab_0 conda-forge
pscript 0.7.4 pypi_0 pypi
psutil 5.8.0 py37h5e8e339_1 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
py 1.10.0 pyhd3deb0d_0 conda-forge
py-lief 0.10.1 py37hb892b2f_2 conda-forge
pyarrow 3.0.0 pypi_0 pypi
pybind11 2.6.2 py37h2527ec5_0 conda-forge
pybind11-global 2.6.2 py37h2527ec5_0 conda-forge
pycodestyle 2.5.0 py37_0
pycosat 0.6.3 py37h5e8e339_1006 conda-forge
pycparser 2.20 pyh9f0ad1d_2 conda-forge
pycrypto 2.6.1 py37hb5d75c8_1005 conda-forge
pycryptodome 3.9.9 pypi_0 pypi
pycryptodomex 3.9.9 pypi_0 pypi
pycurl 7.43.0.6 py37h88a64d2_1 conda-forge
pydantic 1.6.1 pypi_0 pypi
pydocstyle 5.1.1 py_0 conda-forge
pyerfa 1.7.1.1 py37h5e8e339_2 conda-forge
pyflakes 2.1.1 py37_0
pygments 2.7.4 pyhd8ed1ab_0 conda-forge
pylint 2.6.0 py37hc8dfbb8_1 conda-forge
pynacl 1.4.0 pypi_0 pypi
pyodbc 4.0.30 py37hcd2ae1e_1 conda-forge
pyopenssl 20.0.1 pyhd8ed1ab_0 conda-forge
pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge
pyqt 5.12.3 py37h8685d9f_3 conda-forge
pyqt5-sip 4.19.18 pypi_0 pypi
pyqtchart 5.12 pypi_0 pypi
pyqtwebengine 5.12.1 pypi_0 pypi
pyrsistent 0.17.3 py37h5e8e339_2 conda-forge
pysmi 0.3.4 pypi_0 pypi
pysnmp 4.4.6 pypi_0 pypi
pysocks 1.7.1 py37h89c1867_3 conda-forge
pytables 3.6.1 py37h0c4f3e0_3 conda-forge
pytest 6.2.2 pypi_0 pypi
pytest-cov 2.11.1 pypi_0 pypi
python 3.7.6 cpython_h8356626_6 conda-forge
python-dateutil 2.8.1 py_0 conda-forge
python-jose 3.1.0 pypi_0 pypi
python-jsonrpc-server 0.3.4 pyh9f0ad1d_1 conda-forge
python-language-server 0.31.10 py37hc8dfbb8_0 conda-forge
python-libarchive-c 2.9 py37h89c1867_2 conda-forge
python-multipart 0.0.5 pypi_0 pypi
python_abi 3.7 1_cp37m conda-forge
pytz 2021.1 pyhd8ed1ab_0 conda-forge
pywavelets 1.1.1 py37h902c9e0_3 conda-forge
pyxdg 0.26 py_0 conda-forge
pyyaml 5.4.1 py37h5e8e339_0 conda-forge
pyzmq 22.0.1 py37h499b945_0 conda-forge
qdarkstyle 2.8.1 pyhd8ed1ab_2 conda-forge
qpd 0.2.5 pypi_0 pypi
qt 5.12.5 hd8c4c69_1 conda-forge
qtawesome 1.0.2 pyhd8ed1ab_0 conda-forge
qtconsole 5.0.2 pyhd8ed1ab_0 conda-forge
qtpy 1.9.0 py_0 conda-forge
re2 2020.08.01 he1b5a44_1 conda-forge
readline 8.0 he28a2e2_2 conda-forge
regex 2020.10.15 pypi_0 pypi
requests 2.21.0 pypi_0 pypi
ripgrep 12.1.1 h516909a_1 conda-forge
rope 0.18.0 pyh9f0ad1d_0 conda-forge
rtree 0.9.7 py37h0b55af0_1 conda-forge
ruamel_yaml 0.15.80 py37h5e8e339_1004 conda-forge
rx 1.6.1 pypi_0 pypi
samtools 1.7 1 bioconda
scikit-allel 1.3.2 py37h9fdb41a_0 conda-forge
scikit-image 0.18.1 py37hdc94413_0 conda-forge
scikit-learn 0.24.1 py37h69acf81_0 conda-forge
scipy 1.5.3 py37h8911b10_0 conda-forge
seaborn 0.11.1 hd8ed1ab_1 conda-forge
seaborn-base 0.11.1 pyhd8ed1ab_1 conda-forge
secretstorage 3.3.0 py37h89c1867_0 conda-forge
send2trash 1.5.0 py_0 conda-forge
setuptools 49.6.0 py37h89c1867_3 conda-forge
simplegeneric 0.8.1 py_1 conda-forge
singledispatch 3.4.0.3 pyh9f0ad1d_1001 conda-forge
sip 4.19.24 py37hcd2ae1e_3 conda-forge
six 1.15.0 pyh9f0ad1d_0 conda-forge
snappy 1.1.8 he1b5a44_3 conda-forge
sniffio 1.2.0 py37h89c1867_1 conda-forge
snowballstemmer 2.1.0 pyhd8ed1ab_0 conda-forge
sortedcollections 2.1.0 pyhd8ed1ab_0 conda-forge
sortedcontainers 2.3.0 pyhd8ed1ab_0 conda-forge
soupsieve 2.0.1 py_1 conda-forge
sphinx 3.4.3 pyhd8ed1ab_0 conda-forge
sphinxcontrib 1.0 py37_1
sphinxcontrib-applehelp 1.0.2 py_0 conda-forge
sphinxcontrib-devhelp 1.0.2 py_0 conda-forge
sphinxcontrib-htmlhelp 1.0.3 py_0 conda-forge
sphinxcontrib-jsmath 1.0.1 py_0 conda-forge
sphinxcontrib-qthelp 1.0.3 py_0 conda-forge
sphinxcontrib-serializinghtml 1.1.4 py_0 conda-forge
sphinxcontrib-websupport 1.2.4 pyh9f0ad1d_0 conda-forge
spyder 4.1.3 py37hc8dfbb8_0 conda-forge
spyder-kernels 1.9.1 py37hc8dfbb8_1 conda-forge
sqlalchemy 1.3.23 py37h5e8e339_0 conda-forge
sqlite 3.34.0 h74cdb3f_0 conda-forge
starlette 0.13.6 pypi_0 pypi
statsmodels 0.12.2 py37h902c9e0_0 conda-forge
sympy 1.7.1 py37h89c1867_1 conda-forge
tbb 2020.2 h4bd325d_3 conda-forge
tblib 1.6.0 py_0 conda-forge
terminado 0.9.2 py37h89c1867_0 conda-forge
testpath 0.4.4 py_0 conda-forge
threadpoolctl 2.1.0 pyh5ca1d4c_0 conda-forge
thrift-compiler 0.13.0 hbe8ec66_6 conda-forge
thrift-cpp 0.13.0 6 conda-forge
tifffile 2021.2.1 pyhd8ed1ab_0 conda-forge
tiledb 0.8.0 pypi_0 pypi
tiledbvcf-py 0.7.2 py37h93de243_0 tiledb
tk 8.6.10 h21135ba_1 conda-forge
toml 0.10.2 pyhd8ed1ab_0 conda-forge
toolz 0.11.1 py_0 conda-forge
tornado 6.1 py37h5e8e339_1 conda-forge
tqdm 4.56.0 pyhd8ed1ab_0 conda-forge
traitlets 5.0.5 py_0 conda-forge
triad 0.5.1 pypi_0 pypi
typed-ast 1.4.2 py37h5e8e339_0 conda-forge
typing_extensions 3.7.4.3 py_0 conda-forge
tzlocal 2.1 pypi_0 pypi
ujson 3.0.0 pypi_0 pypi
unicodecsv 0.14.1 py_1 conda-forge
unixodbc 2.3.9 h0e019cf_0 conda-forge
urllib3 1.24.3 pypi_0 pypi
uvicorn 0.11.8 pypi_0 pypi
uvloop 0.14.0 pypi_0 pypi
vcftools 0.1.16 he513fc3_4 bioconda
watchdog 1.0.2 py37h89c1867_1 conda-forge
wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge
webencodings 0.5.1 py_1 conda-forge
webruntime 0.5.8 pypi_0 pypi
websockets 8.1 pypi_0 pypi
werkzeug 1.0.1 pyh9f0ad1d_0 conda-forge
wheel 0.36.2 pyhd3deb0d_0 conda-forge
widgetsnbextension 3.5.1 py37h89c1867_4 conda-forge
wrapt 1.11.2 py37h8f50634_1 conda-forge
wurlitzer 2.0.1 py37h89c1867_1 conda-forge
xlrd 2.0.1 pyhd8ed1ab_3 conda-forge
xlsxwriter 1.3.7 pyh9f0ad1d_0 conda-forge
xlwt 1.3.0 py_1 conda-forge
xmltodict 0.12.0 py_0 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h516909a_0 conda-forge
xorg-libsm 1.2.3 h84519dc_1000 conda-forge
xorg-libx11 1.6.12 h516909a_0 conda-forge
xorg-libxau 1.0.9 h7f98852_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h516909a_0 conda-forge
xorg-libxrender 0.9.10 h516909a_1002 conda-forge
xorg-renderproto 0.11.1 h14c3975_1002 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h516909a_1 conda-forge
yaml 0.2.5 h516909a_0 conda-forge
yapf 0.30.0 pyh9f0ad1d_0 conda-forge
zarr 2.6.1 pyhd8ed1ab_0 conda-forge
zeromq 4.3.4 h9c3ff4c_0 conda-forge
zfp 0.5.5 h9c3ff4c_4 conda-forge
zict 2.0.0 py_0 conda-forge
zipp 3.4.0 py_0 conda-forge
zlib 1.2.11 h516909a_1010 conda-forge
zope 1.0 py37_1
zope.event 4.5.0 pyh9f0ad1d_0 conda-forge
zope.interface 5.2.0 py37h5e8e339_1 conda-forge
zstd 1.4.8 hdf46e1d_0 conda-forge
from tiledb-vcf.
@sauloal Thank you for the information. At this time I'm not able to reproduce the crash directly. My suspicious is there is something with your VCF files headers which might be related the warning you get in the stats and export from htslib, [W::bcf_hdr_check_sanity] GL should be declared as Number=G
. Is there anyway you could share an example VCF file that produces this error so we can debug it further? If you can't share the VCF publicly, if you want to email us at [email protected]
we are happy to take a look.
Without reproducing another quick test would be to downgrade htslib from 1.11 to 1.10:
conda install -c bioconda htslib==1.10
from tiledb-vcf.
Can't install
$ mamba install -f --no-deps -c conda-forge -c bioconda -c tiledb htslib==1.10
Problem: package libtiledbvcf-0.8.0-hbab4e3b_0 requires htslib >=1.11,<1.12.0a0, but none of the providers can be installed
UnsatisfiableError: The following specifications were found to be incompatible with each other:
Output in format: Requested package -> Available versions
Package libgcc-ng conflicts for:
htslib==1.10 -> openssl[version='>=1.1.1a,<1.1.2a'] -> libgcc-ng[version='>=7.2.0|>=9.3.0']
htslib==1.10 -> libgcc-ng[version='>=7.3.0']
Package openssl conflicts for:
htslib==1.10 -> openssl[version='>=1.1.1a,<1.1.2a']
htslib==1.10 -> libcurl[version='>=7.64.1,<8.0a0'] -> openssl[version='>=1.1.1b,<1.1.2a|>=1.1.1c,<1.1.2a|>=1.1.1d,<1.1.2a|>=1.1.1g,<1.1.2a']
Package zlib conflicts for:
python=3.8 -> zlib[version='>=1.2.11,<1.3.0a0']
htslib==1.10 -> zlib[version='>=1.2.11,<1.3.0a0']
from tiledb-vcf.
@sauloal if it's not possible to share one of your VCF files could you check to see what the VCF format version number is? It also might help to take a look at the header data for one of the files, or at least the line defining the GL
field.
from tiledb-vcf.
Here is the header of the file.
What is strange is that exporting to TSV works without a problem.
##fileformat=VCFv4.1
##samtoolsVersion=0.1.14 (r933:170)
##INFO=<ID=CI95,Number=2,Type=Float,Description="Equal-tail Bayesian credible interval of the site allele frequency at the 95% level">
##INFO=<ID=RP,Number=1,Type=Integer,Description="# permutations yielding a smaller PCHI2.">
##SnpEffCmd="SnpEff -no-upstream -no-downstream -ud 0 -csvStats Slyc2.40 /home/assembly/tomato150/reseq/mapped/Heinz/RF_104_SZAXPI008751-74.vcf.gz "
##samtoolsVersion=0.1.18 (r982:295)
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square mapping quality of covering reads">
##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of all samples being the same">
##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele frequency (assuming HWE)">
##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele count (no HWE assumption)">
##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype frequencies">
##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test P-value based on G3">
##INFO=<ID=CLR,Number=1,Type=Integer,Description="Log ratio of genotype likelihoods with and without the constraint">
##INFO=<ID=UGT,Number=1,Type=String,Description="The most probable unconstrained genotype configuration in the trio">
##INFO=<ID=CGT,Number=1,Type=String,Description="The most probable constrained genotype configuration in the trio">
##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand bias, baseQ bias, mapQ bias and tail distance bias">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred probability of the nonRef allele frequency in group1 samples being larger (,smaller) than in group2.">
##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior weighted chi^2 P-value for testing the association between group1 and group2 samples.">
##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred scaled PCHI2.">
##INFO=<ID=PR,Number=1,Type=Integer,Description="# permutations yielding a smaller PCHI2.">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods for RR,RA,AA genotypes (R=ref,A=alt)">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand bias P-value">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##SnpEffVersion="3.2 (build 2013-05-23), by Pablo Cingolani"
##SnpEffCmd="SnpEff -no-upstream -no-downstream -ud 0 -csvStats Slyc2.40 /home/assembly/tomato150/reseq/mapped/Heinz/RF_105_SZAXPI009358-45.vcf.gz "
##INFO=<ID=EFF,Number=.,Type=String,Description="Predicted effects for this variant.Format: 'Effect ( Effect_Impact | Functional_Class | Codon_Change | Amino_Acid_change| Amino_Acid_length | Gene_Name | Transcript_BioType | Gene_Coding | Transcript_ID | Exon | GenotypeNum [ | ERRORS | WARNINGS ] )'">
##INFO=<ID=SF,Number=.,Type=String,Description="Source File (index to sourceFiles, f when filtered)">
##INFO=<ID=AC,Number=.,Type=Integer,Description="Allele count in genotypes">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##SnpEffVersion="4.3t (build 2017-11-24 10:18), by Pablo Cingolani"
##SnpEffCmd="SnpEff S_lycopersicum_v2.50 /vcf-data/merge.vcf.gz "
##INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ">
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008746-45 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009284-57 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009285-62 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009286-74 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009287-75 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009288-79 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009289-84 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009290-87 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009291-88 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009292-89 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009293-90 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009294-93 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009295-94 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009296-95 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009297-102 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009298-108
/ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009299-109 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009300-113 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009301-123 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009302-129 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009303-133 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009304-136 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009305-140 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009306-142 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009307-158 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009308-166 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009309-169 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009310-62 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009311-74 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009312-75 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009313-79 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009314-84 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009315-87
/ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009316-88 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008747-46 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009317-89 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009318-90 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009319-93 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009320-94 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009321-95 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009322-102 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009323-108 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009324-109 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008748-47 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009326-113 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009327-123 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009328-129 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009329-133 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009330-136 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009331-140
/ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009332-142 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009333-158 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009334-166 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009359-46 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009335-169 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009336-14 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009337-15 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009338-16-2 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009339-17-2 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009340-18 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009341-19 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009342-21 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009343-22-2 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009344-23 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008749-56 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009345-24 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008752-75 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009346-25 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008753-79 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009347-26 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009348-27 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009349-30 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009350-31 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009351-32 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009352-35 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009325-56 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009353-36 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008750-57 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009354-37 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009355-39 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009356-41 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009357-44 /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008751-74 /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009358-45
SL2.50ch00 280 . A C 30.8 . AC1=2;AC=2;AF1=1;AN=2;DP4=0,0,2,0;DP=17;EFF=INTERGENIC(MODIFIER||||||||||1);FQ=-33;MQ=60;SF=50;VDB=0.0198;ANN=C|intergenic_region|MODIFIER|CHR_START-Solyc00g005000.2|CHR_START-gene:Solyc00g005000.2|intergenic_region|CHR_START-gene:Solyc00g005000.2|||n.280A>C|||||| GT:GQ:DP:PL . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 1/1:10:2:62,6,0 .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
from tiledb-vcf.
Thanks! It seems like this might be the same issue discussed here. Could you try running bcftools reheader
as suggested and then re-ingesting to see if that fixes the export.
If you have vcftools
installed you could also run the vcf-validator
to make sure there are no other issues with the files.
from tiledb-vcf.
I had to split my cVCF into single sample files therefore I get a lot of AN/AC errors but otherwise the file is fine.
Leading or trailing space in attr_key-attr_value pairs is discouraged:
[Description] [Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ]
INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ">
The header tag 'reference' not present. (Not required but highly recommended.)
SL2.50ch00:3235 .. AN is 118, should be 2
SL2.50ch00:3235 .. AC is 118, should be 2
SL2.50ch00:4314 .. AN is 128, should be 2
SL2.50ch00:4314 .. AC is 128, should be 2
from tiledb-vcf.
Any luck with bcftools reheader
?
from tiledb-vcf.
So,
In summary. It solved the proble. Recoding the file with BCF tools and fixing the header manually. Not really high throughput but works.
Now,
I've added just a few files and it half works (at least gives a different error).
tiledbvcf export --uri 150_debug_cli --verbose --output-format z --sample-names SZAXPI008746-45,SZAXPI009284-57,SZAXPI009285-62,SZAXPI009286-74 --regions SL2.50ch00:1-100000 --output-dir 150_debug_cli_query
Sorted 1 regions in 4.62e-05 seconds.
Allocating 11 fields (17 buffers) of size 63161283 bytes (60.2353MB)
Initialized TileDB query with 1 start_pos ranges,4 samples for contig SL2.50ch00 (contig batch 1/1, sample batch 1/1).
Processed 44 cells in 0.0005062 sec. Reported 44 cells.
[E::bcf_fmt_array] Unexpected type 0
real 0m3.985s
user 0m0.331s
sys 0m0.815s
And it extracts 1 instead of 4 files
from tiledb-vcf.
Thanks for the update. This looks like another VCF format error coming from htslib. If you’re able to share a couple of your files we’d be happy to help track down the issue.
from tiledb-vcf.
So, I've finished adding all 84 samples
I've consolidated the database and vaccum it (6 hours to do so).
CONSOLIDATING FRAGMENT META
+ tiledbvcf utils consolidate fragment_meta --uri 150_debug_cli
real 0m4.862s
user 0m0.329s
sys 0m1.496s
+ echo 'CONSOLIDATING FRAGMENT'
CONSOLIDATING FRAGMENT
+ tiledbvcf utils consolidate fragments --uri 150_debug_cli
real 357m27.628s
user 438m14.372s
sys 243m45.570s
+ echo 'VACCUM FRAGMENT META'
VACCUM FRAGMENT META
+ tiledbvcf utils vaccum fragment_meta --uri 150_debug_cli
real 0m1.039s
user 0m0.161s
sys 0m0.117s
+ echo 'VACCUM FRAGMENT'
VACCUM FRAGMENT
+ tiledbvcf utils vaccum fragments --uri 150_debug_cli
real 0m26.796s
user 0m0.452s
sys 0m21.811s
Still the same result:
tiledbvcf export --uri 150_debug_cli --verbose --output-format z --sample-names SZAXPI008746-45,SZAXPI009284-57,SZAXPI009285-62,SZAXPI009286-74 --regions SL2.50ch00:1-500000 --output-dir 150_debug_cli_query
Sorted 1 regions in 3.73e-05 seconds.
Allocating 11 fields (17 buffers) of size 63161283 bytes (60.2353MB)
Initialized TileDB query with 1 start_pos ranges,4 samples for contig SL2.50ch00 (contig batch 1/1, sample batch 1/1).
Processed 95 cells in 0.0003633 sec. Reported 95 cells.
Processed 92 cells in 0.0003643 sec. Reported 92 cells.
Processed 61 cells in 0.0005126 sec. Reported 61 cells.
Processed 97 cells in 0.0005054 sec. Reported 97 cells.
[E::bcf_fmt_array] Unexpected type 0
real 0m21.077s
user 0m1.040s
sys 0m0.312s
Each file ranges from 50 to 500 Mb compressed. How can I send it to you, let's say, 5 of them?
Regards
from tiledb-vcf.
The plot thickens.
Exporting to BCF and TSV works. Just VCF crashes.
from tiledb-vcf.
@sauloal thank you for the continued information. TileDB-VCF relies on htslib for both the BCF
and VCF
export. We build the in-memory record structure then pass things to htslib for putting it into the proper format in the file. TSV
export is handled entirely inside TileDB-VCF. It seems that once we get a sample of your VCF files, we'll be able to track down the exact cause and push a fix into htslib to prevent the segfault and potentially also make some adjustment in on our side to help this export succeed.
Each file ranges from 50 to 500 Mb compressed. How can I send it to you, let's say, 5 of them?
If you can upload them to google drive/drop box that would work. You can email us at [email protected]
with private links. If that isn't an option we can also give you temporary access an FTPS site where you could upload them. Lastly we can also provide a shared S3 bucket where you can upload, if you are an AWS user. Please let us know which you prefer.
I've consolidated the database and vaccum it (6 hours to do so).
One note here, you don't need to consolidate the fragments. Consolidating the fragment metadata is an important step to reduce the overhead when opening the array. Consolidating the fragments themselves is not needed, and this time consuming step can be avoided for your testing. Even in general with TileDB-VCF arrays, you should not need to consolidate the fragments in most use cases. TileDB efficiently prunes the fragments that do not intersect a query, so having a large number does not harm the read performance in most cases.
from tiledb-vcf.
@aaronwolen @Shelnutt2
Thanks for your message. I've sent and email with the data.
Regarding the consolidation, I'm investigating using tiledb to a large deployment so I want to test its speed and reliability, motly curiosity and expectation to need to run it after inserting large amounts of data.
I've also noticed that consolidating the fragment metadata reduced the insertion time massively so i've made my scrip always do that after each insertion. after that insertion time remainined constant.
from tiledb-vcf.
@sauloal We've identified the issue and adjusted TileDB-VCF to avoid the problem in htslib. @aaronwolen and I have validated the fix against your sample data. We are wrapping up a few other open pull requests now and will look to cut a release with the fix tomorrow morning. We'll let you know as soon as the conda package is available.
Fix: #263
from tiledb-vcf.
I can confirm it is working and exporting successfully.
Thanks for the great work!
from tiledb-vcf.
Related Issues (20)
- The nightly build job failed on Saturday (2023-11-04) HOT 4
- Build failing in linux/arm64 Ubuntu VM HOT 9
- The nightly build job failed on Wednesday (2023-11-29) HOT 10
- The nightly build job failed on Friday (2023-12-08) HOT 8
- The nightly build job failed on Thursday (2023-12-21) HOT 7
- Very high RAM usage when storing plant variant data from GVCFs HOT 2
- Wrong type hint for dataset python api HOT 2
- export with -m (merge) option HOT 3
- tiledb-vcf-java jar doesn't include native libraries HOT 11
- The nightly build job failed on Thursday (2024-01-25) HOT 1
- The nightly build job failed on Wednesday (2024-02-07) HOT 11
- Cannot submit_and_finalize query HOT 2
- Java API: Request to support loading Mac-ARM libraries HOT 1
- The nightly build job failed on Monday (2024-02-26) HOT 1
- The nightly build job failed on Tuesday (2024-02-27) HOT 1
- The nightly build job failed on Tuesday (2024-03-12) HOT 1
- The nightly build job failed on Wednesday (2024-03-20) HOT 1
- delete sample from database: segmentation fault on CLI HOT 5
- The nightly build job failed on Wednesday (2024-04-03) HOT 2
- The nightly build job failed on Friday (2024-04-05) HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tiledb-vcf.