Git Product home page Git Product logo

Comments (18)

sauloal avatar sauloal commented on June 9, 2024 1

I'm running now but i have a cVCF with 85 samples so takes several hours to split. Unfortunately TileDB-VCF still can't handle cVCF ;-)

from tiledb-vcf.

Shelnutt2 avatar Shelnutt2 commented on June 9, 2024

@sauloal Thanks for reporting this, it looks like the segfault happens in htslib. Can you tell us which version of htslib you have installed? Looks like you are in conda, so can you paste the output of conda list ?

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

Thanks for the quick reply.

htslib 1.11 hd3b49d5_2 bioconda

Please find below the quite large list.

$ conda list
# packages in environment at /home/saulo/anaconda3:
#
# Name                    Version                   Build  Channel
_anaconda_depends         2020.07                  py37_0
_ipyw_jlab_nb_ext_conf    0.1.0                    py37_0
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20200225.2           he1b5a44_2    conda-forge
adagio                    0.2.2                    pypi_0    pypi
aiofiles                  0.5.0                    pypi_0    pypi
alabaster                 0.7.12                     py_0    conda-forge
anaconda                  custom                   py37_1
anaconda-client           1.7.2                      py_0    conda-forge
anaconda-navigator        1.9.12                   py37_1
anaconda-project          0.9.1              pyhd8ed1ab_0    conda-forge
aniso8601                 7.0.0                    pypi_0    pypi
antlr4-python3-runtime    4.9.1                    pypi_0    pypi
anyio                     2.0.2            py37h89c1867_4    conda-forge
appdirs                   1.4.4              pyh9f0ad1d_0    conda-forge
argh                      0.26.2          pyh9f0ad1d_1002    conda-forge
argon2-cffi               20.1.0           py37h5e8e339_2    conda-forge
arrow-cpp                 1.0.1           py37h1234567_1_cpu    conda-forge
asciitree                 0.3.3                      py_2    conda-forge
asn1crypto                1.4.0              pyh9f0ad1d_0    conda-forge
astroid                   2.4.2            py37hc8dfbb8_1    conda-forge
astropy                   4.2              py37h5e8e339_1    conda-forge
async-exit-stack          1.0.1                    pypi_0    pypi
async_generator           1.10                       py_0    conda-forge
atomicwrites              1.4.0              pyh9f0ad1d_0    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
autopep8                  1.4.4                      py_0
aws-sdk-cpp               1.7.164              hba45d7a_2    conda-forge
babel                     2.9.0              pyhd3deb0d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
backports.shutil_get_terminal_size 1.0.0                      py_3    conda-forge
bcolz                     1.2.1           py37hb3f55d8_1001    conda-forge
bcrypt                    3.1.7                    pypi_0    pypi
beautifulsoup4            4.9.3              pyhb0f4dca_0    conda-forge
bitarray                  1.6.3            py37h5e8e339_0    conda-forge
bkcharts                  0.2                      py37_0
blas                      2.3                    openblas    conda-forge
bleach                    3.3.0              pyh44b312d_0    conda-forge
blosc                     1.21.0               h9c3ff4c_0    conda-forge
bokeh                     2.2.3            py37h89c1867_0    conda-forge
boto                      2.49.0                     py_0    conda-forge
bottleneck                1.3.2            py37h902c9e0_3    conda-forge
brotli                    1.0.9                h9c3ff4c_4    conda-forge
brotlipy                  0.7.0           py37h5e8e339_1001    conda-forge
brunsli                   0.1                  h9c3ff4c_0    conda-forge
brython                   3.8.9                    pypi_0    pypi
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h36c2ea0_0    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
cached-property           1.5.1                      py_0    conda-forge
cairo                     1.16.0            hcf35c78_1003    conda-forge
certifi                   2020.12.5        py37h89c1867_1    conda-forge
cffi                      1.14.4           py37h11fe52a_0    conda-forge
chardet                   4.0.0            py37h89c1867_1    conda-forge
charls                    2.2.0                h9c3ff4c_0    conda-forge
ciso8601                  2.1.3                    pypi_0    pypi
click                     7.1.2              pyh9f0ad1d_0    conda-forge
cloudpickle               1.6.0                      py_0    conda-forge
clyent                    1.2.2                      py_1    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
conda                     4.9.2            py37h89c1867_0    conda-forge
conda-build               3.19.2           py37hc8dfbb8_2    conda-forge
conda-env                 2.6.0                         1    conda-forge
conda-package-handling    1.7.2            py37hb5d75c8_0    conda-forge
conda-verify              3.1.1           py37hc8dfbb8_1001    conda-forge
contextlib2               0.6.0.post1                py_0    conda-forge
coverage                  5.4                      pypi_0    pypi
cryptography              3.3.1            py37h7f0c10b_1    conda-forge
curl                      7.71.1               he644dc0_8    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.21          py37hcd2ae1e_2    conda-forge
cytoolz                   0.11.0           py37h5e8e339_3    conda-forge
dask                      2021.1.1           pyhd8ed1ab_0    conda-forge
dask-core                 2021.1.1           pyhd8ed1ab_0    conda-forge
dask-glm                  0.2.0                    pypi_0    pypi
dask-ml                   1.8.0                    pypi_0    pypi
dask-sql                  0.3.1.dev9+g0bb554d           dev_0    <develop>
databases                 0.3.2                    pypi_0    pypi
dateparser                0.7.6                    pypi_0    pypi
dbus                      1.13.6               he372182_0    conda-forge
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
dialite                   0.5.3                    pypi_0    pypi
diff-match-patch          20200713           pyh9f0ad1d_0    conda-forge
distributed               2021.1.1         py37h89c1867_0    conda-forge
dnspython                 2.0.0                    pypi_0    pypi
docutils                  0.16             py37h89c1867_3    conda-forge
ecdsa                     0.15                     pypi_0    pypi
email-validator           1.1.1                    pypi_0    pypi
entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
et_xmlfile                1.0.1                   py_1001    conda-forge
expat                     2.2.10               h9c3ff4c_0    conda-forge
fastapi                   0.63.0                   pypi_0    pypi
fastcache                 1.1.0            py37h5e8e339_2    conda-forge
fasteners                 0.14.1                     py_3    conda-forge
filelock                  3.0.12             pyh9f0ad1d_0    conda-forge
flake8                    3.7.9            py37hc8dfbb8_1    conda-forge
flask                     1.1.2              pyh9f0ad1d_0    conda-forge
flexx                     0.8.1                    pypi_0    pypi
fontconfig                2.13.1            hba837de_1004    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
fs                        2.4.12                   pypi_0    pypi
fsspec                    0.8.5              pyhd8ed1ab_0    conda-forge
fugue                     0.5.0                    pypi_0    pypi
future                    0.18.2           py37h89c1867_3    conda-forge
get_terminal_size         1.0.0                haa9412d_0
gettext                   0.19.8.1          hf34092f_1004    conda-forge
gevent                    21.1.2           py37h5e8e339_0    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
giflib                    5.2.1                h36c2ea0_2    conda-forge
glib                      2.58.3          py37he00f558_1004    conda-forge
glob2                     0.7                        py_0    conda-forge
glog                      0.4.0                h49b9bf7_3    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
gmpy2                     2.1.0b1          py37hcb968a4_1    conda-forge
graphene                  2.1.8                    pypi_0    pypi
graphite2                 1.3.13            h58526e2_1001    conda-forge
graphql-core              2.3.2                    pypi_0    pypi
graphql-relay             2.0.1                    pypi_0    pypi
greenlet                  0.4.17           py37h5e8e339_2    conda-forge
grpc-cpp                  1.30.2               heedbac9_0    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
h11                       0.9.0                    pypi_0    pypi
h5py                      3.1.0           nompi_py37h1e651dc_100    conda-forge
harfbuzz                  2.4.0                h9f30f68_3    conda-forge
hdf5                      1.10.6          nompi_h7c3c948_1111    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
helpdev                   0.7.1              pyhd8ed1ab_0    conda-forge
html5lib                  1.1                pyh9f0ad1d_0    conda-forge
htslib                    1.11                 hd3b49d5_2    bioconda
httptools                 0.1.1                    pypi_0    pypi
icu                       64.2                 he1b5a44_1    conda-forge
idna                      2.8                      pypi_0    pypi
imagecodecs               2021.1.11        py37h70f1e17_0    conda-forge
imageio                   2.9.0                      py_0    conda-forge
imagesize                 1.2.0                      py_0    conda-forge
importlib-metadata        3.4.0            py37h89c1867_0    conda-forge
importlib_metadata        3.4.0                hd8ed1ab_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
intake                    0.6.0                    pypi_0    pypi
intel-openmp              2020.2                      254
intervaltree              3.0.2                      py_0    conda-forge
ipykernel                 5.3.2            py37h43977f1_0    conda-forge
ipython                   7.16.1           py37h43977f1_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.6.3              pyhd3deb0d_0    conda-forge
isort                     5.7.0              pyhd8ed1ab_0    conda-forge
itsdangerous              1.1.0                      py_0    conda-forge
jbig                      2.1               h516909a_2002    conda-forge
jdcal                     1.4.1                      py_0    conda-forge
jedi                      0.15.2                   py37_0    conda-forge
jeepney                   0.6.0              pyhd8ed1ab_0    conda-forge
jinja2                    2.11.3             pyh44b312d_0    conda-forge
joblib                    1.0.0              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
jpype1                    1.2.1                    pypi_0    pypi
json5                     0.9.5              pyh9f0ad1d_0    conda-forge
jsonschema                3.2.0                      py_2    conda-forge
jupyter                   1.0.0            py37h89c1867_6    conda-forge
jupyter_client            6.1.11             pyhd8ed1ab_1    conda-forge
jupyter_console           6.2.0                      py_0    conda-forge
jupyter_core              4.7.1            py37h89c1867_0    conda-forge
jupyter_server            1.2.2            py37h89c1867_1    conda-forge
jupyterlab                3.0.7              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_server         2.1.3              pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        1.0.0              pyhd8ed1ab_1    conda-forge
jxrlib                    1.1                  h7f98852_2    conda-forge
keyring                   22.0.1           py37h89c1867_0    conda-forge
kiwisolver                1.3.1            py37h2527ec5_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lazy-object-proxy         1.4.3            py37h8f50634_2    conda-forge
lcms2                     2.11                 hcbb858e_1    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
lerc                      2.2.1                h9c3ff4c_0    conda-forge
libaec                    1.0.4                h9c3ff4c_1    conda-forge
libarchive                3.5.1                h899b81a_0    conda-forge
libblas                   3.9.0                3_openblas    conda-forge
libcblas                  3.9.0                3_openblas    conda-forge
libclang                  9.0.1           default_hde54327_0    conda-forge
libcurl                   7.71.1               hcdd3856_8    conda-forge
libdeflate                1.7                  h7f98852_5    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               hcdb4288_3    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc                    7.2.0                h69d50b8_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgfortran-ng            7.5.0               h14aa051_18    conda-forge
libgfortran4              7.5.0               h14aa051_18    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                3_openblas    conda-forge
liblapacke                3.9.0                3_openblas    conda-forge
liblief                   0.10.1               he1b5a44_2    conda-forge
libllvm10                 10.0.1               he513fc3_3    conda-forge
libllvm9                  9.0.1                hf817b99_2    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libopenblas               0.3.12          pthreads_hb3c22a3_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libprotobuf               3.12.4               h8b12597_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libspatialindex           1.9.3                he1b5a44_3    conda-forge
libssh2                   1.9.0                hab1572f_5    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
libthrift                 0.13.0               hbe8ec66_6    conda-forge
libtiff                   4.2.0                hdc55705_0    conda-forge
libtiledbvcf              0.7.2                hbab4e3b_0    tiledb
libtool                   2.4.6             h58526e2_1007    conda-forge
libutf8proc               2.6.1                h7f98852_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libuv                     1.40.0               h7f98852_0    conda-forge
libwebp-base              1.2.0                h7f98852_0    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxkbcommon              0.10.0               he1b5a44_0    conda-forge
libxml2                   2.9.10               hee79883_0    conda-forge
libxslt                   1.1.33               h31b3aaa_0    conda-forge
libzopfli                 1.0.3                h9c3ff4c_0    conda-forge
llvm-openmp               11.0.1               h4bd325d_0    conda-forge
llvmlite                  0.35.0           py37h9d7f4d0_1    conda-forge
locket                    0.2.0                      py_2    conda-forge
lxml                      4.6.2            py37h77fd288_1    conda-forge
lz4-c                     1.9.2                he1b5a44_3    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
markupsafe                1.1.1            py37h5e8e339_3    conda-forge
matplotlib                3.3.4            py37h89c1867_0    conda-forge
matplotlib-base           3.3.4            py37h0c9df89_0    conda-forge
mccabe                    0.6.1                      py_1    conda-forge
mistune                   0.8.4           py37h5e8e339_1003    conda-forge
mkl                       2020.4             h726a3e6_304    conda-forge
mkl-service               2.3.0            py37h8f50634_2    conda-forge
mkl_fft                   1.2.0            py37h161383b_1    conda-forge
mkl_random                1.2.0            py37h9fdb41a_1    conda-forge
mock                      4.0.3            py37h89c1867_1    conda-forge
monotonic                 1.5                        py_0    conda-forge
more-itertools            8.6.0              pyhd8ed1ab_0    conda-forge
mpc                       1.1.0             h04dde30_1009    conda-forge
mpfr                      4.0.2                he80fd80_1    conda-forge
mpmath                    1.1.0                      py_0    conda-forge
msgpack-asgi              1.0.0                    pypi_0    pypi
msgpack-python            1.0.2            py37h2527ec5_1    conda-forge
multipledispatch          0.6.0                      py_0    conda-forge
navigator-updater         0.2.1                    py37_0
nbclassic                 0.2.6              pyhd8ed1ab_0    conda-forge
nbclient                  0.5.1                      py_0    conda-forge
nbconvert                 6.0.7            py37h89c1867_3    conda-forge
nbformat                  5.1.2              pyhd8ed1ab_1    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nest-asyncio              1.4.3              pyhd8ed1ab_0    conda-forge
networkx                  2.5                        py_0    conda-forge
nltk                      3.4.4                      py_0    conda-forge
nose                      1.3.7                   py_1006    conda-forge
notebook                  6.2.0            py37h89c1867_0    conda-forge
nspr                      4.29                 h9c3ff4c_1    conda-forge
nss                       3.61                 hb5efdd6_0    conda-forge
numba                     0.52.0           py37hdc94413_0    conda-forge
numcodecs                 0.7.3            py37hcd2ae1e_0    conda-forge
numexpr                   2.7.2            py37hdc94413_0    conda-forge
numpy                     1.19.4                   pypi_0    pypi
numpy-base                1.17.0           py37h2f8d375_0    r
numpydoc                  1.1.0                      py_1    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
opencv-python             3.4.11.45                pypi_0    pypi
openjpeg                  2.4.0                hf7af979_0    conda-forge
openpyxl                  3.0.6              pyhd8ed1ab_0    conda-forge
openssl                   1.1.1i               h7f98852_0    conda-forge
orjson                    3.3.0                    pypi_0    pypi
packaging                 20.8               pyhd3deb0d_0    conda-forge
pandas                    1.1.5                    pypi_0    pypi
pandoc                    2.11.4               h7f98852_0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
pango                     1.42.4               h7062337_4    conda-forge
paramiko                  2.7.2                    pypi_0    pypi
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.5.2                      py_0
partd                     1.1.0                      py_0    conda-forge
passlib                   1.7.2                    pypi_0    pypi
patchelf                  0.11                 he1b5a44_0    conda-forge
path                      15.1.0           py37h89c1867_0    conda-forge
path.py                   12.5.0                        0    conda-forge
pathlib2                  2.3.5            py37h89c1867_3    conda-forge
pathtools                 0.1.2                      py_1    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pep8                      1.7.1                      py_0    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.1.0            py37he6b4880_1    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
pixman                    0.38.0            h516909a_1003    conda-forge
pkginfo                   1.7.0              pyhd8ed1ab_0    conda-forge
pluggy                    0.13.1           py37h89c1867_4    conda-forge
ply                       3.11                       py_1    conda-forge
pomegranate               0.13.3           py37hc928c03_1    conda-forge
pooch                     1.3.0              pyhd8ed1ab_0    conda-forge
prometheus_client         0.9.0              pyhd3deb0d_0    conda-forge
promise                   2.3                      pypi_0    pypi
prompt-toolkit            3.0.14             pyha770c72_0    conda-forge
prompt_toolkit            3.0.14               hd8ed1ab_0    conda-forge
pscript                   0.7.4                    pypi_0    pypi
psutil                    5.8.0            py37h5e8e339_1    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
py                        1.10.0             pyhd3deb0d_0    conda-forge
py-lief                   0.10.1           py37hb892b2f_2    conda-forge
pyarrow                   3.0.0                    pypi_0    pypi
pybind11                  2.6.2            py37h2527ec5_0    conda-forge
pybind11-global           2.6.2            py37h2527ec5_0    conda-forge
pycodestyle               2.5.0                    py37_0
pycosat                   0.6.3           py37h5e8e339_1006    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pycrypto                  2.6.1           py37hb5d75c8_1005    conda-forge
pycryptodome              3.9.9                    pypi_0    pypi
pycryptodomex             3.9.9                    pypi_0    pypi
pycurl                    7.43.0.6         py37h88a64d2_1    conda-forge
pydantic                  1.6.1                    pypi_0    pypi
pydocstyle                5.1.1                      py_0    conda-forge
pyerfa                    1.7.1.1          py37h5e8e339_2    conda-forge
pyflakes                  2.1.1                    py37_0
pygments                  2.7.4              pyhd8ed1ab_0    conda-forge
pylint                    2.6.0            py37hc8dfbb8_1    conda-forge
pynacl                    1.4.0                    pypi_0    pypi
pyodbc                    4.0.30           py37hcd2ae1e_1    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.12.3           py37h8685d9f_3    conda-forge
pyqt5-sip                 4.19.18                  pypi_0    pypi
pyqtchart                 5.12                     pypi_0    pypi
pyqtwebengine             5.12.1                   pypi_0    pypi
pyrsistent                0.17.3           py37h5e8e339_2    conda-forge
pysmi                     0.3.4                    pypi_0    pypi
pysnmp                    4.4.6                    pypi_0    pypi
pysocks                   1.7.1            py37h89c1867_3    conda-forge
pytables                  3.6.1            py37h0c4f3e0_3    conda-forge
pytest                    6.2.2                    pypi_0    pypi
pytest-cov                2.11.1                   pypi_0    pypi
python                    3.7.6           cpython_h8356626_6    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-jose               3.1.0                    pypi_0    pypi
python-jsonrpc-server     0.3.4              pyh9f0ad1d_1    conda-forge
python-language-server    0.31.10          py37hc8dfbb8_0    conda-forge
python-libarchive-c       2.9              py37h89c1867_2    conda-forge
python-multipart          0.0.5                    pypi_0    pypi
python_abi                3.7                     1_cp37m    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pywavelets                1.1.1            py37h902c9e0_3    conda-forge
pyxdg                     0.26                       py_0    conda-forge
pyyaml                    5.4.1            py37h5e8e339_0    conda-forge
pyzmq                     22.0.1           py37h499b945_0    conda-forge
qdarkstyle                2.8.1              pyhd8ed1ab_2    conda-forge
qpd                       0.2.5                    pypi_0    pypi
qt                        5.12.5               hd8c4c69_1    conda-forge
qtawesome                 1.0.2              pyhd8ed1ab_0    conda-forge
qtconsole                 5.0.2              pyhd8ed1ab_0    conda-forge
qtpy                      1.9.0                      py_0    conda-forge
re2                       2020.08.01           he1b5a44_1    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
regex                     2020.10.15               pypi_0    pypi
requests                  2.21.0                   pypi_0    pypi
ripgrep                   12.1.1               h516909a_1    conda-forge
rope                      0.18.0             pyh9f0ad1d_0    conda-forge
rtree                     0.9.7            py37h0b55af0_1    conda-forge
ruamel_yaml               0.15.80         py37h5e8e339_1004    conda-forge
rx                        1.6.1                    pypi_0    pypi
samtools                  1.7                           1    bioconda
scikit-allel              1.3.2            py37h9fdb41a_0    conda-forge
scikit-image              0.18.1           py37hdc94413_0    conda-forge
scikit-learn              0.24.1           py37h69acf81_0    conda-forge
scipy                     1.5.3            py37h8911b10_0    conda-forge
seaborn                   0.11.1               hd8ed1ab_1    conda-forge
seaborn-base              0.11.1             pyhd8ed1ab_1    conda-forge
secretstorage             3.3.0            py37h89c1867_0    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setuptools                49.6.0           py37h89c1867_3    conda-forge
simplegeneric             0.8.1                      py_1    conda-forge
singledispatch            3.4.0.3         pyh9f0ad1d_1001    conda-forge
sip                       4.19.24          py37hcd2ae1e_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
sniffio                   1.2.0            py37h89c1867_1    conda-forge
snowballstemmer           2.1.0              pyhd8ed1ab_0    conda-forge
sortedcollections         2.1.0              pyhd8ed1ab_0    conda-forge
sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.0.1                      py_1    conda-forge
sphinx                    3.4.3              pyhd8ed1ab_0    conda-forge
sphinxcontrib             1.0                      py37_1
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    1.0.3                      py_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.4                      py_0    conda-forge
sphinxcontrib-websupport  1.2.4              pyh9f0ad1d_0    conda-forge
spyder                    4.1.3            py37hc8dfbb8_0    conda-forge
spyder-kernels            1.9.1            py37hc8dfbb8_1    conda-forge
sqlalchemy                1.3.23           py37h5e8e339_0    conda-forge
sqlite                    3.34.0               h74cdb3f_0    conda-forge
starlette                 0.13.6                   pypi_0    pypi
statsmodels               0.12.2           py37h902c9e0_0    conda-forge
sympy                     1.7.1            py37h89c1867_1    conda-forge
tbb                       2020.2               h4bd325d_3    conda-forge
tblib                     1.6.0                      py_0    conda-forge
terminado                 0.9.2            py37h89c1867_0    conda-forge
testpath                  0.4.4                      py_0    conda-forge
threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
thrift-compiler           0.13.0               hbe8ec66_6    conda-forge
thrift-cpp                0.13.0                        6    conda-forge
tifffile                  2021.2.1           pyhd8ed1ab_0    conda-forge
tiledb                    0.8.0                    pypi_0    pypi
tiledbvcf-py              0.7.2            py37h93de243_0    tiledb
tk                        8.6.10               h21135ba_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
toolz                     0.11.1                     py_0    conda-forge
tornado                   6.1              py37h5e8e339_1    conda-forge
tqdm                      4.56.0             pyhd8ed1ab_0    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
triad                     0.5.1                    pypi_0    pypi
typed-ast                 1.4.2            py37h5e8e339_0    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
tzlocal                   2.1                      pypi_0    pypi
ujson                     3.0.0                    pypi_0    pypi
unicodecsv                0.14.1                     py_1    conda-forge
unixodbc                  2.3.9                h0e019cf_0    conda-forge
urllib3                   1.24.3                   pypi_0    pypi
uvicorn                   0.11.8                   pypi_0    pypi
uvloop                    0.14.0                   pypi_0    pypi
vcftools                  0.1.16               he513fc3_4    bioconda
watchdog                  1.0.2            py37h89c1867_1    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
webruntime                0.5.8                    pypi_0    pypi
websockets                8.1                      pypi_0    pypi
werkzeug                  1.0.1              pyh9f0ad1d_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
widgetsnbextension        3.5.1            py37h89c1867_4    conda-forge
wrapt                     1.11.2           py37h8f50634_1    conda-forge
wurlitzer                 2.0.1            py37h89c1867_1    conda-forge
xlrd                      2.0.1              pyhd8ed1ab_3    conda-forge
xlsxwriter                1.3.7              pyh9f0ad1d_0    conda-forge
xlwt                      1.3.0                      py_1    conda-forge
xmltodict                 0.12.0                     py_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.12               h516909a_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
yapf                      0.30.0             pyh9f0ad1d_0    conda-forge
zarr                      2.6.1              pyhd8ed1ab_0    conda-forge
zeromq                    4.3.4                h9c3ff4c_0    conda-forge
zfp                       0.5.5                h9c3ff4c_4    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.4.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zope                      1.0                      py37_1
zope.event                4.5.0              pyh9f0ad1d_0    conda-forge
zope.interface            5.2.0            py37h5e8e339_1    conda-forge
zstd                      1.4.8                hdf46e1d_0    conda-forge

from tiledb-vcf.

Shelnutt2 avatar Shelnutt2 commented on June 9, 2024

@sauloal Thank you for the information. At this time I'm not able to reproduce the crash directly. My suspicious is there is something with your VCF files headers which might be related the warning you get in the stats and export from htslib, [W::bcf_hdr_check_sanity] GL should be declared as Number=G. Is there anyway you could share an example VCF file that produces this error so we can debug it further? If you can't share the VCF publicly, if you want to email us at [email protected] we are happy to take a look.

Without reproducing another quick test would be to downgrade htslib from 1.11 to 1.10:
conda install -c bioconda htslib==1.10

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

Can't install

$ mamba install -f --no-deps -c conda-forge -c bioconda -c tiledb htslib==1.10

Problem: package libtiledbvcf-0.8.0-hbab4e3b_0 requires htslib >=1.11,<1.12.0a0, but none of the providers can be installed
UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package libgcc-ng conflicts for:
htslib==1.10 -> openssl[version='>=1.1.1a,<1.1.2a'] -> libgcc-ng[version='>=7.2.0|>=9.3.0']
htslib==1.10 -> libgcc-ng[version='>=7.3.0']

Package openssl conflicts for:
htslib==1.10 -> openssl[version='>=1.1.1a,<1.1.2a']
htslib==1.10 -> libcurl[version='>=7.64.1,<8.0a0'] -> openssl[version='>=1.1.1b,<1.1.2a|>=1.1.1c,<1.1.2a|>=1.1.1d,<1.1.2a|>=1.1.1g,<1.1.2a']

Package zlib conflicts for:
python=3.8 -> zlib[version='>=1.2.11,<1.3.0a0']
htslib==1.10 -> zlib[version='>=1.2.11,<1.3.0a0']

from tiledb-vcf.

aaronwolen avatar aaronwolen commented on June 9, 2024

@sauloal if it's not possible to share one of your VCF files could you check to see what the VCF format version number is? It also might help to take a look at the header data for one of the files, or at least the line defining the GL field.

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

Here is the header of the file.

What is strange is that exporting to TSV works without a problem.

##fileformat=VCFv4.1
##samtoolsVersion=0.1.14 (r933:170)
##INFO=<ID=CI95,Number=2,Type=Float,Description="Equal-tail Bayesian credible interval of the site allele frequency at the 95% level">
##INFO=<ID=RP,Number=1,Type=Integer,Description="# permutations yielding a smaller PCHI2.">
##SnpEffCmd="SnpEff  -no-upstream -no-downstream -ud 0 -csvStats Slyc2.40 /home/assembly/tomato150/reseq/mapped/Heinz/RF_104_SZAXPI008751-74.vcf.gz "
##samtoolsVersion=0.1.18 (r982:295)
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square mapping quality of covering reads">
##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of all samples being the same">
##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele frequency (assuming HWE)">
##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele count (no HWE assumption)">
##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype frequencies">
##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test P-value based on G3">
##INFO=<ID=CLR,Number=1,Type=Integer,Description="Log ratio of genotype likelihoods with and without the constraint">
##INFO=<ID=UGT,Number=1,Type=String,Description="The most probable unconstrained genotype configuration in the trio">
##INFO=<ID=CGT,Number=1,Type=String,Description="The most probable constrained genotype configuration in the trio">
##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand bias, baseQ bias, mapQ bias and tail distance bias">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred probability of the nonRef allele frequency in group1 samples being larger (,smaller) than in group2.">
##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior weighted chi^2 P-value for testing the association between group1 and group2 samples.">
##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred scaled PCHI2.">
##INFO=<ID=PR,Number=1,Type=Integer,Description="# permutations yielding a smaller PCHI2.">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods for RR,RA,AA genotypes (R=ref,A=alt)">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand bias P-value">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##SnpEffVersion="3.2 (build 2013-05-23), by Pablo Cingolani"
##SnpEffCmd="SnpEff  -no-upstream -no-downstream -ud 0 -csvStats Slyc2.40 /home/assembly/tomato150/reseq/mapped/Heinz/RF_105_SZAXPI009358-45.vcf.gz "
##INFO=<ID=EFF,Number=.,Type=String,Description="Predicted effects for this variant.Format: 'Effect ( Effect_Impact | Functional_Class | Codon_Change | Amino_Acid_change| Amino_Acid_length | Gene_Name | Transcript_BioType | Gene_Coding | Transcript_ID | Exon  | GenotypeNum [ | ERRORS | WARNINGS ] )'">
##INFO=<ID=SF,Number=.,Type=String,Description="Source File (index to sourceFiles, f when filtered)">
##INFO=<ID=AC,Number=.,Type=Integer,Description="Allele count in genotypes">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##SnpEffVersion="4.3t (build 2017-11-24 10:18), by Pablo Cingolani"
##SnpEffCmd="SnpEff  S_lycopersicum_v2.50 /vcf-data/merge.vcf.gz "
##INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ">
##INFO=<ID=LOF,Number=.,Type=String,Description="Predicted loss of function effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
##INFO=<ID=NMD,Number=.,Type=String,Description="Predicted nonsense mediated decay effects for this variant. Format: 'Gene_Name | Gene_ID | Number_of_transcripts_in_gene | Percent_of_transcripts_affected'">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008746-45  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009284-57        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009285-62        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009286-74        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009287-75        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009288-79        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009289-84        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009290-87        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009291-88        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009292-89        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009293-90        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009294-93        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009295-94        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009296-95        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009297-102       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009298-108
       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009299-109       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009300-113       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009301-123       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009302-129       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009303-133       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009304-136       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009305-140       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009306-142       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009307-158       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009308-166       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009309-169       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009310-62        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009311-74        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009312-75        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009313-79        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009314-84        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009315-87
        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009316-88        /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008747-46  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009317-89        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009318-90        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009319-93        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009320-94        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009321-95        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009322-102       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009323-108       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009324-109       /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008748-47  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009326-113       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009327-123       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009328-129       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009329-133       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009330-136       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009331-140
       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009332-142       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009333-158       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009334-166       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009359-46        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009335-169       /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009336-14        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009337-15        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009338-16-2      /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009339-17-2      /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009340-18        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009341-19        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009342-21        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009343-22-2      /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009344-23        /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008749-56  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009345-24        /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008752-75  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009346-25        /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008753-79  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009347-26        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009348-27        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009349-30        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009350-31        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009351-32        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009352-35        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009325-56        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009353-36        /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008750-57  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009354-37        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009355-39        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009356-41        /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009357-44        /panfs/ANIMAL/group001/minjiumeng/tomato_reseq/SZAXPI008751-74  /ifshk5/PC_PA_EU/PMO/Tomato_reseq/01.BWA/SZAXPI009358-45
SL2.50ch00      280     .       A       C       30.8    .       AC1=2;AC=2;AF1=1;AN=2;DP4=0,0,2,0;DP=17;EFF=INTERGENIC(MODIFIER||||||||||1);FQ=-33;MQ=60;SF=50;VDB=0.0198;ANN=C|intergenic_region|MODIFIER|CHR_START-Solyc00g005000.2|CHR_START-gene:Solyc00g005000.2|intergenic_region|CHR_START-gene:Solyc00g005000.2|||n.280A>C||||||        GT:GQ:DP:PL     .       .       .       .       .       .       .       .       .       .       .       .       .       .       .
       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .
       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       1/1:10:2:62,6,0 .
       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .       .
       .       .       .       .       .       .       .       .       .       .       .       .       .

from tiledb-vcf.

aaronwolen avatar aaronwolen commented on June 9, 2024

Thanks! It seems like this might be the same issue discussed here. Could you try running bcftools reheader as suggested and then re-ingesting to see if that fixes the export.

If you have vcftools installed you could also run the vcf-validator to make sure there are no other issues with the files.

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

I had to split my cVCF into single sample files therefore I get a lot of AN/AC errors but otherwise the file is fine.

Leading or trailing space in attr_key-attr_value pairs is discouraged:
        [Description] [Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ]
        INFO=<ID=ANN,Number=.,Type=String,Description="Functional annotations: 'Allele | Annotation | Annotation_Impact | Gene_Name | Gene_ID | Feature_Type | Feature_ID | Transcript_BioType | Rank | HGVS.c | HGVS.p | cDNA.pos / cDNA.length | CDS.pos / CDS.length | AA.pos / AA.length | Distance | ERRORS / WARNINGS / INFO' ">
The header tag 'reference' not present. (Not required but highly recommended.)
SL2.50ch00:3235 .. AN is 118, should be 2
SL2.50ch00:3235 .. AC is 118, should be 2
SL2.50ch00:4314 .. AN is 128, should be 2
SL2.50ch00:4314 .. AC is 128, should be 2

from tiledb-vcf.

aaronwolen avatar aaronwolen commented on June 9, 2024

Any luck with bcftools reheader?

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

So,

In summary. It solved the proble. Recoding the file with BCF tools and fixing the header manually. Not really high throughput but works.

Now,

I've added just a few files and it half works (at least gives a different error).

tiledbvcf export --uri 150_debug_cli --verbose --output-format z --sample-names SZAXPI008746-45,SZAXPI009284-57,SZAXPI009285-62,SZAXPI009286-74 --regions SL2.50ch00:1-100000 --output-dir 150_debug_cli_query
Sorted 1 regions in 4.62e-05 seconds.
Allocating 11 fields (17 buffers) of size 63161283 bytes (60.2353MB)
Initialized TileDB query with 1 start_pos ranges,4 samples for contig SL2.50ch00 (contig batch 1/1, sample batch 1/1).
Processed 44 cells in 0.0005062 sec. Reported 44 cells.
[E::bcf_fmt_array] Unexpected type 0

real    0m3.985s
user    0m0.331s
sys     0m0.815s

And it extracts 1 instead of 4 files

from tiledb-vcf.

aaronwolen avatar aaronwolen commented on June 9, 2024

Thanks for the update. This looks like another VCF format error coming from htslib. If you’re able to share a couple of your files we’d be happy to help track down the issue.

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

So, I've finished adding all 84 samples

I've consolidated the database and vaccum it (6 hours to do so).

CONSOLIDATING FRAGMENT META
+ tiledbvcf utils consolidate fragment_meta --uri 150_debug_cli

real    0m4.862s
user    0m0.329s
sys     0m1.496s
+ echo 'CONSOLIDATING FRAGMENT'
CONSOLIDATING FRAGMENT
+ tiledbvcf utils consolidate fragments --uri 150_debug_cli


real    357m27.628s
user    438m14.372s
sys     243m45.570s
+ echo 'VACCUM FRAGMENT META'
VACCUM FRAGMENT META
+ tiledbvcf utils vaccum fragment_meta --uri 150_debug_cli

real    0m1.039s
user    0m0.161s
sys     0m0.117s
+ echo 'VACCUM FRAGMENT'
VACCUM FRAGMENT
+ tiledbvcf utils vaccum fragments --uri 150_debug_cli

real    0m26.796s
user    0m0.452s
sys     0m21.811s

Still the same result:

tiledbvcf export --uri 150_debug_cli --verbose --output-format z --sample-names SZAXPI008746-45,SZAXPI009284-57,SZAXPI009285-62,SZAXPI009286-74 --regions SL2.50ch00:1-500000 --output-dir 150_debug_cli_query
Sorted 1 regions in 3.73e-05 seconds.
Allocating 11 fields (17 buffers) of size 63161283 bytes (60.2353MB)
Initialized TileDB query with 1 start_pos ranges,4 samples for contig SL2.50ch00 (contig batch 1/1, sample batch 1/1).
Processed 95 cells in 0.0003633 sec. Reported 95 cells.
Processed 92 cells in 0.0003643 sec. Reported 92 cells.
Processed 61 cells in 0.0005126 sec. Reported 61 cells.
Processed 97 cells in 0.0005054 sec. Reported 97 cells.
[E::bcf_fmt_array] Unexpected type 0

real    0m21.077s
user    0m1.040s
sys     0m0.312s

Each file ranges from 50 to 500 Mb compressed. How can I send it to you, let's say, 5 of them?

Regards

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

The plot thickens.
Exporting to BCF and TSV works. Just VCF crashes.

from tiledb-vcf.

Shelnutt2 avatar Shelnutt2 commented on June 9, 2024

@sauloal thank you for the continued information. TileDB-VCF relies on htslib for both the BCF and VCF export. We build the in-memory record structure then pass things to htslib for putting it into the proper format in the file. TSV export is handled entirely inside TileDB-VCF. It seems that once we get a sample of your VCF files, we'll be able to track down the exact cause and push a fix into htslib to prevent the segfault and potentially also make some adjustment in on our side to help this export succeed.

Each file ranges from 50 to 500 Mb compressed. How can I send it to you, let's say, 5 of them?

If you can upload them to google drive/drop box that would work. You can email us at [email protected] with private links. If that isn't an option we can also give you temporary access an FTPS site where you could upload them. Lastly we can also provide a shared S3 bucket where you can upload, if you are an AWS user. Please let us know which you prefer.

I've consolidated the database and vaccum it (6 hours to do so).

One note here, you don't need to consolidate the fragments. Consolidating the fragment metadata is an important step to reduce the overhead when opening the array. Consolidating the fragments themselves is not needed, and this time consuming step can be avoided for your testing. Even in general with TileDB-VCF arrays, you should not need to consolidate the fragments in most use cases. TileDB efficiently prunes the fragments that do not intersect a query, so having a large number does not harm the read performance in most cases.

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

@aaronwolen @Shelnutt2
Thanks for your message. I've sent and email with the data.

Regarding the consolidation, I'm investigating using tiledb to a large deployment so I want to test its speed and reliability, motly curiosity and expectation to need to run it after inserting large amounts of data.

I've also noticed that consolidating the fragment metadata reduced the insertion time massively so i've made my scrip always do that after each insertion. after that insertion time remainined constant.

from tiledb-vcf.

Shelnutt2 avatar Shelnutt2 commented on June 9, 2024

@sauloal We've identified the issue and adjusted TileDB-VCF to avoid the problem in htslib. @aaronwolen and I have validated the fix against your sample data. We are wrapping up a few other open pull requests now and will look to cut a release with the fix tomorrow morning. We'll let you know as soon as the conda package is available.

Fix: #263

from tiledb-vcf.

sauloal avatar sauloal commented on June 9, 2024

@aaronwolen @Shelnutt2

I can confirm it is working and exporting successfully.

Thanks for the great work!

from tiledb-vcf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.