Git Product home page Git Product logo

google / deepvariant Goto Github PK

View Code? Open in Web Editor NEW
3.1K 162.0 697.0 514.7 MB

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.

License: BSD 3-Clause "New" or "Revised" License

Python 67.13% Shell 4.89% C++ 24.57% Dockerfile 0.40% Starlark 3.01%
tensorflow deep-neural-network genomics science dna sequencing genome bioinformatics deep-learning ngs

deepvariant's Introduction

release announcements blog

DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.

DeepVariant supports germline variant-calling in diploid organisms.

Please also note:

  • For somatic data or any other samples where the genotypes go beyond two copies of DNA, DeepVariant will not work out of the box because the only genotypes supported are hom-alt, het, and hom-ref.
  • The models included with DeepVariant are only trained on human data. For other organisms, see the blog post on non-human variant-calling for some possible pitfalls and how to handle them.

DeepTrio

DeepTrio is a deep learning-based trio variant caller built on top of DeepVariant. DeepTrio extends DeepVariant's functionality, allowing it to utilize the power of neural networks to predict genomic variants in trios or duos. See this page for more details and instructions on how to run DeepTrio.

DeepTrio supports germline variant-calling in diploid organisms for the following types of input data:

Please also note:

  • All DeepTrio models were trained on human data.
  • It is possible to use DeepTrio with only 2 samples (child, and one parent).
  • External tool GLnexus is used to merge output VCFs.

How to run DeepVariant

We recommend using our Docker solution. The command will look like this:

BIN_VERSION="1.6.0"
docker run \
  -v "YOUR_INPUT_DIR":"/input" \
  -v "YOUR_OUTPUT_DIR:/output" \
  google/deepvariant:"${BIN_VERSION}" \
  /opt/deepvariant/bin/run_deepvariant \
  --model_type=WGS \ **Replace this string with exactly one of the following [WGS,WES,PACBIO,ONT_R104,HYBRID_PACBIO_ILLUMINA]**
  --ref=/input/YOUR_REF \
  --reads=/input/YOUR_BAM \
  --output_vcf=/output/YOUR_OUTPUT_VCF \
  --output_gvcf=/output/YOUR_OUTPUT_GVCF \
  --num_shards=$(nproc) \ **This will use all your cores to run make_examples. Feel free to change.**
  --logging_dir=/output/logs \ **Optional. This saves the log output for each stage separately.
  --haploid_contigs="chrX,chrY" \ **Optional. Heterozygous variants in these contigs will be re-genotyped as the most likely of reference or homozygous alternates. For a sample with karyotype XY, it should be set to "chrX,chrY" for GRCh38 and "X,Y" for GRCh37. For a sample with karyotype XX, this should not be used.
  --par_regions_bed="/input/GRCh3X_par.bed" \ **Optional. If --haploid_contigs is set, then this can be used to provide PAR regions to be excluded from genotype adjustment. Download links to this files are available in this page.
  --dry_run=false **Default is false. If set to true, commands will be printed out but not executed.

For details on X,Y support, please see DeepVariant haploid support and the case study in DeepVariant X, Y case study. You can download the PAR bed files from here: GRCh38_par.bed, GRCh37_par.bed.

To see all flags you can use, run: docker run google/deepvariant:"${BIN_VERSION}"

If you're using GPUs, or want to use Singularity instead, see Quick Start for more details or see all the setup options available.

For more information, also see:

How to cite

If you're using DeepVariant in your work, please cite:

A universal SNP and small-indel variant caller using deep neural networks. Nature Biotechnology 36, 983โ€“987 (2018).
Ryan Poplin, Pi-Chuan Chang, David Alexander, Scott Schwartz, Thomas Colthurst, Alexander Ku, Dan Newburger, Jojo Dijamco, Nam Nguyen, Pegah T. Afshar, Sam S. Gross, Lizzie Dorfman, Cory Y. McLean, and Mark A. DePristo.
doi: https://doi.org/10.1038/nbt.4235

Additionally, if you are generating multi-sample calls using our DeepVariant and GLnexus Best Practices, please cite:

Accurate, scalable cohort variant calls using DeepVariant and GLnexus. Bioinformatics (2021).
Taedong Yun, Helen Li, Pi-Chuan Chang, Michael F. Lin, Andrew Carroll, and Cory Y. McLean.
doi: https://doi.org/10.1093/bioinformatics/btaa1081

Why Use DeepVariant?

  • High accuracy - DeepVariant won 2020 PrecisionFDA Truth Challenge V2 for All Benchmark Regions for ONT, PacBio, and Multiple Technologies categories, and 2016 PrecisionFDA Truth Challenge for best SNP Performance. DeepVariant maintains high accuracy across data from different sequencing technologies, prep methods, and species. For lower coverage, using DeepVariant makes an especially great difference. See metrics for the latest accuracy numbers on each of the sequencing types.
  • Flexibility - Out-of-the-box use for PCR-positive samples and low quality sequencing runs, and easy adjustments for different sequencing technologies and non-human species.
  • Ease of use - No filtering is needed beyond setting your preferred minimum quality threshold.
  • Cost effectiveness - With a single non-preemptible n1-standard-16 machine on Google Cloud, it costs ~$11.8 to call a 30x whole genome and ~$0.89 to call an exome. With preemptible pricing, the cost is $2.84 for a 30x whole genome and $0.21 for whole exome (not considering preemption).
  • Speed - See metrics for the runtime of all supported datatypes on a 64-core CPU-only machine. Multiple options for acceleration exist.
  • Usage options - DeepVariant can be run via Docker or binaries, using both on-premise hardware or in the cloud, with support for hardware accelerators like GPUs and TPUs.

(1): Time estimates do not include mapping.

How DeepVariant works

Stages in DeepVariant

For more information on the pileup images and how to read them, please see the "Looking through DeepVariant's Eyes" blog post.

DeepVariant relies on Nucleus, a library of Python and C++ code for reading and writing data in common genomics file formats (like SAM and VCF) designed for painless integration with the TensorFlow machine learning framework. Nucleus was built with DeepVariant in mind and open-sourced separately so it can be used by anyone in the genomics research community for other projects. See this blog post on Using Nucleus and TensorFlow for DNA Sequencing Error Correction.

DeepVariant Setup

Prerequisites

  • Unix-like operating system (cannot run on Windows)
  • Python 3.8

Official Solutions

Below are the official solutions provided by the Genomics team in Google Health.

Name Description
Docker This is the recommended method.
Build from source DeepVariant comes with scripts to build it on Ubuntu 20.04. To build and run on other Unix-based systems, you will need to modify these scripts.
Prebuilt Binaries Available at gs://deepvariant/. These are compiled to use SSE4 and AVX instructions, so you will need a CPU (such as Intel Sandy Bridge) that supports them. You can check the /proc/cpuinfo file on your computer, which lists these features under "flags".

Contribution Guidelines

Please open a pull request if you wish to contribute to DeepVariant. Note, we have not set up the infrastructure to merge pull requests externally. If you agree, we will test and submit the changes internally and mention your contributions in our release notes. We apologize for any inconvenience.

If you have any difficulty using DeepVariant, feel free to open an issue. If you have general questions not specific to DeepVariant, we recommend that you post on a community discussion forum such as BioStars.

License

BSD-3-Clause license

Acknowledgements

DeepVariant happily makes use of many open source packages. We would like to specifically call out a few key ones:

We thank all of the developers and contributors to these packages for their work.

Disclaimer

This is not an official Google product.

NOTE: the content of this research code repository (i) is not intended to be a medical device; and (ii) is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.

deepvariant's People

Contributors

akiraly1 avatar akolesnikov avatar arostamianfar avatar cmclean avatar danielecook avatar gamazeps avatar gunjanbaid avatar jblespiau avatar kishwarshafin avatar marianattestad avatar msamman avatar nmousavi avatar pichuan avatar rpoplin avatar ryi06 avatar scott7z avatar sgoe1 avatar tedyun avatar thomascolthurst avatar xunjieli avatar z3nabi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepvariant's Issues

How do I control the number of threads for call_variants?

I'm trying to build a scatter gather implementation of make_calls -> call_variants -> post_processing, without using GNU parallel, and since multiple shards of call_variants, even with num_readers set to 1, increases the system load well beyond the number of cores, I'd like to try to limit this to a 1:1 ratio where one shard produces a system load of 1. Is this possible?

This is essentially what my pipeline looks like today: https://github.com/oskarvid/wdl_deepvariant/blob/master/deepvar-simple-SG.wdl
I say essentially because I've made insignificant changes, like added --num_readers for example.

One way would be to try to combine all tfrecord files into one, because wdl cannot use your method of using all output files from make_calls as input files for call_variants, inputFile@#shards.gz doesn't compute for wdl, and using wdl's normal way of handling multiple input files, i.e "--examples ${sep=" --examples " InputFile}" doesn't work either since call_variants only takes the last "--examples" as input when there are many "--examples" in the command.

Regarding combining the tfrecord files before they're used as input for call_variants, I'm not familiar enough with tensorflow to know if it's at all possible, and a quick google search didn't return anything fruitful. Is it possible to combine many tfrecord files into one?

Is it easier to try to limit the number of threads per process instead of trying to combine the tfrecord files? Or is there a third method that solves this problem better?

And thanks for a great tool!

The TF examples has image/format 'None'

Hi,

when I run deepVariant, using 2 shards for the makeExample, the call_variants fails.
The error is :

ValueError: The TF examples in shardedExamples/[email protected] has image/format 'None' (expected 'raw') which means you might need to rerun make_examples to genenerate the examples again.

One confusing thing that happens:
The error only shows up only when running with my own BAM file ( mapped to hg19 ) but when I run deepVariant with the example BAM file provided by Google this does not happen.

I would appreciate any kind of help.

Thanks a lot,
Luisa

Deal with multiple samples

Hi,
I have couple dozens of samples (bam files) to do the phylogenetic analysis. Do I treat the sample individually and merge the VCF files afterward, or there is something like SAMtools mpile to merge bam files before feeding to deepvariant workflow?

Thanks for answering this question

ImportError: libcrypto.so.1.0.0

The code in Download binaries, models, and test data the of the DV quick start guide ran successfully. However, running make_examples using the quickstart-testdata failed with the following error:

Traceback (most recent call last):
attila-ThinkS:~/tools/deepvariant$ python bin/make_examples.zip \
>   --mode calling   \
>   --ref "${REF}"   \
>   --reads "${BAM}" \
>   --regions "chr20:10,000,000-10,010,000" \
>   --examples "${OUTPUT_DIR}/examples.tfrecord.gz"
  File "/tmp/Bazel.runfiles_pbJgd2/runfiles/genomics/deepvariant/make_examples.py", line 45, in <module>
    from deepvariant import variant_caller
  File "/tmp/Bazel.runfiles_pbJgd2/runfiles/genomics/deepvariant/variant_caller.py", line 50, in <module>
    from deepvariant.python import variant_calling
ImportError: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory

Documentation: BAM file requirements / recommendations

I may have missed it, but I could not find documentation describing recommended bam file processing prior to running DeepVariant. In particular:

  1. Should the bam file be sorted?
  2. Should duplicate reads be marked?
  3. Is local realignment around indels recommended?

Question about channels

Hi

I have some ideals about the channels in your model, but i don't know which files to modify, could you give me some advice?

Why the software history was not kept?

Hi there,

I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like deepvariant, deleted the software history during the transition to open-source.

8b84eab

Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask deepvariant developers the following four brief questions:

  1. Why did you decide to not keep the software history?
  2. Do the core developers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  3. Do the newcomers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  4. How does the lack of history impacted on software evolution? Does it placed any burden in understanding and evolving the software?

Thanks in advance for your collaboration,

Gustavo Pinto, PhD
http://www.gustavopinto.org

release v0.4.1 failing to compile on ubuntu 16.04

Hi,

This is more of a support question, but I wasn't sure where else to get help. I'm trying to build and test deepvariant inside of a docker image. I know that there is already an image published to google cloud, but for my purposes I prefer to build my own image. My docker file looks like this.

FROM ubuntu:16.04

RUN set -ex \
  && buildDependencies=' \
    ca-certificates \
    curl \
    wget \
    git \
    apt-transport-https \
    xz-utils \
    bzip2 \
    make \
  ' \
  && apt-get update \
  && apt-get install -y --no-install-recommends $buildDependencies \
  # gsutil
  && wget https://storage.googleapis.com/pub/gsutil.tar.gz \
  && tar xfz gsutil.tar.gz -C $HOME && rm gsutil.tar.gz \
  && export PATH=$PATH:$HOME/gsutil \
  # deepvariant
  && git clone https://github.com/google/deepvariant.git \
  && cd deepvariant \
  && git checkout v0.4.1 \
  && ./build-prereq.sh \
  && ./build_and_test.sh

The build_and_test.sh script fails with these errors:

+ ./build_and_test.sh
+ source settings.sh
++ export TF_CUDA_CLANG=0
++ TF_CUDA_CLANG=0
++ export TF_ENABLE_XLA=0
++ TF_ENABLE_XLA=0
++ export TF_NEED_CUDA=0
++ TF_NEED_CUDA=0
++ export TF_NEED_GCP=1
++ TF_NEED_GCP=1
++ export TF_NEED_GDR=0
++ TF_NEED_GDR=0
++ export TF_NEED_HDFS=0
++ TF_NEED_HDFS=0
++ export TF_NEED_JEMALLOC=0
++ TF_NEED_JEMALLOC=0
++ export TF_NEED_MKL=0
++ TF_NEED_MKL=0
++ export TF_NEED_MPI=0
++ TF_NEED_MPI=0
++ export TF_NEED_OPENCL=0
++ TF_NEED_OPENCL=0
++ export TF_NEED_OPENCL_SYCL=0
++ TF_NEED_OPENCL_SYCL=0
++ export TF_NEED_S3=0
++ TF_NEED_S3=0
++ export TF_NEED_VERBS=0
++ TF_NEED_VERBS=0
++ export TF_CUDA_VERSION=8.0
++ TF_CUDA_VERSION=8.0
++ export CUDA_TOOLKIT_PATH=/usr/local/cuda
++ CUDA_TOOLKIT_PATH=/usr/local/cuda
++ export TF_CUDNN_VERSION=6
++ TF_CUDNN_VERSION=6
++ export CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
++ CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
++ export DEEPVARIANT_BUCKET=gs://deepvariant
++ DEEPVARIANT_BUCKET=gs://deepvariant
++ export DV_PACKAGE_BUCKET_PATH=gs://deepvariant/packages
++ DV_PACKAGE_BUCKET_PATH=gs://deepvariant/packages
++ export DV_GPU_BUILD=0
++ DV_GPU_BUILD=0
++ export DV_USE_GCP_OPTIMIZED_TF_WHL=1
++ DV_USE_GCP_OPTIMIZED_TF_WHL=1
++ export GCP_OPTIMIZED_TF_WHL_FILENAME=tensorflow-1.4.1.deepvariant_gcp-cp27-none-linux_x86_64.whl
++ GCP_OPTIMIZED_TF_WHL_FILENAME=tensorflow-1.4.1.deepvariant_gcp-cp27-none-linux_x86_64.whl
++ export GCP_OPTIMIZED_TF_WHL_PATH=gs://deepvariant/packages/tensorflow
++ GCP_OPTIMIZED_TF_WHL_PATH=gs://deepvariant/packages/tensorflow
++ export DV_TF_NIGHTLY_BUILD=0
++ DV_TF_NIGHTLY_BUILD=0
++ export DV_INSTALL_GPU_DRIVERS=0
++ DV_INSTALL_GPU_DRIVERS=0
+++ which python
++ export PYTHON_BIN_PATH=/usr/bin/python
++ PYTHON_BIN_PATH=/usr/bin/python
++ export USE_DEFAULT_PYTHON_LIB_PATH=1
++ USE_DEFAULT_PYTHON_LIB_PATH=1
++ export 'DV_COPT_FLAGS=--copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3'
++ DV_COPT_FLAGS='--copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3'
++ export DV_TENSORFLOW_GIT_SHA=ab0fcaceda001825654424bf18e8a8e0f8d39df2
++ DV_TENSORFLOW_GIT_SHA=ab0fcaceda001825654424bf18e8a8e0f8d39df2
+ [[ 0 = \1 ]]
+ bazel test -c opt --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3 deepvariant/...
..................
(09:27:04) INFO: Current date is 2017-12-21
(09:27:04) Loading: 
(09:27:04) Loading: 0 packages loaded
(09:27:05) Loading: 0 packages loaded
(09:27:06) Loading: 7 packages loaded
    currently loading: deepvariant/core/genomics ... (6 packages)
(09:27:07) Loading: 10 packages loaded
    currently loading: deepvariant/core/genomics ... (3 packages)
(09:27:08) Loading: 10 packages loaded
    currently loading: deepvariant/core/genomics ... (3 packages)
(09:27:09) Analyzing: 242 targets (15 packages loaded)
(09:27:11) Analyzing: 242 targets (16 packages loaded)
(09:27:12) Analyzing: 242 targets (18 packages loaded)
(09:27:14) Analyzing: 242 targets (31 packages loaded)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:96:1: First argument of 'load' must be a label and start with either '//', ':', or '@'. Use --incompatible_load_argument_is_label=false to temporarily disable this check.
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:98:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:100:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:102:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:104:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:106:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:108:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:110:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:112:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:114:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:116:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:118:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:120:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:122:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:124:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:126:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:131:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:136:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:141:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:146:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:14) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:151:1: name 're2_test' is not defined (did you mean 'ios_test'?)
(09:27:15) Analyzing: 242 targets (37 packages loaded)
(09:27:17) Analyzing: 242 targets (45 packages loaded)
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/bitmap256.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/bitstate.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/compile.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/dfa.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/filtered_re2.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/mimics_pcre.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/nfa.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/onepass.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/parse.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/perl_groups.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/prefilter.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/prefilter.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/prefilter_tree.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/prefilter_tree.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/prog.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/prog.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/re2.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/regexp.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/regexp.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/set.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/simplify.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/stringpiece.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/tostring.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/unicode_casefold.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/unicode_casefold.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/unicode_groups.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/unicode_groups.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/walker-inl.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/flags.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/logging.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/mix.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/mutex.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/rune.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/sparse_array.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/sparse_set.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/strutil.cc' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/strutil.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/utf.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:util/util.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/filtered_re2.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/re2.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/set.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/.cache/bazel/_bazel_root/501e9c7e600bb5ec9e98458625ea98f0/external/com_googlesource_code_re2/BUILD:11:1: Target '@com_googlesource_code_re2//:re2/stringpiece.h' contains an error and its package is in error and referenced by '@com_googlesource_code_re2//:re2'
(09:27:18) ERROR: /opt/app/deepvariant/deepvariant/testing/BUILD:19:1: Target '@com_googlesource_code_re2//:re2' contains an error and its package is in error and referenced by '//deepvariant/testing:gunit_extras'
(09:27:18) ERROR: Analysis of target '//deepvariant/testing:gunit_extras_test' failed; build aborted: Loading failed
(09:27:18) INFO: Elapsed time: 14.618s
(09:27:18) FAILED: Build did NOT complete successfully (48 packages loaded)
(09:27:18) ERROR: Couldn't start the build. Unable to run tests

Could anyone shed some light on this issue? Interestingly this was working a few days ago but possibly on a different host. Could it be hardware dependent?

Tensorflow .whl is not installing during build

Issue

When running the build-prereq shell script, I'm getting an error when the Tensorflow install begins.

Error message

Installing Google Cloud Platform optimized CPU-only TensorFlow wheel
Copying gs://deepvariant/packages/tensorflow/tensorflow-1.4.1.deepvariant_gcp-cp27-none-linux_x86_64.whl...
- [1 files][ 41.1 MiB/ 41.1 MiB]    1.0 MiB/s                                   
Operation completed over 1 objects/41.1 MiB.                                     
tensorflow-1.4.1.deepvariant_gcp-cp27-none-linux_x86_64.whl is not a supported wheel on this platform.

Debugging efforts

After browsing around a bit, I discovered that this issue was solved for some through installing the .whl separately. So, I download the whl from the GCloud bucket and executed sudo python2.7 pip install <name of .whl file> through the terminal. It ran, just to tell me โ€œ.dist-info directory not foundโ€.

I think this might be due to some inconsistency in the packages installed through the build-prereq.sh script, because I can see that all the packages that it installed (e.g. numpy) are for Python 3.5, but the Tensorflow version it's trying to get is for cp27 (Python 2.7).

Not sure about where to go from here, would love some assistance :).

System details

OS: Ubuntu 16.04 LTS
Python interpreters: Default with Ubuntu (2.7 and 3.5.2)
Deep Variant version: Installed it today from the main repo, so probably r0.4.1

Thank you

Cannot install DeepVariant

I have try to install the DeepVariant, but failed. I cannot install the GoogleCloud at first, and then cannot also install the software. I come from China. Any solutions to the problem. I hope you can give some help. Thank you.

I downloaded a versioned archive for Cloud SDK. When installing googlecloud, some module error pops.
File "xxxxx/install.py", line 8, in import bootstrapping
File "xxxxx/install.py", line 9, in import setup
File "xxxxx/install.py", line 38, in from googlecloudsdk.core.util import platforms
ImportError: No module named googlecloudsdk.core.util
I also tried the apt-get install, however due to network striction, I cannot install it by online command.

Does it build with Python3?

On Ubuntu 16.04 LTS, when I tried to build it from source with Python 3.6.2, it failed to compile bazel-out/k8-py3-opt/genfiles/deepvariant/core/python/hts_verbose.cc, and the error was hts_verbose.cc:134:143: error: 'Py_InitModule3' was not declared in this scope. After some investigations and it seems to be an incompatible issue with Python 3.

The relatively full stack is here:

(14:05:26) ERROR: xx/git/deepvariant/deepvariant/core/python/BUILD:174:1: C++ compilation of rule '//deepvariant/core/python:hts_verbose_cclib' failed (Exit 1): gcc failed: error executing command
(cd xx/.cache/bazel/xx/7e4d04a878642732d9b8bb40a634229e/execroot/genomics &&
exec env -
PWD=/proc/self/cwd
PYTHON_BIN_PATH=xx/anaconda/envs/Python36/bin/python
PYTHON_LIB_PATH=xx/anaconda/envs/Python36/lib/python3.6/site-packages
TF_NEED_CUDA=0
TF_NEED_OPENCL_SYCL=0
/usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections -fdata-sections -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK -Wno-maybe-uninitialized -Wno-unused-function -msse4.1 -msse4.2 -mavx -O3 '-std=c++0x' -MD -MF bazel-out/k8-py3-opt/bin/deepvariant/core/python/objs/hts_verbose_cclib/deepvariant/core/python/hts_verbose.d '-frandom-seed=bazel-out/k8-py3-opt/bin/deepvariant/core/python/objs/hts_verbose_cclib/deepvariant/core/python/hts_verbose.o' -iquote . -iquote bazel-out/k8-py3-opt/genfiles -iquote external/htslib -iquote bazel-out/k8-py3-opt/genfiles/external/htslib -iquote external/bazel_tools -iquote bazel-out/k8-py3-opt/genfiles/external/bazel_tools -iquote external/clif -iquote bazel-out/k8-py3-opt/genfiles/external/clif -iquote external/local_config_python -iquote bazel-out/k8-py3-opt/genfiles/external/local_config_python -iquote external/protobuf_archive -iquote bazel-out/k8-py3-opt/genfiles/external/protobuf_archive -isystem external/htslib/htslib/htslib_1_6 -isystem bazel-out/k8-py3-opt/genfiles/external/htslib/htslib/htslib_1_6 -isystem external/htslib -isystem bazel-out/k8-py3-opt/genfiles/external/htslib -isystem external/bazel_tools/tools/cpp/gcc3 -isystem external/local_config_python/python_include -isystem bazel-out/k8-py3-opt/genfiles/external/local_config_python/python_include -isystem external/protobuf_archive/src -isystem bazel-out/k8-py3-opt/genfiles/external/protobuf_archive/src '-std=c++11' -fno-canonical-system-headers -Wno-builtin-macro-redefined '-D__DATE="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -c bazel-out/k8-py3-opt/genfiles/deepvariant/core/python/hts_verbose.cc -o bazel-out/k8-py3-opt/bin/deepvariant/core/python/_objs/hts_verbose_cclib/deepvariant/core/python/hts_verbose.o)
bazel-out/k8-py3-opt/genfiles/deepvariant/core/python/hts_verbose.cc: In function 'PyObject* deepvariant_core_python_hts__verbose_clifwrap::Init()':
bazel-out/k8-py3-opt/genfiles/deepvariant/core/python/hts_verbose.cc:134:143: error: 'Py_InitModule3' was not declared in this scope
PyObject* module = Py_InitModule3("deepvariant.core.python.hts_verbose", Methods, "CLIF-generated module for deepvariant/core/hts_verbose.h");
^
(14:05:26) INFO: Elapsed time: 4.048s, Critical Path: 1.26s

error in make_examples extract_sample_name_from_reads

Running make_examples locally from the Docker container. It runs without errors on the NA12878_S1.chr20.10_10p1mb.bam dataset, but fails with my BAM file. See the stack trace below. My BAM file is too large to attach here. What do you recommend? Also what are the allowed values for --logging_level? Please advise.

# ./opt/deepvariant/bin/make_examples --logging_level DEBUG --mode calling --ref /dv2/reference/CFSAN000189.fasta --reads /dv2/samples/CFSAN000211/reads.sorted.bam --examples output.examples.tfrecord
WARNING: Logging before flag parsing goes to stderr.
I1228 21:10:23.407845 140668049200896 client.py:1004] Timeout attempting to reach GCE metadata service.
W1228 21:10:23.408325 140668049200896 htslib_gcp_oauth.py:88] GCP credentials not found; only local files and public gs:// URIs will be accessible from htslib
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_9tjOWl/runfiles/genomics/deepvariant/make_examples.py", line 1015, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "/tmp/Bazel.runfiles_9tjOWl/runfiles/genomics/deepvariant/make_examples.py", line 969, in main
    options = default_options(add_flags=True, flags=FLAGS)
  File "/tmp/Bazel.runfiles_9tjOWl/runfiles/genomics/deepvariant/make_examples.py", line 207, in default_options
    sample_name = extract_sample_name_from_reads(flags.reads)
  File "/tmp/Bazel.runfiles_9tjOWl/runfiles/genomics/deepvariant/make_examples.py", line 406, in extract_sample_name_from_reads
    raise ValueError('Expected a single sample, found {}'.format(samples))
ValueError: Expected a single sample, found set([])

Error running example with hg19 genome

Hi,

I am getting an error at postprocess variants.

The command executed is:

/opt/deepvariant/bin/postprocess_variants     --ref "hg19.fa.gz"     --infile call_variants_output.tfrecord     --outfile "NA12878_S1.chr20.10_10p1mb.bam.vcf"

And the ouput is:

  2018-03-06 11:34:21.456020: I deepvariant/postprocess_variants.cc:87] Read from: call_variants_output.tfrecord
  2018-03-06 11:34:21.457925: I deepvariant/postprocess_variants.cc:96] Done reading: call_variants_output.tfrecord. #entries in single_site_calls = 289
  2018-03-06 11:34:21.457943: I deepvariant/postprocess_variants.cc:100] Total #entries in single_site_calls = 289
  2018-03-06 11:34:21.457949: I deepvariant/postprocess_variants.cc:102] Start SortSingleSiteCalls
  2018-03-06 11:34:21.457957: F deepvariant/core/utils.cc:84] Check failed: pos_in_fasta != contig_name_to_pos_in_fasta.end() Reference name chr20 not in contig info.

Any idea why I cannot change the genome to run the example?

Recommendations for Index/fai and reads Alignement/bam

Dear Developers, this project sounds awesome and thanks for opening up to the whole community. I have a de-novo assembly of a bacterial strain against whom i would like to align another set of an assembly representing different strains and then hopefully be able to use your pipeline. Wondering what recommendations you have to getting those fai and bam files, tools, settings etc... ?

@clif not find!

my dear:

when I wos running ./build_and_test.sh,it print
`+ source settings.sh
++ export TF_CUDA_CLANG=0
++ TF_CUDA_CLANG=0
++ export TF_ENABLE_XLA=0
++ TF_ENABLE_XLA=0
++ export TF_NEED_CUDA=0
++ TF_NEED_CUDA=0
++ export TF_NEED_GCP=1
++ TF_NEED_GCP=1
++ export TF_NEED_GDR=0
++ TF_NEED_GDR=0
++ export TF_NEED_HDFS=0
++ TF_NEED_HDFS=0
++ export TF_NEED_JEMALLOC=0
++ TF_NEED_JEMALLOC=0
++ export TF_NEED_MKL=0
++ TF_NEED_MKL=0
++ export TF_NEED_MPI=0
++ TF_NEED_MPI=0
++ export TF_NEED_OPENCL=0
++ TF_NEED_OPENCL=0
++ export TF_NEED_OPENCL_SYCL=0
++ TF_NEED_OPENCL_SYCL=0
++ export TF_NEED_S3=0
++ TF_NEED_S3=0
++ export TF_NEED_VERBS=0
++ TF_NEED_VERBS=0
++ export TF_CUDA_VERSION=8.0
++ TF_CUDA_VERSION=8.0
++ export CUDA_TOOLKIT_PATH=/usr/local/cuda
++ CUDA_TOOLKIT_PATH=/usr/local/cuda
++ export TF_CUDNN_VERSION=6
++ TF_CUDNN_VERSION=6
++ export CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
++ CUDNN_INSTALL_PATH=/usr/lib/x86_64-linux-gnu
++ export DEEPVARIANT_BUCKET=gs://deepvariant
++ DEEPVARIANT_BUCKET=gs://deepvariant
++ export DV_PACKAGE_BUCKET_PATH=gs://deepvariant/packages
++ DV_PACKAGE_BUCKET_PATH=gs://deepvariant/packages
++ export DV_GPU_BUILD=0
++ DV_GPU_BUILD=0
++ export DV_USE_GCP_OPTIMIZED_TF_WHL=1
++ DV_USE_GCP_OPTIMIZED_TF_WHL=1
++ export GCP_OPTIMIZED_TF_WHL_FILENAME=tensorflow-1.4.1.deepvariant_gcp-cp27-none-linux_x86_64.whl
++ GCP_OPTIMIZED_TF_WHL_FILENAME=tensorflow-1.4.1.deepvariant_gcp-cp27-none-linux_x86_64.whl
++ export GCP_OPTIMIZED_TF_WHL_PATH=gs://deepvariant/packages/tensorflow
++ GCP_OPTIMIZED_TF_WHL_PATH=gs://deepvariant/packages/tensorflow
++ export DV_TF_NIGHTLY_BUILD=0
++ DV_TF_NIGHTLY_BUILD=0
++ export DV_INSTALL_GPU_DRIVERS=0
++ DV_INSTALL_GPU_DRIVERS=0
+++ which python
++ export PYTHON_BIN_PATH=/home/huangl/publib/bin/python
++ PYTHON_BIN_PATH=/home/huangl/publib/bin/python
++ export USE_DEFAULT_PYTHON_LIB_PATH=1
++ USE_DEFAULT_PYTHON_LIB_PATH=1
++ export 'DV_COPT_FLAGS=--copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3'
++ DV_COPT_FLAGS='--copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3'
++ export DV_TENSORFLOW_GIT_SHA=ab0fcaceda001825654424bf18e8a8e0f8d39df2
++ DV_TENSORFLOW_GIT_SHA=ab0fcaceda001825654424bf18e8a8e0f8d39df2

  • [[ 0 = \1 ]]
  • bazel test -c opt --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3 deepvariant/...
    (08:09:38) INFO: Current date is 2017-12-08
    (08:09:38) WARNING: /home/huangl/.cache/bazel/_bazel_huangl/008c6ca154d923f28d39cff9fad40a7f/external/org_tensorflow/tensorflow/core/BUILD:1806:1: in includes attribute of cc_library rule @org_tensorflow//tensorflow/core:framework_headers_lib: '../../../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'external/org_tensorflow/tensorflow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /home/huangl/.cache/bazel/_bazel_huangl/008c6ca154d923f28d39cff9fad40a7f/external/org_tensorflow/tensorflow/tensorflow.bzl:1100:30
    (08:09:38) INFO: Analysed 241 targets (0 packages loaded).
    (08:09:38) INFO: Found 185 targets and 56 test targets...
    (08:09:38) ERROR: missing input file '@clif//:clif/bin/pyclif_proto'
    (08:09:38) ERROR: /home/huangl/biotools/deepvariant/deepvariant/core/protos/BUILD:32:1: //deepvariant/core/protos:core_pyclif_clif_rule: missing input file '@clif//:clif/bin/pyclif_proto'
    (08:09:38) ERROR: /home/huangl/biotools/deepvariant/deepvariant/core/protos/BUILD:32:1 1 input file(s) do not exist
    (08:09:38) INFO: Elapsed time: 0.334s, Critical Path: 0.00s
    (08:09:38) FAILED: Build did NOT complete successfully
    //deepvariant:allelecounter_test NO STATUS
    //deepvariant:call_variants_test NO STATUS
    //deepvariant:data_providers_test NO STATUS
    //deepvariant:make_examples_test NO STATUS
    //deepvariant:model_eval_test NO STATUS
    //deepvariant:model_train_test NO STATUS
    //deepvariant:modeling_test NO STATUS
    //deepvariant:pileup_image_test NO STATUS
    //deepvariant:postprocess_variants_lib_test NO STATUS
    //deepvariant:postprocess_variants_test NO STATUS
    //deepvariant:tf_utils_test NO STATUS
    //deepvariant:utils_test NO STATUS
    //deepvariant:variant_caller_test NO STATUS
    //deepvariant:variant_calling_test NO STATUS
    //deepvariant:variant_labeler_test NO STATUS
    //deepvariant/core:cigar_test NO STATUS
    //deepvariant/core:cpp_math_test NO STATUS
    //deepvariant/core:cpp_utils_test NO STATUS
    //deepvariant/core:errors_test NO STATUS
    //deepvariant/core:genomics_io_gcs_test NO STATUS
    //deepvariant/core:genomics_io_noplugin_test NO STATUS
    //deepvariant/core:genomics_io_test NO STATUS
    //deepvariant/core:hts_test NO STATUS
    //deepvariant/core:hts_verbose_test NO STATUS
    //deepvariant/core:io_utils_test NO STATUS
    //deepvariant/core:py_math_test NO STATUS
    //deepvariant/core:py_utils_test NO STATUS
    //deepvariant/core:ranges_test NO STATUS
    //deepvariant/core:reader_base_test NO STATUS
    //deepvariant/core:reference_fai_test NO STATUS
    //deepvariant/core:sam_reader_test NO STATUS
    //deepvariant/core:samplers_test NO STATUS
    //deepvariant/core:variantutils_test NO STATUS
    //deepvariant/core:vcf_reader_test NO STATUS
    //deepvariant/core:vcf_writer_test NO STATUS
    //deepvariant/core/python:hts_verbose_test NO STATUS
    //deepvariant/core/python:math_wrap_test NO STATUS
    //deepvariant/core/python:reference_wrap_test NO STATUS
    //deepvariant/core/python:sam_reader_wrap_test NO STATUS
    //deepvariant/core/python:vcf_reader_wrap_test NO STATUS
    //deepvariant/core/python:vcf_writer_wrap_test NO STATUS
    //deepvariant/environment_tests:env_smoke_test NO STATUS
    //deepvariant/environment_tests:protobuf_implementation_test NO STATUS
    //deepvariant/python:allelecounter_wrap_test NO STATUS
    //deepvariant/python:variant_calling_wrap_test NO STATUS
    //deepvariant/realigner:aligner_test NO STATUS
    //deepvariant/realigner:realigner_test NO STATUS
    //deepvariant/realigner:ssw_test NO STATUS
    //deepvariant/realigner:window_selector_test NO STATUS
    //deepvariant/realigner/python:debruijn_graph_wrap_test NO STATUS
    //deepvariant/realigner/python:ssw_misc_test NO STATUS
    //deepvariant/realigner/python:ssw_wrap_test NO STATUS
    //deepvariant/testing:gunit_extras_test NO STATUS
    //deepvariant/vendor:statusor_test NO STATUS
    //deepvariant/vendor:timer_test NO STATUS
    //deepvariant/vendor/python:statusor_examples_test NO STATUS
    `
    how to fix it ? I am not a root users

Thanks

Documentation: Incorrect hyperlinks in deepvariant-docker.md

Documentation on how to use DeepVariant without Google Cloud

Hi,

All documentation linked from the README.md mentions Google Cloud platform, but I've found no indication on how to use DeepVariant on a local computer only. Is it because Google Cloud is mandatory (then it would be great to mention it right from the beginning of the documentation)? Or would it be possible to add a quickstart for non-Google Cloud usage?

Thanks,

Expand default exclude contigs for GRCh38

Looking through the code, it appears the default contigs for exclusion are based on hs37d5. Could this exclude list be expanded to include the appropriate GRCh38 contigs?

Also why is the mitochondria is on the default contig exclusion list? Is there a technical/algorithm issue trying to run deepvariant on the MT contig?

exclude_contigs=[
          # The two canonical names for the contig representing the human
          # mitochondrial sequence.
          'chrM',
          'MT',
          # From hs37d5.
          # (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/README_human_reference_20110707)  # pylint:disable=line-too-long
          'GL000207.1',
          'GL000226.1',
          'GL000229.1',
          'GL000231.1',
          'GL000210.1',

Improving pre-built DeepVariant binaries for conda packages

Hi all;
Thanks for all the help getting an initial conda package in place for DeepVariant (#9) through bioconda.

I wanted to follow up with some suggestions that would help make the pre-built binaries more portable as part of this process, in order of helpfulness for portability:

An alternative to points 1 and 3 is making it easier to build DeepVariant as part of the conda build process. The major blocker here is the clif dependency which is difficult to build and the pre-built binaries require unpacking into /usr. If we could make this relocatable and easier to install globally we could build with portable binaries and adjustable numpy as part of the bioconda preparation process.

Thanks again for all the help.

Generalized performance analysis between the versions

Hi Mark (@depristo),

Sorry, I meant to put this together a while ago - regarding #27 (comment) - but got a bit swamped with a research deadline. In any case, this is purely for intellectual curiosity and discussion. Regarding the first point, where differences in allocated CPUs might be the cause for the timing, that could be remedied by specifying a minimal CPU requirement, as noted here:

https://cloud.google.com/compute/docs/instances/specify-min-cpu-platform

So to control for the variability in the test, the two options are either: a) to set the --min-cpu-platform setting to the maximum available ("Intel Sandy Bridge"), or b) to keep requesting and canceling instances until the desired one is allocated on which all tests should be performed, thus satisfying consistency.

Just as a quick inspection, by looking at the CPU cycles utilization, I just ran a performance analysis of 0.4 and 0.5.1 on make_examples - since it displayed the initial discrepancy - and there seem to be some slight increases in 0.5.1, which might cumulatively affect things. In any case, below is the top of the call-graph of percent utilization by method (per version):

DV 0.4

# Samples: 186K of event 'cpu-clock'
# Event count (approx.): 46604750000
#
# Children,    Self,Command      ,Shared Object                  ,Symbol                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
 50.33% , 8.80%  ,python       ,python2.7                      ,[.] PyEval_EvalFrameEx
            |          
            |--42.49%--PyEval_EvalFrameEx
            |          |          
            |          |--30.79%--deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
            |          |          |          
            |          |           --30.34%--StripedSmithWaterman::Aligner::Align
            |          |                     |          
            |          |                     |--27.87%--ssw_align
            |          |                     |          |          
            |          |                     |          |--14.65%--sw_sse2_word
            |          |                     |          |          
            |          |                     |          |--8.32%--sw_sse2_byte
            |          |                     |          |          
            |          |                     |          |--2.91%--banded_sw
            |          |                     |          |          
            |          |                     |           --1.19%--__memcpy_sse2_unaligned
            |          |                     |          
            |          |                      --1.36%--ssw_init
            |          |                                |          
            |          |                                 --0.89%--qP_byte
            |          |          
            |          |--3.30%--deepvariant_realigner_python_debruijn__graph_clifwrap::wrapBuild_as_build
            |          |          |          
            |          |           --3.04%--learning::genomics::deepvariant::DeBruijnGraph::Build
            |          |                     |          
            |          |                      --2.73%--learning::genomics::deepvariant::DeBruijnGraph::DeBruijnGraph
            |          |                                |          
            |          |                                 --2.41%--learning::genomics::deepvariant::DeBruijnGraph::AddEdgesForRead
            |          |                                           |          
            |          |                                            --1.75%--learning::genomics::deepvariant::DeBruijnGraph::AddEdge
            |          |                                                      |          
            |          |                                                       --1.46%--learning::genomics::deepvariant::DeBruijnGraph::EnsureVertex
            |          |                                                                 |          
            |          |                                                                  --0.50%--std::_Hashtable<tensorflow::StringPiece, std::pair<tensorflow::StringPiece const, void*>, std::allocator<std::pair<tensorflow::StringPiece const, void*> >, std::__detail::_Select1st, std::equal_to<tensorflow::StringPiece>, tensorflow::StringPieceHasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node
            |          |          
            |          |--3.05%--google::protobuf::python::cmessage::GetAttr
            |          |          |          
            |          |          |--1.07%--google::protobuf::python::cmessage::InternalGetScalar
            |          |          |          
            |          |           --0.63%--google::protobuf::Descriptor::FindFieldByName
            |          |          
            |          |--0.70%--deepvariant_python_allelecounter_clifwrap::pyAlleleCounter::wrapAdd_as_add
            |          |          
            |          |--0.59%--google::protobuf::python::cmessage::DeepCopy
            |          |          |          
            |          |           --0.58%--google::protobuf::python::cmessage::MergeFrom
            |          |                     |          
            |          |                      --0.57%--google::protobuf::Message::MergeFrom
            |          |                                |          
            |          |                                 --0.54%--google::protobuf::internal::ReflectionOps::Merge
            |          |          
            |           --0.59%--deepvariant_core_python_sam__reader_clifwrap::pySamIterable::wrapNext
            |          
            |--1.84%--0x903b40
            |          |          
            |           --1.71%--PyEval_EvalFrameEx
            |          
            |--1.06%--0x905d60
            |          |          
            |           --1.06%--PyEval_EvalFrameEx
            |          
            |--0.76%--0x8fecc0
            |          PyEval_EvalFrameEx
            |          
            |--0.59%--0x905200
            |          |          
            |           --0.59%--PyEval_EvalFrameEx
            |          
             --0.51%--0x9060a0
                       |          
                        --0.51%--PyEval_EvalFrameEx

 32.95% , 0.00%  ,python       ,[unknown]                      ,[.] 0x00000000009060a0
            |
            ---0x9060a0
               |          
                --32.10%--PyEval_EvalFrameEx
                          |          
                           --30.79%--deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                                     |          
                                      --30.34%--StripedSmithWaterman::Aligner::Align
                                                |          
                                                |--27.87%--ssw_align
                                                |          |          
                                                |          |--14.65%--sw_sse2_word
                                                |          |          
                                                |          |--8.32%--sw_sse2_byte
                                                |          |          
                                                |          |--2.91%--banded_sw
                                                |          |          
                                                |           --1.19%--__memcpy_sse2_unaligned
                                                |          
                                                 --1.36%--ssw_init
                                                           |          
                                                            --0.89%--qP_byte

 30.81% , 0.07%  ,python       ,libssw_cclib.so                ,[.] deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
            |          
             --30.74%--deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                       |          
                        --30.34%--StripedSmithWaterman::Aligner::Align
                                  |          
                                  |--27.87%--ssw_align
                                  |          |          
                                  |          |--14.65%--sw_sse2_word
                                  |          |          
                                  |          |--8.32%--sw_sse2_byte
                                  |          |          
                                  |          |--2.91%--banded_sw
                                  |          |          
                                  |           --1.19%--__memcpy_sse2_unaligned
                                  |          
                                   --1.36%--ssw_init
                                             |          
                                              --0.89%--qP_byte

 30.36% , 0.04%  ,python       ,libssw_cpp.so                  ,[.] StripedSmithWaterman::Aligner::Align
            |          
             --30.32%--StripedSmithWaterman::Aligner::Align
                       |          
                       |--27.87%--ssw_align
                       |          |          
                       |          |--14.65%--sw_sse2_word
                       |          |          
                       |          |--8.32%--sw_sse2_byte
                       |          |          
                       |          |--2.91%--banded_sw
                       |          |          
                       |           --1.19%--__memcpy_sse2_unaligned
                       |          
                        --1.36%--ssw_init
                                  |          
                                   --0.89%--qP_byte

 27.87% , 0.05%  ,python       ,libssw.so                      ,[.] ssw_align
            |          
             --27.82%--ssw_align
                       |          
                       |--14.65%--sw_sse2_word
                       |          
                       |--8.32%--sw_sse2_byte
                       |          
                       |--2.91%--banded_sw
                       |          
                        --1.19%--__memcpy_sse2_unaligned

 14.65% , 14.62% ,python       ,libssw.so                      ,[.] sw_sse2_word
            |          
             --14.62%--0x9060a0
                       PyEval_EvalFrameEx
                       deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                       StripedSmithWaterman::Aligner::Align
                       |          
                        --14.62%--ssw_align
                                  sw_sse2_word

 8.32%  , 8.31%  ,python       ,libssw.so                      ,[.] sw_sse2_byte
            |          
             --8.31%--0x9060a0
                       PyEval_EvalFrameEx
                       deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                       StripedSmithWaterman::Aligner::Align
                       |          
                        --8.30%--ssw_align
                                  sw_sse2_byte

 4.51%  , 0.00%  ,python       ,[unknown]                      ,[.] 0x00000000009063e0
            |
            ---0x9063e0
               |          
                --3.66%--PyEval_EvalFrameEx
                          |          
                           --3.30%--deepvariant_realigner_python_debruijn__graph_clifwrap::wrapBuild_as_build
                                     |          
                                      --3.04%--learning::genomics::deepvariant::DeBruijnGraph::Build
                                                |          
                                                 --2.73%--learning::genomics::deepvariant::DeBruijnGraph::DeBruijnGraph
                                                           |          
                                                            --2.41%--learning::genomics::deepvariant::DeBruijnGraph::AddEdgesForRead
                                                                      |          
                                                                       --1.75%--learning::genomics::deepvariant::DeBruijnGraph::AddEdge
                                                                                 |          
                                                                                  --1.46%--learning::genomics::deepvariant::DeBruijnGraph::EnsureVertex
                                                                                            |          
                                                                                             --0.50%--std::_Hashtable<tensorflow::StringPiece, std::pair<tensorflow::StringPiece const, void*>, std::allocator<std::pair<tensorflow::StringPiece const, void*> >, std::__detail::_Select1st, std::equal_to<tensorflow::StringPiece>, tensorflow::StringPieceHasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node

DV 0.5.1

# Samples: 152K of event 'cpu-clock'
# Event count (approx.): 38010500000
#
# Children,    Self,Command      ,Shared Object                   ,Symbol                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
 51.45% , 9.13%  ,python       ,python2.7                       ,[.] PyEval_EvalFrameEx
            |          
            |--43.33%--PyEval_EvalFrameEx
            |          |          
            |          |--31.12%--deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
            |          |          |          
            |          |           --30.63%--StripedSmithWaterman::Aligner::Align
            |          |                     |          
            |          |                     |--28.27%--ssw_align
            |          |                     |          |          
            |          |                     |          |--14.88%--sw_sse2_word
            |          |                     |          |          
            |          |                     |          |--8.45%--sw_sse2_byte
            |          |                     |          |          
            |          |                     |          |--2.89%--banded_sw
            |          |                     |          |          
            |          |                     |           --1.19%--__memcpy_sse2_unaligned
            |          |                     |          
            |          |                      --1.38%--ssw_init
            |          |                                |          
            |          |                                 --0.92%--qP_byte
            |          |          
            |          |--3.57%--deepvariant_realigner_python_debruijn__graph_clifwrap::wrapBuild_as_build
            |          |          |          
            |          |           --3.32%--learning::genomics::deepvariant::DeBruijnGraph::Build
            |          |                     |          
            |          |                      --3.02%--learning::genomics::deepvariant::DeBruijnGraph::DeBruijnGraph
            |          |                                |          
            |          |                                 --2.63%--learning::genomics::deepvariant::DeBruijnGraph::AddEdgesForRead
            |          |                                           |          
            |          |                                            --1.89%--learning::genomics::deepvariant::DeBruijnGraph::AddEdge
            |          |                                                      |          
            |          |                                                       --1.60%--learning::genomics::deepvariant::DeBruijnGraph::EnsureVertex
            |          |                                                                 |          
            |          |                                                                  --0.56%--std::_Hashtable<tensorflow::StringPiece, std::pair<tensorflow::StringPiece const, void*>, std::allocator<std::pair<tensorflow::StringPiece const, void*> >, std::__detail::_Select1st, std::equal_to<tensorflow::StringPiece>, tensorflow::StringPieceHasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node
            |          |          
            |          |--3.16%--google::protobuf::python::cmessage::GetAttr
            |          |          |          
            |          |          |--1.12%--google::protobuf::python::cmessage::InternalGetScalar
            |          |          |          
            |          |           --0.58%--google::protobuf::Descriptor::FindFieldByName
            |          |          
            |          |--0.70%--deepvariant_python_allelecounter_clifwrap::pyAlleleCounter::wrapAdd_as_add
            |          |          
            |          |--0.62%--deepvariant_core_python_sam__reader_clifwrap::pySamIterable::wrapNext
            |          |          
            |           --0.57%--google::protobuf::python::cmessage::DeepCopy
            |                     |          
            |                      --0.56%--google::protobuf::python::cmessage::MergeFrom
            |                                |          
            |                                 --0.56%--google::protobuf::Message::MergeFrom
            |                                           |          
            |                                            --0.52%--google::protobuf::internal::ReflectionOps::Merge
            |          
            |--1.92%--0x903b40
            |          |          
            |           --1.78%--PyEval_EvalFrameEx
            |          
            |--1.09%--0x905d60
            |          |          
            |           --1.09%--PyEval_EvalFrameEx
            |          
            |--0.78%--0x8fecc0
            |          PyEval_EvalFrameEx
            |          
            |--0.62%--0x905200
            |          |          
            |           --0.62%--PyEval_EvalFrameEx
            |          
             --0.54%--0x9060a0
                       |          
                        --0.54%--PyEval_EvalFrameEx

 33.23% , 0.00%  ,python       ,[unknown]                       ,[.] 0x00000000009060a0
            |
            ---0x9060a0
               |          
                --32.46%--PyEval_EvalFrameEx
                          |          
                           --31.12%--deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                                     |          
                                      --30.63%--StripedSmithWaterman::Aligner::Align
                                                |          
                                                |--28.27%--ssw_align
                                                |          |          
                                                |          |--14.88%--sw_sse2_word
                                                |          |          
                                                |          |--8.45%--sw_sse2_byte
                                                |          |          
                                                |          |--2.89%--banded_sw
                                                |          |          
                                                |           --1.19%--__memcpy_sse2_unaligned
                                                |          
                                                 --1.38%--ssw_init
                                                           |          
                                                            --0.92%--qP_byte

 31.13% , 0.08%  ,python       ,libssw_cclib.so                 ,[.] deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
            |          
             --31.05%--deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                       |          
                        --30.63%--StripedSmithWaterman::Aligner::Align
                                  |          
                                  |--28.27%--ssw_align
                                  |          |          
                                  |          |--14.88%--sw_sse2_word
                                  |          |          
                                  |          |--8.45%--sw_sse2_byte
                                  |          |          
                                  |          |--2.89%--banded_sw
                                  |          |          
                                  |           --1.19%--__memcpy_sse2_unaligned
                                  |          
                                   --1.38%--ssw_init
                                             |          
                                              --0.92%--qP_byte

 30.64% , 0.03%  ,python       ,libssw_cpp.so                   ,[.] StripedSmithWaterman::Aligner::Align
            |          
             --30.61%--StripedSmithWaterman::Aligner::Align
                       |          
                       |--28.27%--ssw_align
                       |          |          
                       |          |--14.88%--sw_sse2_word
                       |          |          
                       |          |--8.45%--sw_sse2_byte
                       |          |          
                       |          |--2.89%--banded_sw
                       |          |          
                       |           --1.19%--__memcpy_sse2_unaligned
                       |          
                        --1.38%--ssw_init
                                  |          
                                   --0.92%--qP_byte

 28.27% , 0.04%  ,python       ,libssw.so                       ,[.] ssw_align
            |          
             --28.23%--ssw_align
                       |          
                       |--14.88%--sw_sse2_word
                       |          
                       |--8.45%--sw_sse2_byte
                       |          
                       |--2.89%--banded_sw
                       |          
                        --1.19%--__memcpy_sse2_unaligned

 14.88% , 14.86% ,python       ,libssw.so                       ,[.] sw_sse2_word
            |          
             --14.86%--0x9060a0
                       PyEval_EvalFrameEx
                       deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                       StripedSmithWaterman::Aligner::Align
                       |          
                        --14.86%--ssw_align
                                  sw_sse2_word

 8.45%  , 8.43%  ,python       ,libssw.so                       ,[.] sw_sse2_byte
            |          
             --8.43%--0x9060a0
                       PyEval_EvalFrameEx
                       deepvariant_realigner_python_ssw_clifwrap::pyAligner::wrapAlign_as_align
                       StripedSmithWaterman::Aligner::Align
                       |          
                        --8.43%--ssw_align
                                  sw_sse2_byte

 4.72%  , 0.00%  ,python       ,[unknown]                       ,[.] 0x00000000009063e0
            |
            ---0x9063e0
               |          
                --3.94%--PyEval_EvalFrameEx
                          |          
                           --3.57%--deepvariant_realigner_python_debruijn__graph_clifwrap::wrapBuild_as_build
                                     |          
                                      --3.32%--learning::genomics::deepvariant::DeBruijnGraph::Build
                                                |          
                                                 --3.02%--learning::genomics::deepvariant::DeBruijnGraph::DeBruijnGraph
                                                           |          
                                                            --2.63%--learning::genomics::deepvariant::DeBruijnGraph::AddEdgesForRead
                                                                      |          
                                                                       --1.89%--learning::genomics::deepvariant::DeBruijnGraph::AddEdge
                                                                                 |          
                                                                                  --1.60%--learning::genomics::deepvariant::DeBruijnGraph::EnsureVertex
                                                                                            |          
                                                                                             --0.56%--std::_Hashtable<tensorflow::StringPiece, std::pair<tensorflow::StringPiece const, void*>, std::allocator<std::pair<tensorflow::StringPiece const, void*> >, std::__detail::_Select1st, std::equal_to<tensorflow::StringPiece>, tensorflow::StringPieceHasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node

To do this properly would require that the tests be performed on different datasets, and different CPUs on the same Cloud environment - with different distributed scenarios - which would be cost-prohibitive for me.

Hope it helps and have a great weekend!
Paul

Runtime issues while using docker image

Trying to run but come across the following error

docker run -it -v $PWD/input:/dv2/input -v $PWD/models:/dv2/models \

gcr.io/deepvariant-docker/deepvariant:$IMAGE_VERSION

Unable to find image 'gcr.io/deepvariant-docker/deepvariant:0.4.0' locally
0.4.0: Pulling from deepvariant-docker/deepvariant
Digest: sha256:72d3bd936dfbfbb707e648d7e6f0f8fb4318eb115aad0bfde9b43ff05fef8f19
Status: Downloaded newer image for gcr.io/deepvariant-docker/deepvariant:0.4.0
root@720aed86585e:/#
root@720aed86585e:/# ./opt/deepvariant/bin/make_examples \

--mode calling \
--ref /dv2/input/ucsc.hg19.chr20.unittest.fasta.gz \
--reads /dv2/input/NA12878_S1.chr20.10_10p1mb.bam \
--examples output.examples.tfrecord \
--regions "chr20:10,000,000-10,010,000"

Traceback (most recent call last):
File "/tmp/Bazel.runfiles_PMLPk5/runfiles/genomics/deepvariant/make_examples.py", line 1015, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/tmp/Bazel.runfiles_PMLPk5/runfiles/genomics/deepvariant/make_examples.py", line 966, in main
htslib_gcp_oauth.init()
File "/tmp/Bazel.runfiles_PMLPk5/runfiles/genomics/deepvariant/core/htslib_gcp_oauth.py", line 79, in init
token = cloud_utils.oauth2_token()
File "/tmp/Bazel.runfiles_PMLPk5/runfiles/genomics/deepvariant/core/cloud_utils.py", line 58, in oauth2_token
credentials = oauth2_client.GoogleCredentials.get_application_default()
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 1271, in get_application_default
return GoogleCredentials._get_implicit_credentials()
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 1256, in _get_implicit_credentials
credentials = checker()
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 1187, in _implicit_credentials_from_gce
if not _in_gce_environment():
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 1042, in _in_gce_environment
if NO_GCE_CHECK != 'True' and _detect_gce_environment():
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 999, in _detect_gce_environment
http, _GCE_METADATA_URI, headers=_GCE_HEADERS)
File "/usr/local/lib/python2.7/dist-packages/oauth2client/transport.py", line 282, in request
connection_type=connection_type)
File "/usr/local/lib/python2.7/dist-packages/httplib2/init.py", line 1659, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File "/usr/local/lib/python2.7/dist-packages/httplib2/init.py", line 1399, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/usr/local/lib/python2.7/dist-packages/httplib2/init.py", line 1355, in _conn_request
response = conn.getresponse()
File "/usr/lib/python2.7/httplib.py", line 1123, in getresponse
raise ResponseNotReady()
httplib.ResponseNotReady
root@720aed86585e:/#

Build and test works, binaries do not

I'm unfamiliar with the bazel build environment. After a successful build with tensorflow-gpu, all tests passed, but on attempting to run a binary from bazel-bin I see

2018-02-05 11:14:37.628020: I tensorflow/core/platform/s3/aws_logging.cc:53] Initializing Curl library
Traceback (most recent call last):
File "/home2/bradBuild/deepvariant/bazel-bin/deepvariant/make_examples.runfiles/genomics/deepvariant/make_examples.py", line 1105, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 118, in run
argv = flags.FLAGS(_sys.argv if argv is None else argv, known_only=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/flags.py", line 112, in call
return self.dict['__wrapped'].call(*args, **kwargs)
TypeError: call() got an unexpected keyword argument 'known_only'

A quick start guide issue

I was trying to follow the quick start guide. While running the run-prereq.sh file, I got
========== Load config settings.
========== [Sun Jan 28 13:47:50 EST 2018] Stage 'Misc setup' starting
========== [Sun Jan 28 13:47:50 EST 2018] Stage 'Update package list' starting
sudo: apt-get: command not found
Then, I realize it is because I am running it on mac. Is there any quick fix to this problem?

conda build

Building deepvariant seems very hard, if you can put deepvariant onto anaconda cloud, then it'll be very easy for us to install.

Retrain model on GCP

Hi,

I have some thoughts on the creating-pileup-image part and really want to try it out. Since I don't have the GPU resources, I followed the Getting Started with GCP guide.

Q1: How could I retrain my modified forked version of deep variant using gcloud?

Should I change the last parameter in this command?
gcloud beta compute instances create "${USER}-deepvariant-quickstart"

Q2: How could I prepare the trianing data?

You mentioned running make_examples in training mode to get the data in the Training DeepVariant models Guide. How do I "providing the --confident_regions and --truth_variants arguments"? These two terms are too biological for me to understand completely.

Really appreciated if you can provide some help.

Thank you

include_next "loophole"

I tried to build deepvariant on a local ubuntu server.
With GCP support turned off, so far I am stuck with an error after build_and_test.sh

(13:58:12) ERROR: /root/deepvariant/deepvariant/core/python/BUILD:174:1: CLIF wrapping deepvariant/core/python/hts_verbose.clif failed (Exit 4): pyclif failed: error execut ing command (cd /root/.cache/bazel/_bazel_root/8422bf851bfac3671a35809acde131a7/execroot/genomics && \ exec env - \ bazel-out/host/bin/external/clif/pyclif --modname deepvariant.core.python.hts_verbose -c bazel-out/k8-opt/genfiles/deepvariant/core/python/hts_verbose.cc -g bazel-out/k8- opt/genfiles/deepvariant/core/python/hts_verbose.h -i bazel-out/k8-opt/genfiles/deepvariant/core/python/hts_verbose_init.cc --prepend /root/opt/clif/python/types.h -Iextern al/protobuf_archive -Ibazel-out/k8-opt/genfiles -Ibazel-out/k8-opt/genfiles/external/local_config_python -Iexternal/htslib -Ibazel-out/k8-opt/genfiles/external/htslib -I. - Iexternal/bazel_tools -Ibazel-out/k8-opt/genfiles/external/bazel_tools -Iexternal/htslib/htslib/htslib_1_6 -Ibazel-out/k8-opt/genfiles/external/htslib/htslib/htslib_1_6 -Ie xternal/bazel_tools/tools/cpp/gcc3 -Iexternal/clif -Ibazel-out/k8-opt/genfiles/external/clif -Iexternal/local_config_python -Ibazel-out/k8-opt/genfiles/external/protobuf_ar chive -Iexternal/local_config_python/python_include -Ibazel-out/k8-opt/genfiles/external/local_config_python/python_include -Iexternal/protobuf_archive/src -Ibazel-out/k8-o pt/genfiles/external/protobuf_archive/src '-f-Iexternal/protobuf_archive -Ibazel-out/k8-opt/genfiles -Ibazel-out/k8-opt/genfiles/external/local_config_python -Iexternal/hts lib -Ibazel-out/k8-opt/genfiles/external/htslib -I. -Iexternal/bazel_tools -Ibazel-out/k8-opt/genfiles/external/bazel_tools -Iexternal/htslib/htslib/htslib_1_6 -Ibazel-out/ k8-opt/genfiles/external/htslib/htslib/htslib_1_6 -Iexternal/bazel_tools/tools/cpp/gcc3 -Iexternal/clif -Ibazel-out/k8-opt/genfiles/external/clif -Iexternal/local_config_py thon -Ibazel-out/k8-opt/genfiles/external/protobuf_archive -Iexternal/local_config_python/python_include -Ibazel-out/k8-opt/genfiles/external/local_config_python/python_inc lude -Iexternal/protobuf_archive/src -Ibazel-out/k8-opt/genfiles/external/protobuf_archive/src -std=c++11' deepvariant/core/python/hts_verbose.clif) _BackendError: Matcher failed with status 1 In file included from /dev/stdin:1: In file included from /root/opt/clif/python/types.h:27: In file included from bazel-out/k8-opt/genfiles/external/local_config_python/python_include/Python.h:19: /usr/include/limits.h:123:16: fatal error: 'limits.h' file not found # include_next <limits.h>`

Before that, I adjust the "/deepvariant/third_party/clif.bzl" file to include a clif header which the same build command complained about.

clif.bzl: "--prepend", "/root/opt/clif/python/types.h"

The system is a fresh ubuntu16.4
Linux AnnoSpark 4.4.0-103-generic #126-Ubuntu SMP Mon Dec 4 16:23:28 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

My gcc include appears to be so
$gcc -xc -E -v -

/usr/lib/gcc/x86_64-linux-gnu/4.8/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed
/usr/include/x86_64-linux-gnu
/usr/include

clif build
commit 6c6d894a112d978bd5abfcab1052c60c5ee365a9

Any help or direction is deeply appreciated.
Dan

import error for GLIBCXX_3.4.21

When I run make-example.zip, the error shows up... Please fix it for me ;- )

ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /tmp/Bazel.runfiles_YSzuwd/runfiles/protobuf_archive/python/google/protob
uf/pyext/_message.so)

Shared library from the Exome Case Study requires AVX support -- not all users might have this

Hi,

I was going through the Exome Case Study, and noticed that I was not getting any TFRecord formatted files from the Run make_examples step. I then proceeded to dig deeper, and I'm listing my debugging steps here in case it might help others. The gist of it is that the common shared libraries that are part of the zip files (from the Google Storage location) are built with AVX-support, which not everyone might have support for with their CPU. It would be great if they were compiled with the bare-minimum of CPU qualities, to guarantee they will work on most users' machines. In any case, below is my analysis:

$ PYTHONPATH=. /usr/bin/python deepvariant/make_examples.py --mode calling --ref /home/paul/exome-case-study/input/data/hs37d5.fa.gz --reads /home/paul/exome-case-study/input/data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.bam --examples /home/paul/exome-case-study/output/[email protected] --regions /home/paul/exome-case-study/input/data/refseq.coding_exons.b37.extended50.bed --task 0
Illegal instruction (core dumped)
$
$ mkdir make-examples && cd make-examples
$ cd unzip ~/exome-case-study/input/bin/make_examples.zip
$ cd runfiles/genomics
$
$ PYTHONPATH=. /usr/bin/python deepvariant/make_examples.py --mode calling --ref /home/paul/exome-case-study/input/data/hs37d5.fa.gz --reads /home/paul/exome-case-study/input/data/151002_7001448_0359_AC7F6GANXX_Sample_HG002-EEogPU_v02-KIT-Av5_AGATGTAC_L008.posiSrt.markDup.bam --examples /home/paul/exome-case-study/output/[email protected] --regions /home/paul/exome-case-study/input/data/refseq.coding_exons.b37.extended50.bed --task 0
Illegal instruction (core dumped)
$

After digging a bit deeper, I noticed that loading the pileup_image_native module was causing this issue. I was curious and looked at the assembly instructions:

$ gdb -ex r --args python -c "from deepvariant.python import pileup_image_native"
Program received signal SIGILL, Illegal instruction.
0x00007ffff5d308b4 in google::protobuf::DescriptorPool::Tables::Tables() ()
   from /home/paul/make-examples/runfiles/genomics/deepvariant/python/../../_solib_k8/libexternal_Sprotobuf_Uarchive_Slibprotobuf.so
(gdb) disassemble $pc,$pc+32
Dump of assembler code from 0x7ffff5d308b4 to 0x7ffff5d308d4:
=> 0x00007ffff5d308b4 <_ZN6google8protobuf14DescriptorPool6TablesC2Ev+676>:     vpxor  %xmm0,%xmm0,%xmm0
   0x00007ffff5d308b8 <_ZN6google8protobuf14DescriptorPool6TablesC2Ev+680>:     lea    0x1b0(%rbx),%rax
   0x00007ffff5d308bf <_ZN6google8protobuf14DescriptorPool6TablesC2Ev+687>:     movl   $0x0,0x1b0(%rbx)
   0x00007ffff5d308c9 <_ZN6google8protobuf14DescriptorPool6TablesC2Ev+697>:     movq   $0x0,0x1b8(%rbx)
End of assembler dump.
(gdb)

I noticed the vpxor instruction, which made me wonder if my CPU is enabled for AVX, so I proceeded as follows:

$ grep flags /proc/cpuinfo
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm
$

This confirmed for me that I don't have AVX support. So it would be great for the examples that will drive usage and be used by many users to learn from, if the provided libraries are compiled with the bare-minimum of CPU qualities. I think it'll make it a bit easier for many users to adopt this nice pipeline.

Thanks,
Paul

Building from source failed tests that import `random`

On Ubuntu 16.04 TLS, I built it with Python 2.7 and tensorflow 1.4.1. It failed all tests that import random (e.g. deepvariant/deepvariant/core/cloud_utils_test.py). It turned out that in random.py, it imports math, and it mistakingly imports the math.py in /deepvariant/deepvariant/core/, instead of from the standard library. Is there a fix?

build_and_test.sh fail

Hi, I get a fail info when i try run the build_and_test.sh. So the fail info will affect my normal using?
image

the log info is
image

make_examples ERROR

Hello ๏ผ

image
The picture shows the error,how can I solve this problem?
Thank you !

Evaluation regions for HG002

I notice that the numbers for exome precision and recall are given based on Refseq. There isn't anything wrong with this approach.

If it is of use, (https://www.nature.com/articles/sdata201625) indicates that the HG002 exome was generated with Agilent SureSelect. We've taken the SureSelect v5 BED (agilent_sureselect_human_all_exon_v5_b37_targets.bed) and intersected it with the GIAB confident regions for our exome evaluations.

*edited: link apparently expired, see comment below for the full capture regions.

Docker run failed: command failed: /tmp/ggp-494856422: line 16: type: gsutil: not found\ndebconf

Dear All,

I am trying to run gcloud alpha genomics but have recurrently encountered the same issues about authentification and docker run.

The bash file for Deep Variant and error logs are below:
BASH file https://storage.googleapis.com/wgs-test-shan/test_samples/deepVariant.sh
YAML file https://storage.googleapis.com/wgs-test-shan/test_samples/deepvariant_wes_pipeline.yaml
LOG file https://storage.googleapis.com/wgs-test-shan/test_samples/runner_logs/ENjW7s2JLBjf3aql19nvyv8BIKeM6-b_FyoPcHJvZHVjdGlvblF1ZXVl-stderr.log

I have contacted Cloud support center and obtained suggestions as below. However this did not mend the problem. What is your suggestion?
https://enterprise.google.com/supportcenter/managecases#Case/001f200001TaEgT/U-14552728

Thank you.
I will appreciate your help.
Best,
Shan

Limit number of CPUs used by DeepVariant

I am using DeepVariant on a local machine with docker. When I start DeepVariant it uses all the CPU cores of my machine. Since I want to test DeepVariant without any time limitations and use the machine for other tasks I am curious whether it is possible to limit the number of threads or the number of CPUs used by DeepVariant.

train_model.zip error

When i run train_model.zip, i get a error:
image
Here is my command:

python /leostore/software/deepvariant/bazel-bin/deepvariant/make_examples.zip --dataset_config_pbtxt "/leostore/analysis/development/liteng/deepvariant_test/test_train.config.txt" --start_from_checkpoint inception_v3.ckpt

built_and_test.sh fails on ubuntu 16.04

The errors (part of them)

  • [[ 0 = \1 ]]
  • bazel test -c opt --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3 deepvariant/...
    (00:49:22) INFO: Current date is 2018-01-27
    (00:49:22) Loading:
    (00:49:22) Loading: 0 packages loaded
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:96:
    1: First argument of 'load' must be a label and start with either '//', ':', or '@'. Use --incompatible_load_argument_is_label=fals
    e to temporarily disable this check.
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:98:
    1: name 're2_test' is not defined (did you mean 'ios_test'?)
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:100
    :1: name 're2_test' is not defined (did you mean 'ios_test'?)
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:102
    :1: name 're2_test' is not defined (did you mean 'ios_test'?)
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:104
    :1: name 're2_test' is not defined (did you mean 'ios_test'?)
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:106
    :1: name 're2_test' is not defined (did you mean 'ios_test'?)
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:108
    :1: name 're2_test' is not defined (did you mean 'ios_test'?)
    (00:49:22) ERROR: /home//.cache/bazel/_bazel_ravi/74e2f34442216df8489f404815744088/external/com_googlesource_code_re2/BUILD:110
    :1: name 're2_test' is not defined (did you mean 'ios_test'?)

make_examples fails to process data within docker container

I've setup deepvariant via gsutil as described under https://github.com/google/deepvariant/blob/r0.4/docs/deepvariant-docker.md

When running the first step make_examples as indicated in the reference within the docker container, it runs without any complaints, and terminates after around a second. However it does not create an output file nor does it output any error, nor does it care if I provide invalid input file names.

The downstream tools call_variants and postprocess_variants behave similar, they run without error, but accept any input arguments (including invalid ones) and fail to create any output.

Any help would be appreciated.

deepvariant does not build from source

I am following the instructions under:
https://github.com/google/deepvariant/blob/r0.4/docs/deepvariant-build-test.md

  1. start GCE image : Ubuntu 16.04 with 100GB

git clone https://github.com/google/deepvariant
cd deepvariant
./build-prereq.sh
./build_and_test.sh

...
++ export DV_INSTALL_GPU_DRIVERS=0
++ DV_INSTALL_GPU_DRIVERS=0
+++ which python
++ export PYTHON_BIN_PATH=/usr/bin/python
++ PYTHON_BIN_PATH=/usr/bin/python
++ export USE_DEFAULT_PYTHON_LIB_PATH=1
++ USE_DEFAULT_PYTHON_LIB_PATH=1
++ export 'DV_COPT_FLAGS=--copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3'
++ DV_COPT_FLAGS='--copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3'
++ export DV_TENSORFLOW_GIT_SHA=ab0fcaceda001825654424bf18e8a8e0f8d39df2
++ DV_TENSORFLOW_GIT_SHA=ab0fcaceda001825654424bf18e8a8e0f8d39df2
+ [[ 0 = \1 ]]
+ bazel test -c opt --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-O3 deepvariant/...
(17:54:59) INFO: Current date is 2017-12-22
(17:55:18) ERROR: /home/<mypath>/0fcc5a420905d68918d80793ee59fab4/external/com_goo
glesource_code_re2/BUILD:96:1: First argument of 'load' must be a label and start
 with either '//', ':', or '@'. Us
e --incompatible_load_argument_is_label=false to temporarily disable this check.

...

(17:55:26) ERROR: Analysis of target '//deepvariant/testing:gunit_extras' failed; build aborted: Loading failed
(17:55:26) INFO: Elapsed time: 27.289s
(17:55:26) FAILED: Build did NOT complete successfully (50 packages loaded)
(17:55:26) ERROR: Couldn't start the build. Unable to run tests

Unable to run make_examples.zip when using a virtual environment

I would rather not install the DeepVariant dependencies into my global python environment. When the python dependencies are installed into a virtual environment, make_examples.zip cannot find tensorflow. The steps below show the error. Any suggestions?

$ mkvirtualenv -p /usr/bin/python2.7  DeepVariant.2.7
(DeepVariant.2.7) $ cd bin; bash run-prereq.sh; cd -

(DeepVariant.2.7) $ python bin/make_examples.zip   --mode calling     --ref "${REF}"     --reads "${BAM}"   --regions "chr20:10,000,000-10,010,000"   --examples "${OUTPUT_DIR}/examples.tfrecord.gz"
Traceback (most recent call last):
  File "/tmp/Bazel.runfiles_r1oZvM/runfiles/genomics/deepvariant/make_examples.py", line 38, in <module>
    import tensorflow as tf
ImportError: No module named tensorflow

Is it possible to post process a de novo assembly using deep variant?

The blog post on deep variant mentions:

a deep learning technology to reconstruct the true genome sequence from HTS sequencer data with significantly greater accuracy than previous classical methods

How could I reconstruct the true genome sequence? Would an example of this be using something like gatk with the generated VCF file to correct the assembly?

E.g. https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_fasta_FastaAlternateReferenceMaker.php

Binaries are not compatible with the models

I tried running DeepVariant-0.5.0+cl-183695032 on the quickstart dataset with the following command:

python call_variants.zip --dataset_config_p
btxt a.pbtxt --checkpoint DeepVariant-inception_v3-0.5.0+cl-182548131.data-wgs_standard/model.ckpt.index

However it exits with the following error:

KeyError: 'InceptionV3/Logits/Conv2d_1c_1x1/weights

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.