Git Product home page Git Product logo

bamgineer's Introduction

Bamgineer

Introduces simulated allele-specific copy number variants into exome and targeted sequence data sets

Author

Soroush Samadian

Maintainer

Suluxan Mohanraj [email protected]

Description

Bamgineer is a tool that can be used to introduce user-defined haplotype-phased allele-specific copy number variations (CNV) into an existing Binary Alignment Mapping (BAM) file with demonstrated applicability to simulate somatic cancer CNVs in phased whole-genome sequencing datsets. This is done by introducing new read pairs sampled from existing reads, thereby retaining biases of the original data such as local coverage, strand bias, and insert size. As input, Bamgineer requires a BAM file and a list of non-overlapping genomic coordinates to introduce allele-specific gains and losses. We implemented parallelization of the Bamgineer algorithm for both standalone and high performance computing cluster environments, significantly improving the scalability of the algorithm. Bamgineer has been extensively tested on phased, whole-genome sequencing samples.

Contact

If you have any questions with the package, please feel free to email Suluxan at [email protected].

Running example Bamgineer workflow with Docker

Please see bamgineer/docs/input_preparation for preparing your own files

CHR21 bam files for NA12878 10X can be found here:

https://drive.google.com/file/d/1km9gupGi7W6aUE9XqBsiamGnwTpDrGpZ/view?usp=sharing

Tested with Docker version 17.05.0-ce, build 89658be

docker pull suluxan/bamgineer-v2
git clone https://github.com/pughlab/bamgineer.git
cd bamgineer/docker-example
# download google drive file (link above) and move into this directory
tar xjf splitbams.tar.bz2 
# start of bamgineer command
docker run --rm \
-v $(pwd):/src \
-it suluxan/bamgineer-v2 \
-config /src/inputs/config.cfg \
-splitbamdir src/splitbams \
-cnv_bed /src/inputs/cnv.bed \
-vcf src/inputs/normal_het.vcf \
-exons src/inputs/exons.bed \
-outbam tumour.bam \
-results src/outputs \
-cancertype LUAC1 

Running without Docker:

    usage: python bamgineer/src/simulate.py 
  
    arguments:
       -config CONFIG_FILE,        configuration file including paths to tools/executables
       -splitdir SPLIT_BAMS_DIR,   input bam split by chromosomes
       -cnv_bed CNV_BED_FILE,      bed file containing non-overlapping CNVs following template
       -vcf FILTERED_VCF_FILE,     phased vcf file with indels removed 
       -exons COORDINATES_BED      bed file of whole genome coordinates or exons
       -outbam OUTPUT_BAM,         bam file name for output
       -results RESULTS_DIR,       output directory for final simulated bam results
       -cancertype CANCER_TYPE,    cancer type/acronym (OPTIONAL)

Prerequisites - NOTE: many dependencies are outdated but will be continuously updated throughout the bamgineer development, docker image and docker workflow outlined above is ideal for ease-of-use

General NGS tools

Samtools (version 1.2): samtools
Bedtools:bedtools
VCFtools:vcftools
BamUtil:bamutil

Python packages

pysam (version 0.8.4): pysam
Note: the latest version of pysam (0.9.0) is not backward compatible with Samtools1.2
pyVCF pyvcf
pyBedTools pybedtools

bamgineer's People

Contributors

quevedor2 avatar soroushsamadian avatar suluxan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bamgineer's Issues

Error in running run_example1.sh

Hi,
I am getting error while running run_example1.sh using example data set. I have recreated exons.bed as explained in input_preparation.md file and modified config file with appropriate path.

File structure:

|-- inputs
| |-- cnv.bed
| |-- config.cfg
| |-- config_ORI.cfg
| |-- exons.bed
| |-- exons_ORI.bed
| |-- normal_het.vcf
| -- normal_het.vcf.gz |-- outputs | |-- cnv_dir | |-- gain.bed
| |-- finalbams
| |-- haplotypedir
| | |-- bedtool.log
| | |-- chr21_exons_in_roigain.bed
| | |-- chr21_het_snpgain.bed
| | |-- exons_in_roigain.bed
| | -- het_snpgain.bed | |-- logs | |-- debug.log
| |-- phasedvcfdir
| | |-- hap1_het_filtered.bed
| | |-- hap1_het_filtered.log
| | |-- hap1_het_filtered.recode.vcf
| | |-- hap1_het.vcf
| | |-- hap2_het_filtered.bed
| | |-- hap2_het_filtered.log
| | |-- hap2_het_filtered.recode.vcf
| | |-- hap2_het.vcf
| | |-- normal_het_phased.log
| | |-- normal_het_phased.vcf.gz
| | |-- normal_het_phased.warnings
| | -- PHASED.BED |-- tmpbams
|-- scripts
| |-- beagle.log
| -- run_example1.sh |-- splitbams | |-- chr21.bam | |-- chr21.bam.bai | |-- chr21.byname.bam | |-- chr22.bam | |-- chr22.bam.bai |-- chr22.byname.bam

Here is the output:

/opt/installers/bamgineer/examples/outputs
___ phasing vcf file ___
beagle.09Nov15.d2a.jar
Copyright (C) 2014-2015 Brian L. Browning
Enter "java -jar beagle.jar" for a summary of command line arguments.
Start time: 01:04 PM IST on 25 Apr 2018

Command line: java -Xmx3641m -jar beagle.jar
gt=/opt/installers/bamgineer/examples/inputs/normal_het.vcf.gz
out=/opt/installers/bamgineer/examples/outputs/phasedvcfdir/normal_het_phased

No genetic map is specified: using 1 cM = 1 Mb

reference samples: 0
target samples: 1

Window 1 [ chr21:9414112-48119669 ]
target markers: 10225

Starting burn-in iterations

Window=1 Iteration=1
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=2
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=3
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=4
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=5
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=6
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=7
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=8
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=9
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Window=1 Iteration=10
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

Starting phasing iterations

Window=1 Iteration=11
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

states/marker: 1.0

Window=1 Iteration=12
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

states/marker: 1.0

Window=1 Iteration=13
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

states/marker: 1.0

Window=1 Iteration=14
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

states/marker: 1.0

Window=1 Iteration=15
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 2.000 mean count/edge: 1

states/marker: 1.0

Window 2 [ chr22:16066867-51239065 ]
target markers: 3455

Starting burn-in iterations

Window=2 Iteration=1
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=2
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=3
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=4
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=5
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=6
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=7
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=8
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=9
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Window=2 Iteration=10
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

Starting phasing iterations

Window=2 Iteration=11
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

states/marker: 1.0

Window=2 Iteration=12
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

states/marker: 1.0

Window=2 Iteration=13
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

states/marker: 1.0

Window=2 Iteration=14
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

states/marker: 1.0

Window=2 Iteration=15
Time for building model: 0 seconds
Time for sampling (singles): 0 seconds
DAG statistics
mean edges/level: 2 max edges/level: 2
mean edges/node: 1.999 mean count/edge: 1

states/marker: 1.0

Number of markers: 13680
Total time for building model: 1 second
Total time for sampling: 1 second
Total run time: 2 seconds

End time: 01:04 PM IST on 25 Apr 2018
beagle.09Nov15.d2a.jar finished

VCFtools - v0.1.12a
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
--vcf /opt/installers/bamgineer/examples/outputs/phasedvcfdir/hap1_het.vcf
--thin 50
--out /opt/installers/bamgineer/examples/outputs/phasedvcfdir/hap1_het_filtered
--recode

After filtering, kept 0 out of 0 Individuals
Outputting VCF file...
After filtering, kept 5324 out of a possible 6773 Sites
Run Time = 0.00 seconds

VCFtools - v0.1.12a
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
--vcf /opt/installers/bamgineer/examples/outputs/phasedvcfdir/hap2_het.vcf
--thin 50
--out /opt/installers/bamgineer/examples/outputs/phasedvcfdir/hap2_het_filtered
--recode

After filtering, kept 0 out of 0 Individuals
Outputting VCF file...
After filtering, kept 5424 out of a possible 6905 Sites
Run Time = 0.00 seconds
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "/opt/installers/bamgineer/examples/outputs/tmpbams/chr21_r
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/opt/installers/bamgineer/src/helpers/handlers.py", line 76, in receive
record = self.queue.get(True, self.polltime)
File "/usr/lib/python2.7/multiprocessing/queues.py", line 135, in get
res = self._recv()
TypeError: ('init() takes exactly 2 arguments (1 given)', <class 'pysam.utils.SamtoolsError'>, ())

How to simulate homozygous deletion

Hi suluxan, May I know the best way to simulate a homozygous deletion?
Should the entry be mentioned (as stated below) in the cnv intereset bed:

chr1 123 124 0

Missing chromosomes?

Hi there, thank you for this initiative, it is vital for the community. I have run into a problem. I tried to simulate a very simple situation (loss of chr3 & chr8, gain of chr4 & chr9). Rudimentary logR analyis shows the events have been engineered into the target BAM. However, chr10 and chr15-22 also appear to have been lost ...

image

All ideas welcome!

Thread error running bamgineer

Getting error running the bamgineer tool. Seems to be with respect to the multiprocessing module. I also tried to use the older version of multiprocessing module ( (0.70.4, as suggested on online forums for such a python error; seems to be a common error). Still no luck in getting bamgineer to work through it. Could you suggest a solution to it?
Please find the error log below:

___ generating phased bed ___
___ filtering bed file columns for amp4AABB47974300_tmp2.bed ___
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/mnt/DataDisk/NGS_tools/bamgineer/src/helpers/handlers.py", line 76, in receive
record = self.queue.get(True, self.polltime)
File "/usr/lib/python2.7/multiprocessing/queues.py", line 135, in get
res = self._recv()
TypeError: init() takes exactly 2 arguments (1 given)

Demo codes and files?

  1. bamgineer/demo_codes/wes_example.sh is a blank file

  2. I was also wondering if there are any demo files to try out bamgineer with? The directory bamgineer/tcga_experiments/inputs/ only contains bed files of genes. As per the scripts used in "bamgineer/tcga_experiments/scripts/generation/T1", I can not find any of the files mentioned: "inputs/vcfs/normal_varscan.vcf", "inputs/bams/Normal.bam" or "inputs/beds/exons.bed"

Documentation

Hi,
It would be great to have more detailed documentation, in particular:

  • Format of bed file with various CNVs, how one should format it for hom/het duplications, deletions, etc.
  • How one can model specific VAFs for inserted mutations?
  • Does one need to supply only bam files split by @sq, or do these files need to be sorted by name, or does the tool require both type of files (chr#.bam and chr#.byname.bam)?
    Since software is fairly resource demanding, it's quite time consuming figuring out these details by trial and error.
    Thanks!

Param "--configFile"

Hi there,
I came across your CNV simulation algorithm in bioRxiv. I'm realising I might be an early adopter, but the tool seems to do exactly what I'm looking for, so I wanted to give it a try.
Trying to use it, I noticed the required parameter "-c/--configFile". I can guess what it does (specifying paths to common tools such as samtools), however, I could not find an example of an actual config file compatible with your code. Could you perhaps provide me with an example that I can adapt to my environment?

Thanks and best wishes
Simon

Failed execution for demo data

Trying to run the demo example on my local computer. First attempt ran for two days then failed. Second example, I made a smaller region:
chr21 47000000 47500000 AAB 3
And got the following:
docker-example % docker run --rm
-v $(pwd):/src
-it suluxan/bamgineer-v2
-config /src/inputs/config.cfg
-splitbamdir src/splitbams
-cnv_bed /src/inputs/cnv.new.bed
-vcf src/inputs/normal_het.vcf
-exons src/inputs/exons.bed
-outbam tumour.bam
-results src/outputs
-cancertype LUAC1
('OPTIONS:', Namespace(cancerType='LUAC1', chrList=None, cnvBed='/src/inputs/cnv.new.bed', configfile='/src/inputs/config.cfg', ctDNA=False, exons='src/inputs/exons.bed', inbamFile=None, outBamFile='tumour.bam', outputDir='src/outputs', phase=False, singleXY=False, splitbams='src/splitbams', vcf='src/inputs/normal_het.vcf'))
src/outputs
___ generating phased bed ___
___ filtering bed file columns for gainAAB47000000_tmp2.bed ___
___ extracting roi bams ___
___ splitting original bam into hap1 and hap2 ___
___ re-pairing hap1 bam reads ___
___ removing repaired duplicates ___
finding positions of the duplicate reads in the file...
sorting 49824 end pairs... done in 10 ms
sorting 626441 single ends (among them 622276 unmatched pairs)... done in 26 ms
collecting virtual offsets of duplicate reads... done in 5 ms
found 10279 duplicates, sorting the list... done in 0 ms
collected list of positions in 0 min 5 sec
removing duplicates...
total time elapsed: 0 min 8 sec
___ re-pairing hap2 bam reads ___
___ removing repaired duplicates ___
finding positions of the duplicate reads in the file...
sorting 49383 end pairs... done in 11 ms
sorting 620508 single ends (among them 616064 unmatched pairs)... done in 28 ms
collecting virtual offsets of duplicate reads... done in 6 ms
found 10196 duplicates, sorting the list... done in 0 ms
collected list of positions in 0 min 5 sec
removing duplicates...
total time elapsed: 0 min 8 sec
sambamba-merge: Error reading BGZF block starting from offset 15187326: wrong BGZF magic
sambamba-merge: Error reading BGZF block starting from offset 15167255: wrong BGZF magic
___ removing hap1 merged normal duplicates ___
finding positions of the duplicate reads in the file...
sorting 84909 end pairs... done in 11 ms
sorting 573252 single ends (among them 572403 unmatched pairs)... done in 22 ms
collecting virtual offsets of duplicate reads... done in 7 ms
found 18344 duplicates, sorting the list... done in 0 ms
collected list of positions in 0 min 5 sec
removing duplicates...
total time elapsed: 0 min 8 sec
___ removing hap2 merged normal duplicates ___
finding positions of the duplicate reads in the file...
sorting 85143 end pairs... done in 10 ms
sorting 569723 single ends (among them 568810 unmatched pairs)... done in 28 ms
collecting virtual offsets of duplicate reads... done in 6 ms
found 18035 duplicates, sorting the list... done in 1 ms
collected list of positions in 0 min 5 sec
removing duplicates...
total time elapsed: 0 min 8 sec
___ extracting non-roi bams ___
samtools: writing to standard output failed: Broken pipe
samtools: error closing standard output: -1
Killed

Example failed

I installed bamgineer and all its dependencies in macos (v10.13.4). I get following error when i ran the example-

sed: 1: "outputs/phasedvcfdir/ha ...": invalid command code o
sed: 1: "outputs/phasedvcfdir/ha ...": invalid command code o
awk: syntax error at source line 1
 context is
	($1 ~ "chr"){print $0 >> $1 >>>  "_exons_in_roigain" <<< 
awk: illegal statement at source line 1
awk: illegal statement at source line 1
sh: outputs/haplotypedir/gain_tmp.bed: No such file or directory
sh: outputs/haplotypedir/het_snpgain.bed: No such file or directory
Traceback (most recent call last):
  File "../../src/simulate.py", line 69, in <module>
    main(args)
  File "../../src/simulate.py", line 39, in main
    run_pipeline(results_path)
  File "/Users/vamin/projects/CNV/bamgineer/src/methods.py", line 587, in run_pipeline
    initialize_pipeline(phase_path, haplotype_path, cnv_path)
  File "/Users/vamin/projects/CNV/bamgineer/src/methods.py", line 98, in initialize_pipeline
    splitBed(hetsnpbed, '_het_snp' + str(event))
  File "/Users/vamin/projects/CNV/bamgineer/src/utils.py", line 296, in splitBed
    os.chdir(path)
OSError: [Errno 2] No such file or directory: 'outputs/haplotypedir'
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/anaconda3/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/anaconda3/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/vamin/projects/CNV/bamgineer/src/helpers/handlers.py", line 76, in receive
    record = self.queue.get(True, self.polltime)
  File "/anaconda3/lib/python2.7/multiprocessing/queues.py", line 135, in get
    res = self._recv()
EOFError

Here is the full output

outputs
 ___ phasing vcf file ___ 
beagle.09Nov15.d2a.jar
Copyright (C) 2014-2015 Brian L. Browning
Enter "java -jar beagle.jar" for a summary of command line arguments.
Start time: 01:25 PM EDT on 16 Apr 2018

Command line: java -Xmx3641m -jar beagle.jar
  gt=normal_het.vcf.gz
  out=outputs/phasedvcfdir/normal_het_phased

No genetic map is specified: using 1 cM = 1 Mb

reference samples:       0
target samples:          1

Window 1 [ chr21:9414112-48119669 ]
target markers:      10225

Starting burn-in iterations

Window=1 Iteration=1
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=2
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=3
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=4
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=5
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=6
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=7
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=8
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=9
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Window=1 Iteration=10
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

Starting phasing iterations

Window=1 Iteration=11
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

states/marker:    1.0

Window=1 Iteration=12
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

states/marker:    1.0

Window=1 Iteration=13
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

states/marker:    1.0

Window=1 Iteration=14
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

states/marker:    1.0

Window=1 Iteration=15
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  2.000  mean count/edge: 1

states/marker:    1.0

Window 2 [ chr22:16066867-51239065 ]
target markers:       3455

Starting burn-in iterations

Window=2 Iteration=1
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=2
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=3
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=4
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=5
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=6
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=7
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=8
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=9
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Window=2 Iteration=10
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

Starting phasing iterations

Window=2 Iteration=11
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

states/marker:    1.0

Window=2 Iteration=12
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

states/marker:    1.0

Window=2 Iteration=13
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

states/marker:    1.0

Window=2 Iteration=14
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

states/marker:    1.0

Window=2 Iteration=15
Time for building model:         0 seconds
Time for sampling (singles):     0 seconds
DAG statistics
mean edges/level: 2      max edges/level: 2
mean edges/node:  1.999  mean count/edge: 1

states/marker:    1.0

Number of markers:               13680
Total time for building model: 1 second
Total time for sampling:       1 second
Total run time:                1 second

End time: 01:25 PM EDT on 16 Apr 2018
beagle.09Nov15.d2a.jar finished

VCFtools - 0.1.15
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
	--vcf outputs/phasedvcfdir/hap1_het.vcf
	--thin 50
	--out outputs/phasedvcfdir/hap1_het_filtered
	--recode

After filtering, kept 0 out of 0 Individuals
Outputting VCF file...
After filtering, kept 5324 out of a possible 6773 Sites
Run Time = 0.00 seconds

VCFtools - 0.1.15
(C) Adam Auton and Anthony Marcketta 2009

Parameters as interpreted:
	--vcf outputs/phasedvcfdir/hap2_het.vcf
	--thin 50
	--out outputs/phasedvcfdir/hap2_het_filtered
	--recode

After filtering, kept 0 out of 0 Individuals
Outputting VCF file...
After filtering, kept 5424 out of a possible 6905 Sites
Run Time = 0.00 seconds
sed: 1: "outputs/phasedvcfdir/ha ...": invalid command code o
sed: 1: "outputs/phasedvcfdir/ha ...": invalid command code o
awk: syntax error at source line 1
 context is
	($1 ~ "chr"){print $0 >> $1 >>>  "_exons_in_roigain" <<< 
awk: illegal statement at source line 1
awk: illegal statement at source line 1
sh: outputs/haplotypedir/gain_tmp.bed: No such file or directory
sh: outputs/haplotypedir/het_snpgain.bed: No such file or directory
Traceback (most recent call last):
  File "../../src/simulate.py", line 69, in <module>
    main(args)
  File "../../src/simulate.py", line 39, in main
    run_pipeline(results_path)
  File "/Users/vamin/projects/CNV/bamgineer/src/methods.py", line 587, in run_pipeline
    initialize_pipeline(phase_path, haplotype_path, cnv_path)
  File "/Users/vamin/projects/CNV/bamgineer/src/methods.py", line 98, in initialize_pipeline
    splitBed(hetsnpbed, '_het_snp' + str(event))
  File "/Users/vamin/projects/CNV/bamgineer/src/utils.py", line 296, in splitBed
    os.chdir(path)
OSError: [Errno 2] No such file or directory: 'outputs/haplotypedir'
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/anaconda3/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/anaconda3/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/Users/vamin/projects/CNV/bamgineer/src/helpers/handlers.py", line 76, in receive
    record = self.queue.get(True, self.polltime)
  File "/anaconda3/lib/python2.7/multiprocessing/queues.py", line 135, in get
    res = self._recv()
EOFError

Runnable source code?

Hi,

is there any runnable version of bamgineer to be found?

The one that is hosted here clearly cannot work, since e.g. it does not even process the input bam files from the arguments parser and several called methods e.g. params.GetInputBam() have never been defined anywhere.

A question

How to do allele specific analysis by using bisulfite sequencing data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.