Git Product home page Git Product logo

pipeline's People

Contributors

charlesylin avatar dpolaski avatar godloved avatar huihuifan avatar jdimatteo avatar semenko avatar vsoch avatar youngcomputation avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pipeline's Issues

global name 'maketrans' is not defined

Hello,

When running
python bamToGFF.py -e 200 -r -m 1 -b myBam.bam -i myGFF.gff -o myOutput.gff
I get a

�[1;31mNameError�[0m: global name 'maketrans' is not defined

Can you please check?

testing notifications

please disregard this issue, John is just testing if he gets an email notification when others create issues

procedure to add a reference genome

Dear all,

I'd really like to thank you for this amazing tool combo.

I was trying to use the rose pipeline script but unfortunately I found out that there is no support for hg38 genome.

I wanted to double check with you a possible workaround to add the genome myself, other than adding the option in the python script, I think I need to add a couple of files in the annotation/ folder:

  • hg38.chrom.sizes
  • hg19_refseq.ucsc (this is just the GTF file from UCSC hgTable's tool with gene annotation ,right? any particular filters-parameters I need to know of?)
  • ucsc_chromSize.txt (should i just append hg38.chrom.sizes to this file?)

Am I missing something else?

thank you very much!
Matteo

bamliquidator_regions should better handle a zero region file

Currently a zero region file results in errors like the following:

HDF5-DIAG: Error detected in HDF5 (1.8.4-patch1) thread 140106874668864:
  #000: ../../../src/H5S.c line 1335 in H5Screate_simple(): zero sized dimension for non-unlimited dimension
    major: Invalid arguments to routine
    minor: Bad value

We should probably display a user friendly error and return a non-zero (error) exit code. Another option is to display a warning and return a zero (success) exit code. Or we could just quietly return a zero (success) error code with no warning and assume the user intentionally used a zero region file.

Also, test similar error handling with bamliquidator_bins.

bamliquidator should support outputting files into an existing directory

As requested by Charles, bamliquidator should support outputting into an already existing directory even if that results in overwriting files. Currently if the directory exists it just aborts.

Test Plan:

  1. Use existing empty directory
    1. mkdir test_existing
    2. bamliquidator_batch.py -o test_existing some.bam
    3. verify run completes with no error nor warnings
  2. Overwrite prior run
    1. verify that directories test_overwrite and test_comparison don't exist
    2. bamliquidator_batch.py -f -o test_overwrite first.bam
    3. bamliquidator_batch.py -f -o test_overwrite second.bam
    4. bamliquidator_batch.py -f -o test_comparison second.bam
    5. verify that test_overwrite and test_comparison contents exactly match (this verifies that all results from the first run were completely overwritten)
    6. repeat with a region file and -m
  3. Regression: new directory created if not already existing
    1. verify directory "test_regression" doesn't exist
    2. bamliquidator_batch.py -o test_regression some.bam
    3. verify test_regression directory created and run completes with no error nor warnings

This change may now cause some confusing behavior if some files are overwritten but some aren't. This may happen when a prior run has different output arguments (e.g. if the prior run included matrix.gff and the subsequent run did not) or different chromosomes than a subsequent run. I'm not going to try and do anything complicated to handle this -- I'll just trust the user to be aware of what they are doing when re-using a prior run's output directory.

enhance reporting of parsing errors

e.g. if a region file has a bad start/stop column an error like the following will be generated, which isn't very clear.

we should ideally report the region line and the column that couldn't be parsed. other possible parsing errors could occur on the python to C++ interface as well. any fix applied to region liquidation should also be applied to bin liquidation.

jd-mba:pipeline jdimatteo$ ./bamliquidator_batch.py -r test.gff -o test.liquid test.bam
Liquidating test.bam (file 1 of 1)
ERROR   Unhandled exception: bad lexical cast: source type value could not be interpreted as target
Liquidation completed: 0.012737 seconds
Traceback (most recent call last):
  File "./bamliquidator_batch.py", line 508, in <module>
    main()
  File "./bamliquidator_batch.py", line 488, in main
    not args.quiet)
  File "./bamliquidator_batch.py", line 312, in __init__
    self.batch(extension, sense)
  File "./bamliquidator_batch.py", line 203, in batch
    raise Exception("%s failed with exit code %d" % (self.executable_path, return_code))
Exception: /Users/jdimatteo/DanaFarber/pipeline/bamliquidator_internal/bamliquidator_regions failed with exit code 4
jd-mba:pipeline jdimatteo$ 

Update Install Instructions when bamliquidator 1.2 released

update install instructions to remove pip bamliquidatorbatch and note that bokeh must be installed seperately if needed: sudo pip install bokeh==0.4.4 "openpyxl>=1.6.1,<2.0.0"
remove pip bamliquidatorbatch package so that it can no longer be installed

bamliquidator fails when bam files processed in current directory

I think this is because the cell type is an empty string which probably breaks some logic:

jdimatteo@ip-172-31-43-98:$ bamliquidator_batch tmp/04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
Liquidating tmp/04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam (file 1 of 1)
Liquidation completed: 22.162895 seconds, 29059326 reads, 1.308493 millions of reads per second
Cell Types: tmp
Normalizing and calculating percentiles for cell type tmp
Indexing normalized counts
Plotting
-- skipping plotting chrM because not enough bins (only 1)
-- skipping plotting chrM because not enough bins (only 1)
Summarizing
Post liquidation processing took 2.762700 seconds
jdimatteo@ip-172-31-43-98:
$ mv tmp/04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam* .
jdimatteo@ip-172-31-43-98:$ rm -rf output/
jdimatteo@ip-172-31-43-98:
$ bamliquidator_batch 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
Liquidating 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam (file 1 of 1)
Liquidation completed: 22.146732 seconds, 29059326 reads, 1.309448 millions of reads per second
Cell Types:
Normalizing and calculating percentiles for cell type
Indexing normalized counts
Plotting
-- skipping plotting chr1 because not enough bins (only 0)
-- skipping plotting chr2 because not enough bins (only 0)
-- skipping plotting chr3 because not enough bins (only 0)
-- skipping plotting chr4 because not enough bins (only 0)
-- skipping plotting chr5 because not enough bins (only 0)
-- skipping plotting chr6 because not enough bins (only 0)
-- skipping plotting chr7 because not enough bins (only 0)
-- skipping plotting chr8 because not enough bins (only 0)
-- skipping plotting chr9 because not enough bins (only 0)
-- skipping plotting chr10 because not enough bins (only 0)
-- skipping plotting chr11 because not enough bins (only 0)
-- skipping plotting chr12 because not enough bins (only 0)
-- skipping plotting chr13 because not enough bins (only 0)
-- skipping plotting chr14 because not enough bins (only 0)
-- skipping plotting chr15 because not enough bins (only 0)
-- skipping plotting chr16 because not enough bins (only 0)
-- skipping plotting chr17 because not enough bins (only 0)
-- skipping plotting chr18 because not enough bins (only 0)
-- skipping plotting chr19 because not enough bins (only 0)
-- skipping plotting chr20 because not enough bins (only 0)
-- skipping plotting chr21 because not enough bins (only 0)
-- skipping plotting chr22 because not enough bins (only 0)
-- skipping plotting chrX because not enough bins (only 0)
-- skipping plotting chrY because not enough bins (only 0)
-- skipping plotting chrM because not enough bins (only 0)
-- skipping plotting chr1 because not enough bins (only 0)
-- skipping plotting chr2 because not enough bins (only 0)
-- skipping plotting chr3 because not enough bins (only 0)
-- skipping plotting chr4 because not enough bins (only 0)
-- skipping plotting chr5 because not enough bins (only 0)
-- skipping plotting chr6 because not enough bins (only 0)
-- skipping plotting chr7 because not enough bins (only 0)
-- skipping plotting chr8 because not enough bins (only 0)
-- skipping plotting chr9 because not enough bins (only 0)
-- skipping plotting chr10 because not enough bins (only 0)
-- skipping plotting chr11 because not enough bins (only 0)
-- skipping plotting chr12 because not enough bins (only 0)
-- skipping plotting chr13 because not enough bins (only 0)
-- skipping plotting chr14 because not enough bins (only 0)
-- skipping plotting chr15 because not enough bins (only 0)
-- skipping plotting chr16 because not enough bins (only 0)
-- skipping plotting chr17 because not enough bins (only 0)
-- skipping plotting chr18 because not enough bins (only 0)
-- skipping plotting chr19 because not enough bins (only 0)
-- skipping plotting chr20 because not enough bins (only 0)
-- skipping plotting chr21 because not enough bins (only 0)
-- skipping plotting chr22 because not enough bins (only 0)
-- skipping plotting chrX because not enough bins (only 0)
-- skipping plotting chrY because not enough bins (only 0)
-- skipping plotting chrM because not enough bins (only 0)
Summarizing
Traceback (most recent call last):
File "/usr/local/bin/bamliquidator_batch", line 9, in
load_entry_point('BamLiquidatorBatch==0.9.3', 'console_scripts', 'bamliquidator_batch')()
File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 440, in main
not args.quiet)
File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 223, in init
self.batch(extension, sense)
File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 199, in batch
self.normalize()
File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 250, in normalize
nps.normalize_plot_and_summarize(counts_file, self.output_directory, self.bin_size, self.skip_plot)
File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/normalize_plot_and_summarize.py", line 377, in normalize_plot_and_summarize
populate_summary(summary, normalized_counts, chromosome)
File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/normalize_plot_and_summarize.py", line 327, in populate_summary
summary.row["avg_cell_type_percentile"] = summed_cell_type_percentiles_by_bin[bin_number] / len(cell_types)
ZeroDivisionError: float division by zero
jdimatteo@ip-172-31-43-98:~$

provide option for a different normalization method

The current normalization method doesn't work well when comparing normalized across different bams with different read lengths. We probably shouldn't change it for legacy compatibility reasons, but we may want to provide a way to override it with a more robust option.

This probably isn't very important, since bamliquidator_batch usually serves as a low level tool used by higher level frameworks, and those higher levels could probably trivially calculate a different normalization value.

Installation issue: -lbam is error, but hdf5/serial isn't a thing at all?

Installing on CentOS 7.4 in a py2.7 virtenv. I've install samtools 0.1.19

I've added the path the the samtools source because what it needs is sam.h, to CPLUS_INCLUDE_PATH, that got me past the first error. (/source/samtools/ because #include <samtools/sam.h>)

Now I run make and I see the below. There are two issues. Seemingly it can't find bam.h:

/usr/bin/ld: cannot find -lbam

Which is in the samtools dir....so I now add the full path to CPLUS_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/source/samtools:/source/samtools/samtools/

But that doesn't work with the same error. Now I also note that the hdf5 details are explicitly linked. I have no
/usr/lib/x86_64 - this being CentOS, I'm not surprised.

But I can't find anything related to hdf5-serial anywhere. There are allusions to it online in tickets (usually when it's missing), but I can't see anything about where or how I might find it.

I have hdf5 and hdf5-devel installed (version 1.8.12) but I see no mention of hdf5 and serial anywhere. Where can I find that to install it?

(c2c4865)[root@head-node bamliquidator_internal]# make
g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I/usr/include/hdf5/serial -I/usr/local/include -c bamliquidator.m.cpp
g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I/usr/include/hdf5/serial -I/usr/local/include -pthread -c bamliquidator.cpp
bamliquidator.cpp: In function ‘std::vector<double> liquidate(const samfile_t*, const bam_index_t*, const string&, unsigned int, unsigned int, char, unsigned int, unsigned int)’:
bamliquidator.cpp:202:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   for(int i=0; i<spnum; i++)
                  ^
bamliquidator.cpp:213:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     for(int i=0; i<spnum; i++)
                    ^
bamliquidator.cpp:215:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if(item.start > stopArr[i]) continue;
                                ^
bamliquidator.cpp:216:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       if(item.stop < startArr[i]) break;
                                ^
g++ -O3 -g -Wall -L/usr/lib/x86_64-linux-gnu/hdf5/serial -o bamliquidator bamliquidator.o bamliquidator.m.o -lbam -lz -lpthread 
/usr/bin/ld: cannot find -lbam
collect2: error: ld returned 1 exit status
make: *** [bamliquidator] Error 1

bamliquidator rounding error

Request from Charles:

substitute the chunk in bamliquidator.cpp around line 200 with this:

/* fetch bed items for a region and compute density                                                                                 
only deal with coord, so use generic item                                                                                           
*/
int startArr[spnum], stopArr[spnum];
int pieceLength = (int)(stop-start) / spnum;

for(i=0; i<spnum; i++)
{
        startArr[i] = (int)(start + pieceLength*i);
        stopArr[i] = (int)(start + pieceLength*(i+1));
}

Sequence Data File Hot Spot Analysis

Find the number of data files that correspond to each bin of the genome. Start with the 1 million base pair bins Charles defined, and possibly repeat with smaller bins.

There are hundreds of data files with sequence data, each a couple GB in size. Run bamliquidator on all the data files for each bin to find the number of data files that correspond to each bin. Store the results in a table and generate a heat map to visualize the results. Benchmark how long it takes for the analysis to run for a given sequence length.

Many bioinformatics anecdotally observed a seeming hot spot tendency, where certain areas of the genome tended to have more sequence data associated with it. No one has done a large scale analysis before to confirm whether or not this is really true. The null hypothesis is that there are no hot spots.

networkScatter.R not found

I'm trying to run coltron and I'm running into the error:
Fatal error: cannot open file 'networkScatter.R': No such file or directory
I've looked for this file, but can't seem to find it. I was wondering if maybe I was just missing something?

Operation Timed out on ROSE2

An unusual error is clogging ROSE2 during bam mapping. The log file reads:

MAPPING TO THE FOLLOWING BAMS: (edited out paths per Berkley's request)

OPERATION TIMED OUT. FILE matrix.txt NOT FOUND

Has anyone seen this error before, and solved it?
I'm running ROSE2 on a cluster node with --mem=121g --cpus-per-task=4

follow up on poor bokeh 0.5 performance

On a t2-micro ec2 instance, post liquidation processing takes over 10x longer with bokeh 0.5 compared with bokeh 0.4.4.

First I should just update the bamliquidator pip package to use bokeh 0.4.4.

Secondly I should communicate with the bokeh team to either fix bokeh or change how I'm using bokeh.


with bokeh 0.5:

ubuntu@ip-172-31-28-230:~$ rm -rf output && bamliquidator_batch 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
Liquidating ../04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam (file 1 of 1, 15:11:01)
Liquidation completed: 99.480256 seconds, 29059326 reads, 0.291515 millions of reads per second
Cell Types: ..
Normalizing and calculating percentiles for cell type ..
Indexing normalized counts
Plotting
-- skipping plotting chrM because not enough bins (only 1)
-- skipping plotting chrM because not enough bins (only 1)
Summarizing
Post liquidation processing took 29.339265 seconds
ubuntu@ip-172-31-28-230:~$ 

with bokeh 0.4.4 (sudo pip install -I bokeh==0.4.4):

ubuntu@ip-172-31-28-230:~$ rm -rf output && bamliquidator_batch 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
Liquidating 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam (file 1 of 1, 16:55:47)
Liquidation completed: 31.431156 seconds, 29059326 reads, 0.922651 millions of reads per second
Cell Types: ..
Normalizing and calculating percentiles for cell type ..
Indexing normalized counts
Plotting
-- skipping plotting chrM because not enough bins (only 1)
-- skipping plotting chrM because not enough bins (only 1)
Summarizing
Post liquidation processing took 2.600450 seconds
ubuntu@ip-172-31-28-230:~$

Error in bamToGFF.py

Hi,
I am trying to use bamToGFF.py but when run it I get the following error:

mapping to GFF and making a matrix with fixed bin number
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
bamToGFF.py in <module>()
    424
    425 if __name__ == "__main__":
--> 426     main()

bamToGFF.py in main()
    407         elif options.matrix:
    408             print('mapping to GFF and making a matrix with fixed bin number')
--> 409             newGFF = mapBamToGFF(bamFile,gffFile,options.sense,options.unique,int(options.extension),options.floor,options.density,options.rpm,25,None,options.matrix,False,options.jxn)
    410
    411         else:

bamToGFF.py in mapBamToGFF(bamFile, gff, sense, unique, extension, floor, density, rpm, binSize, clusterGram, matrix, raw, includeJxnReads)
    110         unique = False
    111         if rpm:
--> 112             MMR= round(float(bam.getTotalReads('mapped'))/1000000,4)
    113         else:
    114             MMR = 1

TypeError: float() argument must be a string or a number

Command used:

module unload python
module load python/2.7
python bamToGFF.py -b ${BAM_FILE} -i ${GFF_FILE} -o ${OUTPUT} -e 400 -m 25 -r 

Files to reproduce this error can be found here

Can you please help me fix this error?

Thanks,
Gunjan

Unhandled exception: pthread_attr_setstacksize: Invalid argument

I've compiled and installed bamliquidator, but my initial tests are giving an ominous error:

$ bamliquidator_batch.py bam_links
Liquidating bam_links/D1JF9ACXX_MN23.bam (file 1 of 2)
ERROR Unhandled exception: pthread_attr_setstacksize: Invalid argument
Liquidation completed: 0.022893 seconds, 163940971 reads, 7120.021162 millions of reads per second
Traceback (most recent call last):
File "/usr/local/apps/bamliquidator/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 532, in
main()
File "/usr/local/apps/bamliquidator/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 503, in main
not args.quiet, args.number_of_threads, args.black_list)
File "/usr/local/apps/bamliquidator/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 256, in init
self.batch(extension, sense)
File "/usr/local/apps/bamliquidator/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 205, in batch
raise Exception("%s failed with exit code %d" % (self.executable_path, return_code))
Exception: /usr/local/apps/bamliquidator/pipeline/bamliquidator_internal/bamliquidator_bins failed with exit code 4

bamliquidator_region cannot handle non-canonical chromosome IDs

Lewyn reported a problem where a region with a 20 character chromosome name with "random" in the name (e.g. chr1_12345678_random) resulted in several problems:

  1. the region failed to be liquidated
  2. instead of aborting or entering a value like zero or nan, a random value was recorded
  3. the chromosome name was truncated and concatenated with the region in the warning output

The random value item (2) was already reported in bug issue #11 and fixed with merge #12 in May.

The other two items however still occur with the latest version and need to be fixed: the region should be liquidated and any error reporting should neither truncate the a name nor concatenate with another field. We should add automated tests and fix this bug.

travis tests failing

Have the travis tests ever passed? Who is maintaining them?

Right now just seems like noise, e.g. "All checks have failed" posted on #72 . These test failures have nothing to do with the changes.

bamliquidator_batch ValueError: cannot set WRITEABLE flag to True of this array

When trying to run bamliquidator_batch on Ubuntu 18.04, I get the following error:

Traceback (most recent call last):
File "/usr/bin/bamliquidator_batch", line 11, in
load_entry_point('BamLiquidatorBatch==1.3.8', 'console_scripts', 'bamliquidator_batch')()
File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 503, in main
not args.quiet, args.number_of_threads, args.black_list)
File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 254, in init
include_cpp_warnings_in_stderr, counts_file_path, number_of_threads)
File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 133, in init
self.bam_file_paths = bam_file_paths_with_no_file_entries(file_names, self.bam_file_paths)
File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 61, in bam_file_paths_with_no_file_entries
if basename(bam_file_path) not in file_names:
File "/home/raphaelb/.local/lib/python2.7/site-packages/six.py", line 566, in next
return type(self).next(self)
File "/usr/lib/python2.7/dist-packages/tables/vlarray.py", line 624, in next
self.listarr = self.read(self._startb, self._stopb, self._step)
File "/usr/lib/python2.7/dist-packages/tables/vlarray.py", line 811, in read
listarr = self._read_array(start, stop, step)
File "tables/hdf5extension.pyx", line 2106, in tables.hdf5extension.VLArray._read_array
ValueError: cannot set WRITEABLE flag to True of this array
Closing remaining open files:output/counts.h5...done

I installed bamliquidator with apt-get. The output of the commands specified in the troubleshooting sections are:

bamliquidator_batch 1.3.8
Version: 1.3.8-0ppa1~bionic
/usr/bin/bamliquidator_batch
/usr/bin/bamliquidator_bins
/usr/bin/bamliquidator_regions

pdf loading time

@jdimatteo @bradnerComputation

I forgot to tell you, but if you leave the hockey-stick running in your browser for a minute or so, the PDFs begin to load quite quickly. I'm going to guess that this is because there is "residual" javascript still loading in the browser, so it doesn't respond very quickly when you request a PDF load.

I'm not sure if you have ideas around this? If not I can just set a loading bar so the page won't be active until it loads.

Pypi packages are dropping python2 support

While trying to install on Centos7 and installing the required python packages:
sudo pip install bokeh==0.9.3 "openpyxl>=1.6.1,<2.0.0" tables unittest2 scipy
the installation failed. tables, numpy, scipy, and pandas no longer support python2. Others will likely follow.

bamliquidator_batch (calling bamliquidator_regions) fails on valid .bed file

Not sure what's up here. Going to try to debug more.

semenko@nucleosome:~/git/pipeline/bamliquidator_internal/bamliquidatorbatch$ ./bamliquidator_batch.py -r ~/bed-analysis-quick-trash/peak-union-with-100bp-window.bed -e 200 -o /ramcache/test-baml/ ~/bed-analysis-quick-trash/raw-data/Liquidating ~/bed-analysis-quick-trash/raw-data/gf-gd/GF6-TCR-GD_S10.bt2.srt.rmdup.bam (file 1 of 23)
ERROR   Unhandled exception: error parsing /[snip]/bed-analysis-quick-trash/peak-union-with-100bp-window.bed
Liquidation completed: 0.014897 seconds
Traceback (most recent call last):
  File "./bamliquidator_batch.py", line 508, in <module>
    main()
  File "./bamliquidator_batch.py", line 488, in main
    not args.quiet)
  File "./bamliquidator_batch.py", line 312, in __init__
    self.batch(extension, sense)
  File "./bamliquidator_batch.py", line 203, in batch
    raise Exception("%s failed with exit code %d" % (self.executable_path, return_code))
Exception: /home/semenko/git/pipeline/bamliquidator_internal/bamliquidator_regions failed with exit code 4

Support arbitrary bin sizes in hot spot analysis

Enhance work done in #1 to support 10K, 1K, and other arbitrary bin sizes. (Currently the bin sizes are hard coded to 100K). The bin size should be specified as a single variable or be given as an argument to the bin counting script, so that it is easy to change the bin size in the future.

The run time for the 100K bin analysis wasn't particularly fast, so other changes should probably be made at the same time:

  • call bamliquidator.c function in a C/C++ loop to remove overhead of bash
  • use hdf5 instead of mysql

Test Plan

  1. Compare with prior version
    1. run bin counting with 100K bin size for 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
    2. verify results exactly match counts from prior version in #1 . Either randomly check several counts or write a script to check every count.
  2. Verify smaller bins add up to larger bins
    1. run bin counting with 10K bin sizes for 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
    2. verify that for some randomly selected 100K bins that the 10 corresponding 10K bins add up to exactly the same value
  3. Verify smaller bin graphs appear visually similar to larger bins
    1. Run bin counting with 1K bin sizes 04032013_D1L57ACXX_4.TTAGGC.hg18.bwt.sorted.bam
    2. Visually compare graphs from 1K bins and 100K bins. They should be similar, with 1K showing more detail.
    3. If possible, find an area that appears different between the 1K and 100K bin graphs, and confirm that it is accurate (e.g. by leveraging some known biology background or opening the bam file in a viewer like Tablet)

bamliquidator_regions error

Hi,
I'm trying to use ROSE2_main.py script on my data. But at one point the script is returning the following error.

Liquidation completed: 0.117097 seconds
Traceback (most recent call last):
  File "/scripts/tools/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 532, in <module>
    main()
  File "/scripts/tools/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 512, in main
    not args.quiet, args.number_of_threads)
  File "/scripts/tools/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 314, in __init__
    self.batch(extension, sense)
  File "/scripts/tools/pipeline/bamliquidator_internal/bamliquidatorbatch/bamliquidator_batch.py", line 205, in batch
    raise Exception("%s failed with exit code %d" % (self.executable_path, return_code))
Exception: /scripts/tools/pipeline/bamliquidator_internal/bamliquidator_regions failed with exit code -6

I tried bamliquidator_btach.py script separately, it is also returning the same error. It would be really helpful if you can suggest some fixes for this issue.

Thank you.

bamliquidator_batch.py --match_bamToGFF garbage in matrix.gff

As reported by Charles, bamliquidator_batch.py --match_bamToGFF results in garbage in matrix.gff when bamliquidator can't find data for a region.

The original example to reproduce (/raider/temp/bamliquidator_batch_testing/testRun.sh) no longer appears to work since the region file is no longer present.

I'll post steps to reproduce the error soon.

bamliquidator doesn't handle long file names

See 6/18/2014 email from Charles for example leading to KeyError at:

File "/usr/local/lib/python2.7/dist-packages/bamliquidatorbatch/normalize_plot_and_summarize.py", line 183, in normalize

    total_count = file_to_count[file_name]

installation on MAC

Hello,

thank you for the wonderful tools you made!
I tried installing bamliquidator on MAC OS X10.10 but I kept getting the error below. Any advice would be deeply appreciated.

MacBook-Pro-2:bamliquidator_internal nizarjacquesbahlis$ make
g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -c bamliquidator.m.cpp
g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -pthread -c bamliquidator.cpp
g++ -O3 -g -Wall -o bamliquidator bamliquidator.o bamliquidator.m.o -lbam -lz -lpthread
Undefined symbols for architecture x86_64:
"_bam_destroy1", referenced from:
_bam_fetch in libbam.a(bam.o)
_sampileup in libbam.a(sam.o)
"_bam_hdr_destroy", referenced from:
_samclose in libbam.a(sam.o)
"_bam_init1", referenced from:
_bam_fetch in libbam.a(bam.o)
_sampileup in libbam.a(sam.o)
"_bam_name2id", referenced from:
_bam_parse_region in libbam.a(bam_aux.o)
"_bam_plp_destroy", referenced from:
_bam_plbuf_destroy in libbam.a(bam_plbuf.o)
"_bam_plp_init", referenced from:
_bam_plbuf_init in libbam.a(bam_plbuf.o)
"_bam_plp_next", referenced from:
_bam_plbuf_push in libbam.a(bam_plbuf.o)
"_bam_plp_push", referenced from:
_bam_plbuf_push in libbam.a(bam_plbuf.o)
"_bam_plp_reset", referenced from:
_bam_plbuf_reset in libbam.a(bam_plbuf.o)
"_bam_read1", referenced from:
_bam_fetch in libbam.a(bam.o)
"_bgzf_mt", referenced from:
_samthreads in libbam.a(sam.o)
"_fai_build", referenced from:
_samfaipath in libbam.a(sam.o)
"_hts_close", referenced from:
_samclose in libbam.a(sam.o)
"_hts_get_format", referenced from:
_samthreads in libbam.a(sam.o)
_samopen in libbam.a(sam.o)
"_hts_idx_destroy", referenced from:
liquidate(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, unsigned int, unsigned int, char, unsigned int, unsigned int) in bamliquidator.o
"_hts_idx_load", referenced from:
liquidate(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, unsigned int, unsigned int, char, unsigned int, unsigned int) in bamliquidator.o
"_hts_itr_destroy", referenced from:
_bam_fetch in libbam.a(bam.o)
"_hts_itr_next", referenced from:
_bam_fetch in libbam.a(bam.o)
"_hts_open", referenced from:
_samopen in libbam.a(sam.o)
"_hts_parse_reg", referenced from:
_bam_parse_region in libbam.a(bam_aux.o)
"_hts_set_fai_filename", referenced from:
_samopen in libbam.a(sam.o)
"_hts_verbose", referenced from:
_samopen in libbam.a(sam.o)
_samfaipath in libbam.a(sam.o)
"_sam_format1", referenced from:
_bam_format1 in libbam.a(bam.o)
_bam_view1 in libbam.a(bam.o)
"_sam_hdr_read", referenced from:
_samopen in libbam.a(sam.o)
"_sam_hdr_write", referenced from:
_samopen in libbam.a(sam.o)
"_sam_itr_queryi", referenced from:
_bam_fetch in libbam.a(bam.o)
"_sam_read1", referenced from:
_sampileup in libbam.a(sam.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [bamliquidator] Error 1

Questions about installation and output

I seem to have successfully installed. Running make give a couple of

warning: comparison between signed and unsigned integer expressions [-Wsign-compare] │bamliquidator_batch.py bamPlot.py bamPlot_turbo.R bamToGFF.py if (columns.size() > name_column)

errors, but otherwise finishes without error and re-running make gives "nothing to be done for all".

When I then run the python bamliquidatorbatch/test.py I get a list of errors, but it also seems to finish OK?

(c2c4865)[root@vmpr-res-utils bamliquidator_internal]# python bamliquidatorbatch/test.py 
[samopen] SAM header is present: 1 sequences.
[samopen] SAM header is present: 1 sequences.
/config/binaries/bamliquidator/c2c4865/pipeline/bamliquidator_internal/bamliquidatorbatch/normalize_plot_and_summarize.py:245: RuntimeWarning: invalid value encountered in true_divide
  percentiles = (stats.rankdata(normalized_count_list) - 1) / (len(normalized_count_list)-1) * 100
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: bb196b99-4a58-4101-9e42-fa6e85dcf111
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: 0077eb96-20e6-42d8-a1e8-374225a7b234
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: a3f49d52-d8f1-46e1-a917-6354df750de3
.[samopen] SAM header is present: 1 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: f2820684-62a3-4545-aaa9-f26aaffb4853
.[samopen] SAM header is present: 1 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: f9a53df4-88ef-451e-9d3e-c4a3a7273bc0
.[samopen] SAM header is present: 2 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: 71a41d72-cd89-4910-9b59-be73ce4b3784
.[samopen] SAM header is present: 3 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: e4151f76-757a-470e-8477-b0da65e657b3
.[samopen] SAM header is present: 3 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: 9f0e6d71-a4d8-4e71-adc4-efbf6fdc5337
.[samopen] SAM header is present: 1 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: d5ef915b-02cf-4018-900f-c848789f892e
.[samopen] SAM header is present: 1 sequences.
ERROR   Bin size cannot be zero
.[samopen] SAM header is present: 1 sequences.
ERROR:/config/binaries/bamliquidator/c2c4865/lib/python2.7/site-packages/bokeh/validation/check.pyc:W-1002 (EMPTY_LAYOUT): Layout has no children: VBox, ViewModel:VBox, ref _id: 28b0a72b-b8c5-464a-880e-32c0248d5e2d
.[samopen] SAM header is present: 1 sequences.
WARNING No valid regions detected in /tmp/blt_JBCZN1/empty.gff
WARNING No valid regions detected in /tmp/blt_JBCZN1/empty.gff
.[samopen] SAM header is present: 1 sequences.
WARNING Excluding invalid region on line 1: bam file key 1 chr1 region1 60 -> 70 . 0
WARNING No valid regions detected in /tmp/blt_tc4K9_/single.gff
.[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
ERROR   Unhandled exception: Not enough columns parsing line 1 '' of /tmp/blt_kEPPRo/single.bed
ERROR   Unhandled exception: Not enough columns parsing line 1 'chr1' of /tmp/blt_kEPPRo/single.bed
ERROR   Unhandled exception: Not enough columns parsing line 1 'chr1    1' of /tmp/blt_kEPPRo/single.bed
.[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
WARNING Truncated region on line 1 from 'rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr' to 'rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr'
.[samopen] SAM header is present: 1 sequences.
[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
WARNING Truncated region on line 1 from 'rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr' to 'rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr'
.[samopen] SAM header is present: 1 sequences.
[samopen] SAM header is present: 1 sequences.
.[samopen] SAM header is present: 1 sequences.
WARNING Excluding invalid region on line 1: bam file key 1 chr10 region1 60 -> 70 . 0
WARNING No valid regions detected in /tmp/blt_oG3vu5/single.gff
.
----------------------------------------------------------------------
Ran 22 tests in 2.447s

OK
Closing remaining open files:/tmp/blt_ByyONM/output/counts.h5...done

Is that what I should be expecting?

add version number to counts.h5 file

Store the bamliquidator_batch.py version in the counts.h5 file. This will make it easier to determine the format of the bamliquidator_batch tables.

Wrong binsize calculation accumulates offset errors with increasing bin numbers

There is a crucial bug in the binsize calculation in the bamliquidator.cpp file.
Since the tool needs a gff file as input the length of a region is
"end-start+1" and not "end-start" as in bed format. When demanding 60 bins for a region you get an offset of 60bp for the 60th bin already and thus miss the very end of your region. Further, a regions end position would then be start+binsize-1 if you stick with the gff format. However, I did not check the rest of the code. If it uses half-open intervals start+binsize would be correct of course.

real examples with data and output

I am an administrator installing bamliquidator on behalf of a user. So I don't have enough background information to test the installation once it's completed. It would be great if you could provide some sample data and some sample commands in the readme that will produce expected output. That way and admin like myself can install this software and test it without knowing a lot about omics etc. Thanks for considering this request!

Installation failure in MacOS

Hi, I am trying to install bamliquidator in my MacOS Mojave 10.14.6, but encountered the following:

(base) breastadms-MBP:bamliquidator_internal xiaoyongfu$ brew install jdimatteo/science/[email protected]
==> Tapping jdimatteo/science
Cloning into '/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science'...
remote: Enumerating objects: 35046, done.
remote: Total 35046 (delta 0), reused 0 (delta 0), pack-reused 35046
Receiving objects: 100% (35046/35046), 9.24 MiB | 11.81 MiB/s, done.
Resolving deltas: 100% (22278/22278), done.
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/des.rb:15

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/oce.rb
unknown or unsupported macOS version: :snow_leopard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/beetl.rb
beetl: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/fastml.rb
unknown or unsupported macOS version: :mavericks
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/trinity.rb:27

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/libfolia.rb
libfolia: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/siril.rb
siril: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/ome-xml.rb
ome-xml: uninitialized constant #Class:0x00007feda900c7a8::MinimumMacOSRequirement
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/astral.rb:14

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/stacks.rb
stacks: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/ds9.rb
unknown or unsupported macOS version: :lion
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/dynare.rb
dynare: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/scram.rb
scram: "cxx14" is not a recognized standard
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/bbtools.rb:16

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/nexusformat.rb
nexusformat: Calling BuildOptions#cxx11? is disabled! There is no replacement.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core):
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/nexusformat.rb:22

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/itensor.rb
itensor: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/timbl.rb
timbl: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/wopr.rb
wopr: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/nixio.rb
nixio: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/trilinos.rb
trilinos: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/geda-gaf.rb
geda-gaf: undefined method `devel' for #Class:0x00007fedd98cdd08
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/adol-c.rb
adol-c: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/discovardenovo.rb
discovardenovo: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/bitseq.rb
bitseq: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/galfit.rb
unknown or unsupported macOS version: :leopard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/ome-files.rb
ome-files: uninitialized constant #Class:0x00007feda915a088::MinimumMacOSRequirement
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/alembic.rb
alembic: "cxx11" is not a recognized standard
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/paxtools.rb:7

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/ticcutils.rb
ticcutils: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/rna-star.rb
rna-star: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/elemental.rb
elemental: undefined method devel' for #<Class:0x00007fedd98dc790> Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/opengrm-ngram.rb opengrm-ngram: "cxx11" is not a recognized standard Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/osgearth.rb osgearth: uninitialized constant #<Class:0x00007feda90d14b8>::MinimumMacOSRequirement Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/sailfish.rb sailfish: "cxx11" is not a recognized standard Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/discovar.rb discovar: "cxx11" is not a recognized standard Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/flexbar.rb flexbar: "cxx14" is not a recognized standard Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/ome-common.rb ome-common: uninitialized constant #<Class:0x00007feda932af20>::MinimumMacOSRequirement Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/openni.rb openni: undefined method devel' for #Class:0x00007feda93b3c08
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/radx.rb
radx: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/dealii.rb
dealii: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/vigra.rb
vigra: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/quaff.rb
quaff: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/insighttoolkit.rb
insighttoolkit: Calling BuildOptions#cxx11? is disabled! There is no replacement.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core):
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/insighttoolkit.rb:22

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/mlpack.rb
mlpack: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/opengrm-thrax.rb
opengrm-thrax: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/symengine.rb
symengine: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/analysis.rb
analysis: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/mantaflow.rb
mantaflow: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/butterflow.rb
unknown or unsupported macOS version: :mavericks
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/dgtal.rb
dgtal: "cxx11" is not a recognized standard
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/cytoscape.rb:17

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/fplll.rb
fplll: "cxx11" is not a recognized standard
Warning: Calling depends_on :java is deprecated! Use "depends_on "openjdk@11", "depends_on "openjdk@8" or "depends_on "openjdk" instead.
Please report this issue to the jdimatteo/science tap (not Homebrew/brew or Homebrew/core), or even better, submit a PR to fix it:
/usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/simpleitk.rb:33

Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/tamarin-prover.rb
unknown or unsupported macOS version: :mountain_lion
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/lsd.rb
lsd: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/arrayfire.rb
unknown or unsupported macOS version: :mavericks
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/sdsl-lite.rb
sdsl-lite: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/spaced.rb
spaced: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/openfst.rb
openfst: "cxx11" is not a recognized standard
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/jdimatteo/homebrew-science/Formula/dssp.rb
dssp: "cxx11" is not a recognized standard
Error: Cannot tap jdimatteo/science: invalid syntax in tap!

Thanks for help!
Xiaoyong

Document/Test/Script bamliquidator_bach install with no admin rights

Probably the way to go would be to run bamliquidator/bamliquidator_bins/bamliquidator_regions in a script that sets up the LD_LIBRARY_PATH to point to a local directory with all the .so dependencies. The python side should be easy since pip install --user bamliquidatorbatch should work without sudo access.

Here are the current dependencies:

jdimatteo@ubuntu:~/pipeline/bamliquidator_internal$ ldd bamliquidator
    linux-vdso.so.1 =>  (0x00007fff7a924000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f05a5431000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f05a5214000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f05a4f13000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f05a4cfd000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f05a493d000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f05a5652000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f05a4640000)
jdimatteo@ubuntu:~/pipeline/bamliquidator_internal$ ldd bamliquidator_bins
    linux-vdso.so.1 =>  (0x00007fff72df2000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f7f773e0000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f7f771c3000)
    libhdf5.so.6 => /usr/lib/libhdf5.so.6 (0x00007f7f76c27000)
    libhdf5_hl.so.6 => /usr/lib/libhdf5_hl.so.6 (0x00007f7f769f5000)
    libtcmalloc_minimal.so.0 => /usr/lib/libtcmalloc_minimal.so.0 (0x00007f7f767a9000)
    libtbb.so.2 => /usr/lib/libtbb.so.2 (0x00007f7f7657b000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f7f7627b000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f7f75f7f000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f7f75d68000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7f759a8000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f7f77601000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f7f757a4000)

(bamliquidator_regions has same dependencies as bamliquidator_bins.)

Related discussion on freenode #ubuntu:

01:40 < jj995> I created a ppa for my app, but some people want to install it without admin rights -- is that possible?
01:49 < ki7mt> jj995, if you build the .deb file, and they can install it in thir user dir's, should be ok
01:50 < jj995> ki7mt: the .deb wouldn't contain any of the runtime requirements, like libtbb.so.2
01:50 < ki7mt> jj995, shared libs is a different issue, are they your libs or package libs
01:51 < jj995> ki7mt: package libs (libbam-dev, libhdf5-serial-dev, libboost-dev, libgoogle-perftools-dev, samtools, libtbb-dev)
01:51 < ki7mt> jj995, You could always wright a shell wrapper or something and export the libdir and have the DL to the local install 
               dir
01:53 < ki7mt> jj995, It gets tricky with non-free or proprietary stuff, especially if distributing it.
01:54 < jj995> ki7mt: I guess I could write a script to download all the .deb files from http://packages.ubuntu.com/, install them into 
               a local directory, and then run my app with that directory on the search path
01:54 < jj995> it is all OSS dependencies, but I thought even that would be tricky to distribute
01:55 < jj995> but I guess if I just write a script to download/install debs to a local directory from http://packages.ubuntu.com/ I 
               don't need to worry about any distribution license
01:55 < ki7mt> jj995, I work on a couple apps that have simliar issues, Lic to the guy who builds it, but can redist the binary .. so 
               that what we do, dl it, use a script, and export the location.

I don't think it is a good idea to try to statically link everything; related discussion on freenode #C++-general:

00:58 < jj995> is it possible to statically link most libraries on linux so that I can distribute a program with no prereqs?  I've read 
               various things saying statically linking is "dead" with gcc on linux.  Would I fare better with clang?
01:01 < SamB> jj995: I don't really think it has anything to do with GCC
01:02 < SamB> anyway, you'll want to link dynamically to glibc; glibc barely supports being statically linked at this point ...
01:07 < jj995> SamB: ok, I guess I could just copy about bunch of .so and supply a script to add them to the search path if I really 
               want to have no prereq
01:08 < jj995> maybe I'll try that, thanks
01:08 < SamB> that WOULD at least permit the user to substitute a copy of libc that actually works on their system, certainly
01:26 < jj995> can I distribute things like /lib/x86_64-linux-gnu/libgcc_s.so.1 with my application?  do I need to include a license or 
               something?
01:26 < jj995> my app uses 13 different .so -- do I need to carefully check the license for each one?
01:33 < SamB> jj995: you do
01:34 < SamB> you may end up obligated to distribute a lot of source code if you do that
01:36 < SamB> (static linking would make things WORSE, not better, though)

bamliquidator_batch -m option for multiple files should add columns

Currently the header row with column names is repeated for each bam file, e.g.

jd-mba:pipeline jdimatteo$ time bamliquidator_batch.py -r ../copied_from_tod/HG19_ENRICHED_MM1S_CTCF_-1000_+1000.gff -m -o two links/med/
Liquidating links/med/20130221_629_hg19.sorted.bam (file 1 of 2)
Liquidation completed: 10.203140 seconds
Liquidating links/med/med_copy2.bam (file 2 of 2)
Liquidation completed: 10.936342 seconds
Normalizing
Post liquidation processing took 1.972099 seconds
Writing bamToGff style matrix.gff file
Writing matrix.gff took 0.410482 seconds

real    0m24.414s
user    1m17.872s
sys 0m1.941s
jd-mba:pipeline jdimatteo$ grep GENE_ID two/matrix.gff 
GENE_ID locusLine   bin_1_20130221_629_hg19.sorted.bam
GENE_ID locusLine   bin_1_med_copy2.bam
jd-mba:pipeline jdimatteo$ 

Instead, a new column should be added for each bam.

Also, the file name should be matrix.txt not matrix.gff.

bamliquidator: Installation issues

Hi,

I have been trying to install bamliquidator. I added CPATH and LIBRARY_PATH as described here and added samtools in CPATH directory. But when I use the command make, it is giving the following error

g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -c bamliquidator.m.cpp
In file included from bamliquidator.h:4:0,
                 from bamliquidator.m.cpp:7:
/usr/local/include/samtools/sam.h:28:24: fatal error: htslib/sam.h: No such file or directory
 #include "htslib/sam.h"
                        ^
compilation terminated.
makefile:96: recipe for target 'bamliquidator.m.o' failed
make: *** [bamliquidator.m.o] Error 1

I would appreciate your clarification on this issue.

Thank you.

bamliquidator crashes after failed absurdly large memory allocation

With a specific bam, bamliquidator crashes with a bin size of 30,000 but works fine for a bin size of 40,000.

jdm@tod:~/Gunk/tmp_large_alloc$ bamliquidator_batch -b 30000 test.bam                                    Liquidating test.bam (file 1 of 1)
tcmalloc: large alloc 18446744071562067968 bytes == (nil) @
Liquidation completed: 0.583112 seconds, 24334613 reads, 41.158474 millions of reads per second
Traceback (most recent call last):
  File "/usr/bin/bamliquidator_batch", line 9, in <module>
    load_entry_point('BamLiquidatorBatch==1.2.0', 'console_scripts', 'bamliquidator_batch')()
  File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 503, in main
    not args.quiet, args.number_of_threads, args.black_list)
  File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 256, in __init__   
    self.batch(extension, sense)
  File "/usr/lib/python2.7/dist-packages/bamliquidatorbatch/bamliquidator_batch.py", line 205, in batch
    raise Exception("%s failed with exit code %d" % (self.executable_path, return_code))
Exception: bamliquidator_bins failed with exit code -11
jdm@tod:~/Gunk/tmp_large_alloc$ rm -rf output/
jdm@tod:~/Gunk/tmp_large_alloc$ bamliquidator_batch -b 40000 test.bam
Liquidating test.bam (file 1 of 1)
tcmalloc: large alloc 1073741824 bytes == 0x2ff82000 @
Liquidation completed: 5.087825 seconds, 24334613 reads, 4.717143 millions of reads per second
Cell Types: -
Normalizing and calculating percentiles for cell type -
Indexing normalized counts
Plotting
-- skipping plotting chrM because not enough bins (only 1)
-- skipping plotting chrM because not enough bins (only 1)
Summarizing
Post liquidation processing took 7.320600 seconds
jdm@tod:~/Gunk/tmp_large_alloc$ 

MAC installation problem

Hi,

Thanks a lot for impressively fast tool. I am trying to install it on MAC but unfortunately I'm facing the following problem

g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE  -c bamliquidator.m.cpp
g++ -std=c++0x -O3 -g -Wall -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE  -pthread -c bamliquidator.cpp
g++ -O3 -g -Wall -o bamliquidator bamliquidator.o bamliquidator.m.o -lbam -lz -lpthread
ld: library not found for -lbam
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [bamliquidator] Error 1

I followed the instructions for MAC here but I am not able to get around the problem. I also tried instruction from #53 .

One more issue is, I couldn't install samtools as suggested in wiki page, I installed it on my own and symlinked to /usr/local/include.

Any help would be greatly appreciated.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.