Git Product home page Git Product logo

bam-readcount's Issues

CIGAR error

I am receiving errors from using readcount on a bam file aligned with bwa mem:

2017-03-31 11:26:05 bam_readcount STDERR ERROR (CigarRoller.cpp:127): Parsing CIGAR - invalid character found with parameter * and 0

Any ideas what may be wrong?

Inconsistent output when run with or without bed file for regions

I am currently trying to use the bam-readcount program to look at allelic frequencies and distributions at specific locations. When running the command with a bed file (tab-separated), I am not seeing the same insertion and/or deletion events as when ran with the specific regions from a piped echo command.

For example:

echo -e 17\t7570000\t7571000 | bam-readcount -w 0 -q 1 -b 20 -l /dev/stdin -f $reference $inBAM

&

bam-readcount -w 0 -q 1 -b 20 -l $bedFile -f $reference $inBAM

The difference is that sometimes the the deletion/insertion event/s are not listed at all in the output file which used a bed file as the region input. I found that running the command in a loop, parsing each line in the bed file and passing that through the first example command works as intended, but running the bottom example command where the bed file is the input regions does not work as intended.

Problem on Intallaion of bam_readcount

I need to install on a server RHEL 5.5 bam_readcount.also conda not works.
What can I do? I have a cmake 2.

cmake --version
cmake version 2.6-patch 4

Missing LICENSE file

Hi Dave,
would you please include a LICENSE file so I could create a package of your tool in Gentoo Linux? ;-) Thank you.

read count limit 8000

Is there a workaround changing the read count limit?
-d doesn't work (at least without specifying a region - which I can't do as this leads to segfaults)

~/bam-readcount/build/bin/bam-readcount -q 30 -d 10000000 ~/IgG_RbpjChIP_chr7.bam > IgG_Rbpj_chr7_maxcount10mil.txt

chr7 19049388 N 7997 =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 A:7052:60.00:35.90:60.00:79:6973:0.00:0.02:0.00:0:0.00:0.00:0.00 C:2:60.00:36.00:60.00:0:2:0.00:0.02:0.00:0:0.00:0.00:0.00 G:939:59.96:35.81:59.96:12:927:0.00:0.00:0.00:0:0.00:0.00:0.00 T:3:60.00:21.33:60.00:0:3:0.00:0.03:0.00:0:0.00:0.00:0.00 N:1:60.00:2.00:60.00:0:1:0.00:0.08:0.00:0:0.00:0.00:0.00

According to bamCoverage this region is covered by 6 million reads...

boost not listed as a dependency but required for building

Error message:
CMake Error at /usr/share/cmake-2.8/Modules/FindBoost.cmake:1194 (message):
Unable to find the requested Boost libraries.

Should be added to list of dependencies (ideally including which Debian/Redhat packages to install)

bam-readcount doesn't count indels in total reads passing mapq

This is an example from my data (splitted to ease reading):

chr17   7578391 T       619     =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  
A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  
C:4:70.25:24.75:70.25:1:3:0.61:0.05:33.00:1:0.66:97.25:0.52     
G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00     
T:615:78.87:36.11:78.87:374:241:0.55:0.02:1.84:374:0.62:123.98:0.51     
N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  
-CA:1:75.00:0.00:75.00:0:1:1.00:0.05:34.00:0:-nan:111.00:0.50  
+C:1:85.00:0.00:85.00:0:1:0.43:0.01:0.00:0:-nan:137.00:0.20        
-C:471:77.42:0.00:77.42:296:175:0.55:0.02:1.07:296:0.61:123.79:0.51

In this specific case, the mutation found was "-C". However the counts for "-C" or "+C" are not included in the total reads above the mapping threshold next to the reference base. This is important if one needs to calculate percentages and similar data, because calculations will be even largely incorrect.

Is this by design, or an oversight?

a question of how to use bam-readcount on insertions?

Hi there,

I am doing a test about the function of VarScan (v2.4.3) fpfilter, but I am confused how to make the readcount region for insertions. For SNVs, the regions look like "chr pos pos", and the regions of deletions should be "chr pos+1 pos+1". So, should I also make the insertion region like "chr pos+1 pos+1" and run bam-readcout with -i ?

Thanks so much and wait for your reply

Rui

[bam_plp_destroy] memory leak: 1. Continue anyway.

Hello,

I used bam_readcount to analyse 6 of my samples. I get this error for 3/6 samples. Was wandering how i can solve this error.

I tried to set different quality parameters to prevent this error without success. I found that bam_readcount fails for the 3 of the biggest bam (3, 5, 6):

`
[jflucier@ip29-mp2 vl_samples]$ ll -h *.bam
-rw-rw-r-- 1 jflucier jacques 55M Jul 26 09:54 1.90.mka.bam
-rw-rw-r-- 1 jflucier jacques 77M Jul 26 09:58 2.90.mka.bam
-rw-rw-r-- 1 jflucier jacques 99M Jul 26 10:02 3.90.mka.bam
-rw-rw-r-- 1 jflucier jacques 76M Jul 26 10:05 4.90.mka.bam
-rw-rw-r-- 1 jflucier jacques 89M Jul 26 10:09 5.90.mka.bam
-rw-rw-r-- 1 jflucier jacques 90M Jul 26 10:13 6.90.mka.bam

`

Here are the command I am running:

`
[jflucier@ip29-mp2 vl_samples]$ ./programs/bam-readcount/bin/bam-readcount -f vl_ref.fa .3.90.mka.bam > 3.90.mka.bam_readcount.tsv
Minimum mapping quality is set to 0
[bam_pileup_core] the input is not sorted (reads out of order)
[bam_plp_destroy] memory leak: 1. Continue anyway.
[jflucier@ip29-mp2 vl_samples]$ ./programs/bam-readcount/bin/bam-readcount -w 1 -b 30 -f vl_ref.fa 3.90.mka.bam > 3.90.mka.bam_readcount.b30.tsv
Minimum mapping quality is set to 0
[bam_pileup_core] the input is not sorted (reads out of order)
[bam_plp_destroy] memory leak: 1. Continue anyway.

`

thank for your help,

JF

does not compile on OSX 10.9.2 with Samtools 0.1.19-44428cd

[ 50%] Building CXX object CMakeFiles/gtest.dir/src/gtest-all.cc.o
In file included from /code/bam-readcount/vendor/src/gtest160/src/gtest-all.cc:39:
In file included from /code/bam-readcount/vendor/src/gtest160/include/gtest/gtest.h:57:
In file included from /code/bam-readcount/vendor/src/gtest160/include/gtest/internal/gtest-internal.h:40:
/code/bam-readcount/vendor/src/gtest160/include/gtest/internal/gtest-port.h:499:13: fatal error:
'tr1/tuple' file not found

include <tr1/tuple> // NOLINT

        ^

1 error generated.
make[5]: *** [CMakeFiles/gtest.dir/src/gtest-all.cc.o] Error 1
make[4]: *** [CMakeFiles/gtest.dir/all] Error 2
make[3]: *** [all] Error 2
make[2]: *** [vendor/src/gtest160-stamp/gtest160-build] Error 2
make[1]: *** [CMakeFiles/gtest160.dir/all] Error 2
make: *** [all] Error 2

speed improvements

Hi... thank you for this tool ... it is very helpful

But... I was wondering... is there a way we could maybe speed this up by using threads.

I have been trying to use it to calculate quite a few regions and it is very slow.

I could split by regions into several files and run them in parallel but if there was a way of maybe doing that directly on the software by using threads that would be much better.

I have a server with more than enough memory and cpu ... so it would be great if we could specify this

Filteriing paired end data

Hi

I'm using bamreadcount output as input for pVACseq.
The thing is that I'm using paired end data from DNA and RNA, and I warnings like:

WARNING: In read XXXXXXXXXXXXXXXXXX: Couldn't find single-end mapping quality. Check to see if the SM tag is in BAM.
I'm filtering using the parameter -b 20.

I'm not sure how this is affecting the filtering. I'm guessing the filtering by quality position is not being affected by this.
But, What would happen if I was filtering instead by -b but by -q?

Thanks for your time.

Many single-end mapping quality warnings

BWA-MEM 0.7.4 does not output single-end mapping qualities. Running bam-readcount results in millions (one per read) of warnings. Please emit one warning and suppress all following warnings.

Couldn't grab single-end mapping quality for read M01171:3:000000000-A3E2L:1:1101:10683:2958. Check to see if SM tag is in BAM

Feature Request: Statistics per strand

It would be really useful if the statistics from read-counts were split by strand, for example the count of As on Fwd and Rev strands, and the mean base quality on each strand. This would be really useful for enrichment data, where there may be a stand bias.
Let me know if you want some more use cases.
Thanks,

Stewart

detailed format to calculate "avg_sum_mismatch_qualities" ?

HI,
I wanna how to calculate "avg_sum_mismatch_qualities".
can you give some details ?
for SNV ,I guess ,
avg_smq = the sum of all mismatch base qualities / the sum of mismatch bases.
but for deletion and insertion ,how to calculate that ???
I guess both deletion bases and insertion bases are seen to be mismatch bases .

Best

List of base pair quality

Is there a way to output for each position the list of the bp quality for all reads instead of the average?

Thank you

Parallel make fails

make -j4 fails. make -j1 builds fine.

❯❯❯ make -j4
…
/usr/local/Cellar/cmake/3.2.2/bin/cmake -E cmake_progress_report /tmp/bam-readcount20150612-1951-1wqpaid/bam-readcount-0.7.4/build/CMakeFiles  3 4 5 6 7 8 9 10
[100%] Built target boost-1.55
make: *** [all] Error 2

Support for CRAM

It would be great to be able to run bam-readcounts on CRAM as well as bam files. Samtools has full support for reading and writing cram so adding support shouldn't be too difficult.
Would this be possible?

Deletion counts behaviour

Hello!
I have a question regarding how deletions are counted in bam-readcount. Let me show you an example.

I have these two variants:

1	85610808	.	CAGA	C	893.73	.	AC=1;AF=0.5;AN=2;BaseQRankSum=-0.674;ClippingRankSum=0;DP=981;ExcessHet=3.0103;FS=9.835;MLEAC=1;MLEAF=0.5;MQ=25.78;MQRankSum=1.866;QD=1.04;ReadPosRankSum=0.665;SOR=0.186	GT:AD:DP:GQ:PL	0/1:778,80:858:99:931,0,32168
1	88266278	rs387873782	TCTGGGAGGGCTTGCTCCGGGGGCAGTGTGTCCTGTTCTTGTGCAGCCCCTG	T	12561.7	.	AC=1;AF=0.5;AN=2;BaseQRankSum=2.115;ClippingRankSum=0;DB;DP=639;ExcessHet=3.0103;FS=2.461;MLEAC=1;MLEAF=0.5;MQ=60;MQRankSum=0;QD=20.59;ReadPosRankSum=0.864;SOR=0.514	GT:AD:DP:GQ:PL	0/1:285,325:610:99:12599,0,10870

I want to see these two deletions in the bam-readcount output:

1	85610809	A	1482	=:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	A:1398:16.98:35.64:0.02:482:916:0.47:0.01:11.16:482:0.38:74.87:0.50	C:1:20.00:15.00:0.00:0:1:0.24:0.04:40.00:0:0.00:75.00:0.88	G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	-AGA:83:19.57:0.00:0.00:43:40:0.65:0.04:0.67:43:0.44:75.00:0.46
1	88266279	C	384	=:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	C:383:60.00:37.11:0.31:178:205:0.27:0.00:9.55:178:0.30:65.91:0.40	G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00	T:1:60.00:7.00:0.00:0:1:0.75:0.03:18.00:0:0.00:75.00:0.63	N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

So for the first deletion, I can see the deletion in bam-readcount, but not for the second. If I visualize the aligments of these two positions on IGV:
1_85610808
1_88266279

Both seem to be true deletions. My question is, why only the first one is reported as a deletion in bam-readcount? Is it due to the length of the deletion? Maybe due to the spanning reads supporting the deletion in the first variant but not in the second one?

Thank you in advance,

--site-list option is not working

Hi,

I git clone and build from the latest source file. I tried the following command to limit the search to a collection of sites, but bam-readcount simply stops without output.

./bam-readcount -w 0 -f GRCh37.primary_assembly.genome.fa --site-list loci.txt Aligned.sortedByCoord.out.bam
$ head loci.txt
chr1:13418-13418
chr1:13649-13649
chr1:324822-324822
chr1:897462-897462
chr1:1580738-1580738

Does not compile with samtools 0.1.19

As reported on Biostar.

$ make
...
/home/dev/src/samtools/libbam.a(bgzf.o): In function `bgzf_mt':
/home/dev/src/samtools/bgzf.c:445: undefined reference to `pthread_create'
/home/dev/src/samtools/libbam.a(bgzf.o): In function `mt_destroy':
/home/dev/src/samtools/bgzf.c:458: undefined reference to `pthread_join'
collect2: error: ld returned 1 exit status
make[2]: *** [bin/bam-readcount] Error 1
make[1]: *** [build/src/exe/bam-readcount/CMakeFiles/bam-readcount.dir/all] Error 2
make: *** [all] Error 2

`make install` ignores CMAKE_INSTALL_PREFIX

It tries to install to /usr/bin even though I've set CMAKE_INSTALL_PREFIX.

==> cmake .. -DCMAKE_C_FLAGS_RELEASE= -DCMAKE_CXX_FLAGS_RELEASE= -DCMAKE_INSTALL_PREFIX=/usr/local/Cellar/bam-readcount/0.7.4 -DCMAKE_BUILD_TYPE=Release -DCMAKE_FIND_FRAMEWORK=LAST -DCMAKE_VERBOSE_MAKEFILE=ON -Wno-dev
…
==> make install
…
/usr/local/Cellar/cmake/3.2.2/bin/cmake -P cmake_install.cmake
-- Install configuration: "Release"
-- Installing: /usr/bin/bam-readcount
CMake Error at build/src/exe/bam-readcount/cmake_install.cmake:31 (file):
  file INSTALL cannot copy file
  "/tmp/bam-readcount20150612-7802-1a5e9cy/bam-readcount-0.7.4/build/bin/bam-readcount"
  to "/usr/bin/bam-readcount".

compiling error

cmake 2.8.4

xierf@bio-build1:~/Yang_L/bam_readcount/compile> make
Extracting boost from /share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/vendor/boost-1.55-bamrc.tar.gz
Boost build log can be found at /share/home/xierf/Yang_L/bam_readcount/compile/vendor/boost-src/build.log
-- Configuring done
-- Generating done
-- Build files have been written to: /share/home/xierf/Yang_L/bam_readcount/compile
Scanning dependencies of target boost-1.55
[ 3%] Creating directories for 'boost-1.55'
[ 6%] Performing download step (verify and extract) for 'boost-1.55'
-- verifying file...
file='/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/vendor/boost-1.55-bamrc.tar.gz'
-- verifying file... warning: did not verify file - no URL_MD5 checksum argument? corrupt file?
-- extracting...
src='/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/vendor/boost-1.55-bamrc.tar.gz'
dst='/share/home/xierf/Yang_L/bam_readcount/compile/vendor/boost-src'
-- extracting... [tar xfz]
-- extracting... [analysis]
-- extracting... [rename]
-- extracting... [clean up]
-- extracting... done
[ 9%] No patch step for 'boost-1.55'
[ 12%] No update step for 'boost-1.55'
[ 15%] Performing configure step for 'boost-1.55'
Building Boost.Build engine with toolset gcc... tools/build/v2/engine/bin.linuxx86_64/b2
Detecting Python version... 3.5
Detecting Python root... File "", line 1
import sys; print sys.prefix
^
SyntaxError: invalid syntax

Unicode/ICU support for Boost.Regex?... not found.
Generating Boost.Build configuration in project-config.jam...

Bootstrapping is done. To build, run:

./b2

To adjust configuration, edit 'project-config.jam'.
Further information:

[ 18%] Performing build step for 'boost-1.55'
Building boost, build log is /share/home/xierf/Yang_L/bam_readcount/compile/vendor/boost-src/build.log
[ 21%] Performing install step for 'boost-1.55'
[ 25%] Completed 'boost-1.55'
[ 25%] Built target boost-1.55
Scanning dependencies of target samtools-lib
[ 28%] Creating directories for 'samtools-lib'
[ 31%] Performing download step (verify and extract) for 'samtools-lib'
-- verifying file...
file='/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/vendor/samtools-0.1.19.tar.gz'
-- verifying file... warning: did not verify file - no URL_MD5 checksum argument? corrupt file?
-- extracting...
src='/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/vendor/samtools-0.1.19.tar.gz'
dst='/share/home/xierf/Yang_L/bam_readcount/compile/vendor/samtools'
-- extracting... [tar xfz]
-- extracting... [analysis]
-- extracting... [rename]
-- extracting... [clean up]
-- extracting... done
[ 34%] Performing patch step for 'samtools-lib'
missing header for unified diff at line 3 of patch
patching file sam_header.c
[ 37%] No update step for 'samtools-lib'
[ 40%] Performing configure step for 'samtools-lib'
Building samtools, build log at /share/home/xierf/Yang_L/bam_readcount/compile/vendor/samtools/build.log
[ 43%] Performing build step for 'samtools-lib'
[ 46%] Performing install step for 'samtools-lib'
[ 50%] Completed 'samtools-lib'
[ 50%] Built target samtools-lib
Scanning dependencies of target __bc_predepends
[ 50%] Built target __bc_predepends
Scanning dependencies of target gtest160
[ 53%] Creating directories for 'gtest160'
[ 56%] Performing download step (verify and extract) for 'gtest160'
-- verifying file...
file='/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/build-common/vendor/gtest-1.6.0.tar.gz'
-- verifying file... warning: did not verify file - no URL_MD5 checksum argument? corrupt file?
-- extracting...
src='/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/build-common/vendor/gtest-1.6.0.tar.gz'
dst='/share/home/xierf/Yang_L/bam_readcount/compile/vendor/src/gtest160'
-- extracting... [tar xfz]
-- extracting... [analysis]
-- extracting... [rename]
-- extracting... [clean up]
-- extracting... done
[ 59%] No patch step for 'gtest160'
[ 62%] No update step for 'gtest160'
[ 65%] Performing configure step for 'gtest160'
-- The CXX compiler identification is GNU
-- The C compiler identification is GNU
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Found PythonInterp: /share/home/xierf/Yang_L/conda3/bin/python (found version "3.5.1 :: Continuum Analytics, Inc.")
-- Looking for include files CMAKE_HAVE_PTHREAD_H
-- Looking for include files CMAKE_HAVE_PTHREAD_H - found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /share/home/xierf/Yang_L/bam_readcount/compile/vendor/gtest160-build
[ 68%] Performing build step for 'gtest160'
-- Configuring done
-- Generating done
-- Build files have been written to: /share/home/xierf/Yang_L/bam_readcount/compile/vendor/gtest160-build
Scanning dependencies of target gtest
[ 50%] Building CXX object CMakeFiles/gtest.dir/src/gtest-all.cc.o
Linking CXX static library libgtest.a
[ 50%] Built target gtest
Scanning dependencies of target gtest_main
[100%] Building CXX object CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
Linking CXX static library libgtest_main.a
[100%] Built target gtest_main
[ 71%] No install step for 'gtest160'
[ 75%] Completed 'gtest160'
[ 75%] Built target gtest160
Scanning dependencies of target bamrc
[ 78%] Building CXX object build/src/lib/bamrc/CMakeFiles/bamrc.dir/BasicStat.cpp.o
/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/src/lib/bamrc/BasicStat.cpp: In member function ‘void BasicStat::process_read(const bam_pileup1_t*)’:
/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/src/lib/bamrc/BasicStat.cpp:62: 错误:调用重载的‘abs(int)’有歧义
/usr/include/c++/4.3/cmath:99: 附注:备选为: double std::abs(double)
/usr/include/c++/4.3/cmath:103: 附注: float std::abs(float)
/usr/include/c++/4.3/cmath:107: 附注: long double std::abs(long double)
/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/src/lib/bamrc/BasicStat.cpp:65: 错误:调用重载的‘abs(int)’有歧义
/usr/include/c++/4.3/cmath:99: 附注:备选为: double std::abs(double)
/usr/include/c++/4.3/cmath:103: 附注: float std::abs(float)
/usr/include/c++/4.3/cmath:107: 附注: long double std::abs(long double)
/share/home/xierf/Yang_L/bam_readcount/bam-readcount-0.8.0/src/lib/bamrc/BasicStat.cpp:66: 错误:调用重载的‘abs(int)’有歧义
/usr/include/c++/4.3/cmath:99: 附注:备选为: double std::abs(double)
/usr/include/c++/4.3/cmath:103: 附注: float std::abs(float)
/usr/include/c++/4.3/cmath:107: 附注: long double std::abs(long double)
make[2]: *** [build/src/lib/bamrc/CMakeFiles/bamrc.dir/BasicStat.cpp.o] 错误 1
make[1]: *** [build/src/lib/bamrc/CMakeFiles/bamrc.dir/all] 错误 2
make: *** [all] 错误 2
xierf@bio-build1:~/Yang_L/bam_readcount/compile>

Different alignment of indels in DNA and RNA

Hi,

This might be off-topic and not within your domain, but want to ask anyways.

I work with a mouse dataset. We use BWA for alignment of DNA seq, and STAR for RNA seq. I have an insertion of A: CA>CAA. BWA seem to prefer having the insertion at position 2, and STAR at position 3. Both are equally right (!) - but when I use bam-readcount to find info on variants in DNA seq and RNA seq, I run into trouble and I find my variants (sometimes) at different locations.

I am not sure how to workaround this. I do not align the data myself, our core facility does that. Do you have any suggestions on what to do here?

(Issue previously raised at https://github.com/griffithlab/pVAC-Seq/issues/293)

Thanks,
Nils

Update bioconda deployment to Python3.7

Hi,

We at GenomeNext enjoy using Anaconda to help manage packages, however I have noticed during an attempt to upgrade some of our systems to Python3.7 that bam-readcount interferes with our upgrade process. It looks like it has been built for Python 3.5 and 3.6 but not 3.7, the new Anaconda standard deployment version. Maybe this could be a relatively easy update to make?

index not found for gatk output

Hi,

If the index is created by samtools, the name of the index is sample.sorted.bam.bai
if it is created by gatk, it is sample.sorted.bai.

bam-readcount complains for the gatk bai file

Minimum mapping quality is set to 0
Expect library: Pa25T2-T in BAM
[bam_index_load] fail to load BAM index.
BAM indexing file is not available.

it should be a quick fix.

Thank you,

Tommy

VarScan fpfilter returning no results

Moving this from comment on other issue

Hi,

I am using bam-readcount (0.7.4) , as part of varscan somatic pipeline, but while applying fpfilter, getting error saying "failed because no readcounts were returned"... What could be the reason
I am not able to figure out, what could be the reason for this...

LOG:
740 variants in input file
0 had a bam-readcount result
0 had reads1>=2
0 passed filters
740 failed filters
740 failed because no readcounts were returned
0 failed minimim variant count < 3
0 failed minimum variant freq < 0.05
0 failed minimum strandedness < 0.0
0 failed minimum reference readpos < 0.2
0 failed minimum variant readpos < 0.15
0 failed minimum reference dist3 < 0.2
0 failed minimum variant dist3 < 0.15
0 failed maximum reference MMQS > 50
0 failed maximum variant MMQS > 100
0 failed maximum MMQS diff (var - ref) > 50
0 failed maximum mapqual diff (ref - var) > 10
0 failed minimim ref mapqual < 20
0 failed minimim var mapqual < 30
0 failed minimim ref basequal < 15
0 failed minimim var basequal < 30
0 failed maximum RL diff (ref - var) > 0.05

Thanks,
Vt,

A error report about bam-readcount

HI,
I meet this error :
“WARNING: In read DHDC08P1_0264:2:2204:2494:140080#CTTGTA: Couldn't find single-end mapping quality. Check to see if the SM tag is in BAM”
what happened ? anything wrong in your software ? And I got the result ,but I do not know this error whether will affect the result ???

bam-readcount

when I use bam-readcount ,it always show "Expect library: lib in BAM", what is it mean ?

Does not report results for some 1-position regions

Hello,

I gave as an additional input a list of regions, each of those made by just one position (therefore start=end). I see the output does not include all those positions; I checked a missing one and I saw that there were no alignments mapping it. If this is the issue, I guess it should be still present in the output but with depth=0.

Thank you!

Non -ACGTN counts

Hi there,
I'm trying to extract relevant coverage information using bam-readcount tool. And I have realised that sometimes there are this kind of rows:

1 267356 A 1 =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 -ATA:1:27.00:0.00:0.00:0:1:0.22:0.03:0.00:0:0.00:101.00:0.11

What does it mean? In 267356 position I have one read with TA insertion? In this case, why is not simply the A counter increased by 1 and that's all?
I hope the question is more or less clear.
Thank you in advance,

bam-readcount does not install on centos 5

I am trying to compile bam-readcount on this centos 5 system using cmake 3.0.2. "cmake" and "make deps" all went through. However "make" failed with the following errors:

Linking CXX executable ../../../../bin/bam-readcount
../../../../vendor/boost/lib/libboost_program_options.a(options_description.o): In function boost::program_options::options_description::print(std::basic_ostream<char, std::char_traits<char> >&) const': options_description.cpp:(.text+0x2a8b): undefined reference tostd::basic_ostream<char, std::char_traits >& std::_ostream_insert<char, std::char_traits >(std::basic_ostream<char, std::char_traits >&, char const, long)'
options_description.cpp:(.text+0x2aa8): undefined reference to std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const_, long)' options_description.cpp:(.text+0x2abf): undefined reference to std::basic_ostream<char, std::char_traits >& std::__ostream_insert<char, std::char_traits >(std::basic_ostream<char, std::char_traits >&, char const*, long)'
options_description.cpp:(.text+0x2ad3): undefined reference tostd::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const_, long)' options_description.cpp:(.text+0x2d03): undefined reference to std::basic_ostream<char, std::char_traits >& std::_ostream_insert<char, std::char_traits >(std::basic_ostream<char, std::char_traits >&, char const, long)'
../../../../vendor/boost/lib/libboost_program_options.a(options_description.o):options_description.cpp:(.text+0x2fc4): more undefined references to `std::basic_ostream<char, std::char_traits >& std::_ostream_insert<char, std::char_traits >(std::basic_ostream<char, std::char_traits >&, char const, long)' follow
collect2: ld returned 1 exit status
make[2]: *_* [bin/bam-readcount] Error 1
make[1]: *** [build/src/exe/bam-readcount/CMakeFiles/bam-readcount.dir/all] Error 2
make: *** [all] Error 2

As it seems to be the issue of Boost, I tried to link Boost 1.55 which is installed system wide by adding the following lines to CMakeLists.txt and remove boost-1.55 from dependencies.

set(BOOST_INCLUDE "/risapps/rhel5/boost/1.55.0/include")
find_package(Boost REQUIRED)
include_directories(${BOOST_INCLUDE})

It failed at "cmake" again saying "include could not find load file: /usr/lib64/boost/Boost.cmake".

Please advise.

no output when running with -l varscan data

I try to identify false positive varscan2 (v2.4.3) somatic calls

I adapted the command listed in the Koboldt et al, Curr Protoc Bioinfo 2013 p8 (PMID 25553206) and it does not output.

bam-readcount -q 1 -b 20 \
 -f GRCh38.fa \
 -l varscan_somatic_mpileup_normal-tumor.snp \
 tumor.bam > tumour.snp.counts

I am confused as the paper says we should use a varscan.variant input for -l while the bam-readcount suggests a bed format chr start end.

my varscan.variant data looks like

chrom   position        ref     var     normal_reads1   normal_reads2   normal_var_freq normal_gt       tumor_reads1    tumor_reads2    tumor_var_freq  tumor_gt        somatic_status  variant_p_value somatic_p_value tumor_reads1_plus       tumor_reads1_minus      tumor_reads2_plus     tumor_reads2_minus      normal_reads1_plus      normal_reads1_minus     normal_reads2_plus      normal_reads2_minus
chr1    12195   T       C       17      4       19.05%  T       9       6       40%     Y       Germline        4.740417747336426E-4    0.15725806451612526     9       0       6       0       14      3       4       0
chr1    12198   G       C       17      4       19.05%  G       10      4       28.57%  S       Germline        0.002493108593043604    0.3976640711902104      10      0       4       0       14      3       4       0

Should I make a BED file from this varscan format like below?

chr1    12194    12195
chr1    12197    12198
...

and what about indel calls which are more difficult to translate to BED due to their variable length?

Thanks in advance
Stephane

bam-readcount

Hello,everbody:
Recently I used bam-readcount to calculate the number of A,C,G,T at each site from bam format file.
Because the depth is extremely high(>100000),when i ran bam-readcount,i used -d ,but i still got wrong result.In other words,-d change nothing. Did somebody meet the same question?
Would anybody give me some advice?

best,
Zhan

Feature Request: Specify Flags to Include/Exclude

Specifically, this has come up in the context of including duplicate reads, but it would be a good enhancement to allow the user to specify read flags to exclude/include as can be done with samtools view

segmentation faults on OSX 10.9.2 with Samtools 0.1.19-44428cd

Hi- This doesn't seem to work correctly. These are Illumina Platinum Genome bams downloaded from Basespace that work with samtools mpileup:

samtools mpileup NA12888_S1.bam -r chr1:10291-10291

[mpileup] 1 samples in 1 input files
Set max per-file depth to 8000
chr1 10291 N 50 tT$c$TTTcttccc-1ncTt$gcgccccccTCC+1TccCCtccgccccccCcccccc^8c^>C @8B?8<BG=HGCDG@A;BDB8B50GGGB5IE895899;505D2<20070@

counting position 10005-10005 works:

bam-readcount -f hg19.fa NA12888_S1.bam chr1:10005-10005
Minimum mapping quality is set to 0
chr1 10005 c 265 =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 C:265:7.65:34.62:4.98:258:7:0.06:0.02:24.99:258:0.95:95.32:0.8G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00 N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

~300bp away seg faults

bam-readcount -f hg19.fa NA12888_S1.bam chr1:10291-10291
Minimum mapping quality is set to 0
Segmentation fault: 11

Ideas?

make error

I also have a error presentation.HELP!!

...
CMakeFiles/bam-readcount.dir/bamreadcount.cpp.o:(.rodata._ZTVN5boost15program_options11typed_valueIicEE[vtable for boost::program_options::typed_value<int, char>]+0x38): undefined reference to boost::program_options::value_semantic_codecvt_helper<char>::parse(boost::any&, std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, bool) const' CMakeFiles/bam-readcount.dir/bamreadcount.cpp.o:(.rodata._ZTVN5boost15program_options11typed_valueISt6vectorISsSaISsEEcEE[vtable for boost::program_options::typed_value<std::vector<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, char>]+0x38): undefined reference to boost::program_options::value_semantic_codecvt_helper::parse(boost::any&, std::vector<std::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::basic_string<char, std::char_traits, std::allocator > > > const&, bool) const'
collect2: ld returned 1 exit status
make[2]: *** [bin/bam-readcount] Error 1
make[1]: *** [build/src/exe/bam-readcount/CMakeFiles/bam-readcount.dir/all] Error 2
make: *** [all] Error 2

Compilation error with abs function in BasicStat.cpp

I just compile your tool in a GNU/Linux distro, using GCC 6.1.1 20160802, but there is a problem with the "BasicStat.cpp" file, specifically, with the "abs" function used at the line 70. The error was:

BasicStat.cpp:69:86: error: call of overloaded ‘abs(float)’ is ambiguous
sum_event_location += 1.0 - abs((float)(base->qpos - left_clip) - read_center)/read_center;

I don't knwo if it is the right way but I have solved it including "cmat" header and changing from "abs" to "fabs".

Hope that helps.

print-individual-mapq not working

Running with other options works great, but -D or --print-individual-mapq with either of 1, true, yes or a few other things throws an error:

bam-readcount -w 10 -f ${fasta}.fa -D true ${sample}.bam ${region}

Minimum mapping quality is set to 0
libc++abi.dylib: terminating with uncaught exception of type char const*
Abort trap: 6

This is on OS X 10.14 in a conda environment.
Many thanks, Felix

max depth still 8000 with -d set up

I installed by git clone and it runs well. I found that max count per position is still 8000 even I set up -d equals to 1000,000. A bug or wrong command line? Thanks.

min-base-quality not giving expected output (also on biostars)

I would like to limit the base-quality-score to phred 13 (similar to default of samtools mpileup (1.3.1)), using bam-readcount (0.8.0). Despite various command modifications, I'm not getting the expected output.

Default:

bam-readcount -f human_g1k_v37.hotspot.fasta final.bam -l F5DNA.txt --max-warnings 0

1       169519049       C       8085    =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:21:60.00:20.05:0.00:15:6:0.20:0.01:33.00:15:0.85:124.00:0.62  C:39:60.00:17.72:1.54:24:15:0.18:0.01:21.10:24:0.85:123.92:0.54 G:8:60.00:11.38:0.00:1:7:0.07:0.02:37.50:1:0.85:124.00:0.13     T:8014:60.00:26.69:0.04:4033:3981:0.15:0.01:35.90:4033:0.85:123.97:0.44 N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  -C:3:60.00:0.00:0.00:2:1:0.18:0.01:9.33:2:0.86:124.00:0.58

min-base-quality 13:

bam-readcount -f human_g1k_v37.hotspot.fasta final.bam -l F5DNA.txt --min-base-quality 13 --max-warnings 0

1       169519049       C       8067    =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:21:60.00:20.05:0.00:15:6:0.20:0.01:33.00:15:0.85:124.00:0.62  C:27:60.00:20.70:2.22:14:13:0.16:0.01:21.04:14:0.85:123.96:0.45 G:2:60.00:24.50:0.00:1:1:0.15:0.01:24.50:1:0.85:124.00:0.44     T:8014:60.00:26.69:0.04:4033:3981:0.15:0.01:35.90:4033:0.85:123.97:0.44 N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  -C:3:60.00:0.00:0.00:2:1:0.18:0.01:9.33:2:0.86:124.00:0.58

Mpileup: (-Q0, not showing the variants)

 samtools mpileup -f human_g1k_v37.hotspot.fasta -d 1000000 -A -l F5DNA.txt -Q0 final.bam 
1       169519049       C       8085

Mpileup: (default of -Q13, not showing the variants

samtools mpileup -f human_g1k_v37.hotspot.fasta -d 1000000 -A -l F5DNA.txt -Q13 final.bam 
1       169519049       C       4072  

Looking at IGV, the readcounts at this position are 8085, which matches both mpileup and bam-readcount output when minimum base quality is 0.
Any idea why bam-readcounts does not appear to filter the same when the minimum base quality is set to 13?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.