Git Product home page Git Product logo

cnvnator's People

Contributors

abyzov avatar arpanda avatar bintriz avatar dnil avatar hyphaltip avatar indraniel avatar joelmartin avatar mikedacre avatar mkarpiarz avatar nathanweeks avatar shobanasekar avatar suvakov avatar zamaudio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cnvnator's Issues

Can't find all histograms for '1'.

Hi Alexej,

I've got some hg19 aligned samples and am using CNVnator running version 0.3.2.

When calling the stat step I get the above error Can't find all histograms for '1'.

Here's the commands and output (truncated to remove chrs 2-21) I'm using:

cnvnator -root sample.root -tree ../bam/sample.sorted.nodup.bam -unique
Parsing file ../bam/sample.sorted.nodup.bam ...
Allocating memory ...
Done.
Filling and saving tree for 'chr1' ...
...
Filling and saving tree for 'chr22' ...
Filling and saving tree for 'chrX' ...
Filling and saving tree for 'chrY' ...
Filling and saving tree for 'chrM' ...
Writing histograms ... 
Total of 13819118 reads were placed.
cnvnator -root sample.root -his 6000 -d /mnt/lustre/references/hg19/
Allocating memory ...
Done.
Calculating histograms with bin size of 6000 for 'chr1' ...
Making directory bin_6000 ...
Making GC histogram for 'chr1' ...
Done.
...
Calculating histograms with bin size of 6000 for 'chr22' ...
Making GC histogram for 'chr22' ...
Done.
Calculating histograms with bin size of 6000 for 'chrX' ...
Making GC histogram for 'chrX' ...
Done.
Calculating histograms with bin size of 6000 for 'chrY' ...
Making GC histogram for 'chrY' ...
Done.
Calculating histograms with bin size of 6000 for 'chrM' ...
Making GC histogram for 'chrM' ...
Done.
cnvnator -root sample.root -stat 6000
Making statistics for chr1 ...
...
Making statistics for chr22 ...
Making statistics for chrX ...
Making statistics for chrY ...
Making statistics for chrM ...
Average RD per bin (1-22) is 28.2875 +- 7.9401 (before GC correction)
Average RD per bin (X,Y)  is 14.4064 +- 5.43518 (before GC correction)
Correcting counts by GC-content for 'chr1' ...
Correcting counts by GC-content for 'chr2' ...
Correcting counts by GC-content for 'chr3' ...
Correcting counts by GC-content for 'chr4' ...
Correcting counts by GC-content for 'chr5' ...
Correcting counts by GC-content for 'chr6' ...
Correcting counts by GC-content for 'chr7' ...
Correcting counts by GC-content for 'chr8' ...
Correcting counts by GC-content for 'chr9' ...
Correcting counts by GC-content for 'chr10' ...
Correcting counts by GC-content for 'chr11' ...
Zero value of GC average.
Bin 16073 with center 9.6435e+07 is not corrected.
Correcting counts by GC-content for 'chr12' ...
Correcting counts by GC-content for 'chr13' ...
Zero value of GC average.
Bin 19071 with center 1.14423e+08 is not corrected.
Correcting counts by GC-content for 'chr14' ...
Correcting counts by GC-content for 'chr15' ...
Correcting counts by GC-content for 'chr16' ...
Correcting counts by GC-content for 'chr17' ...
Correcting counts by GC-content for 'chr18' ...
Correcting counts by GC-content for 'chr19' ...
Correcting counts by GC-content for 'chr20' ...
Correcting counts by GC-content for 'chr21' ...
Correcting counts by GC-content for 'chr22' ...
Correcting counts by GC-content for 'chrX' ...
Correcting counts by GC-content for 'chrY' ...
Correcting counts by GC-content for 'chrM' ...
Making statistics for chr1 after GC correction ...
...
Making statistics for chr22 after GC correction ...
Making statistics for chrX after GC correction ...
Making statistics for chrY after GC correction ...
Making statistics for chrM after GC correction ...
Average RD per bin (1-22) is 27.2265 +- 7.3848 (after GC correction)
Average RD per bin (X,Y)  is 12.221 +- 5.19508 (after GC correction)
Can't find all histograms for 'chr1'

My bam header is as follows:

@HD VN:1.0  SO:coordinate
@SQ SN:chr1 LN:249250621
@SQ SN:chr2 LN:243199373
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chr10    LN:135534747
@SQ SN:chr11    LN:135006516
@SQ SN:chr12    LN:133851895
@SQ SN:chr13    LN:115169878
@SQ SN:chr14    LN:107349540
@SQ SN:chr15    LN:102531392
@SQ SN:chr16    LN:90354753
@SQ SN:chr17    LN:81195210
@SQ SN:chr18    LN:78077248
@SQ SN:chr19    LN:59128983
@SQ SN:chr20    LN:63025520
@SQ SN:chr21    LN:48129895
@SQ SN:chr22    LN:51304566
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@SQ SN:chrM LN:16571
@RG ID:1    PL:ILLUMINA PU:barcode  LB:pairedend    SM:sample

My chromosomes directory contains chr1.fa ... chrM.fa.

I also have this same sample aligned with GRCh37 as the reference and get the same error (passing the correctly named chromosome fasta files).

Looking at the source code I can see that the error is generated when the signal, partition, or distribution names aren't set correctly (source lines 890 & 1186). But I'm not 100% clear why this should be the case.

Help diagnosing CNVnator error

I am trying to run CNVnator to obtain a file with normalized read depth values for SNP locations. I installed root and CNVnator as instructed and am attempting to run the first step (extracting read mapping from the bam file). The command I used is here:

cnvnator -root HR1204.root -genome /home/blumhagenr/Chromosomes/chr1.fa -chrom chr1 -tree reduced_filtered_HR1204.bam

However, I get this as output and am having trouble diagnosing the error message. Can you help me determine what may be the cause and how best to proceed?

Error: cannot open file "iostream" (tmpfile):2:
*** Interpreter error recovered ***
Parsing file reduced_filtered_HR1204.bam ...
Allocating memory ...
Done.
Filling and saving tree for 'chr1' ...
Fatal in TVirtualStreamerInfo::Factory: Cannot find the plugin handler for TVirtualStreamerInfo! However $ROOTSYS/etc/plugins/TVirtualStreamerInfo is accessible, Check the content of this directory!
aborting
Error in TUnixSystem::StackTrace script /etc/root/gdb-backtrace.sh is missing
Aborted

Using multi-cores for computation?

I have 16 cores on my server and 192GB of RAM. I am running a test run and see CNVnator only use one core. Is it possible to setup CNVnator so that it uses multiple cores and more RAM to speed up calculation?

I may also try the OpenMP option, but still prefer running on one host first.

Lots of deletions

Hi,

I have low-coverage WGS data (~1x) with many sequencing gaps, i.e. fragments where no reads are found. CNVnator classifies these as deletions, and I get considerably more deletions compared to other CNV detection algorithms. Other tools just discard these regions where no reads are found. Is there any way to get rid of this feature? I am using the default parameters.

CNVnator Genotype

Dear all,

Recently, I want to do CNV association study on population scale. I have run CNVnator and get the raw CNV regions for each sample. I removed CNVs overlapped with genome gap region and the q0 value over 0.5 for each sample. To get the population CNV regions, I required a reciprocal overlap of CNVs more than 90% of their lengths. After above processes, I got a location database and used this database to genotype for each sample. But when I compared the RD value, I found it will have a big difference between the RD reported by call and the RD reported by genotype when their boundaries have a little differences. I just doubt that the RD in call has been normalized to one but the RD in genotype are two copies based. Is this right?

Any suggestions will be appreciated!

Best

ERROR: cnvnator --his

Hello,

When I try to generate a histogram ($./cnvnator -root out.root -his 100) I get the following message, for all contigs:

Allocating memory ...
Can't determine length for 'chr17'.
No reference genome specified.
Done.
Calculating histograms with bin size of 100 for 'chr17' ..

my code:
1)./cnvnator -root NA12878-17.root -chrom chr17 -tree NA12878.chrom17.SLX.maq.SRP000032.2009_07.bam
2)./cnvnator -genome ./hg19/chr17.fa -root NA12878-17.root -chrom chr17 -his 100

I'm not sure what the problem is. My input is .bam files. Your help would be appreciated!

Can I use the single chromosome as my .bam file? If I want to detect the single chromosome, what should i do? Could you please give me some advice?

Thanks,
Madeline

Histogram error

Hello,

When I try to generate a histogram ($./cnvnator -root out.root -his 100) I get the following message, for all contigs:

Calculating histograms with bin size of 100 for 'GL000192.1' ...
Making GC histogram for 'GL000192.1' ...
Can't open file with chromosome sequence.
No chromosome/contig information parsed.
Sequence length (0) is different from expectation (547496) for 'GL000192.1'.
Doing nothing!

I'm not sure what the problem is. My input is .bam files. Your help would be appreciated!

Thanks,
Madeline

contig autocasing breaks non human discovery

ver 0.3 is much easier to run on nonhuman samples, but for some they have contigs named like
Chr01
which cnvnator changes to 'chr01' when trying to load the fasta file for that contig.
an option to disable that would be very useful. the workarounds are very troublesome.

CNVnatorv0.3.2, root 6.04.10, Ubuntu 14.04, and make

I have successfully installed root version 6.04.10, and I have installed the CNVnator version of samtools but when I go to compile the CNVnator code with make I keep getting the following error. The version of Ubuntu is running BioLinux

Compiling with parallel (OpenMP) support
g++ -O3 -std=c++11 -DCNVNATOR_VERSION="v0.3.2" -fopenmp -I/include -Isamtools -Isamtools/htslib-1.2.1 -c cnvnator.cpp -o obj/cnvnator.o
In file included from cnvnator.cpp:14:0:
HisMaker.hh:11:20: fatal error: TFrame.h: No such file or directory
#include <TFrame.h>
^
compilation terminated.
make: *** [obj/cnvnator.o] Error 1

I get the same with make OMP=no

I realize that this was previously been associated with ROOTSYS and the LD_LIBRARY_PATH settings but even when these are correct I get the same error.

The .TFrame.h is in the correct setting for root in /data/software/root-6.04.10/graf2d/graf/inc

set command provides the following (partial output)

GID=1014
GNOME_KEYRING_CONTROL=/run/user/1002/keyring-xVuzel
GNOME_KEYRING_PID=119570
GPG_AGENT_INFO=/run/user/1002/keyring-xVuzel/gpg:0:1
GTK_IM_MODULE=ibus
GTK_MODULES=overlay-scrollbar
HISTCHARS='!^#'
HISTCMD=723
HISTFILE=/home/philbup/.zsh_history
HISTSIZE=2000
HOME=/home/philbup
HOST=MDHS-NIX-028
IFS=' '
KEYBOARD_HACK=''
KEYTIMEOUT=40
LANG=en_AU.UTF-8
LANGUAGE=en_AU:en
LD_LIBRARY_PATH=/data/software/root-6.04.10/lib
LIBPATH=/data/software/root-6.04.10/lib
LINENO=377
LINES=51
LISTMAX=100
LOGCHECK=60
LOGNAME=philbup
MACHTYPE=x86_64
MAIL=/var/mail/philbup
MAILCHECK=60
MAILPATH=''
MANPATH=/data/software/root-6.04.10/man/man1:/man1:/man1:/man1:/data/software/root/man:/usr/local/man:/usr/local/share/man:/usr/share/man
MATE_DESKTOP_SESSION_ID=this-is-deprecated
MODULE_PATH=/usr/lib/x86_64-linux-gnu/zsh/5.0.2
NULLCMD=cat

Do I need to change something in the Makefile? or could be problem with the .Zsh shell?

Explanation of genotype output

I issue the following command to get genotypes and it gives some warnings, along with some numerical assignments:

cnvnator -root cnvnator_merged_unique.root -genotype 100 -ngc
>4:174602340-174608711
Can't find directory 'bin_1000'.
Can't find directory 'bin_1000'.
Genotype 4:174602340-174608711 cnvnator_merged_unique.root 2.48289 -1

First, are the warnings about 'bin_1000' of concern and is there a solution to this issue? My main question is what do the numbers mean? It is not clear if this is the expected output or what it means exactly.

What I would like is to extract per sample copy number estimates from the called CNV region. Is this possible?

Thanks.

"Maximum buffer size exceeded"

Hi,
I used CNVnator v_0.3 to call CNVs in our WGS data (7x). When I did the second step "CREATING A HISTOGRAM", I encounter a problem that said "Maximum buffer size exceeded". Does it have impact on the result? How can I deal with it? Thanks.

Here is the issue below:
[root@maize-1 bamtest]# cnvnator -genome /workdir/xz235/maize3.fa -root total.root -chrom 1 2 3 4 5 6 7 8 9 10 -his 500 -d /workdir/xz235/refSeqbyChr/
Warning in UnknownClass::SetDisplay: DISPLAY not set, setting it to xx.xxx.xx.xxx:0.0
Allocating memory ...
Done.
Calculating histograms with bin size of 500 for '1' ...
Making GC histogram for '1' ...
Maximum buffer size exceeded.

CNVnator make error

Dear all,

I have successfully install root package, however, when I try to make CNVnator, there are still some errors:

/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h: At global scope:
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:38:24: error: 鈙hared_ptr?in namespace 鈙td?does not name a type
TFitResultPtr(const std::shared_ptr & p);
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:38:29: error: ISO C++ forbids declaration of 鈖arameter?with no type [-fpermissive]
TFitResultPtr(const std::shared_ptr & p);
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:38:39: error: expected ??or ?..?before ??token
TFitResultPtr(const std::shared_ptr & p);
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:38:4: error: 釺FitResultPtr::TFitResultPtr(int)?cannot be overloaded
TFitResultPtr(const std::shared_ptr & p);
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:36:4: error: with 釺FitResultPtr::TFitResultPtr(int)?
TFitResultPtr(int status = -1): fStatus(status), fPointer(0) {};
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:59:4: error: 鈙hared_ptr?in namespace 鈙td?does not name a type
std::shared_ptr fPointer; //! Smart Pointer to TFitResult class
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h: In constructor 釺FitResultPtr::TFitResultPtr(int)?
/gpfs01/home/jingjing/software/root-6.04.02/include/TFitResultPtr.h:36:53: error: class 釺FitResultPtr?does not have any field named 鈌Pointer?
TFitResultPtr(int status = -1): fStatus(status), fPointer(0) {};
^
In file included from /gpfs01/home/jingjing/software/root-6.04.02/include/TFormula.h:26:0,
from /gpfs01/home/jingjing/software/root-6.04.02/include/TF1.h:29,
from HisMaker.hh:23,
from cnvnator.cpp:8:
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h: At global scope:
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:44:10: error: expected nested-name-specifier before 釫ReturnType?
using EReturnType = TInterpreter::EReturnType;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:44:10: error: using-declaration for non-member at class scope
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:44:22: error: expected ??before ??token
using EReturnType = TInterpreter::EReturnType;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:44:22: error: expected unqualified-id before ??token
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:47:17: error: 釫ReturnType?does not name a type
static const EReturnType kLong = TInterpreter::EReturnType::kLong;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:48:17: error: 釫ReturnType?does not name a type
static const EReturnType kDouble = TInterpreter::EReturnType::kDouble;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:49:17: error: 釫ReturnType?does not name a type
static const EReturnType kString = TInterpreter::EReturnType::kString;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:50:17: error: 釫ReturnType?does not name a type
static const EReturnType kOther = TInterpreter::EReturnType::kOther;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:51:17: error: 釫ReturnType?does not name a type
static const EReturnType kNoReturnType = TInterpreter::EReturnType::kNoReturnType;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:53:17: error: 釫ReturnType?does not name a type
static const EReturnType kNone = TInterpreter::EReturnType::kNoReturnType;
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:66:4: error: 釫ReturnType?does not name a type
EReturnType fRetType; //method return type
^
In file included from /gpfs01/home/jingjing/software/root-6.04.02/include/TFormula.h:26:0,
from /gpfs01/home/jingjing/software/root-6.04.02/include/TF1.h:29,
from HisMaker.hh:23,
from cnvnator.cpp:8:
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:98:4: error: 釫ReturnType?does not name a type
EReturnType ReturnType();
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:108:22: warning: variadic templates only available with -std=c++11 or -std=gnu++11 [enabled by default]
template <typename... T> void SetParams(const T&... params) {
^
/gpfs01/home/jingjing/software/root-6.04.02/include/TMethodCall.h:108:56: warning: variadic templates only available with -std=c++11 or -std=gnu++11 [enabled by default]
template <typename... T> void SetParams(const T&... params) {
^
make: *** [obj/cnvnator.o] Error 1

I am not sure this is the problem because of root or CNVnator?

[jingjing@headnode src]$ echo $LD_LIBRARY_PATH
/apps/platform/lsf/9.1/linux2.6-glibc2.3-x86_64/lib:/gpfs01/software/general/libgd-2.1.0/lib::/gpfs01/home/jingjing/software/root-6.04.02//lib:/gpfs01/home/jingjing/software/root-6.04.02/lib
[jingjing@headnode src]$ echo $ROOTSYS
/gpfs01/home/jingjing/software/root-6.04.02
[jingjing@headnode src]$ which root
~/software/root-6.04.02/bin/root

Can anyone give me some suggestions?

Jingjing

No chromosome/contig description given.

When I run CNVnator0.3.3, there is alway an error:

./cnvnator -genome ~/ref/Homo_sapiens_assembly19.fasta -root ~/test/out.root -tree ~/data/test.BAM

Parsing file ~/test.BAM ...
No chromosome/contig description given.
No reference genome specified. Aborting parsing.
Writing histograms ...

I have set the reference file. Do I need to set other chromosome/contig files?

Thanks!

Break *** Segmentation violation

I got errors below running cnvnator/0.3.2/gcc/4.8.5 on

Linux ulam 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

The input file is extracted scaffold region from a large file, which contains 3K scaffolds.
The bam header still seems to have all the scaffolds names. Could that be the issue? If only add the current scaffold LGUN_random_Scaffold1085 in the header, the bam format is not correct.
Would you please check if anything wrong here?

cnvnator -root LGUN_random_Scaffold1085.root -chrom LGUN_random_Scaffold1085  -tree ../reheader_sorted_merged_B12G2.bam
Parsing file ../reheader_sorted_merged_B12G2.bam ...

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
    from libstdcxx.v6.printers import register_libstdcxx_printers
#0  0x00002b09e4dd4b4c in __libc_waitpid (pid=12909, stat_loc=stat_loc
entry=0x7ffd0335c520, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:31
#1  0x00002b09e4d5a2e2 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
#2  0x00002b09e21a9eb7 in TUnixSystem::StackTrace() () from /opt/local/root/5.34.18/gcc/4.6.3//lib/root/libCore.so
#3  0x00002b09e21ac7a3 in TUnixSystem::DispatchSignals(ESignals) () from /opt/local/root/5.34.18/gcc/4.6.3//lib/root/libCore.so
#4  <signal handler called>
#5  0x0000000000427e12 in AliParser::numChrom() ()
#6  0x0000000000424f30 in HisMaker::produceTrees(std::string*, int, std::string*, int, bool) ()
#7  0x0000000000408b72 in main ()
===========================================================


The lines below might hint at the cause of the crash.
If they do not help you then please submit a bug report at
http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x0000000000427e12 in AliParser::numChrom() ()
#6  0x0000000000424f30 in HisMaker::produceTrees(std::string*, int, std::string*, int, bool) ()
#7  0x0000000000408b72 in main ()
===========================================================

Calculating histograms error

Deal all,
Very popular tool to call CNVs. For me, an error bothers me for a long time when running calculating histograms. My commands like following,

cnvnator -unique -root output.root -tree test.sort.dedup.realn.bam (succeed)
cnvnator -root ouput.root -his 200 -d /ref/sep/ (failed)

and the error logs

Total of 1534005368 reads were placed.
Allocating memory ...
Done.
Calculating histograms with bin size of 200 for '1' ...
Making directory bin_200 ...
Making GC histogram for '1' ...
SysError in TFile::Seek: cannot seek to position -1634381530 in file ERR194147.root, retpos=-1 (Invalid argument)
SysError in TFile::Seek: cannot seek to position 8419450657427287862 in file ERR194147.root, retpos=-1 (Invalid argument)
Error in TFile::ReadBuffer: error reading all requested bytes from file ERR194147.root, got 8522541 of 67108895
Error in TKey::ReadFile: Failed to read data.
SysError in TFile::Seek: cannot seek to position -216073498781733252 in file ERR194147.root, retpos=-1 (Invalid argument)
Done.

The version I used is v0.3.2, I hope you can help me, any suggestion will be appreciated.

best wishes.

0 reads placed result

Hi,

I would love to use CNVnator but am having trouble with the first step of usage.

When executing the following command:
CNVnator_v0.3.2/src/cnvnator -root out.root -tree test_merged.bam.sorted

I get a result that 0 reads were placed.

Specifically, the output is:
Parsing file test_merged.bam.sorted ...
No chromosome/contig description given.
No reference genome specified. Aborting parsing.
Writing histograms ...
Total of 0 reads were placed.

Any suggestions would be greatly appreciated as to why this occurs.

many thanks,

JSteenwyk

Test issues

Hi,
I would like to add a new functionality. Also, I'm attaching an image of compilation results.

screen shot 2014-07-09 at 6 55 50 pm

Alexej Abyzov

Analyze multiple samples together or separated?

Hi Abyzov, I am testing to see if CNVnator is sutitable for shallow sequencing data. We sequence them in batch of 6 - 12 samples on Ion Proton. The average coverage is 1X - 4X.

I am not clear how should run CNVnator from the first step:

Should I run like this?
cnvnator -genome hg19 -root batch1.root -tree sample1.bam sample2.bam sample3.bam ... sample12.bam

or like this?

cnvnator -genome hg19 -root sample1.root -tree sample1.bam
...
cnvnator -genome hg19 -root sample12.root -tree sample12.bam

cnvnator2vcf output problem

Hello all,

I have followed steps to generate SV call files like the following, which is good.

deletion    20:1-60000  60000   0   2.65621e-12 0   2.7478e-12  0   -1
deletion    20:254201-254900    700 0.529152    807.762 0.000491034 1   1   1
deletion    20:259401-260400    1000    0.300038    5.70349 2.27101e-11 1   1   1
deletion    20:278101-278600    500 0.314873    5696.86 0.000254785 1   1   1
deletion    20:304501-309600    5100    0.514826    2.18747e-08 2.72107e+09 0.00531416  2.77891e+09 1

After get this, I want to turn it to a VCF file by using cnvnator2vcf. But in the VCF file, contents under REF and ALT column are missing. Therefore, I could not use IGV to view this.

How to solve this problem? Thanks in advance!

##fileformat=VCFv4.0
##fileDate=20161019
##reference=1000GenomesPhase1-GRCh37
##source=CNVnator
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=IMPRECISE,Number=0,Type=Flag,Description="Imprecise structural variation">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=natorRD,Number=1,Type=Float,Description="Normalized RD">
##INFO=<ID=natorP1,Number=1,Type=Float,Description="p-val by t-test">
##INFO=<ID=natorP2,Number=1,Type=Float,Description="p-val by Gaussian tail">
##INFO=<ID=natorP3,Number=1,Type=Float,Description="p-val by t-test (middle)">
##INFO=<ID=natorP4,Number=1,Type=Float,Description="p-val by Gaussian tail (middle)">
##INFO=<ID=natorQ0,Number=1,Type=Float,Description="Fraction of reads with 0 mapping quality">
##INFO=<ID=SAMPLES,Number=.,Type=String,Description="Sample genotyped to have the variant">
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=DUP,Description="Duplication">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
20  1   CNVnator_1  .   <DEL>   .   PASS    END=60000;SVTYPE=DEL;SVLEN=-60000;IMPRECISE;natorRD=0;natorP1=2.65621e-12;natorP2=0;natorP3=2.7478e-12;natorP4=0;natorQ0=-1
20  254201  CNVnator_2  .   <DEL>   .   PASS    END=254900;SVTYPE=DEL;SVLEN=-700;IMPRECISE;natorRD=0.529152;natorP1=807.762;natorP2=0.000491034;natorP3=1;natorP4=1;natorQ0=1
20  259401  CNVnator_3  .   <DEL>   .   PASS    END=260400;SVTYPE=DEL;SVLEN=-1000;IMPRECISE;natorRD=0.300038;natorP1=5.70349;natorP2=2.27101e-11;natorP3=1;natorP4=1;natorQ0=1
20  278101  CNVnator_4  .   <DEL>   .   PASS    END=278600;SVTYPE=DEL;SVLEN=-500;IMPRECISE;natorRD=0.314873;natorP1=5696.86;natorP2=0.000254785;natorP3=1;natorP4=1;natorQ0=1

bam_tview_curses.c:30:20: fatal error: curses.h: 没有那个文件或目录

lidanqing@lidanqing-X450VC:~/software/CNVnator_v0.3.2/src/samtools$ make
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_index.o bam_index.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_plcmd.o bam_plcmd.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o sam_view.o sam_view.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_cat.o bam_cat.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_md.o bam_md.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_reheader.o bam_reheader.c
bam_reheader.c: In function ‘bam_reheader’:
bam_reheader.c:36:19: warning: variable ‘old’ set but not used [-Wunused-but-set-variable]
bam_header_t old;
^
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_sort.o bam_sort.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bedidx.o bedidx.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o kprobaln.o kprobaln.c
kprobaln.c: In function ‘kpa_glocal’:
kprobaln.c:78:21: warning: variable ‘is_diff’ set but not used [-Wunused-but-set-variable]
int bw, bw2, i, k, is_diff = 0, is_backward = 1, Pr;
^
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_rmdup.o bam_rmdup.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_rmdupse.o bam_rmdupse.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_mate.o bam_mate.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_stat.o bam_stat.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_color.o bam_color.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bamtk.o bamtk.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam2bcf.o bam2bcf.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam2bcf_indel.o bam2bcf_indel.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o errmod.o errmod.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o sample.o sample.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o cut_target.o cut_target.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o phase.o phase.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam2depth.o bam2depth.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o padding.o padding.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bedcov.o bedcov.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bamshuf.o bamshuf.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o faidx.o faidx.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o stats.o stats.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o stats_isize.o stats_isize.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_flags.o bam_flags.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_split.o bam_split.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_tview.o bam_tview.c
gcc -g -Wall -O2 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE -D_CURSES_LIB=1 -I. -Ihtslib-1.2.1 -c -o bam_tview_curses.o bam_tview_curses.c
bam_tview_curses.c:30:20: fatal error: curses.h: 没有那个文件或目录
compilation terminated.
Makefile:116: recipe for target 'bam_tview_curses.o' failed
make: *
* [bam_tview_curses.o] Error

CNVnator reads in CRAM

Is CNVnator able to use CRAM files as input? In our local testing it does not seem to have that capability. Is this a feature you might consider adding?
thanks
Jim Eldred
McDonnell Genome Institute

Compilation problem

Hello,

I compiled and installed Root (v6.06.06) without problem. I set the ROOTSYS variable to the appropriate path. I downloaded the last version of CNVNator (v0.3.3) via git. When I try to compile it, I have a list of errors that I can't solve.
They all look like this:

HisMaker.cpp:(.text+0x4300): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&)'

Could you help me, please?

I am using g++ 5.3.1

Cheers

Does CNVnator require a specific version of ROOT?

Dear @bintriz and @abyzov,

A short question regarding CNVnator's dependency on ROOT.
Are the latest versions of ROOT (>= 6.0) compatible with CNVnator or is a specific version of ROOT required?

I'm asking because I'm trying to compile CNVnator's with ROOT 6.04.00 and I run into a problem where the library « libCint.so » (the -lCint flag) is not found. My understanding is that this library is no longer available in ROOT 6.04.00 (but maybe I just did not install ROOT correctly).

Thank you and best regards,
Robin

g++ -O3 -std=c++11 -DCNVNATOR_VERSION="v0.3.1" -fopenmp -o cnvnator obj/cnvnator.o obj/HisMaker.o obj/AliParser.o obj/Genotyper.o obj/Interval.o obj/Genome.o /software/UHTS/Analysis/samtools/1.2/lib64/libbam.a -lz -L/software/Development/root/6.04.00-minimal/lib/root//lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lMatrix -lPhysics -lMathCore -lThread -lGui
/software/bin/ld: cannot find -lCint

non-human genome RD signal issues

Hello,

I am currently running CNVnator on a fish genome. the programme seems to be running fine but when it tries to get RD it comes up with the "Fit data empty" message. It then carries on and seems to be working fine but does not correct for GC content.
Is there something I am doing wrong? Is it because I am using a non-human genome?
Any comments would be helpful. Thank you.

(where I have written ** etc... ** just means that the programme has run through all the chromosomes).

Parsing file fish.bam ...
Allocating memory ...
Done.
Filling and saving tree for 'chr1' ...
etc....
Writing histograms ...
Total of 127933939 reads were placed.
histogram
Allocating memory ...
Done.
Calculating histograms with bin size of 200 for 'chr1' ...
Making directory bin_200 ...
Making GC histogram for 'chr1' ...
etc...
Making statistics for chr1 ...
etc...
Average RD per bin (1-22) is 11.5863 +- 6.77279 (before GC correction)
Warning in : Fit data is empty
Warning in : Fit data is empty
Average RD per bin (X,Y) is 0 +- 0 (before GC correction)
Correcting counts by GC-content for 'chr1' ...
Correcting counts by GC-content for 'chr2' ...
Zero value of GC average.
Bin 36874 with center 7.3747e+06 is not corrected.
Zero value of GC average.
Bin 309564 with center 6.19127e+07 is not corrected.
Correcting counts by GC-content for 'chr3' ...
Correcting counts by GC-content for 'chr4' ...
Correcting counts by GC-content for 'chr5' ...
Zero value of GC average.
etc..
Making statistics for chr1 after GC correction ...
etc...
Average RD per bin (1-22) is 10.8082 +- 6.0048 (after GC correction)
Warning in : Fit data is empty
Warning in : Fit data is empty
Average RD per bin (X,Y) is 0 +- 0 (after GC correction)
partition
Partitioning RD signal for 'chr1' with bin size of 200 ...
Average RD per bin is 10.8082 +- 6.0048
Bin band is 2
Bin band is 3
Bin band is 4
Bin band is 5
Bin band is 6
etc...

Add test cases

It would be really useful if CNVnator included a number of small test cases, which could be run quickly and help troubleshoot installation problems.

How to compile

To compile I had to:

  • install cern-root, yeppp have them available
  • git clone
  • edit the Makefile to point to the samtools source code (not to the samtools I have installed) because access to libbam.a is needed:
#SAMDIR = samtools
SAMDIR = /config/source/samtools/samtools-1.3.1
  • make returned an error
/usr/bin/ld: /config/source/samtools/samtools-1.3.1/htslib-1.3.1/libhts.a(plugin.o): undefined reference to symbol 'dlclose@@GLIBC_2.2.5'
/usr/bin/ld: note: 'dlclose@@GLIBC_2.2.5' is defined in DSO /lib64/libdl.so.2 so try adding it to the linker command line
/lib64/libdl.so.2: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status
  • so I compiled with: LIBS=-ldl make (thanks to this post) and it was successful.

Do I need all the bam files?

Hello,
Recently, I used the CNVnator software to predict CNV calls on hundreds of individuals for Populus tomentosa (a plant). For comparisons of CNVs among multiple individuals, we merged all overlapping calls into unique CNVRs using the BEDTools. In total, 5300 CNVRs were detected among the population of P. tomentosa. To compare gene copy number (CN) between individuals, we want to use the “-genotype” option to infer the average CN across the gene length. But I have no idea what data or files are needed to do this procedure. I only have the reference genome sequences and the outputs of each individual CNV files in my hands (In other words, the Chromosome names, coordinates, length of genes or CNVRs are clear and detailed). If I run the commands on “–genotype” option, do I need all the bam files, which included the original reads of each resequencing individuals? In addition, could you kindly tell me the specific commands about the “-genotype”? Is this “./cnvnator –root file.root –genotype bin_size (or coordinates of genes) [-ngc]” ?
Thank you very much for taking time to read my issue. I look forward to your reply at your earliest convenience.

Best wishes to you.

Parallelization in partition step?

I am trying to understand what parallelization is available in the partition step. I am using someone's wrapper to an older version of cnvnator, and they extract each chromosome to a separate root file and run partition in parallel, then serially merge it back to the original root file. Is this still necessary, or have you added inherent parallelization to the partition step?

CNVnator genotype

I used the following commands to get genotypes,

./cnvnator -root sample.root -genome hg19 -chrom chr1 -unique -tree chr1.bam
./cnvnator -root sample.root -genome hg19 -chrom chr1 -d ./b37 -his 100
./cnvnator -root sample.root -chrom chr1 -stat 100
./cnvnator -root sample.root -chrom chr1 -partition 100
./cnvnator -root sample.root -d ./b37 -genotype 100

For the step5, when appeared ">", I typed "chr1:1-1000", then the output is
Can't find directory 'bin_1000'.
Can't find directory 'bin_1000'.
Genotype chr1:1-1000 sample.root -1 -1

Why the copy number of my data is "-1" . I tried lots of data, all outputs are the same.
I can use the sample.root to call CNV sucessfully.

Thanks.

CNVnator not recognising GRCh37 aligned samples

Hi,

When I run CNVnator with -genome GRCh37 and pass the file path to the chromosome directory containing 1.fa (etc) it still tries to find 'chr1' etc as the chromosomes.

Exact command:

cnvnator -root sample.root -genome GRCh37 -his 10000 -d /mnt/lustre/references/speedseq/annotations/cnvnator_chroms

Error messages:

Can't open file /mnt/lustre/references/speedseq/annotations/cnvnator_chroms/chr1.fa
No chromosome/contig information parsed
Read sequence is of different length from expectation

I'm using v0.3 FYI.
Any ideas why this might be happening?

Thanks,

SB

CNVnator with control bam file

Hi,
Is it possible that run CNVnator with two bam files? one of them is tumor bam file and another is control bam file.

CNVnator result filtering

Dear Alexej Abyzov,
I am using CNVnator v0.3.2 for CNV calling, after finish that, I got a raw call result, how can I filter it ?
And I have an doubt about definition about duplication and deletion, is that normalized_RD > 1 is defined as duplication, and vice versa? if so, I got results like this :
deletion chr1:122613801-122624200 10400 2.22569 15649.7 2.86171e+09 12154.1 2.8635e+09 0.627206
deletion chr1:122763901-122777200 13300 1.46077 18102.9 2.85913e+09 10231.4 2.86091e+09 0.520318
deletion chr1:122872301-122876900 4600 1.55713 117590 2.86689e+09 81973.3 2.86868e+09 0.399678
deletion chr1:123010201-123013300 3100 1.47059 231779 2.86823e+09 154720 2.86898e+09 0.371232
deletion chr1:123089801-123091700 1900 1.28593 528143 2.8693e+09 1 1 0.582763
deletion chr1:123274601-123278000 3400 1.0499 393230 2.86796e+09 358119 2.86975e+09 0.517699
deletion chr1:123341601-123352500 10900 1.34957 39799.6 2.86127e+09 25483.9 2.86305e+09 0.47585
deletion chr1:123946601-123951500 4900 3.25128 36370 2.86662e+09 37502.8 2.86841e+09 0.422769
deletion chr1:124144801-124148100 3300 1.28625 230407 2.86805e+09 58904 2.86984e+09 0.656344
deletion chr1:124716101-124719700 3600 2.19913 58635 2.86778e+09 49175.7 2.86957e+09 0.393118
Is it normal ?
Looking forward to you reply!
Cheers,
Gerde

How to set bin size

As the title.
The best bin sizeto call CNV may be affected by the sequencing depth and genome size and etc..How to quicckly determine a ideal bin size which is very close to the best bin size?

FitConfig stderr(maybe root problems) in cnvnator

I use the cnvnator 0.32 call CNV

here is commands
$genome=ucsc.hg19
$chrom=chr1 chr2 ... chrX chrY chrM

cnvnator -root $rootname.root -chrom $chrom -genome $genome -tree $bam 2>err
cnvnator -root $rootname.root -chrom $chrom -his $bin_size -d /var/data/ 2>>err
cnvnator -root $rootname.root -chrom $chrom -stat $bin_size 2>>err
cnvnator -root $rootname.root -chrom $chrom -partition $bin_size  2>>err
cnvnator -root $rootname.root -chrom $chrom -call $bin_size  >$outputcnv 2>>err

the outputcnv’s lines are normal except the following:

Number of free parameters from FitConfig = 3
Number of free parameters from Minimizer = 2
Number of free parameters from FitConfig = 3
Number of free parameters from Minimizer = 2

qq20160912-0 2x

Compilation error wrt Samtools

Using gcc 4.8.5, ROOT 6.06.8, yeppp 1.0, and samtools 1.3.1 (which I copied into place from elsewhere), I get the following. There is more, but it's all along the same lines.

Which version of samtools are you using, and are there any plans to upgrade to the newer samtools?

[root@vmpr-res-utils CNVnator]# make
Compiling with parallel (OpenMP) support
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.3\" -fopenmp -I/config/binaries/cern-root/6.06.8/include      -Isamtools -Isamto
ols/htslib-1.3.1 -I/config/binaries/yeppp/1.0.0/lib -DUSE_YEPPP -c cnvnator.cpp -o obj/cnvnator.o
In file included from samtools/sam.h:29:0,
                 from AliParser.hh:10,
                 from cnvnator.cpp:13:
samtools/bam.h:73:9: error: ‘bam_hdr_t’ does not name a type
 typedef bam_hdr_t bam_header_t;
         ^
samtools/bam.h:119:9: error: ‘hts_itr_t’ does not name a type
 typedef hts_itr_t *bam_iter_t;
         ^
samtools/bam.h:200:13: error: ‘samFile’ does not name a type
     typedef samFile *tamFile;
             ^
samtools/bam.h:207:19: error: ‘tamFile’ does not name a type
     static inline tamFile samtools_sam_open(const char *fn) { return sam_open(fn, "r"); }
                   ^
samtools/bam.h:234:5: error: ‘bam_header_t’ does not name a type
     bam_header_t *sam_header_read2(const char *fn_list);
     ^
samtools/bam.h:241:19: error: ‘bam_header_t’ does not name a type
     static inline bam_header_t *sam_header_read(tamFile fp) { return sam_hdr_read(fp); }
                   ^
samtools/bam.h:244:45: error: ‘bam_header_t’ does not name a type
     static inline int32_t bam_get_tid(const bam_header_t *header, const char *seq_name) { return bam_name2id((bam_header_t *
)header, seq_name); }
                                             ^
samtools/bam.h:244:59: error: ISO C++ forbids declaration of ‘header’ with no type [-fpermissive]
     static inline int32_t bam_get_tid(const bam_header_t *header, const char *seq_name) { return bam_name2id((bam_header_t *
)header, seq_name); }
                                                           ^
samtools/bam.h: In function ‘int32_t bam_get_tid(const int*, const char*)’:
samtools/bam.h:244:111: error: ‘bam_header_t’ was not declared in this scope
     static inline int32_t bam_get_tid(const bam_header_t *header, const char *seq_name) { return bam_name2id((bam_header_t *
)header, seq_name); }
                                                                                                               ^
samtools/bam.h:244:125: error: expected primary-expression before ‘)’ token
     static inline int32_t bam_get_tid(const bam_header_t *header, const char *seq_name) { return bam_name2id((bam_header_t *
)header, seq_name); }

Compilation error

Dear @bintriz ,

I've noticed that src/samtools directory is not included in the release, thus I got an error:

g++ -O3 -DCNVNATOR_VERSION=\"v0.3.1\" -fopenmp -o cnvnator obj/cnvnator.o obj/HisMaker.o obj/AliParser.o obj/Genotyper.o obj/Interval.o obj/Genome.o samtools/libbam.a -lz -L/lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lMatrix -lPhysics -lMathCore -lThread -lGui
g++: error: samtools/libbam.a: No such file or directory

Is there a fix for this? A workaround could be that I compile samtools source code and manually move libbam.a to cnvnator, but it doesn't sound like an elegant solution.

Best wishes,
Fengyuan

Problem running -merge roots

Hello

I am having a problem with running the merge function of CNVnator since I split the runs as I was running it on a non-model genome with >5000 scaffolds. The previous runs were successful with 100 scaffolds each time, and I was not sure if the output of each of these runs were the same as running them at once with the merged root file (the alignment file is the same for each run).

chroms=$(cat ../input_all); merges=$(ls -1 roots/root_* | tr '\n' ' '); /opt/CNVnator/CNVnator-master/cnvnator -root out_all.root -chrom $chroms -merge $merges

It breaks with the following error -

*** Break *** segmentation violation
 
===========================================================
There was a crash.
This is the entire stack trace of all threads:
 ===========================================================
 #0  0x00000033304ac2ce in __libc_waitpid (pid=<optimized out>, stat_loc=0x7ffff77afb10, options=0) at ../sysdeps/unix/sysv/linux/waitpid.c:32
 #1  0x000000333044107e in do_system (line=0x2022e78 "/usr/share/root/gdb-backtrace.sh 29566 1>&2") at ../sysdeps/posix/system.c:149
 #2  0x000000352f63cb0a in TUnixSystem::StackTrace (this=0x1c53530) at /usr/src/debug/root-5.28.00h/core/unix/src/TUnixSystem.cxx:2253
 #3  0x000000352f63f283 in TUnixSystem::DispatchSignals (this=0x1c53530, sig=kSigSegmentationViolation) at /usr/src/debug/root-5.28.00h/core/unix/src/TUnixSystem.cxx:1157
 #4  <signal handler called>
 #5  size (this=0x7ffff794e3d8) at /usr/src/debug/gcc-4.6.3-20120306/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:711
 #6  std::string::assign (this=0x7ffff794e3d8, __s=0x7ffff795bc4e "Lvir_1901", __n=9) at /usr/src/debug/gcc-4.6.3-20120306/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:264
 #7  0x00000000004083a8 in main ()
 ===========================================================
 
 
 The lines below might hint at the cause of the crash.
 If they do not help you then please submit a bug report at
 http://root.cern.ch/bugs. Please post the ENTIRE stack trace
 from above as an attachment in addition to anything else
 that might help us fixing this issue.
 ===========================================================
 #5  size (this=0x7ffff794e3d8) at /usr/src/debug/gcc-4.6.3-20120306/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.h:711
 #6  std::string::assign (this=0x7ffff794e3d8, __s=0x7ffff795bc4e "Lvir_1901", __n=9) at /usr/src/debug/gcc-4.6.3-20120306/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/basic_string.tcc:264
 #7  0x00000000004083a8 in main ()
 ===========================================================

I have seen the issue of "segmentation violation" before, but increasing the number (N_CHROM_MAX) does not change anything. Can you please let me know what the problem might be?

P.S. The root I am using is quite old (5.28 to be exact), did not update it since CNVnator doesn't mention the version to be used - might this be the reason?

Compiling CNVnator 0.3.2. Error when linking: undefined reference to `TString::TString

Hi,
I'm trying to compile CNVnator with root-6.06 but there is a problem when linking. Here is the makefile trace:

$ g++ --version
g++ (GCC) 5.2.0

$ echo ${ROOTSYS}
/ccc/products/root-6.06.06/default
$ make SAMDIR=/ccc/work/cont007/fg0019/lindenbp/packages/samtools HTSDIR=/ccc/work/cont007/fg0019/lindenbp/packages/htslib  -B
Compiling with parallel (OpenMP) support
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c cnvnator.cpp -o obj/cnvnator.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c EXOnator.cpp -o obj/EXOnator.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c HisMaker.cpp -o obj/HisMaker.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c AliParser.cpp -o obj/AliParser.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c Genotyper.cpp -o obj/Genotyper.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c Interval.cpp -o obj/Interval.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -I/ccc/products/root-6.06.06/default/include      -I/ccc/work/cont007/fg0019/lindenbp/packages/samtools -I/ccc/work/cont007/fg0019/lindenbp/packages/htslib -c Genome.cpp -o obj/Genome.o
g++ -O3 -std=c++11 -DCNVNATOR_VERSION=\"v0.3.2\" -fopenmp -o cnvnator obj/cnvnator.o obj/EXOnator.o obj/HisMaker.o obj/AliParser.o obj/Genotyper.o obj/Interval.o obj/Genome.o /ccc/work/cont007/fg0019/lindenbp/packages/samtools/libbam.a /ccc/work/cont007/fg0019/lindenbp/packages/htslib/libhts.a -lz -L/ccc/products/root-6.06.06/default/lib      -lCore -lRIO -lHist -lGraf -lGpad -lTree -lMathCore
obj/HisMaker.o: In function `HisMaker::HisMaker(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, Genome*)':
HisMaker.cpp:(.text+0x4280): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::pe_for_file(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, double, double)':
HisMaker.cpp:(.text+0x810a): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::HisMaker(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, bool, Genome*)':
HisMaker.cpp:(.text+0xa86a): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::generateView(TString, int, int, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int)':
HisMaker.cpp:(.text+0xb903): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
HisMaker.cpp:(.text+0xbd5e): undefined reference to `TString::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
HisMaker.cpp:(.text+0xc723): undefined reference to `TString::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::view(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool, bool)':
HisMaker.cpp:(.text+0xd4b2): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::genotype(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool, bool)':
HisMaker.cpp:(.text+0xda50): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::eval(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool, bool)':
HisMaker.cpp:(.text+0x11da1): undefined reference to `TString::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::produceHistogramsNew(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int)':
HisMaker.cpp:(.text+0x1334b): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::mergeTrees(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int)':
HisMaker.cpp:(.text+0x15c71): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o: In function `HisMaker::callSVs(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, int, bool, bool, bool)':
HisMaker.cpp:(.text+0x168e7): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
HisMaker.cpp:(.text+0x1696f): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
HisMaker.cpp:(.text+0x16a06): undefined reference to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
obj/HisMaker.o:HisMaker.cpp:(.text+0x16a94): more undefined references to `TString::TString(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' follow
collect2: error: ld returned 1 exit status
Makefile:59: recipe for target 'cnvnator' failed

can you please help me ?
thanks.

A tiny question

Hi abyzov

Just a really tiny question. I am currently working on nonhuman genome analysis. And I have tried CNVnator these days to identify the CNV in my samples. But when I convert my cnv files into VCF files, in the headers, I see "##reference=1000GenomesPhase3_decoy-GRCh37". It still shows the human genome reference.
So now, I am not quite sure whether it is able to call cnv in nonhuman genome(although I am sure there are some people doing this)? why it still shows human reference genome in the vcf file,even if I used nonhuman reference genome? And how should I use cnvnator for nonhuman genome cnv calling? Is it different from what's in the README file?
Thank you!

Nan

whole scaffold is deleted

Hello,
I run CNVnator for an organism with over 50,000 scaffolds, and in the result for some scaffolds it reports the whole scaffold is deleted. Is it because CNVnator compare the coverage of this scaffold to the background which is calculated using the whole genome?
thanks

Clarification on evalue1

Hi,

I'm a bit confused about the range of the e-value1 I'm getting. For the calls, I was expecting evalues below 1 and many below e.g. 0.05. However I see a lot of calls (~20%) with e-value1>100 and q0<.5.

Is this normal ?

I'm asking because I want to remove calls with low confidence and thought I would do something like: q0<.5 and e-value1<.01. But then I see all these calls with {evalue1 > 100, q0<.5} so I'm wondering if I should I really remove all these calls ?

Thanks in advance,

Jean

-stat log

Hi Abyzov,

In the process of“CALCULATING STATISTICS” in a genome with scaffolds:
It seems that cnvnator skip some GC correction process beacause of the chromosome name problem (Average RD per bin (1-22) is 4.50973 +- 2.58553 (before GC correction)). But the results seems fine with CNV CALLING.

Is it a big problem with the CNV calling? What should I do to perform GC correction correctly.

Thank you
Lizhong

Below is the log:

got standard output log:

Making statistics for scaffold10 ...
Average RD per bin (1-22) is 4.50973 +- 2.58553 (before GC correction)
Average RD per bin (X,Y)  is 0 +- 0 (before GC correction)
Correcting counts by GC-content for 'scaffold10' ...
Making statistics for scaffold10 after GC correction ...
Average RD per bin (1-22) is 4.3648 +- 2.50379 (after GC correction)
Average RD per bin (X,Y)  is 0 +- 0 (after GC correction)

and standard erro log:

Warning in <Fit>: Fit data is empty
Warning in <Fit>: Fit data is empty
Zero value of GC average.
Bin 10800 with center 1.07995e+06 is not corrected.
Zero value of GC average.
Bin 30449 with center 3.04485e+06 is not corrected.
Zero value of GC average.
Bin 45603 with center 4.56025e+06 is not corrected.
Zero value of GC average.
...
...
Zero value of GC average.
Bin 114211 with center 1.1421e+07 is not corrected.
Zero value of GC average.
Bin 116606 with center 1.16606e+07 is not corrected.
Zero value of GC average.
Bin 116799 with center 1.16798e+07 is not corrected.
Warning in <Fit>: Fit data is empty
Warning in <Fit>: Fit data is empty

Which versions work?

Hi,

could someone please post which combination of versions for CNVnator and ROOT work together? I've tried to follow instructions given here, the CERN ROOT website and various versions of http://seqanswers.com/forums/showthread.php?t=16665, but so far, no success. I've tried recent versions as well as those ROOT versions from around the time of the publication.

The closest I could get to an installation was using

#!/usr/bin/env bash

CNVNATORVERSION=0.3
ROOTVERSION=5.30.06
INSTALLDIR=$HOME

# ROOT
ROOTPATH=${INSTALLDIR}/root
cd ${INSTALLDIR}
wget -nc https://root.cern.ch/download/root_v${ROOTVERSION}.source.tar.gz
tar -zxvf root_v${ROOTVERSION}.source.tar.gz
cd ${ROOTPATH}
source ${ROOTPATH}/bin/thisroot.csh
./configure
make


#CNVnator
echo "Getting CNVnator..."
cd ${INSTALLDIR}
wget -nc http://sv.gersteinlab.org/cnvnator/CNVnator_v${CNVNATORVERSION}.zip
unzip -o ${INSTALLDIR}/CNVnator_v${CNVNATORVERSION}.zip
SRCDIR=${INSTALLDIR}/CNVnator_v${CNVNATORVERSION}/src

echo "Making samtools..."
cd ${SRCDIR}/samtools
make

printf "\nExporting variables...\n\n"
ROOTSYS=${ROOTPATH}
export ROOTSYS
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${ROOTSYS}/lib
export LD_LIBRARY_PATH
source ${ROOTPATH}/bin/thisroot.sh
printf "\nROOTSYS is ${ROOTSYS}\n"
printf "\nLD_LIBRARY_PATH is ${LD_LIBRARY_PATH}\n\n"

printf "\nMaking CNVnator...\n\n"
cd ${SRCDIR}
make
echo ${ROOTSYS}

I had to remove -lPhysics -lGraph3D (which doesn't sound like it'd cause problems) from the Makefile, and manually link to pthread. At Ieast that compiled, but I got a segfault for any bam file I tried.

Any help would be greatly appreciated.

Questions on CNV calling results

Hello,

I am a new-comer in bioinformatics and recently while I am using CNVnator v0.3.2 for a project I bump into questions when trying to make sense of results from CNV calling.

  1. I saw in many literatures people filter their raw CNV calls using p=0.05 as cut-off, however the output i'm seeing contains 4 e-vals instead of p-vals. Is it the case that the e-vals are converted from the p-vals (which is calculated from the t-test)? or the e-vals and the p-vals in the output can be treated interchangeably?
  2. for e-val2, what does it mean by "the region to be in the tail of Gaussian distribution"? Can I interpret this value as the significance of the call being a CNV?
  3. for e-val3 and e-val4, what does it mean by "for the middle of CNV" and what's the purpose of looking specifically in the middle of CNV?
    Thank you in advance for comments and help!

genotype almost all homozygous

Hello,

I am currently trying to distinguish between heterozygous and homozygous deletions for my individuals.
After filtering CNVs by q0 filter and pval1. I am running genotype on all deletion calls.
If I understand the paper correctly (notably around figure 2). If a deletion's genotype is around 1.5 then it should be heterozygous and if it is around 0 it should be homozygous.
I expect from my data a larger amount of heterozygous loci, however well over 95% of my calls genotyped are homozygous. (with almost all genotype values between 0 and 0.02)

Is this something you have encountered before? Or am I interpreting the results wrong?

Any comments or suggestions would be greatly appreciated.

Thank you for your help,
Alicia

Cling error

I am getting an error about cling and missing libraries when running CNVnator v0.3.2 using the following bash script:

source /path/to/root-6.06.02/bin/thisroot.sh
/path/to/CNVnator_v0.3.2/src/cnvnator -root out.root -tree bamfile.bam

This is the full error message:

ERROR in cling::CIFactory::createCI(): cannot extract standard library include paths!
Invoking:
    echo | LC_ALL=C c++  -pipe -m64 -Wall -W -Woverloaded-virtual -fsigned-char -fPIC -pthread -std=c++11 -Wno-deprecated-declarations -Wno-comment -Wno-unused-parameter -Wno-maybe-uninitialized -Wno-unused-but-set-variable -Wno-missing-field-initializers  -fPIC -fvisibi
results in
with exit code 256
input_line_1:1:10: fatal error: 'new' file not found
#include <new>
         ^
input_line_3:38:10: fatal error: 'string' file not found
#include <string>
         ^
IncrementalExecutor::executeFunction: symbol '_ZN5cling7runtime6gClingE' unresolved while linking [cling interface function]!
You are probably missing the definition of cling::runtime::gCling
Maybe you need to load the corresponding shared library?

I am running this on centos7 and root 6.06.02. Both cnvnator and root are build from source. Any idea how to solve this?

error

Hi I tried installing CNVnator and ended with the following error. I am newbie to such analysis and trying to use CNVnator for CNV detections. Please help me to fix this.

g++ -m64 -DCNVNATOR_VERSION="v0.3" -I/wrk/../CNV_dir/CNVnator_v0.3/root//include -Isamtools -c cnvnator.cpp -o obj/cnvnator.o
g++ -m64 -DCNVNATOR_VERSION="v0.3" -I/wrk/../CNV_dir/CNVnator_v0.3/root//include -Isamtools -c HisMaker.cpp -o obj/HisMaker.o
g++ -m64 -DCNVNATOR_VERSION="v0.3" -I/wrk/../CNV_dir/CNVnator_v0.3/root//include -Isamtools -c AliParser.cpp -o obj/AliParser.o
g++ -m64 -DCNVNATOR_VERSION="v0.3" -I/wrk/../CNV_dir/CNVnator_v0.3/root//include -Isamtools -c Genotyper.cpp -o obj/Genotyper.o
g++ -m64 -DCNVNATOR_VERSION="v0.3" -I/wrk/../CNV_dir/CNVnator_v0.3/root//include -Isamtools -c Interval.cpp -o obj/Interval.o
g++ -m64 -DCNVNATOR_VERSION="v0.3" -I/wrk/../CNV_dir/CNVnator_v0.3/root//include -Isamtools -c Genome.cpp -o obj/Genome.o
g++ -m64 -DCNVNATOR_VERSION="v0.3" -o cnvnator obj/cnvnator.o obj/HisMaker.o obj/AliParser.o obj/Genotyper.o obj/Interval.o obj/Genome.o samtools/libbam.a -lz -L/wrk/../CNV_dir/CNVnator_v0.3/root//lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lMatrix -lPhysics -lMathCore -lThread -lGui
/wrk/../CNV_dir/CNVnator_v0.3/root//lib/libCore.so: undefined reference to `std::_throw_out_of_range_fmt(char const, ...)@GLIBCXX_3.4.20'
collect2: error: ld returned 1 exit status
make: *_* [cnvnator] Error 1

Best regards,
SKM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.