yaqiangcao / cloops Goto Github PK
View Code? Open in Web Editor NEWAccurate and flexible loops calling tool for 3D genomic data.
Home Page: https://yaqiangcao.github.io/cLoops/
License: MIT License
Accurate and flexible loops calling tool for 3D genomic data.
Home Page: https://yaqiangcao.github.io/cLoops/
License: MIT License
Hello I used Juicer to preprocess my data and make .hic files. I am trying to use different loop calling tools to get a sense of what are the most confident loops in my dataset. I know that the input files for cLoops are .bedpe files, I'm wondering if you have a pipeline from Juicer to cloops.
If you don't, I'm wondering how you pre-process the fastq files prior to running cLoops.
Thank you so much,
Yonatan
Dear Yaqiang,
I've fed cLoops a HiChIP bedpe file from a hicpro allvalidpairs file.
commands I used: cLoops -f "input files" -o hichip -m 4 -j -s -w -p -1. Then jd2washU,
Now I have the terminal file processed by tabix and was able to visulaize it on washU. The thing is what I was able to visualize is a TAD like interaction desnity "pic 1", but not "loops", is there a way to have/visulize a loops file? and Thanks in advance.
-Yussuf
Hi, Yaqiang
I found that hicpro2bedpe is not working properly as the code misunderstands the meaning of 11th and 12th col. They are not read length but the mapping quality. Please check the this website:https://groups.google.com/g/hic-pro/c/qhHNbQVJfYY.
Best,
Kun
It seems like in a recent commit the threshold to distinguish inter-ligation and self-ligation PETs was set to 0 regardless of the input data. Is this the intended behavior?
Line 284 in 3ea8b4c
Hi when I run the conda env create line this happens:
"conda env create --name cLoops --file cLoops_env.yaml
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
This seems to be an issue of certain versions not being available in Osx64 vs Linux64. Please help.
Everything seems to work nicely with typical sized Hi-C datasets but when attempting to run on something larger (e.g., ~4e9 contacts genome-wide) with -eps 5000,10000 -minPts 50,100 -hic
, the following sort of issue pops up:
Clustering chr8 and chr8 finished. Estimated 43365022 self-ligation reads and 5506751 inter-ligation reads
Traceback (most recent call last):
File "/local/anaconda3/envs/cloops/bin/cLoops", line 8, in <module>
sys.exit(main())
File "/local/anaconda3/envs/cloops/lib/python2.7/site-packages/cLoops/pipe.py", line 352, in main
hic, op.washU, op.juice, op.cut, op.plot, op.max_cut)
File "/local/anaconda3/envs/cloops/lib/python2.7/site-packages/cLoops/pipe.py", line 250, in pipe
dataI_2, dataS_2, dis_2, dss_2 = runDBSCAN(cfs, ep, m, cut, cpu)
File "/local/anaconda3/envs/cloops/lib/python2.7/site-packages/cLoops/pipe.py", line 118, in runDBSCAN
for f in fs)
File "/local/anaconda3/envs/cloops/lib/python2.7/site-packages/joblib/parallel.py", line 789, in __call__
self.retrieve()
File "/local/anaconda3/envs/cloops/lib/python2.7/site-packages/joblib/parallel.py", line 699, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/local/anaconda3/envs/cloops/lib/python2.7/multiprocessing/pool.py", line 572, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[(('chr8', 'chr8'), 'hic/chr8-chr8.jd', ...
Based on the scikit-learn/scikit-learn#8920, I wrapped all the Parallel()
in pipe.py inside with-blocks using the "threading" back-end and seems to have gotten around the error.
My question is whether this is the right way to go about this problem given the "parallel computating bugs" mentioned in the README.
Hi,
I've generated loops and I've got the SAMPLE_washU.txt file.
Which option in the WashU Genome Browser I should use to the the loops in a track?
Thanks,
Roberto
Dear Yaqiang,
I am interested in using cLoops on our HiC data, which has been processed using TADbit and binless. Unfortunately, neither of these programs produces a BEDPE file, so I am a bit at a loss in terms of feeding this data to cLoops.
TADbit does create its own BED-llike .TSV format files with paired-end mapped reads, which in turn are used by binless, so I was wondering whether these could be re-formatted and used by cLoops.
I uploaded a sample file for testing purposes, see https://github.com/bontus/cLoops_TADbit/blob/master/FPR_5_both_map_chr10_sample.tsv
File format description (edited based on https://github.com/3DGenomes/binless)
TADbit .tsv: tab-separated text file containing paired-end mapped reads for a single experiment. All lines starting with # will be discarded. The order of the reads is unimportant. It has the following columns:
id: ID of the mapped pair (alphanumeric without spaces)
chr1: chromosome of read 1 (alphanumeric without spaces)
pos1: position of the 5' end of read 1 (integer)
strand1: whether the read maps on the forward (1) or reverse (0) direction
length1: how many bases were mapped (integer)
re.up1: upstream restriction site closest to pos1 (integer <= pos1)
re.dn1: downstream restriction site closest to pos1 (integer > pos1)
chr2: chromosome of read 2 (alphanumeric without spaces)
pos2: position of the 5' end of read 2 (integer)
strand2: whether the read maps on the forward (1) or reverse (0) direction
length2: how many bases were mapped (integer)
re.up2: upstream restriction site closest to pos2 (integer <= pos1)
re.dn2: downstream restriction site closest to pos1 (integer > pos1)
As I have not worked with HiC-Pro, I do not know whether its BEDPE file provides information about the positions of restriction sites or not. In case the BEDPE files for cLoops solely contain information based on the id, chr1/2, pos1/2 and strand1/2 information, one could convert the TADbit TSVs to BEDPE using AWK:
TSVFILE=FPR_5_both_map_chr10_sample.tsv;
grep -v "[#~]" $TSVFILE \
| awk -v OFS='\t' '{
if ($4 == 1 && $10 == 1) {print $2,$3,$3+$5,$8,$9,$9+$11,$1,".\t+\t+";}
if ($4 == 1 && $10 == 0) {print $2,$3,$3+$5,$8,$9-$11,$9,$1,".\t+\t-";}
if ($4 == 0 && $10 == 1) {print $2,$3-$5,$3,$8,$9,$9+$11,$1,".\t-\t+";}
if ($4 == 0 && $10 == 0) {print $2,$3-$5,$3,$8,$9-$11,$9,$1,".\t-\t-";}
}'
One thing to note is TADbit's mapping, as reads can be split at the ligation site if they do not map entirely, which makes their files quite complex. An example can be seen here (after converting their TSV to SAM):
D00733:158:CA67GANXX:1:1211:12600:68498#1/6 0 chr10 305327 33 1M = 430210 41 * * TC:i:1 E1:i:304175 E2:i:305356 E3:i:429917 E4:i:430247
D00733:158:CA67GANXX:1:1211:12600:68498#4/6 24 chr10 305327 33 1M = 305359 -33 * * TC:i:1 E1:i:304175 E2:i:305356 E3:i:305356 E4:i:305474
D00733:158:CA67GANXX:1:1211:12600:68498#5/6 16 chr10 305327 33 1M = 430250 -41 * * TC:i:1 E1:i:304175 E2:i:305356 E3:i:430247 E4:i:430302
D00733:158:CA67GANXX:1:1211:12600:68498~2~#2/6 0 chr10 305359 33 0M = 430210 41 * * TC:i:2 E1:i:305356 E2:i:305474 E3:i:429917 E4:i:430247
D00733:158:CA67GANXX:1:1211:12600:68498~2~#6/6 0 chr10 305359 33 0M = 430250 -41 * * TC:i:2 E1:i:305356 E2:i:305474 E3:i:430247 E4:i:430302
D00733:158:CA67GANXX:1:1211:12600:68498#4/6 24 chr10 305359 33 0M = 305327 33 * * TC:i:1 E3:i:305356 E4:i:305474 E1:i:304175 E2:i:305356
D00733:158:CA67GANXX:1:1211:12600:68498#1/6 0 chr10 430210 41 1M = 305327 33 * * TC:i:1 E3:i:429917 E4:i:430247 E1:i:304175 E2:i:305356
D00733:158:CA67GANXX:1:1211:12600:68498~2~#2/6 0 chr10 430210 41 1M = 305359 -33 * * TC:i:2 E3:i:429917 E4:i:430247 E1:i:305356 E2:i:305474
D00733:158:CA67GANXX:1:1211:12600:68498#3/6 24 chr10 430210 41 1M = 430250 -41 * * TC:i:1 E1:i:429917 E2:i:430247 E3:i:430247 E4:i:430302
D00733:158:CA67GANXX:1:1211:12600:68498~2~#6/6 0 chr10 430250 41 0M = 305359 -33 * * TC:i:2 E3:i:430247 E4:i:430302 E1:i:305356 E2:i:305474
D00733:158:CA67GANXX:1:1211:12600:68498#5/6 16 chr10 430250 41 0M = 305327 33 * * TC:i:1 E3:i:430247 E4:i:430302 E1:i:304175 E2:i:305356
D00733:158:CA67GANXX:1:1211:12600:68498#3/6 24 chr10 430250 41 0M = 430210 41 * * TC:i:1 E3:i:430247 E4:i:430302 E1:i:429917 E2:i:430247
Let me know what you think, it would be great if we can solve this issue and make cLoops work with TADbit too.
Kind regards
edit: fixed the code
I am running on chr10_intrachr Hi-C data of rao.(GM12878) by " -m 3" usage
there are 128M cis PETs in this chr
and I find that over 50Gb memory is used and cloops keeps running after 1 day.
So, what is the expect running time according to your past test
So I have a .bedpe
file whose head looks like that. I created the code so as to include also orientation and have the last three columns, but unfortunately it still does not work for me. The columns are separated with \t
(as it is needed).
chr1 869398 870595 chr1 904618 906401 5 . + -
chr1 869398 870595 chr1 937699 942959 13 . + -
chr1 869398 870595 chr1 979636 987730 2 . + +
chr1 869398 870595 chr1 1001366 1003470 5 . + -
chr1 869398 870595 chr1 1058440 1061403 2 . + +
chr1 869398 870595 chr1 1118816 1123474 2 . + +
chr1 869398 870595 chr1 1250309 1252884 2 . + -
chr1 869398 870595 chr1 1290219 1292623 2 . + -
chr1 904618 906401 chr1 914193 915144 5 . + +
and the command that I am trying to run is something like,
cLoops -f GM12878WT_ChIAPET_SMC1A_B1S4B2S2B3S2_2.bedpe.gz -o cLoops_out -minPts 20,30 -eps 2500,5000,7500,10000 -hic -s -j -c chr21
as you propose in documentation. My purpose is to call cLoops, so as to find stripes (after it). The error that I take is,
2022-02-09 18:05:06,608 INFO Command line: cLoops -f GM12878WT_ChIAPET_SMC1A_B1S4B2S2B3S2_2.bedpe.gz -o cLoops_out -m 0 -eps 2500,5000,7500,10000 -minPts 20,30 -p 1 -w False -j True -s True -c chr21 -hic True -cut 0 -plot False -max_cut False
2022-02-09 18:05:06,632 INFO mode:0 eps:[2500, 5000, 7500, 10000] minPts:[30, 20] hic:True
2022-02-09 18:05:06,632 INFO Parsing PETs from GM12878WT_ChIAPET_SMC1A_B1S4B2S2B3S2_2.bedpe.gz, requiring initial distance cutoff > 0
300000 PETs processed from GM12878WT_ChIAPET_SMC1A_B1S4B2S2B3S2_2.bedpe.gz()
2022-02-09 18:05:07,933 INFO Totaly 333808 PETs from GM12878WT_ChIAPET_SMC1A_B1S4B2S2B3S2_2.bedpe.gz, in which 3535 cis PETs
Clustering chr21 and chr21 using eps as 2500, minPts as 30,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:07,960 INFO ERROR: no inter-ligation PETs detected for eps 2500 minPts 30,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 2500, minPts as 20,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:07,978 INFO ERROR: no inter-ligation PETs detected for eps 2500 minPts 20,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 5000, minPts as 30,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:07,996 INFO ERROR: no inter-ligation PETs detected for eps 5000 minPts 30,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 5000, minPts as 20,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:08,015 INFO ERROR: no inter-ligation PETs detected for eps 5000 minPts 20,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 7500, minPts as 30,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:08,034 INFO ERROR: no inter-ligation PETs detected for eps 7500 minPts 30,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 7500, minPts as 20,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:08,052 INFO ERROR: no inter-ligation PETs detected for eps 7500 minPts 20,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 10000, minPts as 30,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:08,070 INFO ERROR: no inter-ligation PETs detected for eps 10000 minPts 30,can't model the distance cutoff,continue anyway
Clustering chr21 and chr21 using eps as 10000, minPts as 20,pre-set distance cutoff as > 0
Clustering chr21 and chr21 finished. Estimated 0 self-ligation reads and 0 inter-ligation reads
2022-02-09 18:05:08,089 INFO ERROR: no inter-ligation PETs detected for eps 10000 minPts 20,can't model the distance cutoff,continue anyway
Traceback (most recent call last):
File "/home/blackpianocat/anaconda3/envs/cLoops/bin/cLoops", line 11, in <module>
load_entry_point('cLoops==0.93', 'console_scripts', 'cLoops')()
File "build/bdist.linux-x86_64/egg/cLoops/pipe.py", line 349, in main
File "build/bdist.linux-x86_64/egg/cLoops/pipe.py", line 280, in pipe
File "/home/blackpianocat/anaconda3/envs/cLoops/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2618, in amin
initial=initial)
File "/home/blackpianocat/anaconda3/envs/cLoops/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity
I also tried your new script cLoops2, but I still have problems since after the preprocessing it gives me empty files.
Hi,
I want to analyze some files for which I dont have the raw data, only hic files. I wonder if it is possible to use cLoops with hic file by extracting the data as a sparse format using a'juicer_tools dump' and then formatting the file as a bedpe with awk. the reads would already be binned at a pre-set resolution (5k, 10k), and there will be a score associated with each pair of reads, which it could be added to column 8 of the bedpe file. Could this still work? or do you anticipate problems with the calling algorithm if I were to use this?
This is a question. After generating a list of loops for a given hic matrix file, is there any way using cLoops to get the enrichment score of the same loop coordinates across another matrix file for comparison purposes? If not possible, could you suggest any tool for this? Thank you.
Hi Yaqiang, recently i'm dealing with a Hi-C data set,and since the sequencing depth of our data is not huge(about 80 Million Cis-PETs ), so i need to find the optimal "eps" and here is my question. If i specify the "eps" with a series of parameters, will the program returns the consensus loop called out at every "eps" or the simple merge of loops called out at every eps parameters. Thanks a lot :)
Gramblr required ( OpenSSL)
I am trying to run cLoops with a public data (Vian et al., 2018) but it gives error as below:
cooler dump --join GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30.mcool::resolutions/5000 -o GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.bedpe
cLoops -f GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.bedpe -o GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.cLoops -j -m 3
2020-11-12 15:19:44,166 INFO Command line: cLoops -f GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.bedpe -o GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.cLoops -m 3 -eps 0 -minPts 0 -p 1 -w False -j True -s False -c -hic False -cut 0 -plot False -max_cut False
2020-11-12 15:19:44,166 INFO mode:3 eps:[5000, 7500, 10000] minPts:[50, 40, 30, 20] hic:1
2020-11-12 15:19:44,178 INFO Parsing PETs from GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.bedpe, requiring initial distance cutoff > 0
632800000 PETs processed from GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.bedpe()
2020-11-12 18:27:59,059 INFO Totaly 632884895 PETs from GSE82144_Kieffer-Kwon-2017-activated_B_cells_72_hours_WT_30_res5kb.bedpe, in which 0 cis PETs
2020-11-12 18:27:59,180 INFO ERROR: no inter-ligation PETs detected for eps 5000 minPts 50,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,180 INFO ERROR: no inter-ligation PETs detected for eps 5000 minPts 40,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,181 INFO ERROR: no inter-ligation PETs detected for eps 5000 minPts 30,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,181 INFO ERROR: no inter-ligation PETs detected for eps 5000 minPts 20,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,181 INFO ERROR: no inter-ligation PETs detected for eps 7500 minPts 50,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,181 INFO ERROR: no inter-ligation PETs detected for eps 7500 minPts 40,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,181 INFO ERROR: no inter-ligation PETs detected for eps 7500 minPts 30,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,181 INFO ERROR: no inter-ligation PETs detected for eps 7500 minPts 20,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,182 INFO ERROR: no inter-ligation PETs detected for eps 10000 minPts 50,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,182 INFO ERROR: no inter-ligation PETs detected for eps 10000 minPts 40,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,183 INFO ERROR: no inter-ligation PETs detected for eps 10000 minPts 30,can't model the distance cutoff,continue anyway
2020-11-12 18:27:59,183 INFO ERROR: no inter-ligation PETs detected for eps 10000 minPts 20,can't model the distance cutoff,continue anyway
Traceback (most recent call last):
File "/mnt/data0/apps/anaconda/Anaconda2-5.2/bin/cLoops", line 11, in
load_entry_point('cLoops==0.93', 'console_scripts', 'cLoops')()
File "build/bdist.linux-x86_64/egg/cLoops/pipe.py", line 349, in main
File "build/bdist.linux-x86_64/egg/cLoops/pipe.py", line 280, in pipe
File "/mnt/data0/apps/anaconda/Anaconda2-5.2/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2618, in amin
initial=initial)
File "/mnt/data0/apps/anaconda/Anaconda2-5.2/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation minimum which has no identity
Please let me know if there is something that I missed. Thanks.
Hi,
I just wanted ask a quick question about the strand information. Is it required to specify the strand in the input .bedpe file? Or is it enough to provide chr1-start1-end1-chr2-start2-end2-score?
Thank you.
Dogancan
Hi,
This is an enhancement issue.
Can the output of cLoops be somehow made compatible with existing differential loop callers, e.g. diffloop?
Thank you,
Rishi
Hi,
I'm running cLopps with 4 bedpe files and it hang for more than 24 hours without consuming any resource in the computer.
This is my output:
$ cLoops \
-f CHLA_1_S1_L001_R1_001.fastq_sorted_fixmate.bedpe.gz,CHLA_1_S1_L002_R1_001.fastq_sorted_fixmate.bedpe.gz,CHLA_2_S4_L001_R1_001.fastq_sorted_fixmate.bedpe.gz,CHLA_2_S4_L002_R1_001.fastq_sorted_fixmate.bedpe.gz \
-o \
CHLA \
-m \
3 \
-eps \
5000,7500,10000 \
-minPts \
0 \
-p \
16 \
-c \
chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chrX,chrY \
-w \
-j \
-s \
-hic \
-plot
2020-07-14 13:20:43,940 INFO Command line: cLoops -f /tmp/tmpzvcg8_fw/stgbab360e9-91ca-4353-bd46-e1c0a23caa3e/CHLA_1_S1_L001_R1_001.fastq_sorted_fixmate.bedpe.gz,/tmp/tmpzvcg8_fw/stg60cf6a5f-dfa9-4ec1-9dd8-254e57420dfa/CHLA_1_S1_L002_R1_001.fastq_sorted_fixmate.bedpe.gz,/tmp/tmpzvcg8_fw/stgbd15da67-02ed-4e88-99b6-1aab652edea7/CHLA_2_S4_L001_R1_001.fastq_sorted_fixmate.bedpe.gz,/tmp/tmpzvcg8_fw/stg04f31652-6ef1-4445-ac9b-da8de561bffe/CHLA_2_S4_L002_R1_001.fastq_sorted_fixmate.bedpe.gz -o CHLA -m 3 -eps 5000,7500,10000 -minPts 0 -p 16 -w True -j True -s True -c chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chrX,chrY -hic True -cut 0 -plot True -max_cut False
2020-07-14 13:20:43,940 INFO mode:3 eps:[5000, 7500, 10000] minPts:[50, 40, 30, 20] hic:1
2020-07-14 13:20:43,940 INFO Parsing PETs from /tmp/tmpzvcg8_fw/stgbab360e9-91ca-4353-bd46-e1c0a23caa3e/CHLA_1_S1_L001_R1_001.fastq_sorted_fixmate.bedpe.gz, requiring initial distance cutoff > 0
92700000 PETs processed from /tmp/tmpzvcg8_fw/stgbab360e9-91ca-4353-bd46-e1c0a23caa3e/CHLA_1_S1_L001_R1_001.fastq_sorted_fixmate.bedpe.gz2020-07-14 13:31:55,055 INFO Parsing PETs from /tmp/tmpzvcg8_fw/stg60cf6a5f-dfa9-4ec1-9dd8-254e57420dfa/CHLA_1_S1_L002_R1_001.fastq_sorted_fixmate.bedpe.gz, requiring initial distance cutoff > 0
184800000 PETs processed from /tmp/tmpzvcg8_fw/stg60cf6a5f-dfa9-4ec1-9dd8-254e57420dfa/CHLA_1_S1_L002_R1_001.fastq_sorted_fixmate.bedpe.gz2020-07-14 13:43:04,119 INFO Parsing PETs from /tmp/tmpzvcg8_fw/stgbd15da67-02ed-4e88-99b6-1aab652edea7/CHLA_2_S4_L001_R1_001.fastq_sorted_fixmate.bedpe.gz, requiring initial distance cutoff > 0
241200000 PETs processed from /tmp/tmpzvcg8_fw/stgbd15da67-02ed-4e88-99b6-1aab652edea7/CHLA_2_S4_L001_R1_001.fastq_sorted_fixmate.bedpe.gz2020-07-14 13:49:51,955 INFO Parsing PETs from /tmp/tmpzvcg8_fw/stg04f31652-6ef1-4445-ac9b-da8de561bffe/CHLA_2_S4_L002_R1_001.fastq_sorted_fixmate.bedpe.gz, requiring initial distance cutoff > 0
297100000 PETs processed from /tmp/tmpzvcg8_fw/stg04f31652-6ef1-4445-ac9b-da8de561bffe/CHLA_2_S4_L002_R1_001.fastq_sorted_fixmate.bedpe.gz()
2020-07-14 13:56:39,257 INFO Totaly 297190142 PETs from /tmp/tmpzvcg8_fw/stgbab360e9-91ca-4353-bd46-e1c0a23caa3e/CHLA_1_S1_L001_R1_001.fastq_sorted_fixmate.bedpe.gz,/tmp/tmpzvcg8_fw/stg60cf6a5f-dfa9-4ec1-9dd8-254e57420dfa/CHLA_1_S1_L002_R1_001.fastq_sorted_fixmate.bedpe.gz,/tmp/tmpzvcg8_fw/stgbd15da67-02ed-4e88-99b6-1aab652edea7/CHLA_2_S4_L001_R1_001.fastq_sorted_fixmate.bedpe.gz,/tmp/tmpzvcg8_fw/stg04f31652-6ef1-4445-ac9b-da8de561bffe/CHLA_2_S4_L002_R1_001.fastq_sorted_fixmate.bedpe.gz, in which 231457356 cis PETs
Clustering chrX and chrX using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr13 and chr13 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr18 and chr18 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr19 and chr19 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr15 and chr15 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr9 and chr9 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr10 and chr10 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr11 and chr11 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr8 and chr8 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr4 and chr4 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr17 and chr17 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr7 and chr7 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr6 and chr6 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr5 and chr5 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr2 and chr2 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr1 and chr1 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr18 and chr18 finished. Estimated 4159373 self-ligation reads and 4787 inter-ligation reads
Clustering chrX and chrX finished. Estimated 3892349 self-ligation reads and 30128 inter-ligation reads
Clustering chr12 and chr12 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr13 and chr13 finished. Estimated 4807700 self-ligation reads and 23345 inter-ligation reads
Clustering chr14 and chr14 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr16 and chr16 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr15 and chr15 finished. Estimated 6001390 self-ligation reads and 148954 inter-ligation reads
Clustering chr19 and chr19 finished. Estimated 4936452 self-ligation reads and 31579 inter-ligation reads
Clustering chr3 and chr3 using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr4 and chr4 finished. Estimated 8868202 self-ligation reads and 23993 inter-ligation reads
Clustering chrY and chrY using eps as 5000, minPts as 50,pre-set distance cutoff as > 0
Clustering chr9 and chr9 finished. Estimated 7327541 self-ligation reads and 231139 inter-ligation reads
Clustering chr11 and chr11 finished. Estimated 8249398 self-ligation reads and 8235 inter-ligation reads
Clustering chrY and chrY finished. Estimated 460686 self-ligation reads and 60557 inter-ligation reads
Clustering chr5 and chr5 finished. Estimated 9604772 self-ligation reads and 100824 inter-ligation reads
Clustering chr6 and chr6 finished. Estimated 9308583 self-ligation reads and 15008 inter-ligation reads
Clustering chr7 and chr7 finished. Estimated 9012988 self-ligation reads and 138554 inter-ligation reads
Clustering chr10 and chr10 finished. Estimated 8305044 self-ligation reads and 71578 inter-ligation reads
Clustering chr14 and chr14 finished. Estimated 5752252 self-ligation reads and 22969 inter-ligation reads
Clustering chr8 and chr8 finished. Estimated 8440548 self-ligation reads and 125033 inter-ligation reads
Clustering chr17 and chr17 finished. Estimated 9138294 self-ligation reads and 171760 inter-ligation reads
Clustering chr2 and chr2 finished. Estimated 14067151 self-ligation reads and 108679 inter-ligation reads
Clustering chr16 and chr16 finished. Estimated 5992621 self-ligation reads and 232902 inter-ligation reads
Clustering chr3 and chr3 finished. Estimated 11099243 self-ligation reads and 7972 inter-ligation reads
Clustering chr12 and chr12 finished. Estimated 12273217 self-ligation reads and 20282 inter-ligation reads
^CProcess PoolWorker-21:
Process PoolWorker-25:
Process PoolWorker-28:
Process PoolWorker-23:
Process PoolWorker-19:
Process PoolWorker-22:
Process PoolWorker-27:
Process PoolWorker-32:
Process PoolWorker-31:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Process PoolWorker-20:
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Traceback (most recent call last):
Traceback (most recent call last):
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Process PoolWorker-24:
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Traceback (most recent call last):
Process PoolWorker-17:
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Traceback (most recent call last):
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Process PoolWorker-33:
Process PoolWorker-26:
Traceback (most recent call last):
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Process PoolWorker-30:
Traceback (most recent call last):
Traceback (most recent call last):
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
Process PoolWorker-29:
Traceback (most recent call last):
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
self.run()
self.run()
self.run()
self.run()
self.run()
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self.run()
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
self.run()
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
self.run()
self.run()
self.run()
self.run()
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self.run()
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self.run()
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
self.run()
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
self.run()
File "/bin/cloops/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/bin/cloops/lib/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
task = get()
task = get()
task = get()
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
task = get()
task = get()
task = get()
task = get()
task = get()
task = get()
task = get()
task = get()
task = get()
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
task = get()
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
task = get()
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 360, in get
task = get()
File "/bin/cloops/lib/python2.7/site-packages/joblib/pool.py", line 362, in get
racquire()
racquire()
racquire()
racquire()
KeyboardInterrupt
racquire()
racquire()
racquire()
racquire()
racquire()
racquire()
racquire()
racquire()
racquire()
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
KeyboardInterrupt
racquire()
KeyboardInterrupt
racquire()
KeyboardInterrupt
return recv()
KeyboardInterrupt
Traceback (most recent call last):
File "/bin/jupyter/bin/cwltool", line 10, in <module>
sys.exit(run())
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/main.py", line 1240, in run
sys.exit(main(*args, **kwargs))
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/main.py", line 1105, in main
tool, initialized_job_order_object, runtimeContext, logger=_logger
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/executors.py", line 54, in __call__
return self.execute(*args, **kwargs)
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/executors.py", line 137, in execute
self.run_jobs(process, job_order_object, logger, runtime_context)
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/executors.py", line 244, in run_jobs
job.run(runtime_context)
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/job.py", line 570, in run
self._execute([], env, runtimeContext, monitor_function)
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/job.py", line 373, in _execute
monitor_function=monitor_function,
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/job.py", line 957, in _job_popen
monitor_function(sproc)
File "/bin/jupyter/lib/python3.7/site-packages/cwltool/job.py", line 497, in process_monitor
sproc.wait()
File "/bin/jupyter/lib/python3.7/subprocess.py", line 1019, in wait
return self._wait(timeout=timeout)
File "/bin/jupyter/lib/python3.7/subprocess.py", line 1653, in _wait
(pid, sts) = self._try_wait(0)
File "/bin/jupyter/lib/python3.7/subprocess.py", line 1611, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt
When I try to call stripes with cLoop, the shell show that: "callStripes: command not found".
I get the following error after isntallation and running the test script:
[seb@pc examples]$ sh run.sh
2019-05-03 18:39:59,823 INFO Command line: cLoops -f GSM1872886_GM12878_CTCF_ChIA-PET_chr21_hg38.bedpe.gz -o chiapet -m 0 -eps 0 -minPts 0 -p 1 -w True -j True -s True -c -hic False -cut 0 -plot False -max_cut False
2019-05-03 18:39:59,824 ERROR minPts not assigned!
Converting to .hic file which could be loaded in juicebox
juicer_tools pre -n -r 1000,5000,10000,20000 -d 0.951037801484 CTCF_chr21_juice.hic hg38
0.951037801484 does not exist or does not contain any reads.
rm 0.951037801484
Converting CTCF_chr21_juice.hic to juicer's hic file finished.
Converting to washU track.
bedtools sort -i 0.139065929601 > CTCF_chr21_PETs_washU.txt
rm 0.139065929601
bgzip CTCF_chr21_PETs_washU.txt
tabix -p bed CTCF_chr21_PETs_washU.txt.gz
Converting CTCF_chr21_PETs_washU.txt to washU random accessed track finished.
2019-05-03 18:40:02,585 INFO Getting finger print for chiapet
Traceback (most recent call last):
File "/pc/home/seb/no_backup/python/bin/jd2fingerprint", line 4, in
import('pkg_resources').run_script('cLoops==0.93', 'jd2fingerprint')
File "/pc/home/seb/.local/lib/python2.7/site-packages/pkg_resources/init.py", line 666, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/pc/home/seb/.local/lib/python2.7/site-packages/pkg_resources/init.py", line 1453, in run_script
exec(script_code, namespace, namespace)
File "/pc/home/seb/no_backup/python/lib/python2.7/site-packages/cLoops-0.93-py2.7.egg/EGG-INFO/scripts/jd2fingerprint", line 118, in
File "/pc/home/seb/no_backup/python/lib/python2.7/site-packages/cLoops-0.93-py2.7.egg/EGG-INFO/scripts/jd2fingerprint", line 94, in getFingerPrint
File "/pc/home/seb/no_backup/python/lib/python2.7/site-packages/cLoops-0.93-py2.7.egg/EGG-INFO/scripts/jd2fingerprint", line 70, in jds2FingerPrint
ValueError: need at least one array to concatenate
hi YaQiang!
when I download stripe example file and run the code,and the error said the file is not correct .
but I can't find these file in example file ,
plese help thanks !
cLoops -f GSM1872886_GM12878_CTCF_ChIA-PET_chr21_hg38.bedpe.gz -o chiapet -w 1 -j 1 -s 1
usage: cLoops [-h] -f FNIN -o FNOUT [-m {0,1,2,3,4}] [-eps EPS]
[-minPts MINPTS] [-p CPU] [-c CHROMS] [-w] [-j] [-s] [-hic]
[-cut CUT] [-plot] [-v]
cLoops: error: unrecognized arguments: 1 1 1
Hi! This tools looks really interesting and I would like to give it a try. Is it possible to make input format more flexible? E.g. support the .pairs format from 4DN, and its extension from the Mirny?
https://github.com/4dn-dcic/pairix/blob/master/pairs_format_specification.md
lab https://pairtools.readthedocs.io/en/latest/formats.html
Or allow specification of which column in the data corresponds to what...
Converting such huge files takes a long time, considering they need to by un-gzipped and then gzipped agian. Thank you!
Hi,
It's an interest new tools. But How to prepare the bedpe format file to input this software?
I have to try using HiCcompare to change the HiC-Pro resulted. However, It's still different from the example data (GSM1872886_GM12878_CTCF_ChIA-PET_chr21_hg38.bedpe).
Thanks.
I can not set parameters except default ones. The callstripe program will report an error even the default parameter, like "-minPts 5", appear on the command line.
I'm running cLoops with two bedpe files with the following commands:
2018-12-10 11:30:11,918 INFO Command line: cLoops -f /home/ykq/Data/BL-Hi-C/BL01/bl01.rmdup.bedpe,/home/ykq/Data/BL-Hi-C/BL02/bl02.rmdup.bedpe -o run2 -m 0 -eps 1000,2500,5000,7500,10000 -minPts 10,15,20 -p 10 -w True -j True -s False -c -hic True -cut 0 -plot True -max_cut False
2018-12-10 11:30:11,918 INFO mode:0 eps:[1000, 2500, 5000, 7500, 10000] minPts:[20, 15, 10] hic:True
And there were only *.jd and *_disCutoff.pdf files, no loops file.
i wonder how i can get the loops from these results, or should i rerun the data?
cLoops,log file attached.
cLoops.log
Hi,
I was wondering if you could provide more information on how to use the different parameters in callStripes to optimize the calls. I run it on my data using default parameters and while I notice there are some calls that are accurate, many of them are not visually obvious. The algorithm is also missing stripes that tend to be longer and wider.
I am attaching some screenshots.
BEDPE格式应该没有要求是 tab-delimited 的,你的oi.py中只对\t进行分割,造成了不知情用户的困扰
Hi, I am using cLoops to call loops on my HiC datasets (>200M valid pairs/sample), and I am trying to tune the parameters to incrase and or decrease the number of called loops. Could you explain how to use eps and minPts to increase the sensitivity of the calls? Also I noticed that the majority of the calls are less of a given distance apart... is there a way to incrase the distance to detect longer-range interactions?
I have used -m 3 (equals -eps 5000,7500,10000 -minPts 20,30,40,50 -hic) but I am not sure how this works. Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.