Hi, dcHiC looks a good tool to analyze differential compartments. Have you com

dcHiC result about dchic HOT 13 CLOSED

ay-lab commented on June 15, 2024

dcHiC result

from dchic.

Comments (13)

ay-lab commented on June 15, 2024

Hi. A preprint is coming soon with detailed results. In terms of comparisons, we are not aware of a tool that does multi-way (>2 cell types) comparison of compartments. For pairwise comparisons, our method is finding all differences reported before which are mainly compartment flips as well as changes within a compartment (high A -> low A).

from dchic.

BenxiaHu commented on June 15, 2024

thanks.
I just run the dcHiC, and otained the result. Would you explain what the case & ctrl columns mean?
#chr start end case ctrl mdist dZsc pval padj
chr14 32800000 32900000 1.39196764497679 1.98860321563424 47.4894458754581 3.08855964842099 5.53008806504725e-12 8.33275306022834e-10
chr14 54600000 54700000 0.902929811279385 0.23864960511371 55.6862079152236 3.38604034058415 8.50126228736289e-14 1.91718553099108e-11
chr14 54800000 54900000 -0.0957086304259123 -0.706029881143173 37.5171340209211 3.08855964842099 9.06134744770622e-10 7.55698725042207e-08

Best,

from dchic.

ay-lab commented on June 15, 2024

The columns between "chr start end" and "mdist dZsc pval padj" in the differential compartment file are the PC values. I can't see your input file, but those should be the average PC values for the "case" replicates and "ctrl" replicates.

from dchic.

BenxiaHu commented on June 15, 2024

got it. thanks a lot.

from dchic.

BenxiaHu commented on June 15, 2024

the number of differential compartments is about 100. Do you think the low sequencing depth causes that? the validpairs for each library is about 100M. or how to increase the number of differential compartments?

or, can I just use the PC score to define compartment switching? for example, the PC score in case is greater than 0, while the PC score in control is less than 0, so this compartment undergoes swithcing.

from dchic.

ay-lab commented on June 15, 2024

Is it 100 for the entire genome or for a specific chromosome? Sequencing depth shouldn't affect results; we have done testing with this. If you just want to find compartment switches, using PC score is fine. If you want to extract significant changes in all scenarios (so compartment changes and A-->A, for instance), beyond merely changing the p-value threshold, here are a few things I can think of:

Visualize your data to see if there's an issue (such as large structural variations) there. This can be done with one line: Rscript igvtrack.r [Full Diff Compt Details bedGraph] [genome]. It will create a folder with an interactive HTML where you can browse results.
Double check that your input has groups of replicates. Check the file "chrdistances.txt," which shows the mean and standard deviation of PC value variation between replicates. dcHiC finds significant compartments by using these parameters, so if they are excessively high (mean, stdev > 0.4, for instance), that could indicate that the replicates don't have very similar PC values.

from dchic.

BenxiaHu commented on June 15, 2024

thanks for your reply.
about 100 differential compartments are detected in the entire genome.
only the stdev in chr21 is greater than 0.4, which indicates that the replicates in other chromosomes should befine.

from dchic.

ay-lab commented on June 15, 2024

I see. Could you share any data/result(s) with us, perhaps the compartment detail bedGraphs and IGV html page, here or by email ([email protected])?

from dchic.

BenxiaHu commented on June 15, 2024

thanks. this dataset has not been published.
when I ran dcHiC on another Hi-C dataset, I got this error:
Positions Not Shared Across All Data Sets, Chromosome 6:

None
4167

Rscript --vanilla /nas/longleaf/home/software/dcHiC/dchic/Hier.R
BalancedChrMatrix_exp_case.1.txt BalancedChrMatrix_exp_case.2.txt
BalancedChrMatrix_exp_case.3.txt BalancedChrMatrix_exp_case.4.txt
BalancedChrMatrix_exp_ctrl.1.txt BalancedChrMatrix_exp_ctrl.2.txt
BalancedChrMatrix_exp_ctrl.3.txt BalancedChrMatrix_exp_ctrl.4.txt
4 4 case.1 case.2 case.3 case.4 ctrl.1 ctrl.2 ctrl.3 ctrl.4
case ctrl 2 8 4167 40000 chr6 hg19 ./dcHiC/hg19_goldenpathData 2 /nas/longleaf/home/softwaredcHiC/dchic

R OUTPUT:
R function done!
Traceback (most recent call last):
File "/nas/longleaf/home/software/dcHiC/dchic/run.py", line 318, in
dmfa_file = open(name, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'hmfa_chrRAW_case.1_exp_1.txt'
[1] "/nas/longleaf/home/software/dcHiC/dchic"

when dcHiC analyzed chr20,21,22, I did not got the aforementioned error.

from dchic.

ay-lab commented on June 15, 2024

Hmmm. To be clear, with chromosomes 20, 21, and 22, everything in dchic.py ran to the end? And in chromosome 6 this unique error comes up?

Without any data, it is difficult to make a diagnosis. However, this error is where MFA in R is being called. Usually, after "R OUTPUT:" there are several lines of system output from the MFA R script.

One big possibility is that you are running out of memory mid-way (PCA memory demand scales with size), which is why it works for chromosomes 20-22 but not for 6. I did a test and ran into an error in the same spot when limiting my RAM capacity.

If that's not it, running the R script ("Rscript --vanilla /nas/longleaf/home/software/dcHiC/dchic/Hier.R ..... ") output standalone and seeing what the error output is will help.

from dchic.

BenxiaHu commented on June 15, 2024

thanks.
not only chr6, but aslo other large chromsomes, such as chr1, chr2 show the same error.
yes, when I ran "Rscript --vanilla /nas/longleaf/home/software/dcHiC/dchic/Hier.R ..... " in the terminal, Hier.R outputs results.
so how to deal with this issue? I set the memory to 800GB, I still got the same errors for big chromosomes.

from dchic.

ay-lab commented on June 15, 2024

Hello,

If Hier.R works fine run alone and smaller chromosomes work fine, I don't think this an error with dcHiC. Are you running the large chromosomes in parallel? Perhaps the multi-threading runs into a system limit because the 40kb matrices are already huge. Running in sequence (omitting the -parallel option) or just running everything at a lower resolution (100kb has been our default mode of analysis) may help.

from dchic.

BenxiaHu commented on June 15, 2024

thanks a lot.

from dchic.

dcHiC result about dchic HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent