Git Product home page Git Product logo

Comments (13)

ay-lab avatar ay-lab commented on June 15, 2024

Hi. A preprint is coming soon with detailed results. In terms of comparisons, we are not aware of a tool that does multi-way (>2 cell types) comparison of compartments. For pairwise comparisons, our method is finding all differences reported before which are mainly compartment flips as well as changes within a compartment (high A -> low A).

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

thanks.
I just run the dcHiC, and otained the result. Would you explain what the case & ctrl columns mean?
#chr start end case ctrl mdist dZsc pval padj
chr14 32800000 32900000 1.39196764497679 1.98860321563424 47.4894458754581 3.08855964842099 5.53008806504725e-12 8.33275306022834e-10
chr14 54600000 54700000 0.902929811279385 0.23864960511371 55.6862079152236 3.38604034058415 8.50126228736289e-14 1.91718553099108e-11
chr14 54800000 54900000 -0.0957086304259123 -0.706029881143173 37.5171340209211 3.08855964842099 9.06134744770622e-10 7.55698725042207e-08

Best,

from dchic.

ay-lab avatar ay-lab commented on June 15, 2024

The columns between "chr start end" and "mdist dZsc pval padj" in the differential compartment file are the PC values. I can't see your input file, but those should be the average PC values for the "case" replicates and "ctrl" replicates.

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

got it. thanks a lot.

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

the number of differential compartments is about 100. Do you think the low sequencing depth causes that? the validpairs for each library is about 100M. or how to increase the number of differential compartments?

or, can I just use the PC score to define compartment switching? for example, the PC score in case is greater than 0, while the PC score in control is less than 0, so this compartment undergoes swithcing.

from dchic.

ay-lab avatar ay-lab commented on June 15, 2024

Is it 100 for the entire genome or for a specific chromosome? Sequencing depth shouldn't affect results; we have done testing with this. If you just want to find compartment switches, using PC score is fine. If you want to extract significant changes in all scenarios (so compartment changes and A-->A, for instance), beyond merely changing the p-value threshold, here are a few things I can think of:

  1. Visualize your data to see if there's an issue (such as large structural variations) there. This can be done with one line: Rscript igvtrack.r [Full Diff Compt Details bedGraph] [genome]. It will create a folder with an interactive HTML where you can browse results.

  2. Double check that your input has groups of replicates. Check the file "chrdistances.txt," which shows the mean and standard deviation of PC value variation between replicates. dcHiC finds significant compartments by using these parameters, so if they are excessively high (mean, stdev > 0.4, for instance), that could indicate that the replicates don't have very similar PC values.

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

thanks for your reply.
about 100 differential compartments are detected in the entire genome.
only the stdev in chr21 is greater than 0.4, which indicates that the replicates in other chromosomes should befine.

from dchic.

ay-lab avatar ay-lab commented on June 15, 2024

I see. Could you share any data/result(s) with us, perhaps the compartment detail bedGraphs and IGV html page, here or by email ([email protected])?

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

thanks. this dataset has not been published.
when I ran dcHiC on another Hi-C dataset, I got this error:
Positions Not Shared Across All Data Sets, Chromosome 6:

None
4167

Rscript --vanilla /nas/longleaf/home/software/dcHiC/dchic/Hier.R
BalancedChrMatrix_exp_case.1.txt BalancedChrMatrix_exp_case.2.txt
BalancedChrMatrix_exp_case.3.txt BalancedChrMatrix_exp_case.4.txt
BalancedChrMatrix_exp_ctrl.1.txt BalancedChrMatrix_exp_ctrl.2.txt
BalancedChrMatrix_exp_ctrl.3.txt BalancedChrMatrix_exp_ctrl.4.txt
4 4 case.1 case.2 case.3 case.4 ctrl.1 ctrl.2 ctrl.3 ctrl.4
case ctrl 2 8 4167 40000 chr6 hg19 ./dcHiC/hg19_goldenpathData 2 /nas/longleaf/home/softwaredcHiC/dchic

R OUTPUT:
R function done!
Traceback (most recent call last):
File "/nas/longleaf/home/software/dcHiC/dchic/run.py", line 318, in
dmfa_file = open(name, "r")
FileNotFoundError: [Errno 2] No such file or directory: 'hmfa_chrRAW_case.1_exp_1.txt'
[1] "/nas/longleaf/home/software/dcHiC/dchic"

when dcHiC analyzed chr20,21,22, I did not got the aforementioned error.

from dchic.

ay-lab avatar ay-lab commented on June 15, 2024

Hmmm. To be clear, with chromosomes 20, 21, and 22, everything in dchic.py ran to the end? And in chromosome 6 this unique error comes up?

Without any data, it is difficult to make a diagnosis. However, this error is where MFA in R is being called. Usually, after "R OUTPUT:" there are several lines of system output from the MFA R script.

One big possibility is that you are running out of memory mid-way (PCA memory demand scales with size), which is why it works for chromosomes 20-22 but not for 6. I did a test and ran into an error in the same spot when limiting my RAM capacity.

If that's not it, running the R script ("Rscript --vanilla /nas/longleaf/home/software/dcHiC/dchic/Hier.R ..... ") output standalone and seeing what the error output is will help.

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

thanks.
not only chr6, but aslo other large chromsomes, such as chr1, chr2 show the same error.
yes, when I ran "Rscript --vanilla /nas/longleaf/home/software/dcHiC/dchic/Hier.R ..... " in the terminal, Hier.R outputs results.
so how to deal with this issue? I set the memory to 800GB, I still got the same errors for big chromosomes.

from dchic.

ay-lab avatar ay-lab commented on June 15, 2024

Hello,

If Hier.R works fine run alone and smaller chromosomes work fine, I don't think this an error with dcHiC. Are you running the large chromosomes in parallel? Perhaps the multi-threading runs into a system limit because the 40kb matrices are already huge. Running in sequence (omitting the -parallel option) or just running everything at a lower resolution (100kb has been our default mode of analysis) may help.

from dchic.

BenxiaHu avatar BenxiaHu commented on June 15, 2024

thanks a lot.

from dchic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.