Git Product home page Git Product logo

hifive's People

Contributors

jxtx avatar msauria avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hifive's Issues

npz support

I have matrices in npz format (from Numpy). Can I use them to directly input these to HiFive in order to get them normalized?

Thank you.

hic-mrheatmap - Segmentation fault

I am trying to fetch mrh data (especially trans interactions) from a normalized project. As soon as I specify more than one chromosome/scaffold hic-mrheatmap often gives me a segmentation fault. The behavior is not the same across all possible combinations of chromosomes/scaffolds. For some it returns a mrh for others it dies with

Finding multi-resolution heatmap for 12 by SCF_7...Segmentation fault

[21071391.448646] hifive[12069]: segfault at 880240c8 ip 00007f1c13e2ada6 sp 00007ffff75859f0 error 4 in _hic_binning.so[7f1c13df8000+54000]

Any idea where to start looking?

KeyError: "Unable to open object (object 'cis_indices' doesn't exist)"

Hi, sorry for bothering. I am using your package as part of my Hi-C analysis pipeline. I will just show the code and the error and see if you have some clues.
Code:
# Filtering HiC fends hic = hifive.HiC('HiC_project_object.hdf5') hic.filter_fends(mininteractions=1, mindistance=0, maxdistance=0) hic.save()
Error:

execfile('HiCtool_hifive.py') │
Loading data from HiCfile_pair1.bam... Done │
Read 0 validly-mapped read pairs. │
No valid data was loaded. │
Filtering fends...Traceback (most recent call last): │
File "", line 1, in │
File "HiCtool_hifive.py", line 75, in │
hic.filter_fends(mininteractions=1, mindistance=0, maxdistance│
=0) │
File "/home/lisky/.local/lib/python2.7/site-packages/hifive/hic.│
py", line 357, in filter_fends │
data = self.data['cis_data'][...] │
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wr│
apper │
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wr│
apper │
File "/home/lisky/.local/lib/python2.7/site-packages/h5py/_hl/gr│
oup.py", line 262, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl) │
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wr│
apper │
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wr│
apper │
File "h5py/h5o.pyx", line 190, in h5py.h5o.open │
KeyError: "Unable to open object (object 'cis_data' doesn't exist)│
"
Looking for your reply, many Thanks

How to create input of "loading HiC Fends" ?

Hi,

I don't understand how to create the input file that is loaded into a HiC Fend.

I see it can be a BED file with the following information:
chromosome, start, and stop position of each restriction fragment
or
chromosome, start, and stop position of the restriction enzyme recognition sites

Additionally, HiCPipe-compatible tabular fend files are mentioned. But it is not very clear for a newcomer like me.

How do I create such file, can anyone help me?

QUASAR-QC good quality score

Hi Michael,
Is there a cutoff when deciding a sample has a good quality score. Usually what would be the approximate quality score (range) we should get for a high quality sample. And when use multiple resolution values (-r) when running the tool, with fixed size bins in interaction matrices (i.e 50kb) how does it effect the quality score. This is what the documentations says,
"A simple quality or replicate score can be determined with a single resolution and coverage value, although testing across multiple resolutions and be used to determine sample resolution limits while multiple coverages can be used to model quality as a function of sequencing depth and determine the maximum sample quality score". Does it mean, we can use this quality score to determine whether we can go for a lower resolution (bin size) when creating the interaction matrices? This is what I understood from the paper. (Maximum usable resolution)
Thank You

Visualization Galaxy

Is there any documentation on how to visualize the multi-resolution heatmaps in Galaxy?

Unable to load txt matrices (interaction matrices) obtained from Homer

I tried to create a hic-data object using following commands

hifive fends -B fend.bed --binned=50000 out_filename.txt

When I run
hifive hic-data -X "test_*.MatA" out_filename.txt output.hic.data
I get "Done 0 cis reads, 0 trans reads" in the command line output and I cannot use this file for next steps (which is 6.4kb in size).

I obtained interaction matrices from homer and modified it according to the explanation in the documentation.
My structure for one test_*.MatA file is
chr1:0-50000 chr1:50000-100000 chr1:100000-150000 chr1:150000-200000
chr1:0-50000 7 0 0 0
chr1:50000-100000 0 11 0 0
chr1:100000-150000 0 0 28 0
chr1:150000-200000 0 0 0 0

The headers are column and row names and file is a TSV file

Can you please guide me how to resolve this issue or whether something wrong with my matrix structure

error handling w/ insufficient memory

Sometimes, without sufficient ram, quasar crashes. I've encountered this several times. In this case, I'm using 256GB of ram and it's insufficient to generate correct stats for 10kb.
Traceback (most recent call last):
File "/data/CCRBioinfo/dalgleishjl/quasarenv/bin/hifive", line 849, in
main()
File "/data/CCRBioinfo/dalgleishjl/quasarenv/bin/hifive", line 93, in main
run(args)
File "/data/CCRBioinfo/dalgleishjl/quasarenv/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 134, in run
q1.print_report(args.report)
File "/data/CCRBioinfo/dalgleishjl/quasarenv/lib/python2.7/site-packages/hifive/quasar.py", line 642, in print_report
self._print_txt_report(filename, qscores, rscores)
File "/data/CCRBioinfo/dalgleishjl/quasarenv/lib/python2.7/site-packages/hifive/quasar.py", line 855, in _print_txt_report
temp = [label, self._num2str(cov), "%0.6f" % qscores[-1, i, j]]
IndexError: index 1 is out of bounds for axis 1 with size 1

installing error: Couldn't find index page for 'setuptools_cython'

Hi,

I am trying to install hifive, but encountered the following error

$ ~/software/anaconda2/bin/pip install hifive==1.5.6
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x2b1cb0d4a990>: Failed to establish a new connection: [Errno 101] \xe7\xbd\x91\xe7\xbb\x9c\xe4\xb8\x8d\xe5\x8f\xaf\xe8\xbe\xbe',)': /simple/hifive/
Collecting hifive==1.5.6
  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/87/db/352ffd43f2ac26a6071b42ba7ad489d1c4328c536dc1099e0fa99b6fa43a/hifive-1.5.6.tar.gz (1.3 MB)
    ERROR: Command errored out with exit status 1:
     command: /home/niuyw/software/anaconda2/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-y5LMIJ/hifive/setup.py'"'"'; __file__='"'"'/tmp/pip-install-y5LMIJ/hifive/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-CHNMCV
         cwd: /tmp/pip-install-y5LMIJ/hifive/
    Complete output (26 lines):
    Couldn't find index page for 'setuptools_cython' (maybe misspelled?)
    No local packages or download links found for setuptools-cython
    install_dir .
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-y5LMIJ/hifive/setup.py", line 189, in <module>
        setup_package()
      File "/tmp/pip-install-y5LMIJ/hifive/setup.py", line 146, in setup_package
        setup(**metadata)
      File "/home/niuyw/software/anaconda2/lib/python2.7/distutils/core.py", line 111, in setup
        _setup_distribution = dist = klass(attrs)
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/setuptools/dist.py", line 221, in __init__
        self.fetch_build_eggs(attrs.pop('setup_requires'))
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/setuptools/dist.py", line 245, in fetch_build_eggs
        parse_requirements(requires), installer=self.fetch_build_egg
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/pkg_resources.py", line 544, in resolve
        dist = best[req.key] = env.best_match(req, self, installer)
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/pkg_resources.py", line 786, in best_match
        return self.obtain(req, installer) # try and download/install
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/pkg_resources.py", line 798, in obtain
        return installer(requirement)
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/setuptools/dist.py", line 293, in fetch_build_egg
        return cmd.easy_install(req)
      File "/home/niuyw/software/anaconda2/lib/python2.7/site-packages/distribute-0.6.14-py2.7.egg/setuptools/command/easy_install.py", line 576, in easy_install
        raise DistutilsError(msg)
    distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('setuptools-cython')
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

Any ideas to solve this?

QuaSAR is having trouble recognizing it's own output.

edited-- I noticed that R had left headers in and I removed those from the original file, then rerun HiCData, HiCProject, quasar file, and quasar report steps (after deleting the old quasar file).
head Neg_allValidPairs.hifive.raw
chr1 15874 + chr15 66591371 +
chr1 15876 - chr15 102515347 -
chr1 15878 - chr15 102515333 -
chr1 15878 - chr15 102515348 -
chr1 15889 - chr15 102515319 -
chr1 15912 - chr15 102515317 -
chr1 16064 - chr15 94969060 -
chr1 16066 - chr10 103160580 +
chr1 16084 - chr15 102192091 +
chr1 57043 - chr15 69518897 +

hifive hic-data -R Neg_allValidPairs.hifive.raw -i 500 --skip-duplicate-filtering hindiii_unbinned.fend Neg_unbinned_r.HiCData
Loading data from T47D-HiChip-Neg_allValidPairs.hifive.raw...
23167171 validly-mapped reads pairs loaded.
23167171 total validly-mapped read pairs loaded. 20664582 valid fend pairs
Parsing fend pairs... Done 19041907 cis reads, 3877837 trans reads

hifive hic-project Neg_unbinned_r.HiCData Neg_unbinned_r.HiCProject
Filtering fends... Removed 1609453 of 1675302 fends
Finding distance curve... Done
bash-4.1$ hifive quasar -p Neg_unbinned_r.HiCProject -r 1000000 -d 0 Neg_unbinned_r.quasarfile
bash-4.1$ hifive quasar -o Neg_unbinned_r.quasarreport Neg_unbinned_r.quasarfile
The output format was not recognized.

Any ideas? I generated this data from HiC-pro's allValidPairs

load_data_from_matrices: Attribute not found

I am facing to related (possibly) issues:

I am trying to load the hic matrices which I have binned at 20kb resolution (N*M matrix, with approraite row and column names as described in input format for TXT matrices)

hifive workded fine till the creation of fend object file from the chromosomes lengths file using the follwing commands:
import hifive
fend = hifive.Fend(fend.object, mode='w', binned=20000)
fend.load_bins(chrom_length_filename, genome_name='MM10', format='len')
fend.save()

But next when I run,
data = hifive.HiCData(out_filename, mode='w')
data.load_data_from_matrices(fend.object, ['chr1.matrix', 'chr2.matrix', 'chr1_by_chr2.matrix']), I get error:
AttributeError: 'HiCData' object has no attribute 'load_data_from_matrices'

I also tried to run in directly from linux shell, where I am able to generate the binned fend file using the chromosome lengths, But got an error when I tried to load the hic data from matrices to create the HiC-data object.

AttributeError: 'HiCData' object has no attribute 'fends'

This was generated from the following command

hifive hic-data -X /path/to/matrices.txt path/to/fend/ path/to/output.

Can you please help me regarding these issues. I am using the latest (1.5) version of hifive which I installed using pip with python 2.7.11

Thank you.

Costume Center Position of the Interacting Nucleosome

Dear msauria,

At the beginning, I must thank you for all your previous supports that helped me on my analysis and exploration. Currently I must build my own nucleosome-resolved contact matrix from the Hi-C data obtained from the experiment. Due to the experimental assumption, we assumed that the
plus and minus strand reads originate from interactions at DNA exit and entry points in nucleosomes are wrapped around the nucleosome. Under this assumption, the center position of the
interacting nucleosome can be obtained as 67 bp upstream of the read end coordinates. The assignment was done by finding the closest nucleosome locus along the genome coordinate against the obtained center nucleosome position from each read. I am not sure if your program can add/modify such detailed parameters and create a contact matrix. If not, could you please give me some advice and instructions on creating Hi-C matrix and visualization?

Best regards
Eik

a small striping issue

Hi,
First thank you for the great package.

I think there is a small problem in the code when reading RAW format data

in line 194 and 195.

temp[0].strip('chr')

Shouldn't it be

 temp[0] = temp[0].strip('chr')
 temp[3] = temp[3].strip('chr')

I tried with a file in the following format

chr1    10008   +   chr1    10108   -
chr1    10008   +   chr1    234897  -
chr1    12436   +   chr1    12886   -
chr1    12448   +   chr1    12832   -
chr1    12461   +   chr1    12879   -
chr1    12686   +   chr1    62882   +
chr1    12851   +   chr1    13308   -

but I got 0 loaded reads.

Thanks in advance

Installation Problem

Hi, I have a problem when installing the hifive package.
The error message is RuntimeError: Python version 2.7+ is required. But my python version is 3.7.11. Please see the image.
Screenshot 2024-02-01 at 11 17 35 AM

I check the install.py file. Is it true that only python version > 2.7 but < 3.0 is valid?

issue KeyError: "Unable to open object (object 'cis_indices' doesn't exist)"

Hi,

I'm trying to run your tool (thank you for it by the way) and i'm running into this error:
0 validly-mapped read pairs loaded.
No valid data was loaded.
Traceback (most recent call last):
File "/opt/miniconda3/envs/hiqc/bin/hifive", line 849, in
main()
File "/opt/miniconda3/envs/hiqc/bin/hifive", line 93, in main
run(args)
File "/opt/miniconda3/envs/hiqc/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 114, in run
coverages=args.coverages, seed=args.seed)
File "/opt/miniconda3/envs/hiqc/lib/python2.7/site-packages/hifive/quasar.py", line 190, in find_transformation
elif hic.data['cis_indices'][chr_indices[i + 1]] - hic.data['cis_indices'][chr_indices[i]] == 0:
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/opt/miniconda3/envs/hiqc/lib/python2.7/site-packages/h5py/_hl/group.py", line 264, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'cis_indices' doesn't exist)"

I checked and rechecked my input, and it's very much looking good...
Any idea?

N.

Fends being completely or significantly filtered out by hic-project step.

I'm running a dataset from literature, much the same in structure to the first, but all the fends are being filtered. Is there something I may be able to change or look into?
bash-4.1$ hifive hic-data -R SRR3467175_allValidPairs.hifive.raw -i 500 --skip-duplicate-filtering mboi_unbinned.fend SRR3467175_unbinned_r.HiCData
62409707 validly-mapped reads pairs loaded. raw...
62409707 total validly-mapped read pairs loaded. 60716291 valid fend pairs
Parsing fend pairs... Done 48701403 cis reads, 13233775 trans reads
bash-4.1$ hifive hic-data -R SRR3467176_allValidPairs.hifive.raw -i 500 --skip-duplicate-filtering mboi_unbinned.fend SRR3467176_unbinned_r.HiCData
43228016 validly-mapped reads pairs loaded. raw...
43228016 total validly-mapped read pairs loaded. 42310523 valid fend pairs
Parsing fend pairs... Done 33278056 cis reads, 9636621 trans reads

bash-4.1$ hifive hic-project -m 1 -n 0 -x 0 SRR3467176_unbinned_r.HiCData SRR3467176_unbinned_r.HiCProject
Filtering fends... Removed 14255268 of 14255268 fends
bash-4.1$ hifive hic-project -f 1 -m 1 -n 0 -x 0 SRR3467176_unbinned_r.HiCData SRR3467176_unbinned_r.HiCProject
Filtering fends... Removed 2547651 of 14255268 fends
bash-4.1$ hifive hic-project SRR3467175_unbinned_r.HiCData SRR3467175_unbinned_r.HiCProject
Filtering fends... Removed 14255268 of 14255268 fends

[Question] hifive Tweaking question

Hi,
I have been recently using hifive to re-analysis the non-processed data of Rao et al, but the final results have a kind of noise. Just to ask for some suggestions maybe I am missing some paramters.

I used the following steps

mpirun -np 4 hifive hic-complete express -f 3 -B fend.bed --re Mbol --genome hg19 -R GSM1551559_HIC010_merged_nodups.bed -i 500 --knight-ruiz -P GSM1551559_HIC010_merged_nodups

I selected -f 3 as for low resolution I expect a small number of interactions in some bins. So if it has at least 3 interactions, I consider it as a potential signal. If I got it right, I am expecting noise to be filtered in the next steps.

When hifive is done I got the coordinates and plotted the heatmap using R.

hifive hic-interval  -b 15000 -c 10 -d enrichment   GSM1551559_HIC010_merged_nodups.hcp $output

If I plot the heatmap I get the following

screen shot 2015-11-16 at 8 16 33 am

We can see the domains, but there is non-filtered noise, is it because of the -f 3 parameter and are there other filtering criteria because I think the default value -f 8 is a kind of extreme.

Thanks for your answer.

Regards,

Error when changing resolutions in quasar

Hi,

I used quasar to get quality scores for my hic-project with the the resolutions 1 Mb and 40 Kb, which worked fine. When I changed the resolution to 500 Mb and 200 Mb, quasar gave back this error.:

Coverage 12426421 Resolution 200000 Chrom scaffold_12 - Normalizing countsTraceback (most recent call last):
File "/home/jawen108/.local/bin/hifive", line 849, in
main()
File "/home/jawen108/.local/bin/hifive", line 93, in main
run(args)
File "/home/jawen108/.local/lib/python2.7/site-packages/hifive/commands/find_quasar_scores.py", line 114, in run
coverages=args.coverages, seed=args.seed)
File "/home/jawen108/.local/lib/python2.7/site-packages/hifive/quasar.py", line 302, in find_transformation
norm, dist, valid_rows = self._normalize(chrom, raw[indices[h]:indices[h + 1]], mids[chrom], res)
File "/home/jawen108/.local/lib/python2.7/site-packages/hifive/quasar.py", line 618, in _normalize
curr_binsize = mids[1] - mids[0]
IndexError: index 1 is out of bounds for axis 0 with size 1

I looked through the code to find out whats wrong. Clearly the mids object cannot be subscripted by a second element.

However, this mids object is created from this hic.fends['fends']['mid'][...] object that is complicated to understand (bellow)

179 else:
180 temp_mids = hic.fends['fends']['mid'][...]
181 chr_indices = hic.fends['chr_indices'][...]
....
209 raw[indices[i]:indices[i + 1], :] = temp 210 mids[chrom] = temp_mids[chr_indices[chrint]:chr_indices[chrint + 1]]

I would be thank full, if somebody could descibe me, what happens with this mids and how changing the resolution has an impact on them.

Thank you for your help in advance.

Jan Wendt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.