aidenlab / straw Goto Github PK
View Code? Open in Web Editor NEWExtract data quickly from Juicebox via straw
License: MIT License
Extract data quickly from Juicebox via straw
License: MIT License
My intention is to use straw to extract a contact matrix from .hic file so that
result = straw.straw("NONE", "https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic",
str(args.chrom), str(args.chrom), "BP", args.bin)
When args.bin = 2.5Mb, I printed out the first few entries of result[0], result[1], and result[2], by inspecting the result and reading the source code of straw.py, I knew that result[0],result[2] are x,y-axis, respectively, and result[2] is the counts.
[0, 0, 2500000, 0, 2500000, 5000000, 0, 2500000, 5000000, 7500000, 0, 2500000, 5000000, 7500000, 10000000, 0, 2500000, 5000000, 7500000, 10000000]
[0, 2500000, 2500000, 5000000, 5000000, 5000000, 7500000, 7500000, 7500000, 7500000, 10000000, 10000000, 10000000, 10000000, 10000000, 12500000, 12500000, 12500000, 12500000, 12500000]
[1801463.0, 388561.0, 2243387.0, 140674.0, 667793.0, 2797495.0, 69644.0, 90349.0, 443364.0, 3702096.0, 63695.0, 68959.0, 155894.0, 591727.0, 3755324.0, 28924.0, 100082.0, 139273.0, 82155.0, 329323.0]
By tracing the x, and y axis, I guessed that result gives the counts data by iterating the upper triangular matrix column by column (as shown in the first half of the image); however, when I binned at 25kb, the results seem confusing. If results give the full upper triangular matrix, then result[2] should have #ofbins(1+#ofbins)/2 entries; this is the case when binSize=2.5Mb yet not true when binSize=25kb. I tried to trace the x,y labels and get a confusing line as shown in the second half of that image.
So I wanna ask how is the upper triangular matrix coded in the straw output? Please correct me if I misunderstand how STRAW works since I haven't read the code very carefully.
As per title.
Given a MatrixZoomData file:
import hicstraw
hic = hicstraw.HiCFile(path)
resolution = min(hic.getResolutions())
chromosomes = {c.name : c for c in hic.getChromosomes()}
mzd = hic.getMatrixZoomData("1", "2", "observed", "KR", "BP", resolution)
what is the fastest way to determine if there will be no records?
Using getRecords
seems to be pretty slow for large chromosomes...
records = mzd.getRecords(0, chromosomes["1"].length, 0, chromosomes["2"].length)
len(records) == 0
Thank you in advance!
Describe the bug
The straw
python package can't fetch data correctly, when running:
result = matrixObj.getDataFromBinRegion(4000000,6000000,4000000,6000000)
It will be blocked for a monment, then return an empty list.
To Reproduce
see the Screenshot. My test data can be downloaded from here.
Expected behavior
Return list with readed values.
Desktop (please complete the following information):
Describe the bug
read_metadata does not work on version 0.0.6 (the one release on pip)
Solution
Can you update pip to reflect the current version?
Hello,
Sorry if there was an easier way to extract data that I haven't seen but:
Is your feature request related to a problem? Please describe.
The current way strawC reports data requires heavy conversion before being useful, while the normal straw reports a list of lists, strawC reports it as objects that can't be accessed easily.
While I see that the extraction itself is many times faster than the normal version the added overhead to covert the data makes it slower or the same speed as normal straw.
%%timeit
data = strawC.strawC('NONE', hic_folder+files[1], 'chr22', 'chr22', 'BP', 10000)
extract = lambda x: (x.binX, x.binY, x.counts)
converted_data = np.array(list(map(extract, data)), dtype = np.int64)
matrix = scipy.sparse.coo_matrix((converted_data[:,2],(converted_data[:,0]//10000,converted_data[:,1]//10000)))
707 ms ± 10.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
data = straw.straw('NONE', hic_folder+files[1], 'chr22', 'chr22', 'BP', 10000)
matrix = scipy.sparse.coo_matrix((data[2],(np.array(data[0])//10000,np.array(data[1])//10000)))
673 ms ± 19.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Describe the solution you'd like
Is it possible to report the data either like the normal straw, or as a numpy array, or even directly as a scipy sparse matrix?
If I understand correctly it is possible to use numpy structures in c++ in pybind, maybe a version designed like that?
Thanks!
it seems the package on bioconda update quite slowly comparing to github
I'm working on developing a converter from .hic to .cool files (hic2cool) and came across a possible issue with a test file of hic version 6. The binX, binY, and counts values were not correct within the readBlock function.
The file is IMR90.hic and can be found here: https://bcm.app.box.com/v/aidenlab/folder/11235404320.
I was able to get the correct counts (i.e. ones that matched the output of juice_tools dump
) by changing the lines 323 to 325 in python/straw.py. https://github.com/theaidenlab/straw/blob/65f0e94bf7cc11cfa6a9e7ddaf50205591bb6069/python/straw.py#L323-L325
Looking closely, the binX value == nRecords for i=0 in the loop over range(nRecords), which explains the problem. I changed the range of bytes read and it seems to work:
x = struct.unpack(b'<i', uncompressedBytes[(12i+4):(12i+8)])[0]
y = struct.unpack(b'<i', uncompressedBytes[(12i+8):(12i+12)])[0]
c = struct.unpack(b'<f', uncompressedBytes[(12i+12):(12i+16)])[0]
Just thought I would let you know in case this is a valid issue you want to fix.
Best,
Carl
Hi,thank you for such tools.But
1)How can I install straw in python module for I get error like"ImportError: No module named straw"
2)How can I extract the whole genome interaction matrix ? A full command line would be appreciated.
Cannot use the current straw.cpp to find the straw function but by including the ../python/old the compilation works:
straw-86c2939e3695e31a5a41f53cc1231f7fcb77fb87/C++$ g++ -lz -std=c++11 -o straw main.cpp straw.cpp
/tmp/cc4rs88z.o:main.cpp:function main: error: undefined reference to 'straw(std::string, std::string, int, std::string, std::string, std::string, std::vector<int, std::allocator<int> >&, std::vector<int, std::allocator<int> >&, std::vector<float, std::allocator<float> >&)'
#working version
straw-86c2939e3695e31a5a41f53cc1231f7fcb77fb87/C++$ g++ -lz -std=c++11 -o straw main.cpp straw.cpp ../python/old/straw.cpp
ps: also try to include a makefile or include it into the README for this project.
Need to update strawC especially for v9.
Hi,
I would like to implement support for the hic
format for Galaxy (https://github.com/galaxyproject/galaxy). For this I need test data for a functional test if the file format detection is working as it should. Do you have any hic
file with a file size < 1MB that I could use for this purpose? I looked in your repository here, but it seems you don't have test cases? And in the ipython notebook, the URL: https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/HIC001.hic is given, but I only get a ERROR 403: Forbidden
error using wget.
Best,
Joachim
Hi,
just to mention that for unknown reason, I had to add this "-std=c++11" because without I can't compile the script with g++.
I hope it will serve for others
C++ fails silently, Python fails with an obscure error when normalization vectors don't exist.
hi, thank you guys for providing such useful tool! We are wondering is it possible to add options such as:
$ straw HEADER file.hic
chromosomes 1 2 3 4 ....
blocks BP 10000 20000 500000 ...
blocks FRAG 1 5 10 ...
?
It would be very helpful for people who trying to get the info without trying and failing read it. Thank you!
So I am dumping obs/exp data from a hic file with command-line 'straw' versus R library 'strawr', and I am not getting the same results.
The data are very similar overall and correlate highly, but still are clearly not the same values, upwards of 80% of non-NA rows are different at 4 decimals of accuracy. This holds true across normalizations, bin sizes, and chromosomes, even unnormalized data (i.e. NONE oe) has this problem.
I am using the 'straw' compiled from the latest github release, and 'strawr' installed fresh just a few days ago on R-4.1.0 via install.packages().
I also compared the data from juicer tools 'dump' and found that it was basically identical to strawr.
Here is a row slice from a table showing both methods, same hic file, chr 1, VC, oe, 10kb:
PosA PosB dump strawr straw
40000 40000 0.463189 0.463189 0.463189
40000 45000 1.971135 1.971135 1.971135
40000 50000 2.149339 2.149339 2.149339
40000 55000 1.261088 1.261088 1.261088
40000 60000 0.776958 0.776958 0.624063
40000 65000 0.687151 0.687151 0.855503
40000 70000 0.394186 0.394187 0.333246
40000 80000 1.384906 1.384906 0.854544
40000 105000 1.731343 1.731343 1.358210
40000 110000 1.961904 1.961904 1.652741
40000 115000 0.312818 0.312818 0.240716
40000 120000 0.190295 0.190295 0.488769
40000 130000 0.333526 0.333526 0.338950
40000 135000 1.289947 1.289947 0.944601
40000 140000 0.450147 0.450147 0.437852
40000 145000 1.116514 1.116514 1.116514
40000 150000 0.638958 0.638958 0.459245
40000 165000 0.737243 0.737243 0.832634
40000 175000 0.632508 0.632508 0.678903
40000 190000 1.396106 1.396106 1.578750
Not sure how to proceed.
Thanks,
Ariel
Describe the bug
The current python version of straw (and strawC) isn't able to load a file downloaded from the $DN Data Portal. Running straw/python/read_hic_header.py
with the .hic file suggests that this file should be readable though.
To Reproduce
Steps to reproduce the behavior:
straw/python/read_hic_header.py <path/to/4DNFI4OUMWZ8.hic>
, should see:HiC version : 8
Master index : 17049250642
Genome ID : /var/lib/cwl/stga58048be-68cd-4724-b145-c172fb1dd45b/4DNFI3UBJ3HZ.chrom.sizes
Chromosomes : {'ALL': 2725521, '1': 195471971, '2': 182113224, '3': 160039680, '4': 156508116, '5': 151834684, '6': 149736546, '7': 145441459, '8': 129401213, '9': 124595110, '10': 130694993, '11': 122082543, '12': 120129022, '13': 120421639, '14': 124902244, '15': 104043685, '16': 98207768, '17': 94987271, '18': 90702639, '19': 61431566, 'X': 171031299, 'Y': 91744698}
Base pair-delimited resolutions : [10000000, 5000000, 2500000, 1000000, 500000, 250000, 100000, 50000, 25000, 10000, 5000, 2000, 1000]
Fragment-delimited resolutions : []
import strawC
straw_out = strawC.strawC('NONE', r'path/to/4DNFI4OUMWZ8.hic', '5', '5', 'BP', 500000)
See: File doesn't have the given chr_chr map 5_5
straw.straw
:import straw
straw_out = straw.straw('NONE', '4DNFI4OUMWZ8.hic', '5', '5', 'BP', 500000)
HiC version: 8
See:
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-6-00891572fa3d> in <module>
----> 1 straw_out = straw.straw('NONE', '4DNFI4OUMWZ8.hic', '5', '5', 'BP', 500000)
~/miniconda3/lib/python3.7/site-packages/straw/straw.py in straw(norm, infile, chr1loc, chr2loc, unit, binsize, is_synapse)
508 req.seek(master)
509
--> 510 list1 = readFooter(req, c1, c2, norm, unit, binsize)
511 myFilePos=list1[0]
512 c1NormEntry=list1[1]
~/miniconda3/lib/python3.7/site-packages/straw/straw.py in readFooter(req, c1, c2, norm, unit, resolution)
130 c1NormEntry=dict()
131 c2NormEntry=dict()
--> 132 nBytes = struct.unpack('<i', req.read(4))[0]
133 key = str(c1) + "_" + str(c2)
134 nEntries = struct.unpack('<i', req.read(4))[0]
error: unpack requires a buffer of 4 bytes
Expected behavior
Being able to read the Hi-C file?
The C++ and python flavors for straw have been pretty significantly improved/upgraded. It would be great if some of these improvements could also get pulled into the strawR version and CRAN could also be updated.
Hello. I am an undergrad at the University of British Columbia. I am trying to extract contact probabilities from a .hic file and store them as a 2d array or similar. Any help is appreciated! :)
I use Mac.
So far I have downloaded the binary from: https://github.com/aidenlab/straw/blob/master/bin/Mac/straw
navigated to the downloads folder in terminal
my data data.hic is also located in the downloads folder
i use the following terminal command:
./straw NONE data.hic 1 1 BP 1000000 > output.txt
and I get the following message in terminal:
-bash: ./straw: Permission denied
I'm not sure what this means... Any suggestions would be appreciated. Thanks in advance!
Is your feature request related to a problem? Please describe.
Some functions in straw/straw.py print a message and return -1 when unexpected outcomes are encountered. Checking for a returned value of -1 could help, but 1) this isn't currently done in the code, 2) isn't ideal as it delegates the responsibility of catching these exceptions outside of the relevant function. Worse, it can let execution carry on for a while and then trigger an unexpected and hard-to-trace exception when the -1 value is subsequently used.
Describe the solution you'd like
Explicitely raising typed exceptions with informative messages and documenting common cases for their occurrence.
Describe alternatives you've considered
None, but suggestions are encouraged.
Example
if (magic_string != b"HIC"):
print('This does not appear to be a HiC file magic string is incorrect')
return -1
global version
version = struct.unpack('<i',req.read(4))[0]
if (version < 6):
print("Version {0} no longer supported".format(str(version)))
return -1
dict.keys()
returns a generator in Python 3, but
in Python 2 it returns a list. Hence, iterating and modifying
dict.keys()
will fail, because the dict changes on the fly.
If a user tries to read .hic file with an unsupported resolution, readMatrix prints "Error finding block data" and returns -1. straw.py then raises a TypeError at line 548, which is a bit cryptic:
blockBinCount=list1[0] # where list1 = -1
It might be preferable to raise a ValueError in readMatrix indicating that the value for resolution wasn't found in the .hic file.
Hi,
Thanks for the awesome work!! I just into this field, and try to use straw
to get a vanilla matrix from .hic file. There may be a potential bug here.
Line
Line 616 in 4d22336
if norm != "NONE":
c1Norm = futureNorm1.result()
if isIntra:
c2Norm = c1Norm
else:
c2Norm = futureNorm2.result()
blockBinCount, blockColumnCount = futureMatrix.result()
return normalizedmatrix(self.infile, self.is_synapse, binsize, isIntra, neededToFlipIndices, blockBinCount,
blockColumnCount, blockMap, norm, c1Norm, c2Norm, self.version)
Describe the bug
When use straw obj to get contact matrix without normalization.
UnboundLocalError: local variable 'c1Norm' referenced before assignment
To Reproduce
import straw
straw.__version__
hicFile = 'test.hic'
strawObj = straw.straw(hicFile)
matObj = strawObj.getNormalizedMatrix('5', '5', 'NONE', 'BP', 10000)
Desktop:
Describe the bug
A recent commit (0e9cfa5) changes function signatures, by adding 'matrix' parameter of 'observed' or 'oe': See the git blame record: https://github.com/aidenlab/straw/blame/90367afab2f4142860f53567a6cb5a1d26a007a9/R/R/RcppExports.R#L26
However, since matrix
is a new argument and is put as the first without a default value, this breaks other packages. For example, in the HiCRep package: https://github.com/TaoYang-dev/hicrep/blob/e485dfa71dc98cadbbda70424084e85a4a94e3b0/R/hic2mat.R#L23
For now, since this commit has been released for a while, and some other people's code may depend on this new signature, there's no way going back. I suggest you update your package's version number:
Line 3 in 90367af
So that other packages can at least update their dependency specification, and make sure the correct version of strawr is installed.
Thanks!
It’s my understanding, that the Python package can be installed with pip install strawC
. It’d be great if that was mentioned in the README.
Describe the bug
When I try to install the package using a conda environment, it doesn't seem to work. I get an error related to curl/curl.h. It has something to do with the c package, I think.
To Reproduce
Steps to reproduce the behavior:
Using environment.yml file
Expected behavior
The installation fails with this error:
src/straw.cpp:34:10: fatal error: curl/curl.h: No such file or directory
34 | #include <curl/curl.h>
| ^~~~~~~~~~~~~
compilation terminated.
error: command '/net/noble/vol1/home/mozesj/anaconda3/envs/protocol/bin/x86_64-conda-linux-gnu-cc' failed with exit code 1
[end of output]
Desktop (please complete the following information):
Additional context
When I remove the line that installs hic-straw line from the environment.yml file, the installation of the environment works, so something is happening when I make the environment include the environment.yml file.
Collecting hic-straw
Using cached https://files.pythonhosted.org/packages/a3/f1/19bb1d40f659e5daa98ba6a2d845c1203ec45661c1b855958643ccc37c66/hic-straw-1.1.3.tar.gz
Requirement already satisfied: pybind11>=2.4 in c:\user
(from hic-straw) (2.7.1)
Installing collected packages: hic-straw
Running setup.py install for hic-straw: started
Running setup.py install for hic-straw: finished with status 'error'
ERROR: Command errored out with exit status 1:
Complete output (12 lines):
running install
running build
running build_ext
building 'hicstraw' extension
creating build
creating build\temp.win32-3.8
creating build\temp.win32-3.8\Release
creating build\temp.win32-3.8\Release\src
straw.cpp
src/straw.cpp(34): fatal error C1083: Cannot open include file: 'curl/curl.h': No such file or directory
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Professional\\VC\\Tools\\MSVC\\14.26.28801\\bin\\HostX86\\x86\\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: externally-managed --compile Check the logs for full command output.
WARNING: You are using pip version 19.2.3, however version 22.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
I am trying to use straw.py to extract the normalized contact matrix by using straw.straw("KR",file1, chrn, chrn,"KR",resolution).
In line 393: def readNormalizationVector(req):, the function readNormalizationVector only has one argument, while in line 524 and 526, when you call this function
c1Norm = readNormalizationVector(req, c2NormEntry)
c2Norm = readNormalizationVector(req, c2NormEntry)
There are two arguments given. That will be a problem for using this function.
Thanks in advance!
Hi, I am using straw with the following command:
./straw NONE GSE63525_K562_combined_30.hic 22 22 BP 500000 > K562.chHCT116_r22.500 kb.txt
I get following error:
"not enough arguments"
What could I be missing?
Indices will print out more than once with chromosome region
See email
We're using straw and found a few cases where the same bin pair appears two or more times with different count values. Sometimes the first one ('old' below) is the correct one ('pairs'), other times none of them match the correct number. We're not sure if this is an issue with .hic file or straw.
### Contact count conflicts with Juicer
### using 5kb binsize, KR normalization
### file: /n/data1/hms/transition/park/juicer-sample-dir/SRR1658650/aligned/inter.new_20170124.hic
### conflicts are in form chr1:start1|chr2:start2 --> old count, new count
17:25260000|24:28780000 --> old: 10 new: 7 //pairs: 10
17:25260000|24:28785000 --> old: 25 new: 32 //pairs: 25
17:25260000|24:28790000 --> old: 1 new: 8 //pairs: 1
17:25260000|24:28795000 --> old: 12 new: 13 //pairs: 12
17:25260000|24:28800000 --> old: 7 new: 24 //pairs: 7
17:25260000|24:28805000 --> old: 3 new: 8 //pairs: 3
17:25260000|24:28810000 --> old: 5 new: 4 //pairs: 5
17:25260000|24:28815000 --> old: 4 new: 12 //pairs: 4
25:0|25:5000 --> old: 227 new: 79 //pairs: 180
25:0|25:10000 --> old: 332 new: 330 //pairs: 251
25:0|25:10000 --> old: 330 new: 194 //pairs: 251
25:0|25:15000 --> old: 110 new: 73 //pairs: 84
25:0|25:15000 --> old: 73 new: 120 //pairs: 84
25:0|25:15000 --> old: 120 new: 12 //pairs: 84
Thanks in advance!
Reading the documentation I could not find any reference to those normalization procedures. I wonder if there is a way of using them.
Also, when an inter-chromosomal map is extracted with the balanced norm, which of the following procedures (if any) is used: inter-balancing or genome-wide balancing?
Should be investigated - seems much slower than warranted just by URL / network.
When using the straw python only version and running some of the example commands it returns the following error.
straw.straw('VC', 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic', '4', '4', 'BP', 500000)
Traceback (most recent call last):
File "", line 1, in
TypeError: init() takes from 2 to 3 positional arguments but 7 were given
Using straw with only the .hic as input does not give an error.
The examples (https://github.com/theaidenlab/straw/wiki/CPP#running) show how to extract reads between two consecutive regions on the chromosome. If I have two sets of bins which are not consecutive, and I am only interested to extract reads between these two sets of bins, is there a fast way to use straw to do that? Thanks.
I installed straw locally to my python 3.9 environment via the command that you propose to your site with pip install hic-straw
and pip install strawC
. Then I also installed pybind 11
from the [Anaconda repository]https://anaconda.org/conda-forge/pybind11). Unfortunately in your collab tutorial the next commands,
import straw
import numpy as np
from scipy.sparse import coo_matrix
result = straw.straw('observed','KR', 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic', '4:0:1000000', '4:0:1000000', 'BP', 5000)
result = straw.straw('observed','KR', 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined_30.hic', '4:1000000:2000000', '4:1000000:2000000', 'BP', 5000)
# printing the first 10 rows from the sparse format
for i in range(10):
print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))
take about 30 sec to run, whereas locally I wait for more than 15 minutes and I see no result. Do you know where is the problem?
And for some reason pybind11 cannot be installed to python 3.9 via the way you describe in your site (probably this is the problem).
Thank you!
In line 384 in the readBlock function, there's a variable "countsnot" which should simply be "counts":
` elif (type_== 2):
temp=14
nPts = struct.unpack('<i',uncompressedBytes[temp:(temp+4)])[0]
temp=temp+4
w = struct.unpack('<h',uncompressedBytes[temp:(temp+2)])[0]
temp=temp+2
for i in range(nPts):
row=int(i/w)
col=i-row*w
bin1=int(binXOffset+col)
bin2=int(binYOffset+row)
if (useShort==0):
c = struct.unpack('<h',uncompressedBytes[temp:(temp+2)])[0]
temp=temp+2
if (c != -32768):
record = dict()
record['binX'] = bin1
record['binY'] = bin2
record['counts'] = c
v.append(record)
index = index + 1
else:
counts = struct.unpack('<f',uncompressedBytes[temp:(temp+4)])[0]
temp=temp+4
if (countsnot != 0x7fc00000): #<-----------------here
record = dict()
record['binX'] = bin1
record['binY'] = bin2
record['counts'] = counts
v.append(record)
index = index + 1
return v`
Since curl.h
is required, it should be declared in SystemRequirements.
Hello,
I get the following error when I run the straw-R.cpp file. can you please let me know what went wrong?
C:/RBuildTools/3.4/mingw_64/bin/g++ -I"C:/PROGRA1/R/R-331.0/include" -DNDEBUG -I"PATH TO R FOLDER" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c straw-R.cpp -o straw-R.o
straw-R.cpp: In function 'SEXPREC* straw(std::string, std::string, int, std::string, std::string, std::string)':
straw-R.cpp:525:34: error: no matching function for call to 'std::basic_ifstream::basic_ifstream(std::string&, const openmode&)'
ifstream fin(fname, fstream::in);
^
straw-R.cpp:525:34: note: candidates are:
In file included from straw-R.cpp:25:0:
C:/RBuildTools/3.4/mingw_64/x86_64-w64-mingw32/include/c++/fstream:470:7: note: std::basic_ifstream<_CharT, _Traits>::basic_ifstream(const char*, std::ios_base::openmode) [with _CharT = char; _Traits = std::char_traits; std::ios_base::openmode = std::_Ios_Openmode]
basic_ifstream(const char* __s, ios_base::openmode __mode = ios_base::in)
^
C:/RBuildTools/3.4/mingw_64/x86_64-w64-mingw32/include/c++/fstream:470:7: note: no known conversion for argument 1 from 'std::string {aka std::basic_string}' to 'const char*'
C:/RBuildTools/3.4/mingw_64/x86_64-w64-mingw32/include/c++/fstream:456:7: note: std::basic_ifstream<_CharT, _Traits>::basic_ifstream() [with _CharT = char; _Traits = std::char_traits]
basic_ifstream() : __istream_type(), _M_filebuf()
^
C:/RBuildTools/3.4/mingw_64/x86_64-w64-mingw32/include/c++/fstream:456:7: note: candidate expects 0 arguments, 2 provided
C:/RBuildTools/3.4/mingw_64/x86_64-w64-mingw32/include/c++/fstream:430:11: note: std::basic_ifstream::basic_ifstream(const std::basic_ifstream&)
class basic_ifstream : public basic_istream<_CharT, _Traits>
^
C:/RBuildTools/3.4/mingw_64/x86_64-w64-mingw32/include/c++/fstream:430:11: note: candidate expects 1 argument, 2 provided
straw-R.cpp:535:20: error: 'stoi' was not declared in this scope
c1pos1 = stoi(x);
^
straw-R.cpp:541:20: error: 'stoi' was not declared in this scope
c2pos1 = stoi(x);
^
straw-R.cpp: In function 'SEXPREC* straw_R(Rcpp::String)':
straw-R.cpp:638:20: error: 'stoi' was not declared in this scope
binsize=stoi(size);
^
make: *** [straw-R.o] Error 1
Warning message:
running command 'make -f "C:/PROGRA1/R/R-331.0/etc/x64/Makeconf" -f "C:/PROGRA1/R/R-331.0/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="sourceCpp_5.dll" WIN=64 TCLBIN=64 OBJECTS="straw-R.o"' had status 2
Error in sourceCpp("straw-R.cpp") :
Error 1 occurred building shared library.
Best
Saradha
PS R session info is as follows:
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7600)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Rcpp_0.12.12
loaded via a namespace (and not attached):
[1] tools_3.3.0
hello,
Running read header I am getting this error. Can you please help resolving this:
$ python3 ./read_hic_header.py NlaIII_run01_UCSC_hg38.hic
Traceback (most recent call last):
File "./read_hic_header.py", line 19, in
_=straw_module.read_metadata(sys.argv[1],verbose=verbose)
AttributeError: module 'straw.straw' has no attribute 'read_metadata'
Thanks
Ant
Should be fairly straightforward, see http://www.mathworks.com/help/matlab/matlab_external/standalone-example.html
Hi,
I'm playing with straw.cpp and the file https://github.com/aidenlab/Juicebox/blob/master/data/inter.hic that I saved on my side as 'jeter.hic'
$ wget -q -O - "https://github.com/aidenlab/Juicebox/blob/master/data/inter.hic?raw=true" | sha1sum
1f7fc1149306dc17b1e51b09053d152ebcef1cb0 -
$ ls -lah ~/jeter.hic
-rw-rw-r-- 1 lindenb lindenb 41M juin 4 11:32 /home/lindenb/jeter.hic
$ sha1sum ~/jeter.hic
1f7fc1149306dc17b1e51b09053d152ebcef1cb0 /home/lindenb/jeter.hic
I found that the following command:
./straw VC ~/jeter.hic 1:10:20 2:10:20 BP 100
reaches silently an EOF at https://github.com/aidenlab/straw/blob/master/C%2B%2B/straw.cpp#L266
you can check this by replacing the line 266 with
#define DEBUG(a) do { cerr << __LINE__ << ":" << a << endl; } while(0)
DEBUG("fin " << fin.tellg());
fin.read((char*)&nExpectedValues, sizeof(int));
DEBUG("nExpectedValues " << nExpectedValues << " eof ?" << fin.eof());
DEBUG("fin " << fin.tellg());
if ( (fin.rdstate() & std::ifstream::failbit ) != 0 ) DEBUG("failbit");
if ( (fin.rdstate() & std::ifstream::eofbit ) != 0 ) DEBUG("eofbit");
if ( (fin.rdstate() & std::ifstream::badbit ) != 0 ) DEBUG("badbit");
is it a bug in straw.cpp or is it a problem with the file inter.hic ?
thank you for your help.
Hi, I am using straw with the following command:
./straw NONE myfile.hic Y:1:15902555 Y:1:15902555 BP 100000 > outfile
I get following error:
"File doesn't have the given chr_chr map."
However, I am able to clearly visualize the data on those same exact coordinates from that file on https://www.aidenlab.org/juicebox/
Describe the bug
When a check in the package fails currently a message is printed and the function returns -1 or None instead of raising an Exception, which leads to random errors later.
To Reproduce
Steps to reproduce the behavior:
Have valid HiC file.
use:
import straw
straw.straw("NONE",filename,100,100,BP,10000)
Which will fail, of the organism from the file does not have a chromosome 100, the error message is however:
File "/home/balthasar/.local/lib/python3.9/site-packages/straw/straw.py", line 470, in straw
master=list1[0]
TypeError: 'int' object is not subscriptable
Expected behavior
The error message should instead be:
File "/home/balthasar/.local/lib/python3.9/site-packages/straw/straw.py", line 110, in readHeader
raise ValueError("One of the chromosomes wasn't found in the file. Check that the chromosome name matches the genome.\n")
ValueError: One of the chromosomes wasn't found in the file. Check that the chromosome name matches the genome.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context
Describe the bug
straw aborts for a stoi
issue
To Reproduce
I have compiled straw on my Mac (tried also with the binary version distributed here) and run it on .hic data provided by a collaborator
$ ~/src/straw/C++/straw NONE $i 19 19 10000 BP
libc++abi.dylib: terminating with uncaught exception of type std::invalid_argument: stoi: no conversion
Desktop (please complete the following information):
$ uname -a
Darwin C0004398.local 19.3.0 Darwin Kernel Version 19.3.0: Thu Jan 9 20:58:23 PST 2020; root:xnu-6153.81.5~1/RELEASE_X86_64 x86_64
Describe the bug
I tried to use pip to install this package on windows, but I got the following errors.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The installation should fine.
Screenshots
(plus) D:\code\python\hicplus>set INCLUDE=%INCLUDE%;D:\apps\anaconda3\Library\include
(plus) D:\code\python\hicplus>python -m pip install -U hic-straw
Requirement already satisfied: hic-straw in d:\apps\anaconda3\envs\plus\lib\site-packages (0.0.6)
Collecting hic-straw
Using cached hic-straw-1.3.1.tar.gz (18 kB)
Preparing metadata (setup.py) ... done
Collecting pybind11>=2.4
Using cached pybind11-2.10.0-py3-none-any.whl (213 kB)
Building wheels for collected packages: hic-straw
Building wheel for hic-straw (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [40 lines of output]
D:\apps\anaconda3\envs\plus\lib\site-packages\setuptools\installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
warnings.warn(
running bdist_wheel
running build
running build_ext
building 'hicstraw' extension
creating build
creating build\temp.win-amd64-3.8
creating build\temp.win-amd64-3.8\Release
creating build\temp.win-amd64-3.8\Release\src
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -Ic:\users\liu.d.h\appdata\local\temp\pip-install-06_988jz\hic-straw_aa7b592d70914d3b886a3aa3d4054df1\.eggs\pybind11-2.10.0-py3.8.egg\pybind11\include -Ic:\users\liu.d.h\appdata\local\temp\pip-install-06_988jz\hic-straw_aa7b592d70914d3b886a3aa3d4054df1\.eggs\pybind11-2.10.0-py3.8.egg\pybind11\include -ID:\apps\anaconda3\envs\plus\include -ID:\apps\anaconda3\envs\plus\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -I%INCLUDE% -ID:\apps\anaconda3\Library\include -ID:\apps\anaconda3\Library\include /EHsc /Tpsrc/straw.cpp /Fobuild\temp.win-amd64-3.8\Release\src/straw.obj /EHsc /DVERSION_INFO=\\\"1.3.1\\\"
straw.cpp
C:\Users\Liu.D.H\AppData\Local\Temp\pip-install-06_988jz\hic-straw_aa7b592d70914d3b886a3aa3d4054df1\src\straw.h(74): warning C4244: 'argument': conversion from '__int64' to 'int', possible loss of data
src/straw.cpp(338): warning C4267: 'initializing': conversion from 'size_t' to 'int32_t', possible loss of data
src/straw.cpp(340): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of data
src/straw.cpp(1023): warning C4244: 'return': conversion from 'uint64_t' to 'long', possible loss of data src/straw.cpp(1286): warning C4244: 'argument': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1396): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1397): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1398): error C2131: expression did not evaluate to a constant
src/straw.cpp(1398): note: failure was caused by a read of a variable outside its lifetime
src/straw.cpp(1398): note: see usage of 'numRows'
src/straw.cpp(1398): error C2131: expression did not evaluate to a constant
src/straw.cpp(1398): note: failure was caused by a read of a variable outside its lifetime
src/straw.cpp(1398): note: see usage of 'numCols'
src/straw.cpp(1401): error C3863: array type 'float [numRows][numCols]' is not assignable
src/straw.cpp(1407): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1408): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1410): error C3863: array type 'float [numRows][numCols]' is not assignable
src/straw.cpp(1413): warning C4244: '=': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1414): warning C4244: '=': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1416): error C3863: array type 'float [numRows][numCols]' is not assignable
src/straw.cpp(1526): error C2131: expression did not evaluate to a constant
src/straw.cpp(1526): note: failure was caused by a read of a variable outside its lifetime
src/straw.cpp(1526): note: see usage of 'this'
src/straw.cpp(1530): error C3863: array type 'chromosome ['function']' is not assignable
src/straw.cpp(1690): error C2017: illegal escape sequence
src/straw.cpp(1690): error C2001: newline in constant
src/straw.cpp(1694): error C2143: syntax error: missing ';' before '}'
error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.33.31629\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for hic-straw
Running setup.py clean for hic-straw
Failed to build hic-straw
Installing collected packages: pybind11, hic-straw
Attempting uninstall: hic-straw
Found existing installation: hic-straw 0.0.6
Uninstalling hic-straw-0.0.6:
Successfully uninstalled hic-straw-0.0.6
Running setup.py install for hic-straw ... error
error: subprocess-exited-with-error
× Running setup.py install for hic-straw did not run successfully.
│ exit code: 1
╰─> [40 lines of output]
running install
D:\apps\anaconda3\envs\plus\lib\site-packages\setuptools\command\install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_ext
building 'hicstraw' extension
creating build
creating build\temp.win-amd64-3.8
creating build\temp.win-amd64-3.8\Release
creating build\temp.win-amd64-3.8\Release\src
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -ID:\apps\anaconda3\envs\plus\lib\site-packages\pybind11\include -ID:\apps\anaconda3\envs\plus\lib\site-packages\pybind11\include -ID:\apps\anaconda3\envs\plus\include -ID:\apps\anaconda3\envs\plus\Include "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.33.31629\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -I%INCLUDE% -ID:\apps\anaconda3\Library\include -ID:\apps\anaconda3\Library\include /EHsc /Tpsrc/straw.cpp /Fobuild\temp.win-amd64-3.8\Release\src/straw.obj /EHsc /DVERSION_INFO=\\\"1.3.1\\\"
straw.cpp
C:\Users\Liu.D.H\AppData\Local\Temp\pip-install-06_988jz\hic-straw_aa7b592d70914d3b886a3aa3d4054df1\src\straw.h(74): warning C4244: 'argument': conversion from '__int64' to 'int', possible loss of data
src/straw.cpp(338): warning C4267: 'initializing': conversion from 'size_t' to 'int32_t', possible loss of data
src/straw.cpp(340): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of data
src/straw.cpp(1023): warning C4244: 'return': conversion from 'uint64_t' to 'long', possible loss of data src/straw.cpp(1286): warning C4244: 'argument': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1396): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1397): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1398): error C2131: expression did not evaluate to a constant
src/straw.cpp(1398): note: failure was caused by a read of a variable outside its lifetime
src/straw.cpp(1398): note: see usage of 'numRows'
src/straw.cpp(1398): error C2131: expression did not evaluate to a constant
src/straw.cpp(1398): note: failure was caused by a read of a variable outside its lifetime
src/straw.cpp(1398): note: see usage of 'numCols'
src/straw.cpp(1401): error C3863: array type 'float [numRows][numCols]' is not assignable
src/straw.cpp(1407): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1408): warning C4244: 'initializing': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1410): error C3863: array type 'float [numRows][numCols]' is not assignable
src/straw.cpp(1413): warning C4244: '=': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1414): warning C4244: '=': conversion from 'int64_t' to 'int32_t', possible loss of data
src/straw.cpp(1416): error C3863: array type 'float [numRows][numCols]' is not assignable
src/straw.cpp(1526): error C2131: expression did not evaluate to a constant
src/straw.cpp(1526): note: failure was caused by a read of a variable outside its lifetime
src/straw.cpp(1526): note: see usage of 'this'
src/straw.cpp(1530): error C3863: array type 'chromosome ['function']' is not assignable
src/straw.cpp(1690): error C2017: illegal escape sequence
src/straw.cpp(1690): error C2001: newline in constant
src/straw.cpp(1694): error C2143: syntax error: missing ';' before '}'
error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Community\\VC\\Tools\\MSVC\\14.33.31629\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
WARNING: No metadata found in d:\apps\anaconda3\envs\plus\lib\site-packages
Rolling back uninstall of hic-straw
Moving to d:\apps\anaconda3\envs\plus\lib\site-packages\hic_straw-0.0.6.dist-info\
from D:\apps\anaconda3\envs\plus\Lib\site-packages\~ic_straw-0.0.6.dist-info
Moving to d:\apps\anaconda3\envs\plus\lib\site-packages\straw\
from D:\apps\anaconda3\envs\plus\Lib\site-packages\~traw
error: legacy-install-failure
× Encountered error while trying to install package.
╰─> hic-straw
note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
(plus) D:\code\python\hicplus>
Desktop (please complete the following information):
Additional context
See #112 (comment) also.
Got this error when using straw.straw on a local hic file
Traceback (most recent call last):
File "dumpWGstraw.py", line 16, in <module>
result = straw.straw(norm, hicname, '14', '22', 'BP', resolution)
File "/net/noble/vol2/home/gurkan/bin/anaconda2/envs/straw/lib/python3.6/site-packages/straw/straw.py", line 574, in straw
records=readBlock(req, idx['size'])
File "/net/noble/vol2/home/gurkan/bin/anaconda2/envs/straw/lib/python3.6/site-packages/straw/straw.py", line 384, in readBlock
if (countsnot != 0x7fc00000):
NameError: name 'countsnot' is not defined
Not sure what is causing this error. Straw was installed with pip as instructed in the github page.
hi,
when i use straw , it has some problems. i used your test data
import straw
result = straw.straw('NONE', 'test.hic', 'X', 'X', 'BP', 1000000)
for i in range(len(result)):
print("{0}\t{1}\t{2}".format(result[i].binX, result[i].binY, result[i].counts))
the error is
Traceback (most recent call last):
File "test2.py", line 5, in <module>
result = straw.straw('NONE', 'test.hic', 'X', 'X', 'BP', 100000)
TypeError: __init__() takes from 2 to 3 positional arguments but 7 were given
can you help me to solve it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.