Git Product home page Git Product logo

koszullab / graal Goto Github PK

View Code? Open in Web Editor NEW
14.0 14.0 9.0 18.75 MB

(check out instaGRAAL for a faster, updated program!) This program is from Marie-Nelly et al., Nature Communications, 2014 (High-quality genome assembly using chromosomal contact data), also Marie-Nelly et al., 2013, PhD thesis (https://www.theses.fr/2013PA066714)

Home Page: https://research.pasteur.fr/fr/software/graal-software-for-genome-assembly-from-chromosome-contact-frequencies/

Python 67.62% Cuda 32.38%

graal's People

Contributors

baudrly avatar cooketho avatar rkoszul avatar rvmn57 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graal's Issues

pyramid_sparse.py error

I used HiC-Box to pre-process my data, and loaded it into GRAAL. I clicked 'build pyramid' and it reported that the pyramid was built. I then started GRAAL, and after a few minutes, it gave the error below. I'm not sure if this worked correctly--I will send you a link to my data if you care to try and reproduce the error. I got GRAAL to work fine on the test data set, so I wonder if it is some problem with how I processed my data. Please let me know if you can help. Thanks!

25204
25205
25206
25207
25208
25209
25210
25211
Exception in thread Thread-4:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in bootstrap_inner
self.run()
File "main_window.py", line 425, in run
self)
File "/home/tom/Desktop/test/GRAAL-master/main_gl.py", line 81, in __init

self.fasta_file, candidates_blacklist, self.allow_repeats)
File "/home/tom/Desktop/test/GRAAL-master/simulation_loader.py",
line 66, in init
self.level.build_seq_per_bin(genome_fasta=fasta_file) #
File "/home/tom/Desktop/test/GRAAL-master/pyramid_sparse.py", line
1419, in build_seq_per_bin
start = frag.start_pos
AttributeError: 'str' object has no attribute 'start_pos'


PyCUDA ERROR: The context stack was not empty upon module cleanup.

A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.

Use Context.pop() to avoid this problem.

Aborted (core dumped)

Make a non GUI version.

A simple command line interface would be really really helpful. A mandatory GUI is a significant hindrance when dealing with remote compute machines, which often do not have X11 etc installed.

Satisfying the large dependency tree is hard to justify to an HPC admin, when the program could run quite easily without one.

How to download graal.zip?

Hi!

I had read the paper of GRAAL. It was a grate work, and I want to try this software with my data. However, I couldn't find where to download graal.zip after go through the paper and here. I feel read me is to simple to use this tool. Besides, where are the pdf files that may be very useful?

Best wishes!

Bioconda

Hi,
would you be able to create a bioconda package for your tool?

Thank you in advance.

Michal

AttributeError: 'int' object has no attribute 'astype'

When I run GRAAL, I receive the following error:

Processing...
Description: convert dense file to COO sparse data.
Done.
start filtering
nfrags =  [95581]
n init frags =  [95581]
mean sparsity =  0.0021264316
std sparsity =  0.0022989216
max_sparsity =  0.059509736
cleaning : start
number of fragments to remove =  0
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "main_window.py", line 85, in run
    pyramid = pyr.build_and_filter(self.base_folder, self.size_pyramid, self.factor)
  File "/home/benedikt/Python/GRAAL/pyramid_sparse.py", line 69, in build_and_filter
    current_abs_fragments_contacts, pyramid_0)
  File "/home/benedikt/Python/GRAAL/pyramid_sparse.py", line 756, in remove_problematic_fragments
    p.render(pt ,'step %s\nProcessing...\nDescription: removing bad fragments.' % step)
  File "/home/benedikt/Python/GRAAL/progressbar.py", line 61, in render
    self.progress = (bar_width * percent.astype(np.int)) / 100
AttributeError: 'int' object has no attribute 'astype'

The input data has been generated using the HiC-Box (thanks again for your help there).

Most likely unrelated: the stdout is spammed with this message as well:

*** BUG ***
In pixman_region32_init_rect: Invalid rectangle passed
Set a breakpoint on '_pixman_log_error' to debug

GRAAL for diplid assemblies?

Hi,

I wonder if GRAAL will fit my genome project.
I have a plant genome assembly with the following features:

  • estimated genome size: 2.6 Gb, diploid organism, no recent WGD;
  • total assembly size: 4.5 Gb, scf N50 3 Mb, scf N80 1 Mb, 1.4% Ns;
  • BUSCO genes: 98% present, >70% in two copies
    It is indeed a diploid assembly.

I wonder if GRAAL can use allelic variation to produce phased pseudochromosome sequences.
By collinearity I am able to assign 80% of the sequence to chromosomes (of a closely-related species), but I have pairs of scaffolds at each locus. I would like to split the pairs in the two allelic genomes in a phased fashion. Would GRAAL work with this?
Thanks,

Dario

Documentation is insufficient to get this software running.

The documentation you have provided is incomplete.

The README refers to data files, which I assume are examples, but these are no where to be found either in Github or your papers supplementary data.

When I try to run this program using my own fasta sequences, I get the following error when hitting "build" after selecting a fasta file:

Traceback (most recent call last):
  File "main_window.py", line 353, in OnReturn
    self.main_window.pyramid = self.pyramid
AttributeError: 'LoaderWindow' object has no attribute 'pyramid'
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "main_window.py", line 89, in run
    pyramid = pyr.build_and_filter(self.base_folder, self.size_pyramid, self.factor)
  File "/Users/foobar/git/GRAAL/pyramid_sparse.py", line 36, in build_and_filter
    build(base_folder, init_size_pyramid, factor, min_bin_per_contig,)
  File "/Users/foobar/git/GRAAL/pyramid_sparse.py", line 168, in build
    shutil.copyfile(contig_info,current_contig_info)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 82, in copyfile
    with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: u'/Users/foobar/git/GRAAL/workdir/info_contigs.txt'

Your code is trying to copy a file that doesn't yet exist. If this is some sort of guide file that I need to create beforehand, it's necessary to tell me what goes into it and where to put it. Perhaps this is a hanging legacy of development, where you've always had this dependency satisfied but were unaware that it existed.

Am I also only guessing what sequences are meant to be selected at this point. Is it the contigs? I would have a better idea if the next window was available, but GRAAL won't open until the pyramid is created.

So, please specify the file formats for:

  • info_contigs.txt
  • abs_fragments_contacts_weighted.txt

Or explain how these can be satisfied, as I cannot get passed this step.

Problems running the GUI

Hi there,

I encountered several problems running the graphical interface. I don't know whether they are version-related, but if my proposals don't crash the tool in your setup, they might be useful for others that have a setup similar to mine.

  • from PIL import Image instead of import Image
  • Remove last pipe character | from wcd in OnLoadFasta function in main_window.py
  • terminal.COLUMNS was None in progressbar.py. I just replaced it with an arbitrary number.

At that point I ran into problems I coulnd't fix as easily.
Hope this is useful to someone

LaunchError

I am getting an error when running GRAAL saying "cuLaunchKernel failed: too many resources requested for launch". The full message is as follows:
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in bootstrap_inner
self.run()
File "main_window.py", line 427, in run
gl_window.start_EM()
File "/home/GRAAL/main_gl.py", line 215, in start_EM
self.simulation.sampler.modify_gl_cuda_buffer(0, self.dt)
File "/home/GRAAL/cuda_lib_gl.py", line 1749, in modify_gl_cuda_buffer
block=block
, grid=grid
)
File "/usr/local/lib/python2.7/dist-packages/pycuda-2016.1.2-py2.7-linux-x86_64.egg/pycuda/driver.py", line 402, in function_call
func._launch_kernel(grid, block, arg_buf, shared, None)
LaunchError: cuLaunchKernel failed: too many resources requested for launch

No documentation on data preparation

After fixing several bugs (see pull request), I'm able to successfully run GRAAL on the test data set (trichoderma). But the README is opaque when it comes to formatting and loading my own data set. The two biggest problems are:

  1. "(see start_graal.pdf and pending_graal.pdf)". Where are these files? There are referenced multiple times but I don't see them.

  2. "A pyramid of contact matrices, P = {M0, M1, ..., Mk}, is a data structure representing the 3C/HiC data at different scales." OK fine. But am I supposed to generate this data structure myself? How are the directory and/or files supposed to be structured? Can GRAAL do it for me? No guidance is provided.

I'm submitting a paper in the next few months and I'd love to use GRAAL and cite your work, but the documentation needs to be improved if I'm going to be able to do that.

load pyramid crash

After huge efforts to make the GUI work, we tried to run the program with the sample files but it crashes when loading the pyramid after hitting the build pyramid button. It reports a rather cryptic error related to the h5py module. Here is the traceback:

<HDF5 file "pyramid.hdf5" (mode r+)>
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in bootstrap_inner
self.run()
File "main_window.py", line 90, in run
lev = pyr.level(pyramid, 2)
File "/home/jtena/Desktop/graal/pyramid_sparse.py", line 1195, in __init

self.load_data(pyramid)
File "/home/jtena/Desktop/graal/pyramid_sparse.py", line 1209, in load_data
self.n_frags = np.copy(pyramid.data[str(self.level)]['nfrags'][0])
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-build-uVX5Nb/h5py/h5py/_objects.c:2579)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-build-uVX5Nb/h5py/h5py/_objects.c:2538)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/dataset.py", line 384, in getitem
new_dtype = readtime_dtype(self.id.dtype, names)
File "/usr/local/lib/python2.7/dist-packages/h5py/_hl/dataset.py", line 370, in readtime_dtype
raise ValueError("Field names only allowed for compound types")
ValueError: Field names only allowed for compound types

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.