Git Product home page Git Product logo

Comments (8)

mikolmogorov avatar mikolmogorov commented on August 17, 2024

Hi,

Thanks for the report. It looks like there is a rare issue with the hash function for sequence ids - it does not provide a good uniform distribution of keys in this case. I incorporated a batter hash function which should solve the problem - please try the updated version from the flye-devel branch. You can either checkout it through the git interface, or download a zip archive: https://github.com/fenderglass/Flye/archive/flye-devel.zip

Let me know if this helps,

Mikhail

from flye.

hermeseduardo avatar hermeseduardo commented on August 17, 2024

thanks Mikhail, I tried but the problem is still there. Please let me know if you need additional information from my side.
best,

Hermes

from flye.

mikolmogorov avatar mikolmogorov commented on August 17, 2024

Could you send me the log file and give some information about the dataset: techonology, genome size, coverage etc?

from flye.

hermeseduardo avatar hermeseduardo commented on August 17, 2024

sure, pacbio subreads (RSII) fasta files, coverage about 40X, genome size around 500MB. Bellow the some info about the sbatch and the whole log file.

HE

#SBATCH --mem=1000GB
#SBATCH --cpus-per-task=20

/data/esc003/apps/Flye-flye-devel/bin/flye --pacbio-raw pacbio.fasta --genome-size 550m --threads 20 -m 5000 -o flye

[2018-02-24 01:57:36] root: DEBUG: Genome size: 576716800
[2018-02-24 01:57:36] root: INFO: Running Flye 2.3.2-release
[2018-02-24 01:57:36] root: DEBUG: Cmd: /data/esc003/apps/Flye-flye-devel/bin/flye --pacbio-raw /flush2/esc003/Pacbio_subreads_smartbellremoved_headers_removed.fasta --genome-size 550m --threads 20 -m 5000 -o flye
[2018-02-24 01:57:36] root: INFO: Assembling reads
[2018-02-24 01:57:36] root: DEBUG: -----Begin assembly log------
[2018-02-24 01:57:36] root: DEBUG: Running: flye-assemble -l /flush1/esc003/Flye_cynegetis_assembly/flye/flye.log -t 20 -v 5000 /flush2/esc003/Pacbio_subreads_smartbellremoved_headers_removed.fasta /flush1/esc003/Flye_cynegetis_assembly/flye/0-assembly/draft_assembly.fasta 576716800 /data/esc003/apps/Flye-flye-devel/flye/resource/asm_raw_reads.cfg
[2018-02-24 01:57:36] DEBUG: Build date: Feb 23 2018 21:50:23
[2018-02-24 01:57:36] DEBUG: Parameters:
[2018-02-24 01:57:36] DEBUG: kmer_size=15
[2018-02-24 01:57:36] DEBUG: kmer_size_big=17
[2018-02-24 01:57:36] DEBUG: big_genome_threshold=50000000
[2018-02-24 01:57:36] DEBUG: maximum_jump=1500
[2018-02-24 01:57:36] DEBUG: maximum_overhang=1500
[2018-02-24 01:57:36] DEBUG: hard_min_coverage_rate=10
[2018-02-24 01:57:36] DEBUG: repeat_coverage_rate=10
[2018-02-24 01:57:36] DEBUG: jump_divergence_rate=2
[2018-02-24 01:57:36] DEBUG: overlap_divergence_rate=5
[2018-02-24 01:57:36] DEBUG: penalty_window=100
[2018-02-24 01:57:36] DEBUG: max_coverage_drop_rate=5
[2018-02-24 01:57:36] DEBUG: chimera_window=100
[2018-02-24 01:57:36] DEBUG: min_reads_in_contig=4
[2018-02-24 01:57:36] DEBUG: max_inner_reads=10
[2018-02-24 01:57:36] DEBUG: max_inner_fraction=0.25
[2018-02-24 01:57:36] DEBUG: max_separation=500
[2018-02-24 01:57:36] DEBUG: tip_length_threshold=20000
[2018-02-24 01:57:36] DEBUG: unique_edge_length=50000
[2018-02-24 01:57:36] DEBUG: min_repeat_res_support=0.5
[2018-02-24 01:57:36] DEBUG: out_paths_ratio=5
[2018-02-24 01:57:36] DEBUG: graph_cov_drop_rate=10
[2018-02-24 01:57:36] DEBUG: coverage_estimate_window=100
[2018-02-24 01:57:36] DEBUG: low_cutoff_warning=1
[2018-02-24 01:57:36] DEBUG: assemble_kmer_sample=1
[2018-02-24 01:57:36] DEBUG: assemble_gap=500
[2018-02-24 01:57:36] DEBUG: repeat_graph_kmer_sample=5
[2018-02-24 01:57:36] DEBUG: repeat_graph_gap=100
[2018-02-24 01:57:36] DEBUG: repeat_graph_max_kmer=500
[2018-02-24 01:57:36] DEBUG: read_align_kmer_sample=1
[2018-02-24 01:57:36] DEBUG: read_align_gap=500
[2018-02-24 01:57:36] DEBUG: read_align_max_kmer=500
[2018-02-24 01:57:36] DEBUG: Running with k-mer size: 17
[2018-02-24 01:57:36] INFO: Reading sequences
[2018-02-24 02:05:51] DEBUG: Mean read length: 8958
[2018-02-24 02:05:51] DEBUG: Estimated coverage: 46
[2018-02-24 02:05:51] INFO: Generating solid k-mer index
[2018-02-24 02:05:51] DEBUG: Hard threshold set to 4
[2018-02-24 02:05:51] DEBUG: Started kmer counting
[2018-02-24 02:08:01] INFO: Counting kmers (1/2):
[2018-02-24 02:15:22] INFO: Counting kmers (2/2):
[2018-02-24 02:33:14] DEBUG: Filtered 1247354 repetitive kmers
[2018-02-24 02:33:14] DEBUG: Estimated minimum kmer coverage: 8, 537665765 unique kmers selected
[2018-02-24 02:33:14] INFO: Filling index table
[2018-02-24 02:34:25] DEBUG: Solid kmers: 537665765
[2018-02-24 02:34:25] DEBUG: Kmer index size: 11555644070
[2018-02-24 02:56:33] INFO: Extending reads
[2018-02-24 03:07:29] DEBUG: Mean read coverage: 9
[2018-02-24 03:07:39] DEBUG: Assembled contig 1
With 7 reads
Start read: +m170808_50191
At position: 2
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 31157
[2018-02-24 03:07:39] DEBUG: Inner: 16 covered: 88 total: 6005312
[2018-02-24 03:07:42] DEBUG: Assembled contig 2
With 13 reads
Start read: -m170713_180044
At position: 1
leftTip: 1 rightTip: 1
Suspicios: 1
Mean extensions: 3
Inner reads: 0
Length: 79404
[2018-02-24 03:07:42] DEBUG: Inner: 82 covered: 200 total: 6005312
[2018-02-24 03:07:44] DEBUG: Assembled contig 3
With 12 reads
Start read: +m170515_163220
At position: 9
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 2
Inner reads: 0
Length: 83051
[2018-02-24 03:07:44] DEBUG: Inner: 124 covered: 306 total: 6005312
[2018-02-24 03:07:45] DEBUG: Assembled contig 4
With 11 reads
Start read: +m170712_152947
At position: 2
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 75947
[2018-02-24 03:07:45] DEBUG: Inner: 196 covered: 429 total: 6005312
[2018-02-24 03:07:49] DEBUG: Assembled contig 5
With 20 reads
Start read: +m170516_157558
At position: 18
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 116272
[2018-02-24 03:07:49] DEBUG: Inner: 312 covered: 658 total: 6005312
[2018-02-24 03:07:58] DEBUG: Assembled contig 6
With 30 reads
Start read: -m170713_72622
At position: 16
leftTip: 1 rightTip: 1
Suspicios: 5
Mean extensions: 3
Inner reads: 0
Length: 157831
[2018-02-24 03:07:58] DEBUG: Inner: 444 covered: 972 total: 6005312
[2018-02-24 03:08:00] DEBUG: Assembled contig 7
With 25 reads
Start read: +m170516_207012
At position: 12
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 5
Inner reads: 0
Length: 170570
[2018-02-24 03:08:00] DEBUG: Inner: 600 covered: 1393 total: 6005312
[2018-02-24 03:08:01] DEBUG: Assembled contig 8
With 22 reads
Start read: -m170713_59169
At position: 13
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 139444
[2018-02-24 03:08:01] DEBUG: Inner: 734 covered: 1720 total: 6005312
[2018-02-24 03:08:04] DEBUG: Assembled contig 9
With 23 reads
Start read: -m170809_48452
At position: 10
leftTip: 1 rightTip: 0
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 147044
[2018-02-24 03:08:04] DEBUG: Inner: 852 covered: 1976 total: 6005312
[2018-02-24 03:08:04] DEBUG: Assembled contig 10
With 22 reads
Start read: -m170712_111658
At position: 8
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 6
Inner reads: 0
Length: 120833
[2018-02-24 03:08:04] DEBUG: Inner: 980 covered: 2388 total: 6005312
[2018-02-24 03:08:08] DEBUG: Assembled contig 11
With 14 reads
Start read: -m170609_42878
At position: 8
leftTip: 1 rightTip: 1
Suspicios: 3
Mean extensions: 5
Inner reads: 0
Length: 81458
[2018-02-24 03:08:08] ERROR: Caught unhandled exception: Automatic expansion triggered when load factor was below minimum threshold
[2018-02-24 03:08:08] ERROR: flye-assemble(_Z16exceptionHandlerv+0x2d) [0x43cc8d]
[2018-02-24 03:08:08] ERROR: /usr/lib64/libstdc++.so.6(+0x96706) [0x2aaaab277706]
[2018-02-24 03:08:08] ERROR: /usr/lib64/libstdc++.so.6(+0x96751) [0x2aaaab277751]
[2018-02-24 03:08:08] ERROR: /usr/lib64/libstdc++.so.6(+0xc1708) [0x2aaaab2a2708]
[2018-02-24 03:08:08] ERROR: /lib64/libpthread.so.0(+0x8744) [0x2aaaab789744]
[2018-02-24 03:08:08] ERROR: /lib64/libc.so.6(clone+0x6d) [0x2aaaaba87aad]
-----------End assembly log------------
[2018-02-24 03:08:24] root: ERROR: Command '['flye-assemble', '-l', '/flush1/esc003/Flye_cynegetis_assembly/flye/flye.log', '-t', '20', '-v', '5000', '/flush2/esc003/Pacbio_subreads_smartbellremoved_headers_removed.fasta', '/flush1/esc003/Flye_cynegetis_assembly/flye/0-assembly/draft_assembly.fasta', '576716800', '/data/esc003/apps/Flye-flye-devel/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1

from flye.

mikolmogorov avatar mikolmogorov commented on August 17, 2024

Thanks,

It seems strange, we have never encountered issues with the newer hash function before. Potentially, it could be machine/OS specific. Could you provide "uname -a" output? Do you also have a possibility to run the dataset on a different machine?

from flye.

hermeseduardo avatar hermeseduardo commented on August 17, 2024

sure:
uname -a
Linux XXX 4.4.59-92.17-default #1 SMP Thu Apr 6 14:16:09 UTC 2017 (7bc489d) x86_64 x86_64 x86_64 GNU/Linux

from flye.

mikolmogorov avatar mikolmogorov commented on August 17, 2024

Thanks,

I was not able to reproduce this problem on our datasets so far, so it makes the problem hard to debug. Is it possible for you to share the data - that would make debugging much easier? If so, you can write me to [email protected]

from flye.

hermeseduardo avatar hermeseduardo commented on August 17, 2024

looks like it was something to do with the installation in the cluster, it seems to be working now (v 2.3.2).

thanks

from flye.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.