Git Product home page Git Product logo

pinocchio's People

Contributors

pigimonaco avatar yanling-song455 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pinocchio's Issues

Heap buffer overflow when creating map for fragmentation

Hello,

As discussed in our online meeting last week, we would like to use the new Pinocchio version 5 (currently in the fivedotzero branch) for our simulations related to the Square Kilometer Array, but we find it to be less stable than version 4.1.3 (current master).

In runs on the HPC cluster, the crashes are not 100% reproducible. But they tend to happen more often in large runs where we (naturally) max out the memory per node. This instability makes us hesitant to use v5 instead of v4.

I have noticed that, when I turn on the Address Sanitizer at compile time and run the small example that ships with Pinocchio (in the example folder) locally, it flags parts of the code where crashes often occur in the large runs on the HPC cluster. So I think it is worth taking a closer look.

In the interest of reproducibility, I'm attaching a Dockerfile and Makefile that I used to diagnose possible memory errors. Remove the .txt ending after downloading those files, GitHub wouldn't let me upload without the extension. When on the fivedotzero branch, put them into the project root folder.

Build Pinocchio inside the Docker container with:

docker build --tag pinocchio_v5 .

Then run the example with:

docker run --interactive --tty --rm --volume $(pwd):/cwd --workdir /cwd/example pinocchio_v5 pinocchio parameter_file

This Docker container is very similar to the one in which I deploy Pinocchio on the cluster, via Sarus. It uses the MPI implementation (MPICH 3.1.4) that is ABI compatible with what's installed natively (i.e. outside the container) on the cluster itself. The Makefile overrides the one in the src folder, and ultimately just compiles with the debug flags that include the Address Sanitizer (-fsanitize=address).

Here is the (shortened) output from running the example:

❯ docker run --interactive --tty --rm --volume $(pwd):/cwd --workdir /cwd/example pinocchio_v5 pinocchio parameter_file
[Wed Oct 25 2023 08:58:38] This is pinocchio V5.0, running on 1 MPI tasks

This version uses 3LPT displacements
Radiation is included in the Friedmann equations
Ellipsoidal collapse will be computed as Monaco (1995)

Reading parameters from file parameter_file
Flag for this run: example

…

[Wed Oct 25 2023 08:59:39] Storing velocities
[Wed Oct 25 2023 08:59:39] Done computing velocities, cpu time = 0.179195 s
[Wed Oct 25 2023 08:59:40] Number of collapsed particles to z=0: 3343409
[Wed Oct 25 2023 08:59:40] Finishing fmax, total fmax cpu time =      55.598078
                 IO       :       0.000000 (     55.598077 total time without I/O)
                 FFT      :      15.093578
                 COLLAPSE :      25.301670

[Wed Oct 25 2023 08:59:40] Second part: fragmentation of the collapsed medium
[Wed Oct 25 2023 08:59:40] Task 0 reallocated memory for 1.117586 Gb
[Wed Oct 25 2023 08:59:40] Creating map of needed particles
=================================================================
==1==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f7563d580b0 at pc 0x7f75c25d4681 bp 0x7fff09fa0c30 sp 0x7fff09fa03e0
WRITE of size 1000000 at 0x7f7563d580b0 thread T0
    #0 0x7f75c25d4680 in __interceptor_memset ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:799
    #1 0x564414111fd3 in create_map /source/Pinocchio/fragment.c:801
    #2 0x56441410e13b in fragment /source/Pinocchio/fragment.c:299
    #3 0x56441410de20 in fragment_driver /source/Pinocchio/fragment.c:135
    #4 0x5644140d5d77 in main /source/Pinocchio/pinocchio.c:212
    #5 0x7f75c1d3d1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x271c9)
    #6 0x7f75c1d3d284 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27284)
    #7 0x5644140d5820 in _start (/usr/bin/pinocchio+0x19820)

0x7f7563d580b0 is located 0 bytes to the right of 1199999152-byte region [0x7f751c4ef800,0x7f7563d580b0)
allocated by thread T0 here:
    #0 0x7f75c26448d5 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:85
    #1 0x564414106171 in reallocate_memory_for_fragmentation /source/Pinocchio/allocations.c:555
    #2 0x56441410de10 in fragment_driver /source/Pinocchio/fragment.c:132
    #3 0x5644140d5d77 in main /source/Pinocchio/pinocchio.c:212
    #4 0x7f75c1d3d1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x271c9)

SUMMARY: AddressSanitizer: heap-buffer-overflow ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:799 in __interceptor_memset
Shadow bytes around the buggy address:
  0x0fef2c7a2fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fef2c7a2fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fef2c7a2fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fef2c7a2ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fef2c7a3000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0fef2c7a3010: 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa fa fa
  0x0fef2c7a3020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0fef2c7a3030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0fef2c7a3040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0fef2c7a3050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0fef2c7a3060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==1==ABORTING

Here, the Address Sanitizer outright aborts the run due to a heap buffer overflow when it gets to this piece of code:

Pinocchio/src/fragment.c

Lines 800 to 801 in e5d3780

/* sets to 1 all particles in the well-resolved region plus one row for each side (without PBCs) */
memset(frag_map_update, 0, subbox.maplength*sizeof(unsigned int));

By contrast, with v4 (current master) and a few straightforward fixes to make it compile inside that container, it runs until the very end ("Pinocchio done!") and only afterwards reports a few "detected memory leaks", which are not critical.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.