Git Product home page Git Product logo

Comments (6)

rtobar avatar rtobar commented on August 26, 2024

@MatthieuSchaller thanks for the report. I don't think I'll be able to look into this until early next week though. In the meanwhile, could you please share the data, or push for my cosma application to proceed? I tried to reproduce locally with some of the inputs I had received in the past but they seem to lack the fields needed by the hydro-enabled code.

from velociraptor-stf.

MatthieuSchaller avatar MatthieuSchaller commented on August 26, 2024

If it helps, there is smaller test case here:

  • /snap7/scratch/dp004/jlvc76/SWIFT/EoS_tests/swiftsim/examples/EAGLE_low_z/EAGLE_6/eagle_0000.hdf5
  • /snap7/scratch/dp004/jlvc76/SWIFT/EoS_tests/swiftsim/examples/EAGLE_low_z/EAGLE_6/vrconfig_3dfof_subhalos_SO_hydro.cfg

This one crashes about 20s after start so might be easier.

Config is: cmake ../ -DVR_USE_HYDRO=ON -DCMAKE_BUILD_TYPE=Debug
Run command line is: stf -C vrconfig_3dfof_subhalos_SO_hydro.cfg -i eagle_0000 -o halos_0000 -I 2

  • Problem happens with gcc or ICC,
  • Running with -DVR_OPENMP=OFF also crashes in the same way,
  • Running without VR_MPI_REDUCE crashes in a different way. There the crash happens when reading in stuff.

from velociraptor-stf.

rtobar avatar rtobar commented on August 26, 2024

Thanks @MatthieuSchaller for the extra information. I managed to reproduce this locally with the data you pointed above, I hope I can post an update soon with some information

from velociraptor-stf.

rtobar avatar rtobar commented on August 26, 2024

Running under valgrind revealed an invalid write:

==14074==    by 0x2D2CC6: MPISendReceiveFOFHydroInfoBetweenThreads(Options&, fofid_in*, std::vector<long long, std::allocator<long long> >&, std::vector<float, std::allocator<float> >&, int, int, ompi_communicator_t*&) (mpiroutines.cxx:2697)

The in gdb the reason became clear: the proprecvbuff vector needed to be properly sized before receiving data. Its size should be the same that propsendbuff, which in turn should be the size of indicessend times numextrafields.

On a new issue-54 branch I fixed the resizing of proprecvbuff to accommodate for all incoming data, and also added an assertion to make sure the sizes of the input vectors are related to each other as expected. After these changes I can run the code to completion. @MatthieuSchaller please confirm that this is working for you too.

I also tried out running without VR_MPI_REDUCE and also got a crash. That looks like a separate issue though, so I'll create a separate ticket to keep track of it separately.

from velociraptor-stf.

MatthieuSchaller avatar MatthieuSchaller commented on August 26, 2024

I confirm this works both with and without OpenMP (as expected since unrelated). Thanks for tracking this down!

(I am never brave enough to fire up valgrind inside an mpirun call...)

from velociraptor-stf.

rtobar avatar rtobar commented on August 26, 2024

Great! I merged this into the master branch now, thanks for testing!

from velociraptor-stf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.