Git Product home page Git Product logo

Comments (7)

zingale avatar zingale commented on July 28, 2024

I think the failures started with this AMReX change:

commit 5dfb0400581e0e2deb7bb0dc11b8f7efb14c8d17
Author: Weiqun Zhang <[email protected]>
Date:   Mon Jul 24 20:43:04 2023 -0700

    Disable Managed Memory for The_Arena by default. (#3438)
    
    It used to be that The_Arena was managed for CUDA and SYCL, but not for
    HIP. The users can still turn it on with
    `amrex.the_arena_is_managed=1`. They can also use The_Managed_Arena
    explicitly.

from maestroex.

zingale avatar zingale commented on July 28, 2024

in particular, it looks like we are using GPU::ManagedVector a lot in Basestate.H, so we need to use the managed arena.

from maestroex.

biboyd avatar biboyd commented on July 28, 2024

Okay, running with amrex.the_arena_is_managed=1 fixes the main issue and the test initializes correctly now.

However there is a new issue when we try to advance a timestep:

Timestep 0 starts with TIME = 0 DT = 0.0004741795139

Cell Count:
Level 0, 1327104 cells
inner sponge: r_sp      , r_tp      : 186468750, 224718750
<<< STEP 1 : react state >>>
<<< STEP 2 : make w0 >>>
<<< STEP 3 : create MAC velocities >>>
MLMG: Initial rhs               = 2728631.973
MLMG: Initial residual (resid0) = 2728631.973
MLMG: Final Iter. 1 resid, resid/bnorm = -1.797693135e+308, -6.588257971e+301
MLMG: Timers: Solve = 0.175576918 Iter = 0.145792589 Bottom = 0.00399682
<<< STEP 4 : advect base >>>
            :  density_advance >>>
            :   tracer_advance >>>
Erroneous arithmetic operation
See Backtrace.0 file for details

I'll look more into this, to see exactly where it is breaking

from maestroex.

ajnonaka avatar ajnonaka commented on July 28, 2024

The Backtrace is pointing to MaestroBaseStateGeometry.cpp:55
When I build an executable with DEBUG=FALSE it happily continues running.
A DEBUG=TRUE executable takes so long to get there (stuck in the nodal solver) I gave up and built with TEST=TRUE which immediately got to the Erroneous arithmetic operation

from maestroex.

ajnonaka avatar ajnonaka commented on July 28, 2024

I found it - there is a race condition in SlopeZ where near certain physical boundaries the code is using slopes computed in neighboring cells which haven't necessarily been computed yet if running on a gpu. Interestingly SlopeX and SlopeY had already been fixed. I'll work on a PR.

from maestroex.

ajnonaka avatar ajnonaka commented on July 28, 2024

@biboyd test it with #402 this should fix it.

from maestroex.

biboyd avatar biboyd commented on July 28, 2024

Everything seems to work now, thanks for all the help!

from maestroex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.