geodynamics / sw4lite Goto Github PK

Testing numerical kernels in SW4

License: Other

Makefile 0.37% Python 0.38% C++ 5.36% C 73.14% Fortran 19.62% Shell 1.06% Emacs Lisp 0.01% CMake 0.08%

performance-testing high-order finite-difference-schemes summation-by-parts

sw4lite's Introduction

Sw4lite

Sw4lite is a bare bone version of SW4 (Github) intended for testing performance optimizations in a few important numerical kernels of SW4.

To build

The Makefiles are suited for our systems at LLNL and LBNL; you will have to modify them to suit your system.

Type:

make

to build the code with OpenMP. The executable will be named optimize_mp_hostname/sw4lite.

A debug version with OpenMP can be built by:

make debug=yes

which will be located at debug_mp_hostname/sw4lite.

To build with only C code (no Fortran) and with OpenMP, type:

make ckernel=yes

The executable will be optimize_mp_c_hostname/sw4lite.

To build without OpenMP type:

make openmp=no

The executable will be optimize_hostname/sw4lite.

The Cuda version is built by:

make -f Makefile.cuda

and the executable will be under optimize_cuda_hostname/sw4lite.

More options are described in the Makefile.

Experimental cmake build is available for cuda build:

mkdir build;
cd build;
cmake ..; # optionally add -DCMAKE_PREFIX_PATH=$PWD/../../lapack_build/ if lapack is not found by default.
make;

To run

To run sw4lite with OpenMP threading, you need to assign the number of threads per MPI-task by setting the environment variable OMP_NUM_THREADS, e.g.,

setenv OMP_NUM_THREADS 4

An example input file is provided under tests/pointsource/pointsource.in. This case solves the elastic wave equation for a single point source in a whole space or a half space. The input file is given as argument to the executable, as in the example:

mpirun -np 16 sw4lite pointsource.in

Output from a run is provided at tests/pointsource/pointsource.out. For this point source example, the analytical solution is known. The error is printed at the end:

Errors at time 0.6 Linf = 0.569416 L2 = 0.0245361 norm of solution = 3.7439

When modifying the code, it is important to verify that these numbers have not changed.

Some timings are also output. The average execution times (in seconds) over all MPI processes are reported as follows:

Total execution time for the time stepping loop,
Communication between MPI-tasks (BC comm)
Imposing boundary conditions (BC phys),
Evaluating the difference scheme for divergence of the stress tensor (Scheme),
Evaluating supergrid damping terms (Supergrid), and
Evaluating the forcing functions (Forcing)

The code under tests/testil is a stand alone single-core program that only exercises the computational kernel (Scheme).

sw4lite's People

Contributors

Stargazers

Watchers

Forkers

peihunglin vmiheer minasel portersrc daboehme keipertk minitu daviddpruitt rgayatri23 jeffhammond homerdin udcbench lmcad-unicamp ivanradanov limkokholefork dslarm kulnaman

sw4lite's Issues

Cmake build auto fetch lapack from github

Cmake build now (after #16) will complain if mpi/lapack not found. But it would be great if it even automatically downloaded lapack.
Couple of approaches:

Add https://github.com/Reference-LAPACK/lapack as git submodule
Use cmake to fetch lapack

Lack of Explicit Fortran Module Dependencies Causes Race Condition

While making an initial spackage (spack/spack#5917) I ran into some issues while building the fortran versions in that files would try to be compiled before type_defs.f90 had been used to generate a .mod file. It happened maybe one in every ten times (and I have not been able to reproduce it in just a terminal with make -j), but I disabled the fortran version and only built the ckernels as a result.

I am not an expert on fortran but a quick googling makes it seem like this may be a common issue due to Make not having knowledge of these dependencies.

ERROR: developer option corder, must be zero when fortran routines are used

Hi,
I am new to this proxy app.
I built it according to the readme and tried to run it with:

mpirun -n 4 ./sw4lite ../tests/pointsource/pointsource.in

It would say:

ERROR: developer option corder, must be zero when fortran routines are use

I modified the content of tests/pointsource/pointsource.in to make corder=0 and it will give the same output as the readme:

...
Errors at time 0.6 Linf = 0.569416 L2 = 0.0245361 norm of solution = 3.7439

Am I doing this right? What is corder?

Thanks in advance.

Regards,
Chen

Suspicious array indices in device-routines.C

When compiling sw4lite with a clang-based compiler for AMD GPUs, I'm getting warnings for device-routines.C, lines 1841 and 1843, e.g.:

src/device-routines.C:1841:26: warning: array index 3 is past the end of the array (which contains 3 elements)
      [-Warray-bounds]
            (qum[c][4]-2*qum[c][3]+sum(c,ith,jth,k  ))
                         ^      ~
src/device-routines.C:1744:3: note: array 'qum' declared here
  float_sw4 qu[DIAMETER][3], qum[DIAMETER][3];                                                                            ^

Since DIAMETER has the value 5, and in the loop, c is at most 3, the array accesses are not really out of bounds, but I can't tell whether this is an actual mistake, or if something clever is going on. Could you please double-check, and perhaps add a comment if the code is in fact correct?

The relevant lines were added by Anders Petersson in Oct last year:

[rvanoo@snell:~/repos/sw4lite] $ git blame -L 1840,+5 src/device-routines.C                                            
^dfbecf0 (Anders Petersson 2017-10-19 16:07:10 -0700 1840)             -rho(i,j,k+1)*dcz(k+1)*                         
^dfbecf0 (Anders Petersson 2017-10-19 16:07:10 -0700 1841)             (qum[c][4]-2*qum[c][3]+sum(c,ith,jth,k  ))      
^dfbecf0 (Anders Petersson 2017-10-19 16:07:10 -0700 1842)             +2*rho(i,j,k)*dcz(k)  *                         
^dfbecf0 (Anders Petersson 2017-10-19 16:07:10 -0700 1843)             (qum[c][3]-2*sum(c,ith,jth,k  )+qum[c][1])      
^dfbecf0 (Anders Petersson 2017-10-19 16:07:10 -0700 1844)             -rho(i,j,k-1)*dcz(k-1)*

[Edit] I just noticed that the commit mentioned by 'git blame' above was the initial commit.

Incorrect output for CUDA version?

When building the latest CUDA version, and running it with the "pointsource.in" example input, as described in README.md, the expected output is

Errors at time 0.6 Linf = 0.569416 L2 = 0.0245361 norm of solution = 3.7439

Instead, I'm getting

Errors at time 0.6 Linf = 3.7439 L2 = 1.01524 norm of solution = 3.7439

(I.e., Linf and L2 are different, norm of solution is identical).

I interpret the statement in README.md "When modifying the code, it is important to verify that these numbers have not changed" to say that all three values should match, and they don't. Does this indicate an error in the code?

geodynamics / sw4lite Goto Github PK

sw4lite's Introduction

Sw4lite

To build

To run

sw4lite's People

Contributors

Stargazers

Watchers

Forkers

sw4lite's Issues

Cmake build auto fetch lapack from github

Lack of Explicit Fortran Module Dependencies Causes Race Condition

ERROR: developer option corder, must be zero when fortran routines are used

Suspicious array indices in device-routines.C

Incorrect output for CUDA version?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent