Git Product home page Git Product logo

rajaproxies's Introduction

RAJA Proxy Applications

This project contains a collection of proxy applications written using the RAJA Rerformance Portability Layer. These applications are examples of how RAJA is used in real codes and provideng a convenient vehicle for testing features and analyzing performance.

Quick Start

This repository is hosted on GitHub. To clone the repo into your local working space, use the command:

$ git clone --recursive https://github.com/LLNL/RAJAProxies.git 

The --recursive argument is used to download the repository's submodules, RAJA and the build system BLT. After you execute this command, you will see the master branch in the raja-proxies directory.

Then, you can build RAJA and the proxy applications like any other CMake project, provided you have a C++ compiler that supports the C++11 standard. The simplest way to build the code is to do the following in the top-level raja-proxies directory (in-source builds are not allowed!):

$ mkdir build
$ cd build
$ cmake ../
$ make

More details about RAJA configuration options are located in the RAJA User Guide and Tutorial.

The executable for Each application will be located in the bin directory of your build space. The executable names will include the name of the proxy app, its version and parallel programming model it is using. To run an application, simply run the desired executable. For example, to run run the RAJA version of LULESH v1.0 with the OpenMP backend, execute the following command:

$ ./lulesh-v1.0-RAJA-omp.exe

Proxy Application Information

Information about each available proxy application is available here RAJA_Proxy_Apps.md.

Questions?

If you have any questions about this repo, please send email to [email protected] or contact one of the individuals listed below.

Authors

This repository is maintained by:

Release Information

Please see RELEASE.md for release information for each proxy application.

rajaproxies's People

Contributors

artv3 avatar davidbeckingsale avatar davideberius avatar davidpoliakoff avatar mdavis36 avatar rchen20 avatar rhornung67 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rajaproxies's Issues

LULESH v2.0 - CUDA execution result in illegal memory access

I think there's something strange going on which prevents any application with CUDA enabled from running without a memory error during execution.

I'm not sure if this is a failure of RAJA or RAJAProxies. I was able to build and run kripke as standalone just fine (v1.2.4) from the kripke repository, but I got the same error on head. I filed an issue there as well, so this is only related to LULESH 2.0 (CUDA)

cmake -C /path/to/host-config.cmake \
  -S . \
  -B build \
  -DCMAKE_C_COMPILER=gcc \
  -DCMAKE_CXX_COMPILER=g++ \
  -DCMAKE_BUILD_TYPE=Release \
  -DENABLE_OPENMP=Off \
  -DENABLE_CHAI=On \
  -DENABLE_CUDA=On \
  -DCUDA_ARCH=sm_86 \
  -DENABLE_COMD=Off \
  -DENABLE_LULESH_ONE=Off \
  -DENABLE_KRIPKE=Off \
  -DENABLE_EXAMPLES=Off \
  -DENABLE_TESTS=Off

The host config:

set(CMAKE_CXX_FLAGS_RELEASE "-Ofast -finline-functions" CACHE STRING "")
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "-Ofast -g -finline-functions" CACHE STRING "")
set(CMAKE_CXX_FLAGS_DEBUG "-O0 -g" CACHE STRING "")

set(CUDA_COMMON_OPT_FLAGS -restrict; -gencode=arch=compute_86,code=sm_86; -std c++14; --expt-extended-lambda)
#set(CUDA_COMMON_OPT_FLAGS -restrict; -arch sm_86; -std c++14; --expt-extended-lambda)
#set(CUDA_COMMON_OPT_FLAGS -restrict; -arch compute_86; -std c++14; --expt-extended-lambda)
set(CUDA_COMMON_DEBUG_FLAGS -restrict; -arch compute_86; -std c++14; --expt-extended-lambda)

#set(HOST_OPT_FLAGS -Xcompiler -O3 -Xcompiler -finline-functions -Xcompiler -fopenmp)
set(HOST_OPT_FLAGS -Xcompiler -O3 -Xcompiler -finline-functions -Xcompiler)

if(CMAKE_BUILD_TYPE MATCHES Release)
  set(RAJA_NVCC_FLAGS -O3; ${CUDA_COMMON_OPT_FLAGS}; -ccbin; ${CMAKE_CXX_COMPILER} ; ${HOST_OPT_FLAGS} CACHE LIST "")
elseif(CMAKE_BUILD_TYPE MATCHES RelWithDebInfo)
  set(RAJA_NVCC_FLAGS -g; -G; -O3; ${CUDA_COMMON_OPT_FLAGS}; -ccbin; ${CMAKE_CXX_COMPILER} ; ${HOST_OPT_FLAGS} CACHE LIST "")
elseif(CMAKE_BUILD_TYPE MATCHES Debug)
  set(RAJA_NVCC_FLAGS -g; -G; -O0; ${CUDA_COMMON_DEBUG_FLAGS}; -ccbin; ${CMAKE_CXX_COMPILER} ; -Xcompiler -fopenmp CACHE LIST "")
endif()

Machine has an NVIDIA A6000 which is why the compute capability is set to 86. System GCC (8.4) is used with the NVIDIA Toolkit installed in default path (/usr/local/cuda/)

Everything builds as expected with zero failures (only warnings are related to calling a __host__ function from a __host__ __device__ function is not allowed

LULESH 2.0

$ ./build/bin/lulesh-v2.0-RAJA-cuda.exe
Running problem size 45^3 per domain until completion
Num processors: 1
Total number of elements: 91125

To run other sizes, use -s <integer>.
To run a fixed number of iterations, use -i <integer>.
To run a more or less balanced region set, use -b <integer>.
To change the relative costs of regions, use -c <integer>.
To print out progress, use -p
To write an output file for VisIt, use -v
See help (-h) for more options

CUDAassert: an illegal memory access was encountered /path/to/RAJAProxies/tpl/RAJA/include/RAJA/policy/cuda/MemUtils_CUDA.hpp 183
terminate called after throwing an instance of 'std::runtime_error'
  what():  CUDAassert
Aborted (core dumped)
cuda-memcheck output:
========= CUDA-MEMCHECK
========= Invalid __global__ read of size 4
=========     at 0x000000c0 in _ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_
=========     by thread (63,0,0) in block (8,0,0)
=========     Address 0x55b7888abba8 is out of bounds
=========     Device Frame:_ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_ (_ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_ : 0xc0)
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame:/lib/x86_64-linux-gnu/libcuda.so.1 [0x20d6ea]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x6e91b]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0xc4f58]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x2d197]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x16068]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xf3) [0x270b3]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x1710e]
=========
========= Invalid __global__ read of size 4
=========     at 0x000000c0 in _ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_
=========     by thread (62,0,0) in block (8,0,0)
=========     Address 0x55b7888abba0 is out of bounds
=========     Device Frame:_ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_ (_ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_ : 0xc0)
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame:/lib/x86_64-linux-gnu/libcuda.so.1 [0x20d6ea]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x6e91b]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0xc4f58]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x2d197]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x16068]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xf3) [0x270b3]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x1710e]
=========
========= Invalid __global__ read of size 4
=========     at 0x000000c0 in _ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_
=========     by thread (61,0,0) in block (8,0,0)
=========     Address 0x55b7888abb98 is out of bounds
=========     Device Frame:_ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_ (_ZN4RAJA6policy4cuda4impl18forall_cuda_kernelILm256EPlZN68_INTERNAL_46_tmpxft_0009d47b_00000000_7_lulesh_cuda_cpp1_ii_ee65fee143ApplyAccelerationBoundaryConditionsForNodesEP6DomainEUliE_lEEvT1_T0_T2_ : 0xc0)
=========     Saved host backtrace up to driver entry point at kernel launch time
=========     Host Frame:/lib/x86_64-linux-gnu/libcuda.so.1 [0x20d6ea]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x6e91b]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0xc4f58]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x2d197]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x16068]
=========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xf3) [0x270b3]
=========     Host Frame:./build_newest/bin/lulesh-v2.0-RAJA-cuda.exe [0x1710e]

compilation errors if nvcc debugging flags are used

I wanted to use Nvidia visual profiler, which requires debugging info. to be generated for some performance analysis. So I passed nvcc debugging flags to cmake, using the following command line (-DCMAKE_BUILD_TYPE:STRING=Debug alone does not populate "-g -G" to nvcc somehow, which may be another bug):

..path/RAJAProxies.git/buildCuda] cmake -DENABLE_CUDA=on -DENABLE_OPENMP=On -DCMAKE_BUILD_TYPE:STRING=Debug -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DCUDA_NVCC_FLAGS_DEBUG:STRING="-g -G" ../

However, nvcc will generate compilation errors if the -DCUDA_NVCC_FLAGS_DEBUG:STRING="-g -G" is used.

-- Generating /path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/CMakeFiles/lulesh-v2.0-RAJA-cuda.exe.dir//./lulesh-v2.0-RAJA-cuda.exe_generated_lulesh-cuda.cpp.o
/usr/tce/packages/cuda/cuda-9.2.148/bin/nvcc /path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp -x=cu -c -o /path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/CMakeFiles/lulesh-v2.0-RAJA-cuda.exe.dir//./lulesh-v2.0-RAJA-cuda.exe_generated_lulesh-cuda.cpp.o -m64 -DUSE_OPENMP -DUSE_CUDA -DUSE_CASE=9 -DUSE_MPI=0 -DLULESH_DEVICE=device -Xcompiler -fopenmp -restrict -arch sm_35 -std c++11 --expt-extended-lambda -ccbin /usr/tcetmp/bin/c++ -g -G -lineinfo -DNVCC -I/usr/tce/packages/cuda/cuda-9.2.148/include -I/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/lulesh-v2.0/RAJA -I/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/tpl/RAJA/include -I/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/tpl/RAJA/include -I/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/tpl/RAJA/tpl/cub
nvcc warning : '--device-debug (-G)' overrides '--generate-line-info (-lineinfo)'
/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(288): error: calling a host function("Domain::p") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(288): error: identifier "Domain::p" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(288): error: calling a host function("Domain::q") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(288): error: identifier "Domain::q" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(516): error: calling a host function("Domain::nodelist") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(516): error: identifier "Domain::nodelist" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(528): error: calling a host function("CollectDomainNodesToElemNodes") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(528): error: identifier "CollectDomainNodesToElemNodes" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(560): error: calling a host function("Domain::nodeElemCount") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(560): error: identifier "Domain::nodeElemCount" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(561): error: calling a host function("Domain::nodeElemCornerList") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(561): error: identifier "Domain::nodeElemCornerList" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(571): error: calling a host function("Domain::fx") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(571): error: identifier "Domain::fx" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(572): error: calling a host function("Domain::fy") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(572): error: identifier "Domain::fy" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(573): error: calling a host function("Domain::fz") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(573): error: identifier "Domain::fz" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(758): error: calling a host function("Domain::nodelist") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(758): error: identifier "Domain::nodelist" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(819): error: calling a host function("Domain::ss") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(819): error: identifier "Domain::ss" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(820): error: calling a host function("Domain::elemMass") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(820): error: identifier "Domain::elemMass" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(832): error: calling a host function("Domain::xd") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(832): error: identifier "Domain::xd" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(833): error: calling a host function("Domain::xd") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(833): error: identifier "Domain::xd" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(834): error: calling a host function("Domain::xd") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(834): error: identifier "Domain::xd" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(835): error: calling a host function("Domain::xd") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(835): error: identifier "Domain::xd" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(836): error: calling a host function("Domain::xd") from a device function(" const") is not allowed

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(836): error: identifier "Domain::xd" is undefined in device code

/path2file/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/lulesh-cuda.cpp(837): error: calling a host function("Domain::xd") from a device function(" const") is not allowed
....

Add CI testing.

It would help to have CI check for any issues caused by updates to the RCU repos. Perhaps a weekly CI job which builds and runs against develop branches of RCU?

Extra -fopenmp in nvcc link line.

Enabling OpenMP on CUDA builds adds extra -fopenmp to the nvcc link line, causing an unrecognized parameter failure. Most likely cause is a CMake OpenMP link variable is set and added to the general link line. Current workaround is to disable OpenMP on CUDA builds.

Lulesh Compilation Warnings

Some Lulesh v1 and v2 warnings when building with nvcc 10.1.243 + gcc 8.3.1:

v1:

In function 'T* Allocate(size_t) [with T = double]',
inlined from 'void ApplyMaterialPropertiesForElems()' at /usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v1.0/baseline/lulesh.cpp:2691:37,
inlined from 'void LagrangeElements()' at /usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v1.0/baseline/lulesh.cpp:2774:34,
inlined from 'int main(int, char**)' at /usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v1.0/baseline/lulesh.cpp:2876:20:
/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v1.0/baseline/lulesh.cpp:515:34: warning: argument 1 range [18446744056529682432, 18446744073709551608] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
return static_cast<T >(malloc(sizeof(T)size)) ;
~~~~~~^~~~~~~~~~~~~~~~
In file included from /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/bits/std_abs.h:38,
from /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/cmath:47,
from /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/include/c++/8/math.h:36,
from /usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v1.0/baseline/lulesh.cpp:66:
/usr/include/stdlib.h: In function 'int main(int, char
)':
/usr/include/stdlib.h:465:14: note: in a call to allocation function 'void* malloc(size_t)' declared here
extern void *malloc (size_t __size) __THROW attribute_malloc __wur;
^~~~~~

v2:

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(248): warning: calling a host function from a host device function is not allowed

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(249): warning: calling a host function from a host device function is not allowed

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(250): warning: calling a host function from a host device function is not allowed

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(251): warning: calling a host function from a host device function is not allowed

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(252): warning: calling a host function from a host device function is not allowed

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(253): warning: calling a host function from a host device function is not allowed

/usr/WS1/chen59/allraja/rajaproxy/rajaproxy_git6/lulesh-v2.0/RAJA/lulesh.cpp(254): warning: calling a host function from a host device function is not allowed

Cannot build Kripke with CUDA

Hi,

Would it be better to automatically skip building Kripke if cuda is enabled, instead of stopping with an error message?
'''
cmake -DENABLE_CUDA=on -DENABLE_OPENMP=On ../.

CMake Error at CMakeLists.txt:47 (message):
Cannot build Kripke with CUDA

-- Configuring incomplete, errors occurred!
'''

Right now, I hacked into CMakeLists.txt turn off Kripke manually:
option(ENABLE_KRIPKE "Build Kripke" Off)

a warning: calling a host function from a device function is not allowed

Not sure if this matters. Just report to you just in case:

full command line leading the warning message:

/usr/tce/packages/cmake/cmake-3.12.1/bin/cmake -E remove /g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/CMakeFiles/lulesh-v2.0-RAJA-omp.exe.dir//lulesh-v2.0-RAJA-omp.exe_generated_lulesh.cpp.o.depend.tmp /g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/CMakeFiles/lulesh-v2.0-RAJA-omp.exe.dir//lulesh-v2.0-RAJA-omp.exe_generated_lulesh.cpp.o.NVCC-depend
-- Generating /g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/CMakeFiles/lulesh-v2.0-RAJA-omp.exe.dir//./lulesh-v2.0-RAJA-omp.exe_generated_lulesh.cpp.o
/usr/tce/packages/cuda/cuda-9.2.148/bin/nvcc /g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/lulesh-v2.0/RAJA/lulesh.cpp -x=cu -c -o /g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/lulesh-v2.0/RAJA/CMakeFiles/lulesh-v2.0-RAJA-omp.exe.dir//./lulesh-v2.0-RAJA-omp.exe_generated_lulesh.cpp.o -m64 -DUSE_OPENMP -DUSE_CUDA -DUSE_MPI=0 -DLULESH_DEVICE= -Xcompiler -fopenmp -restrict -arch sm_35 -std c++11 --expt-extended-lambda -ccbin /usr/tcetmp/bin/c++ -DNVCC -I/usr/tce/packages/cuda/cuda-9.2.148/include -I/g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/tpl/RAJA/include -I/g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/buildCuda/tpl/RAJA/include -I/g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/tpl/RAJA/tpl/cub
/g/g17/username/workspace-wsa/tempdir/lulesh/RAJAProxies.git/tpl/RAJA/tpl/cub/cub/device/dispatch/../../agent/../util_device.cuh(110): warning: calling a host function("RAJA::detail::ReduceOMP<double, ::RAJA::reduce::min > ::~ReduceOMP") from a host device function("") is not allowed

RAJAProxies.git/tpl/RAJA/include/RAJA/pattern/detail/reduce.hpp(194): warning: calling a host function("RAJA::detail::ReduceOMP<double, ::RAJA::reduce::sum > ::~ReduceOMP") from a host device function("RAJA::reduce::detail::BaseReduce<double, ::RAJA::reduce::sum, ::RAJA::detail::ReduceOMP> ::~BaseReduce") is not allowed

lulesh-v2.0-baseline-seq.exe terminates prematurely

Hi,

I noticed that the baseline seq version does not print out verification output like other versions do. It actually aborts before reaching the verification stage.

[../RAJAProxies.git/buildRayCuda/bin]./lulesh-v2.0-baseline-seq.exe -s 45
Running problem size 45^3 per domain until completion
Num processors: 1
Total number of elements: 91125

To run other sizes, use -s .
To run a fixed number of iterations, use -i .
To run a more or less balanced region set, use -b .
To change the relative costs of regions, use -c .
To print out progress, use -p
To write an output file for VisIt, use -v
See help (-h) for more options

[../RAJAProxies.git/buildRayCuda/bin]echo $?
255

//------------compared to omp version's output

[../RAJAProxies.git/buildRayCuda/bin]./lulesh-v2.0-baseline-omp.exe -s 45
Running problem size 45^3 per domain until completion
Num processors: 1
Num threads: 64
Total number of elements: 91125

To run other sizes, use -s .
To run a fixed number of iterations, use -i .
To run a more or less balanced region set, use -b .
To change the relative costs of regions, use -c .
To print out progress, use -p
To write an output file for VisIt, use -v
See help (-h) for more options

Run completed:
Problem size = 45
MPI tasks = 1
Iteration count = 1477
Final Origin Energy = 4.234875e+05
Testing Plane 0 of Energy Array on rank 0:
MaxAbsDiff = 1.600711e-10
TotalAbsDiff = 2.642516e-09
MaxRelDiff = 1.830255e-12

Elapsed time = 45.58 (s)
Grind time (us/z/c) = 0.33867557 (per dom) (0.33867557 overall)
FOM = 2952.6783 (z/s)

[../RAJAProxies.git/buildRayCuda/bin] echo $?
0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.