Git Product home page Git Product logo

npb-nas-omp-nk's Introduction

Linux/OpenMP

Linux/OpenMP for Nautilus/RTK and Nautilus/CCK

To construct Linux/OpenMP, with the same flags as Nautilus/RTK and Nautilus/CCK:

git checkout rtk-cck-flags

make clean;make suite

Now to run them for testing, go to the directory ./bin ; the executables are located there.

Linux/OpenMP for Nautilus/PIK

To construct Linux/OpenMP, with the same flags as Nautilus/PIK:

git checkout pik-flags

make clean;make suite

Now to run them for testing, still go to the directory ./bin; the executables are located there.

/*********************************************************************** *

  •                NAS Parallel Benchmarks 3.0
    
  •                Unofficial OpenMP C Version
    
  • Copyright 2014 University of Versailles Saint Quentin en Yvlines
  • Copyright 2000 Omni OpenMP Compiler Project
  • Copyright 1991-2014 NASA Advanced Supercomputing Division
  •                     November 08, 2014
    

***********************************************************************/

  1. Introduction

This package contains an unofficial C version of the NAS Parallel Benchmarks OpenMP 3.0. The benchmarks are derived from the Omni OpenMP Compiler Project 2.3 unofficial C version of NPB 2.3.

The benchmarks were modified to match the new official 3.0 Fortran NPB. benchmarks. In particular, benchmarks in this release follow the same parallel region structure than the official 3.0 Fortran NPB.

  1. Change Log

This section tracks all the modifications that were made to update the Omni OpenMP 2.3 unofficial C version to the 3.0 version.

Each modification includes an annotation of the form XXX:YYY, where XXX is the line number in the 2.3 version and YYY is the line number in the 3.0 version.

BT

  • Transforms the initialization process into distinct parallel regions: 129:127 - initialize
    131:128 - lhsinit 133:129 - exact_rhs
  • Splits adi function into separated parallel regions: 211:206 - compute_rhs 213:209 - x_solve 215:212 - y_solve 217:215 - z_solve 219:218 - add
  • TODO - Initialization part different

CG

  • Updates sparse in makea 761:756 - add preload data pages parallel region loop
  • Splits the first main parallel region 187:195 - initialization parallel region in main are reduced and single regions are replaced by serial parts 356:347 - conj grad (see below) 209:219 - three loops parallelization
  • Decompose Conj grad function into parallel regions 402:395 - initialize conj grad algorithm 413:405 - conj grad iteration loop 551:551 - compute residual norm explicitly
  • Split the second main parallel region 261:260 - conj grad 275:271 - post conj grad two loops parallelization
  • 78:78 - Remove temporarry array w
  • 375:367 - Remove static variables in conj grad since they are initialized at each iteration
  • 500:494 - Add barrier after reduction because LLVM OpenMP does not support implicit synchronization after a reduction
  • 504:498 - Remove single because all the variables are private due to parallel regions updates

EP

  • Parallelizes x-array initialization 109:110 - main

FT

  • Splits the first main parallel region
    123:123 - compute_indexmap 131:130 - fft (see below)
  • Splits the second main parallel region
    176:168 - evolve 164:158 - fft 187:177 - fft 845:849 - checksum
  • Decompose fft into parallel regions 514:501 - cffts1 562:556 - cffts2 607:606 - cffts3
  • define y0 and y1 in cffts[123] on the stack because there is no pointer in fortran
  • TODO - need to insert init_ui to touch the initial data

IS

  • Already in C

LU

  • Splits into parallel regions boundaries and initialization of dependent variables and also forcing term computing 125:123 - setbv 130:128 - setiv 135:133 - erhs
  • Produces setparte parallel regions for ssor 3073:3092 - l2norm 3068:3087 - rhs 3064:3083 - SSOR initialization 3094:3112 - SSOR iteration
  • TODO - Parallelize post computation part
    • error
    • pintgr

MG

  • Turns omp parallel region into omp parallel loop for 1239:1211 - zero3
  • Split first big parallel region
    238:233 - norm2u3 257:249 - mg3P 258:243 - resid
  • Splits the main iteration big parallel region 273:262- resid 274:263 - norm2u3 277:266 - mg3P
  • Transforms the omp parallel region into omp parallel loops from main and mg3P 846:826 - norm2u3
    527:516 - resid 608:595 - rprj3 1245:1217 - zero3
    463:454 - psinv 684:669 - interep (produces two parallel regions)
  • 820:835 - Remove static because of algorithmic changes
  • 836:862 - Algorithmic changes

SP

  • Splits adi function into separated parallel regions: 205:204 - compute_rhs 208:207 - txinvr 211:210 - x_solve and ninvr 214:213 - y_solve and tzetar 216:216 - z_solve and pinvr 220:219 - add
  • TODO - Initialize has to be updated
    • 654:659 parallelize the function
  • TODO - Post computation verify has to be parallelized
    221:226 - error_norm
  1. Installation

THe package should contain the following files/directories:

README - this file README.omc - Readme of the Omni OpenMP Compiler release LOG.omc - Change log of the Omni OpenMP Compiler release Makefile - makefile for the suite (not modified from NPB2.3-omp) Doc - documentations (not modified from NPB2.3-serial) BT, CG, EP, FT, IS, LU, MG, SP - directory for each program bin - directory to put executable files common - common routines (only change version display from NPB2.3-omp) config - configuration files used by 'make' (not modified from NPB2.3-omp) sys - utilities (only change version display from NPB2.3-omp)

To use the suite, edit file 'make.def' in directory 'config'. You must specify the name of compiler and linker, and compiler options.
For more details, refer to file "README.install" in subdirectory "Doc" and to "README.omc".

  1. Information

npb-nas-omp-nk's People

Contributors

wenyiwang-matthias avatar

Stargazers

 avatar Joe Huang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.