epcced / epcc-reframe Goto Github PK
View Code? Open in Web Editor NEWRepository for Reframe configuration and tests for EPCC systems
License: GNU General Public License v3.0
Repository for Reframe configuration and tests for EPCC systems
License: GNU General Public License v3.0
Add tests that Singularity works as expected:
Small error but it should read:
# Performance test reference values
self.reference = {
'archer2:compute': {'perf': (335, 12, -12, 'seconds'),
}
}
Test that MPMD jobs run as expected. How these are implemented depends on the scheduler, for example, Slurm would typically use hetjob
to run this type of job.
ReFrame has changed how parameterised tests are specified in more recent versions and existing tests need to be updated to use the new syntax. Documentation on the new syntax can be found at:
https://reframe-hpc.readthedocs.io/en/stable/tutorial_advanced.html
Update cp2k benchmark in reframe to run at desired scale for continuous monitoring.
The paths of the filesystem are currently configured in each test doing I/O. At the moment, they all write to a shared z19 directory on each of the archer2 filesystem. That means the tests will break when run from another project account or when run on a machine with different file systems , such as on the TDS.
It would be good for the base directories to be configured at run time. One would add an option to the reframe command or modify a config file and all the tests would perform output within those directories.
Add a test that runs the IO500 benchmark (at 10 nodes initially)
Having a consistent style guide helps when creating new, or upgrading older tests.
Furthermore, including linter configuration files in the repository would ensure everyone can easily follow the rules.
I am partial to the philosophy behind Black - I think it makes for readable code.
Regarding PEP8, I don't like the 79 character limit (E501) and W503, as I feel the first pressures people into writing more cryptic variable names, and the latter is supposed to be reversed soon. I would suggest ignoring W503 and setting a line-length of 120 characters.
Add simple tests to ensure that all modulepaths that should be visible, are visible (on login and compute nodes)
Add a test that runs distributed STREAM on compute nodes: https://github.com/adrianjhpc/DistributedStream
Suggestion from HPC Systems team
Thinking ahead to reframe tests we might want to run - on the spinning-rust lustre, there's one OST per server (and if the site E-1000 we have is anything to go by, this will also be the case for the NVMe) - it might be worth doing some kind of I/O tests on every OST in individually to look for slowdowns
We should setup synthetic MPI tests to check that performance and functionality is working as expected. We can use OSU MPI benchmarks for this and build on top of the ReFrame example at:
https://reframe-hpc.readthedocs.io/en/stable/tutorial_deps.html
Which core functionality testing is missing from the EPCC reframe tests?
At the moment it is up to the system if Lustre client caching is used or not which can lead to inconsistencies with test performance. We should update the benchio source code to allow the test to switch caching off at runtime for all the different tests.
For MPIIO:
call MPI_File_sync(fh,ierr)
call MPI_File_close(fh, ierr)
For HDF5:
! Close the file.
CALL h5fflush_f( file_id , H5F_SCOPE_GLOBAL_F,ierr )
CALL h5fclose_f(file_id, ierr)
For file-per-process:
! Declare the interface for POSIX fsync function
interface
function fsync (fd) bind(c,name="fsync")
use iso_c_binding, only: c_int
integer(c_int), value :: fd
integer(c_int) :: fsync
end function fsync
end interface
...file writing...
! Flush and sync
flush(10)
ret = fsync(fnum(10))
! Handle possible error
if (ret /= 0) stop "Error calling FSYNC"
Add tests to verify that Slurm process/thread placement on compute nodes is behaving as expected.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.