Git Product home page Git Product logo

gromax's Introduction

Build Status codecov

Gromax

Gromax is a tool to help optimize GROMACS performance on any hardware, particuarly useful for working with GPUs.


Today's molecular simulation engines are complicated. To get the most performance out of Gromacs and other packages, there are a large number of simulation parameters that need to be considered and tweaked. With the incorporation of GPUs this problem becomes even more complicated, and differing flags and options between yearly releases of Gromacs adds yet another layer of complexity. Gromax is here to help.

Installation

Via pip

To install the most recent release:

pip install git+https://github.com/scal444/[email protected]

To install the current master version:

pip install git+https://github.com/scal444/gromax

Requirements

  • python 3.7 or greater

Capabilities

  • Given a Gromacs TPR file and a description of the hardware (CPU count and GPU IDs), generate a series of Gromacs run commands to explore which parameters provide the best performance for the hardware.
  • Supports Gromacs major versions 2016, 2018, 2019, 2020, and 2021.
  • Break down the available hardware into subcomponents to assess maximum throughput on a single node.
  • Generates a simple bash script to execute
  • Analyzes results and reports best paramater combinations.

See the future work doc for planned features.

Usage

You can find some details via the help text(gromax --help). See the various possible usages in the examples doc!

Notes

  • Gromax is designed for single node optimization and works best with non-MPI gromacs (think gmx binary rather than gmx_mpi). It may work with an MPI-compiled Gromacs, but no guarantees. Non-MPI Gromacs does have a limit of 64 threads, which should be sufficient for most purposes.
  • Some simulation features (free energy, thermostats) affect which runtime optimizations can be used. Gromax currently does not take these into account, so a few simulation combinations may fail with valid errors. Send me an email or open up a bug report if you're unsure if this is the case with your failures.

Contributing

  • Feel free to file an issue bug/feature request, or create a PR. There is a known issues doc for problems that are known but can't yet be addressed.

Release notes

Other awesome resources

  • Want to see how well your system scales to various clusters? Check out MDBenchmark!

gromax's People

Stargazers

Thibaut avatar hopanoid avatar zhenrong-wang avatar Sejeong Park avatar Longyuan Zhang avatar Masrul Huda avatar Ming Hao avatar Manuel Carrer avatar Xavier Hallade avatar Simon Duerr avatar Alan Sill avatar  avatar

Watchers

Xavier Hallade avatar Kevin Boyd avatar

gromax's Issues

Improve output script configurability

There are several improvements/generalizations to make -

  • split off gmx binary and mdrun variables.
  • Add a trials variable for loop
  • stderr/stdout redirecting

Add single-sim option

A fairly common use case will be to just want one simulation for the entire hardware. Add this flag specifically - can unify with --max_sims later

Add log customization

Features include

  • detail level, with --logging_level - normal or debug
    --log_to_stdout - true or false
    --log_file - optional, set to debug mode.

Cap max ranks with PME GPU

There's often an odd number of PP ranks when there's a PME rank - e.g. if a system naturally decomposes into 8 ranks, 7 PP ranks is worse for decomposition. Not sure what the cap should be but Gromacs will fail if it's too ugly

Gromacs compiled without thread MPI support

Hi, I wanted to give this tool a little spin.
On our machines we have Gromacs preinstalled but compiled without thread MPI, wherefore the options (in particular -nt) does not work. Is there support for such kind of installations? I set the correct gmx_executable flag but didn't find another option.

Below there are some details on the installation and hardware.

gmx_mpi mdrun -bonded cpu -deffnm group_1_trial_1_component_1 -gputasks 01 -nb gpu -noconfout -nsteps 15
000 -nstlist 80 -nt 28 -ntmpi 2 -ntomp 14 -pin on -pinoffset 0 -pinstride 1 -pme cpu -resetstep 10000 -s .
./../npt.tpr -update cpu

GROMACS version:    2020.1
Verified release checksum is 5cde61b9d46b24153ba84f499c996612640b965eff9a218f8f5e561f94ff4e43
Precision:          single
Memory model:       64 bit
MPI library:        MPI
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support:        CUDA
SIMD instructions:  AVX2_256
FFT library:        fftw-3.3.8-sse2-avx-avx2-avx2_128
RDTSCP usage:       enabled
TNG support:        enabled
Hwloc support:      disabled
Tracing support:    disabled
C compiler:         /software/gcc/6.3.0/bin/gcc GNU 6.3.0
C compiler flags:   -mavx2 -mfma -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp
C++ compiler:       /software/gcc/6.3.0/bin/g++ GNU 6.3.0
C++ compiler flags: -mavx2 -mfma -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp
CUDA compiler:      /software/cuda/9.2/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on Tue_Jun_12_23:07:04_CDT_2018;Cuda compilation tools, release 9.2, V9.2.148
CUDA compiler:      /software/cuda/9.2/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2018 NVIDIA Corporation;Built on Tue_Jun_12_23:07:04_CDT_2018;Cuda compilation tools, release 9.2, V9.2.148
CUDA compiler flags:-std=c++14;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_35,code=compute_35;-gencode;arch=compute_50,code=compute_50;-gencode;arch=compute_52,code=compute_52;-gencode;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-gencode;arch=compute_70,code=compute_70;-use_fast_math;-D_FORCE_INLINES;-mavx2 -mfma -pthread -fexcess-precision=fast -funroll-all-loops -fopenmp
CUDA driver:        10.10
CUDA runtime:       9.20


Running on 1 node with total 28 cores, 56 logical cores, 2 compatible GPUs
Hardware detected on host lcbcpc81 (the node of MPI rank 0):
 CPU info:
   Vendor: Intel
   Brand:  Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz
   Family: 6   Model: 85   Stepping: 4
   Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic

Make help output prettier

Split off required and optional arguments.

Reorder so important args are first

Get rid of useless metavariables

Add check for empty options for config

Example case - accidentally input 0-20 for CPUs with 2 GPUs, no breakdown into multiple ranks for 21 CPUs and 2 GPUs - fail verbosely rather than silently not giving options

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.