Git Product home page Git Product logo

mtms's Introduction

#Multi-Task Multi-Stage libary

Introduction

This library supports the Multi-Task Multi-Stage pattern: a workflow of parallel Tasks (pipelines) with the same number of stages (steps). All tasks execute the same application, although with (possibly) different configuration and parameters. For the execution mtms relies on RADICAL-Pilot.

Installation

First we create a python virtual environment to safely play around:

virtualenv /tmp/mtms-ve
source /tmp/mtms-ve/bin/activate

As MTMS depends on non-released versions of RADICAL-Pilot, SAGA-Python and MD-Kernels, we install those first.

mkdir /tmp/mtms-src
cd /tmp/mtms-src
git clone https://github.com/radical-cybertools/saga-python.git
cd saga-python
git checkout devel
python setup.py install
cd ..
git clone https://github.com/radical-cybertools/radical.pilot.git
cd radical.pilot
git checkout feature/staging
python setup.py install
cd ..
git clone https://github.com/radical-cybertools/radical.ensemblemd.mdkernels.git
cd radical.ensemblemd.mdkernels
git checkout release
python setup.py install
cd ..

Currently MTMS is only installable from source:

git clone https://github.com/radical-cybertools/MTMS.git
cd MTMS
git checkout staging
python setup.py install

To verify quickly that the installation was successful you can run:

python -c 'import radical.ensemblemd.mtms as mtms; print mtms.version'

This should print a version number. If you get an import error, get in touch with us.

Tests provided

For a more elaborate testing of the code a suite of tests is available. These can be executed by:

python setup.py test

Please report any errors to us, as these should all succeed in theory.

The core library

This is the generic Multi-Task Multi-Stage library. The minimal structure of the API and its usage is displayed below.

from radical.ensemblemd import mtms

res = mtms.Resource_Description()
io = mtms.IO_Description()
tasks = mtms.Task_Description()
engine = mtms.Engine()

engine.execute(res, tasks, io, verbose)

Implementation can be found at: https://github.com/radical-cybertools/MTMS/blob/staging/src/radical/ensemblemd/mtms/mtms.py.

I/O Description

# Task I/O specification in the form of { 'label': 'pattern' }
io.input_per_task_first_stage={}
io.input_all_tasks_per_stage={}
io.input_per_task_all_stages={}

io.output_per_task_per_stage={}
io.output_per_task_final_stage={}

# Intermediate in the form of [{input_label, output_label, pattern}]
io.intermediate_output_per_task_per_stage=[]

Templating / Variable expansion

task_desc = mtms.Task_Description()
task_desc.kernel = 'NAMD'
task_desc.arguments = '${i_conf}'
io_desc.input_all_tasks_per_stage = {
  'i_conf': '%s/dyn-conf-files/dyn${STAGE}.conf' % (DATA_PREFIX),
}

In the task_description we can use any variable as used in the I/O description (like the i_conf). Special variables are ${TASK} and ${STAGE}.

NAMD workflow execution

This is a NAMD workflow specific example that makes use of the mtms library. To run the supplied example, you can need to perform the described steps (from the /tmp/mtms-src/MTMS directory created earlier).

The experiment configuration is based on the paper "Scalable online comparative genomics of mononucleosomes: a BigJob". The script uses the hierarchical directory layout for the input data as in the paper; the first tier represents 5 chromosome sites, and the second tier represents 21 locations along the DNA sequence representing the start of the nucleosome. You can see how that is used in the script at line 59 of examples/namd_mtms_wf.py. For every location 20 simulations of 1ns are performed.

The code of the example can be seen here: https://github.com/radical-cybertools/MTMS/blob/staging/examples/namd_mtms_wf.py.

To cut execution time of this example, the number of chromosomes is 2, with each just 1 location and the number of simulations per location is 3. This leads to 6 MD simulations instead of 2100. Of course you are free to change these numbers, you can do that starting at line 46 of examples/namd_mtms_wf.py.

The current script assumes you have an account on the TACC XSEDE Stampede cluster. If not, you can configure to run on another cluster or on your localhost by changing the code from line 18 of examples/namd_mtms_wf.py.

To prepare the input data on stampede (and save yourself from the data transfers during the tutorial) please follow the instructions below when logged into stampede:

cd $WORK
mkdir demo
cd demo
cp -pr /work/01740/marksant/demo/data_bishop .
cd data_bishop
./populate_data_directory_bishop.sh

To start the experiment, run the following command from the MTMS source directory on your laptop:

/tmp/mtms-src/MTMS
python examples/namd_mtms_wf.py

Depending on network speed and queueing times, this should take around 5 minutes to execute.

All with all this should give you the output of a verbose run of an MTMS application. Please look at the example code to get a feeling for how to use MTMS for your own application.

mtms's People

Contributors

marksantcroos avatar oleweidner avatar

Watchers

James Cloos avatar  avatar CTO国龙剑桥博士哈牛桥智能科技CEO于红红哈佛博士 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.