Git Product home page Git Product logo

weiz-grid's Introduction

###############################################################################
## WeizGrid
##  By Stav Yagev, 2013 (Please contact me if you find bugs !)
##
## 
## Simple framework for using the SGE cluster on Weizmann for parallel work.
##
## FEATURES:
##  - Helps split your own code into fractions that can be run in parallel on the cluster.
##  - Allows you to aggregate results in an efficient manner.
##  - Write your code ONCE! Execute the same code on PC for debugging and on the cluster
##      for production.
##  - Recover from errors on the cluster - splitting means if 1 iteration out of a 
##      1000 failed you still have 999 iterations in your hand!
##
## Use case example: 
##  You have a job that processes 1000 images in some way, running the same algorithm 
##  for each image, outputing something for each image, and finally aggregating the 
##  results from all images. Up until now, you ran this in a single job that took 1000 
##  minutes (because lets assume it takes 1 minute to process each image). Using WeizGrid 
##  you can take advantage of the fact that the job can be parallelized - instead of 1
##  job WeizGrid helps you easily split the job to 80 parallel jobs so that you get all 
##  1000 images in 1000/80=12.5 minutes (!!)
##


Auto Installation:
==================
1. Copy all the files to ~/WeizGrid
2. In a UNIX terminal, type:
    chmod +x ~/WeizGrid/wgsetup
3. Then type:
    ~/WeizGrid/wgsetup

Usage:
======
1. Create files in the spirit of 'sample.m' and 'calcPrimes.m' (make sure 
	the WG m-files are in MATLAB's path)
2. Upload your files to UNIX
3. Under UNIX, from the direcotry of YOUR project, run: 
        ~/WeizGrid/qsubWG [name] [your-m-file] [queue]

    Example: ~/WeizGrid/qsubWG ParallelCracker sample all.q

4.  Enjoy!





Manual Installation (if auto doesn't work...):
=============================================
1. Copy all files to ~/WeizGrid.
2. Make sure all bash scripts have execute permission.
2. Create the following directory structure (1 folder for each queue you intend to use):
        ~/.matlab/cluster_jobs/all.q
        ~/.matlab/cluster_jobs/test.q
        ...
    NOTE: The ~/.matlab/ usually already exists but is hidden, so make sure you 
            are viewing hidden files and folders .




Tips & Troubleshooting:
=======================
- If your algorithm uses a random number, make sure you are not setting the RngShuffle
    option to false when invoking WGexec - because this will yield the same random
    series for every "parallel" piece of work.
- Sometimes, during the aggregation part of your script, there will be a bug. Then,
    you might think all your work is lost! This is not the case, look into part #2
    of the documentation of WGgetResults for more information.
- There isn't verbose error checking, so if something doesn't work, check the output
    files on your UNIX home for hints... If still unsuccessful, feel free to contact 
    me for help.

weiz-grid's People

Contributors

syagev avatar fringefy avatar

Watchers

James Cloos avatar  avatar

weiz-grid's Issues

MATLAB shortcuts

Include MATLAB shortcuts that can spawn jobs and grid jobs seamlessly from the GUI.

Subset of iterations

Add a mechanism that allows to run only a subset of iterations from the ones that are defined by the localParams struct array. This is useful especially when you want to "redo" some iterations you were not satisfied with.

GlobalParams

Change so that GlobalParams is saved in only one file and not duplicated for each worker - that way it is more efficient to pass big variables. Also, make it possible to pass a list of file names to be loaded - that way if the data you want to pass is already in a file you don't have to duplicate it.

Make a Java client for seamless grid execution

The idea is something of the sort:

wg_begin
  ...
  < matlab statements >
  optionally: WGexec, WGwaitForResults
  ...
wg_end

Implementation could use a Java-based client that uses Putty to push the required files to UNIX, submit the job and somehow synchronously wait for the results.

This completely replaces the need for the MATLAB shortcuts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.