Git Product home page Git Product logo

conductor's Introduction

๐ŸŽถ Conductor

Conductor is a simple and elegant tool that helps with orchestrating your research computing. Conductor helps with automating your research computing pipeline, all the way from experiments to figures in your paper.

Note: Conductor is still under active development. Its usage and system requirements are subject to change between versions. Conductor uses semantic versioning. Before the 1.0.0 release, backward compatibility between minor versions will not be guaranteed.


Installation

Conductor requires Python 3.10+ and is currently only supported on macOS and Linux machines. It has been tested on macOS 10.14 and Ubuntu 20.04.

Conductor is available on PyPI and so it can be installed using pip.

pip install conductor-cli

After installation, the cond executable should be available in your shell.

cond --help

Note that if you install Conductor locally on a Linux machine (e.g., using pip install --user conductor-cli), you may need to add $HOME/.local/bin to your $PATH to get access to the cond executable in your shell.

Documentation and Getting Started

A quick way to get started is to look at the example projects under the examples directory. For more details, please check out Conductor's reference documentation here.

Acknowledgements

Conductor's interface was largely inspired by Bazel and Buck.

conductor's People

Contributors

geoffxy avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

conductor's Issues

Experiment versioning and tracking

We need to store all experiment data and keep track of it by git hash and the time at which it was generated. We will also want to include machine specifications. cond archive will be used to save and restore these versions.

Run as much of a task as possible when there are failures

Right now Conductor aborts a task the moment one of its dependencies fails. We should (by default) try to run as much of the task as possible (i.e., run the other dependent tasks in the task's transitive closure that don't rely on outputs of the failed task). We can also add a command line option to restore the existing "fail early" behavior.

Add progress indicator

Some tasks have lots of dependencies and take a long time to run. It would be nice to have a progress indicator to indicate how many subtasks remain.

Record task output (stdout and stderr)

We should record the task's output (stdout and stderr) automatically and save it with the rest of the task outputs. We should definitely do this for run_experiment() tasks (maybe run_command() too).

Git integration to record experiment commit hashes

We should integrate with Git to record experiment commit hashes. This integration will also be useful down the road for other features related to a repository's state. The Git integration should degrade gracefully if the user is not using Git in their project.

  • Python utility class for interfacing with git (#43)
  • Modify version index persistent format to support commit hash tracking
  • Configuration option to disable git tracking (#46)
  • Graceful fallback when the Conductor project is not tracked using git (#45)
  • Record commit hash when running an experiment (#45)
  • Experiment version selection using closest commit ancestor (#45)

Add options and args support to run_command() tasks

Right now only run_experiment() tasks support the options={...} and args=[...] options. Having these options available to run_command() would be nice when the command we want to run requires lots of arguments (i.e., the run_command() task would be easier to read).

Add color to informational output

Conductor now outputs a lot of status information during execution (in part due to #58). Using colors for this output would help better distinguish between the task's output and Conductor's output.

Add failed experiments garbage collection

We should add a utility command to Conductor that removes any experiment task directories that are not in the version index (i.e., this would let us garbage collect any failed experiment task directories under cond-out).

Functionality needed for the first feature release

Required

  • Declaring a task instance
  • Adding a "runnable" task (i.e. run something)
  • Declaring task dependencies
  • Processing an execution plan with task instances and dependencies
  • Ability to run a complete execution plan
  • cond clean functionality
  • cond archive functionality
    • Experiment versioning by git hash and date time (#7)
    • Timestamp based identifiers (handle deduplication too)
  • cond restore
  • Release scripts
  • Tests (#6)
  • "Combine" pseudo-task

Nice to have

  • CI integration for tests, formatting, and type checking
  • Relative task identifiers for deps (#5)
  • Consider renaming .condconfig to a non-dotfile? (e.g., cond_config.toml)

Moved to next version

  • Task aliases in cond_config.toml

Add ability to "test run" a COND file to resolve definition errors

It's useful to be able to validate that the tasks in a COND file were defined correctly, especially since we use Python to programmatically create tasks. The check should not actually run any of the tasks defined in the file.

This is useful because the best practice right now is to commit the COND file changes first and then run. But if there is an error, we end up making many small commits while fixing up the file.

Add support for implied task names on the command line

If a task has the same name as their parent directory, it would be convenient if we could refer to it on the command line (when running cond run) without its trailing name (i.e., just the task identifier path).

For example task world defined in hello/world/COND could also be referred to as //hello/world as well instead of its fully qualified name, //hello/world:world. This would only apply to the CLI, not to dependencies in COND files.

Run tasks again for current commit

Now that we have git integration, it would be useful to be able run a task again for the current commit. The motivation is that using --again force reruns the task and its dependencies. But if a previous --again run had partially succeeded, it's not straightforward to retry the run without rerunning successfully completed dependent tasks.

Allow providing extra arguments in the task definition

It would help to be able to declare arguments that should be passed to the executable in a run_experiment() and run_command() as part of the task definition in the COND file. This is because certain experiments may require many arguments and it is tedious to either (i) create a separate script for each instance, or (ii) tack them on to the run argument (making it really long).

For example

run_experiment(
  name="benchmark",
  run="./run_benchmark.sh",
  args={
    memory: "2G",
    threads: "3",
  },
)

would translate to running ./run_benchmark.sh --memory=2G --threads=3.

Initial execution planning

Conductor should be able to parse the dependency graph to create an execution plan and detect any errors in the graph (e.g., dependency cycles). Conductor should then be able to execute the plan.

  • Assemble dependencies into a graph
  • Validate graph and generate an execution plan
  • Execute the execution plan
    • Set up input/output paths properly
    • Detect re-execution of cached results

Investigate improvements to task type parsing code

The way we currently implement task types is not ideal (a bit too hacky), especially with our need to separately compile the Conductor "standard library" task types. The code can probably be refactored and improved.

Help users with generating experiment instances through parameter sweeps

Often one needs to generate a parameter sweep, or several run_experiment() instances where certain parameters need to be varied. We should help users with this and provide a nice way to specify a "configuration" where the configuration can then also be stored with the experiment task output.

Add task timeouts

Specifying a timeout for a task can be useful if there is a risk of the task running for longer than expected.

Set up testing infrastructure

This project should be well tested. We need to set up tests for our Python code.

  • Set up initial testing infrastructure
  • Add a reasonable number of tests for the core parts of Conductor
  • Set up a GitHub Action workflow that runs the tests

Ignore identical versions when restoring

Sometimes we may want to restore results multiple times (to include new results). It would be nice to ignore any cached results that have an identical version to existing cached results.

Improve output format when running tasks in parallel

Conductor currently prints blank lines between progress messages. This works fine when tasks are executed sequentially. But when we launch tasks in parallel, the spacing between messages is less useful (and can be confusing).

Add ability to parallelize tasks

Some tasks may be parallelizable (in the sense that they are "safe" (e.g., not performance sensitive) to run concurrently with other tasks). It would be useful if the user could specify this when defining their tasks. Then Conductor should launch parallelizable tasks concurrently on behalf of the user.

  • Refactor task execution to enable parallel execution (#47)
  • Implement parallel execution (#48)

Add command to help locate experiment results

Experiment results are stored in directories with a timestamp for uniqueness. However this makes it a bit cumbersome to find the latest results for a particular task. We should provide a way for users to easily get this information (e.g., a command that provides the path to the task's outputs).

Launching cmake from run_command() does not seem to work

Running cmake from run_command() seems to result in an error:

CMake Error: Could not find CMAKE_ROOT !!!
CMake has most likely not been installed correctly.
Modules directory not found in

CMake Error: Error executing cmake::LoadCache(). Aborting.

Maybe this is due to PATH problems or some other environment misconfiguration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.