Git Product home page Git Product logo

htmap's Introduction

HTMap

PyPI version Anaconda-Server Badge

Documentation Status Build Status Tests codecov GitHub issues GitHub pull requests

Jump right in to the tutorials with Binder! Binder

HTMap is a library that wraps the process of mapping Python function calls out to an HTCondor pool. It provides tools for submitting, managing, and processing the output of arbitrary functions.

Our goal is to provide as transparent an interface as possible to high-throughput computing resources so that you can spend more time thinking about your own code, and less about how to get it running on a cluster.

For tutorials, installation instructions, and more details, see the documentation.

Please post bug reports and feature requests to our issue tracker.

htmap's People

Contributors

duncanmmacleod avatar joshkarpel avatar matyasselmeci avatar stsievert avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

htmap's Issues

Implement *_or_recover functions

We should implement a suite of <map_function>_or_recover() functions that either run or recover a map with a given map_id. Will prevent users from needing to wrap all of their persistent map calls in try/except blocks themselves.

MapResult job modification shortcuts

It should be possible to inspect and modify jobs from a MapResult just like you can via condor_<x> commands. Common operations (condor_rm <clusterid>, condor_q -held -af <stuff>, condor_qedit etc.) should have convenient shortcuts.

Add CLI

There should be a small set of common tasks available via CLI:

  • remove a map
  • rename a map
  • display the status of all maps
  • display the status of a map

This will help non-notebook users work with pre-built scripts instead of purely in a REPL.

Implement caching for transplant delivery

Introspect the package list and compare to what is currently cached. I think we only need to store one Python install at a time - most people doing scientific work probably do what I do and throw everything in one install.

MapResults should be singletons

I think this just needs to be enforced at the recovery level. Maps would register themselves in a dictionary when they are submitted or recovered, and clear themselves from the dictionary when removed.

Print Python info on execute node

We should print a message containing information about the Python executable to stdout on the execute node.

Unfortunately, it looks like sys.executable can be an empty string under certain conditions, which will make this very tricky.

Handling Python dependencies on the execute node

We need to figure out how to get Python to the execute node. Two things to keep in mind:

  1. The run.py script that we run on the execute node has one implicit dependency beyond whatever the user's function needs: cloudpickle.
  2. Some people don't need this at all - they have full control over their cluster, and can install Python on all of their execute nodes. So the system should probably be "pluggable" to some degree.

See https://docs.google.com/document/d/1rB711Qe7xg6gDH6smavRwjjV6c0J8DWgMVUoZjAGtJ0/edit?usp=sharing

Import/Export maps from/to other users

It should be possible to transfer a map to another user.

Some connection to #47 , which will allow an imported map to be re-run by replacing certain submit variables in a very predictable way.

Logging integration

We should output log messages when stuff happens, using the standard Python logging module.

License

This project should probably have a license

Occasional unpickling error when iterating over output

Occasionally we try to unpickle an object from a file that HTCondor is still trying to transfer. This tends to throw weird pickling errors.

We should either handle the error gracefully, or use the event tracker to avoid trying to load partially-transferred files.

Batch inputs within a single map

There's significant overhead in running very short jobs, so there should be a way to batch up multiple inputs into a single job. The user just sets the number of inputs per batch, and we handle grouping them up and un-grouping the output.

This should just involve adding one more layer of iterables on the execute and output points.

Unexpected output file when exception occurs during execute

This:

import htmap

@htmap.htmap
def zero(x):
    return 1 / x

results = zero.map('zero', range(1), force_overwrite = True)
list(results)

produces an output file despite getting a ZeroDivisionError during execute.

Expected behavior: no output file.

Provide simpler way of passing custom descriptors to MapOptions

Current plan:

MapOptions(
    request_memory = '10MB',
    custom_options = {'WantFlocking': 'true'}
)

We will coerce all of the keys in custom_options to have a leading +, so that they can be entered with

  • a leading +
  • a leading MY. (case-insensitive)
  • no + or MY. (preferred!)

Query caching

Queries against HTCondor daemons should be cached for a brief period of time to prevent spamming them.

Behavior of MapResult after calling remove is unclear

It's not obvious what a MapResult should do after it's been removed. Right now, things will just fail with vague plain-Python exceptions.

We can't delete the MapResult ourselves. Marking it as as "active" gives it a lot of state that could be hard to manage, and means that lots of methods would need an "am I active?" check at the top that could get ugly.

Map options

It should be possible to pass in submit descriptors ("map options") at the global settings, per-mapped-function, and individual map level.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.