Git Product home page Git Product logo

dask-ec2's Introduction

Dask EC2 Build Status Coverage Status

Easily launch a cluster on Amazon EC2 configured with dask.distributed, Jupyter Notebooks, and Anaconda.

DEPRECATED

This project is not actively maintained. Instead, to deploy Dask on EC2 we recommend the use of Kubernetes. See dask.pydata.org/en/latest/setup/cloud.html for up-to-date information.

Installation

You also install dask-ec2 using pip:

$ pip install dask-ec2

You can also install dask-ec2 and its dependencies from the conda-forge repository using conda:

$ conda install dask-ec2 -c conda-forge

Usage

Note: dask-ec2 uses boto3 to interact with Amazon EC2. You can configure your AWS credentials using Environment Variables or Configuration Files.

The dask-ec2 up command can be used to create and provision a cluster on Amazon EC2:

$ dask-ec2 up --help
Usage: dask-ec2 up [OPTIONS]

Options:
  --keyname TEXT                Keyname on EC2 console  [required]
  --keypair PATH                Path to the keypair that matches the keyname
                                [required]
  --name TEXT                   Tag name on EC2
  --tags TEXT                   Additional EC2 tags.  Comma separated K:V
                                pairs: K1:V1,K2:V2
  --region-name TEXT            AWS region  [default: us-east-1]
  --vpc-id TEXT                 EC2 VPC ID
  --subnet-id TEXT              EC2 Subnet ID on the VPC
  --iaminstance-name TEXT       IAM Instance Name
  --ami TEXT                    EC2 AMI  [default: ami-d05e75b8]
  --username TEXT               User to SSH to the AMI  [default: ubuntu]
  --type TEXT                   EC2 Instance Type  [default: m3.2xlarge]
  --count INTEGER               Number of nodes  [default: 4]
  --security-group TEXT         Security Group Name  [default: dask-ec2-default]
  --security-group-id TEXT      Security Group ID (overwrites Security Group
                                Name)
  --volume-type TEXT            Root volume type  [default: gp2]
  --volume-size INTEGER         Root volume size (GB)  [default: 500]
  --file PATH                   File to save the metadata  [default:
                                cluster.yaml]
  --provision / --no-provision  Provision salt on the nodes  [default: True]
  --anaconda / --no-anaconda    Bootstrap anaconda  [default: True]
  --dask / --no-dask            Install Dask.Distributed in the cluster
                                [default: True]
  --notebook / --no-notebook    Start a Jupyter Notebook in the head node
                                [default: True]
  --nprocs INTEGER              Number of processes per worker  [default: 1]
  --source / --no-source        Install Dask/Distributed from git master
                                [default: False]
  -h, --help                    Show this message and exit.

The minimal required arguments for the dask-ec2 up command are:

$ dask-ec2 up --keyname my_aws_key --keypair ~/.ssh/my_aws_key.pem

This will create a cluster.yaml in the directory that it was executed, and this file is required to use the other commands in the CLI.

Once a cluster is running, the dask-ec2 command can be used to create or destroy a cluster, ssh into nodes, or other functions:

$ dask-ec2
Usage: dask-ec2 [OPTIONS] COMMAND [ARGS]...

Options:
  --version   Show the version and exit.
  -h, --help  Show this message and exit.

Commands:
  anaconda          Provision anaconda
  dask-distributed  dask.distributed option
  destroy           Destroy cluster
  notebook          Provision the Jupyter notebook
  provision         Provision salt instances
  ssh               SSH to one of the node. 0-index
  up                Launch instances

dask-ec2's People

Contributors

arokem avatar danielfrg avatar jalessio avatar jbcrail avatar koverholt avatar mrocklin avatar mrphilroth avatar quasiben avatar stumitchell avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.