Git Product home page Git Product logo

pipeedge's Introduction

PipeEdge

PipeEdge is an inference framework that pipelines neural network (e.g., transformer) model shards on distributed devices. It includes an automatic partition scheduler which maps model layers to devices to optimize throughput.

Prerequisites

System dependencies:

  • Python >= 3.7
  • Compiler with C++17 support
  • CMake >= 3.8 (for C++17 support)
  • yaml-cpp >= 0.6.0

On MacOS:

brew install cmake yaml-cpp

On Debian (>= buster) or Debian-based Linux (including Ubuntu >= 20.04):

sudo apt-get install build-essential cmake libyaml-cpp-dev

We recommend using a Python virtual environment (virtualenv), e.g., on Debian-based Linux:

sudo apt-get install python3-venv

or directly with a system-installed pip:

pip3 install virtualenv

Create and activate the virtualenv:

python3 -m venv .venv
. .venv/bin/activate

Install the development package, Python package dependencies, and runtime application dependencies with:

pip install -U pip
pip install -e '.[runtime]'

Download model weight files (ViT files are from Google Cloud):

python save_model_weights.py

Optional dependencies:

System dependencies required for runtime monitoring:

  • EnergyMon - with a system-appropriate "default" library (which may have transitive dependencies)

Usage

For full usage help, run:

python runtime.py -h

To run with default parameters (using ViT-Base) on a single node:

python runtime.py 0 1

To run on multiple nodes, e.g., with 2 stages and even partitioning, on rank 0:

python runtime.py 0 2 -pt 1,24,25,48

and on rank 1:

python runtime.py 1 2 -pt 1,24,25,48

Partitioning

For example, the ViT-Base model has 12 layers, so the range is [1, 12*4] = [1, 48].

An even partitioning for 2 nodes is:

partition = [1,24,25,48]

An uneven partitioning for 2 nodes could be:

partition = [1,47,48,48]

A partitioning for 4 nodes could be:

partition = [1,4,5,8,9,20,21,48]

Automatic Partition Scheduling

In summary, the sched-pipeline scheduling application uses three input YAML files to map model partitions to devices (hosts). Automated profiling helps produce two of these files; the third lists available hosts and is straightforward to create for your deployment environment. For detailed instructions and documentation, see README_Profiler.md and README_Scheduler.md.

Point runtime.py to the YAML files using the options -sm/--sched-models-file, --sdt/--sched-dev-types-file, and -sd/--sched-dev-file. The runtime passes these through to the previously compiled scheduler application, along with other configurations like the model name and microbatch size. Then map the hosts specified in the third YAML file to the distributed ranks in your runtime using the -H/--hosts option. Do not specify the -pt/--partition option, which is for manually specifying the schedule and takes precedence over automated scheduling.

Datasets

GLUE CoLA

Supported by the following models:

  • textattack/bert-base-uncased-CoLA

The dataset will be automatically downloaded from huggingface.co.

Use in runtime.py with the option(s): --dataset-name=CoLA

ImageNet

Supported by the following models:

  • google/vit-base-patch16-224
  • google/vit-large-patch16-224
  • facebook/deit-base-distilled-patch16-224
  • facebook/deit-small-distilled-patch16-224
  • facebook/deit-tiny-distilled-patch16-224

This dataset cannot be downloaded automatically because a login is required to access the files. Register with image-net.org, then download ILSVRC2012_devkit_t12.tar.gz and at least one of ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar. Place the files in their own directory. The archives will be automatically parsed and extracted into a usable folder structure within the same directory.

Use in runtime.py with the option(s): --dataset-name=ImageNet --dataset-root=/path/to/archive_dir

Citation

If using this software for scientific research or publications, please cite as:

Yang Hu, Connor Imes, Xuanang Zhao, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, John Paul Walters, "PipeEdge: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices," 2022 25th Euromicro Conference on Digital System Design (DSD), 2022, pp. 298-307, doi: 10.1109/DSD57027.2022.00048.

@INPROCEEDINGS{PipeEdge,
  author={Hu, Yang and Imes, Connor and Zhao, Xuanang and Kundu, Souvik and Beerel, Peter A. and Crago, Stephen P. and Walters, John Paul},
  booktitle={2022 25th Euromicro Conference on Digital System Design (DSD)},
  title={{PipeEdge}: Pipeline Parallelism for Large-Scale Model Inference on Heterogeneous Edge Devices},
  year={2022},
  pages={298-307},
  doi={10.1109/DSD57027.2022.00048}}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.