Git Product home page Git Product logo

scinet's Introduction

License Python 3.7, 3.8 Hits

SciNet

Open in gitpod

The rise of distributed cloud computing technologies has been pivotal for the large-scale adoption of Artificial Intelligence (AI) based applications for high fidelity and scalable service delivery. Systematic resource management is central in maintaining optimal Quality of Service (QoS) in cloud platforms and is divided into three fundamental types: resource provisioning, AI model deployment and workload placement. To exploit the synergy among these decision types, it becomes imperative to concurrently design (co-design) the provisioning, deployment and placement decisions for optimal QoS. As users and cloud service providers shift to non-stationary AI-based workloads, frequent decision making imposes severe time constraints on the resource management models. Existing AI-based solutions often optimize decision types independently and tend to ignore the dependencies across various system performance aspects such as energy consumption and CPU utilization, making them perform poorly in large-scale cloud systems. To address this, we propose a novel method, called SciNet, that leverages a co-simulated digital-twin of the infrastructure to capture inter-metric dependencies and accurately estimate QoS scores. To avoid expensive simulation overheads at test time, SciNet trains a neural network based imitation learner that aims to mimic an oracle, which takes optimal decisions based on co-simulated QoS estimates. Offline model training and online decision making based on the imitation learner, enables SciNet to take optimal decisions while being time-efficient. Experiments with real-life AI-based benchmark applications on a public cloud testbed show that SciNet gives up to 48% lower execution cost, 79% higher inference accuracy, 71% lower energy consumption and 56% lower response times compared to the current state-of-the-art methods.

Quick Start Guide

Installation.

# install prerequisites
sudo apt -y update && sudo apt install -y rsync python3-pip
pip3 install --upgrade pip
pip3 install matplotlib scikit-learn
pip3 install -r requirements.txt
export PATH=$PATH:~/.local/bin
sudo chmod 400 keys/id_rsa

# install Azure CLI
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash

Or use pre-installed gitpod.

Quick test.

# deploy a single VM and print VM IP
python3 debug/deployazure.py

# test function
http http://<public_ip>:7071/api/onnx @debug/babyyoda.jpg > output.jpg

Running the SciNet model

python3 main.py

Cite this work

Our work is published in IEEE Transactions on Computers. Cite using the following bibtex entry.

@article{tuli2023scinet,
  author={Tuli, Shreshth and Casale, Giuliano and Jennings, Nicholas R.},
  journal={IEEE Transactions on Computers}, 
  title={{SciNet: Co-Design of Resource Management in Cloud Computing Environments}}, 
  year={2023}
}

License

BSD-3-Clause. Copyright (c) 2022, Shreshth Tuli. All rights reserved.

See License file for more details.

scinet's People

Contributors

shreshthtuli avatar

Watchers

 avatar  avatar

Forkers

helloqian12138

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.