Git Product home page Git Product logo

modeling-system-resource-usage-for-predictive-scheduling's Introduction

Modeling System Resource Usage for Predictive Scheduling

A machine learning-based solution for real-time resource allocation in the cloud

Consulting Project

Project Summary

I consulted for a project using machine learning and neural networks to model and predict resource usage.The company offers startups and new developers a simple yet revolutionary platform to manage their applications and cloud resources in one place. IT also serves as a marketplace for additional applications and services, as well as an easy way to work with multiple APIs in one location.

As a consultant, I wanted to get a sense of Company’s needs to help them reach their goals. Company was interested in gaining a sense of their current network resource usage and provisioning. In addition, Company wanted to move towards a machine learning approach to make real-time usage predictions and automate scheduling of provisions. This is especially important because they were spending valuable hours manually setting limits, at a continuously high resource limit to minimize downtime (i.e., system crashing). Historically, companies like these have to balance the competing needs of minimizing cost of paying for CPU bandwidth, for example, but also minimizing downtime by slightly over provisioning. This was even evidenced in the publicly available dataset I used for my data analysis, because of an NDA agreement. Therefore, my role as a consultant was to develop a model that will provide a more intelligent prediction of resource usage.

Here is how I translated Company’s business objectives into an actionable deliverable.

First, I used time-series analysis, advanced regression techniques, and time series cross-validation in Python (using sklearn) to characterize resource usage as well as to identify important predictive features.

Next, I implemented DeepAR, a recently developed built-in algorithm from Amazon Sagemaker (hosted on AWS) to help Company shift towards real-time analytics. Amazon SageMaker DeepAR is a supervised learning algorithm used to forecast time series using recurrent neural networks (RNN).

###I havent used Amazon SageMaker before but due to the agreement i had to use it.###

Overview of my analysis pipeline:

analysis_pipeline

Summary of the deliverables I provided to Company:

summary_deliverables

Notebook Organization

  1. Manifold_TimeSeries_Models.py: - Modeling 500 time series using sklearn. Techniques include feature engineering, rolling window averages (smoothing), linear regression, scaled regression, and lasso, and ridge regression to perform feature selection and reduce overfitting.

  2. Manifold_AWS_DeepAR.py: AWS Jupyter notebook for Sagemaker DeepAR. Includes code to download and read in data, format into JSON strings, push to S3 bucket, create train a recurrent neural network (RNN), and visualize model predictions.

  3. Manifold_Visualize_Initial_Explore.py.ipynb: - Includes some initial visualizations of data (aggregated and resampled hourly) using python and matplotlb.

  4. Timeseries_FirstLook_1month.py.ipynb - Contains code for exploring timeseries from 1 month and 100 VMs. Includes initial models using ARIMA, SARIMAX, Holt-Winters (smoothing), some visualizations, and stationarity tests.

  5. HyperparameterTuning_DeepAR_Example.ipynb: Example code for hyperparameter tuning request in AWS sagemaker.

Helpful Resources

  1. Data publically available from The Grid Workloads Archive (Bit Brains- I used the rnd traces)

  2. AWS Sagemaker Documentation

  3. Helpful blog and source of some code here

  4. ARIMA description

modeling-system-resource-usage-for-predictive-scheduling's People

Contributors

shubhamtiwari10 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.