Git Product home page Git Product logo

aml-kubernetes's Introduction

Azure Arc-enabled Machine Learning

As part of Azure Machine Learning (AzureML) service capabilities, Azure Arc enabled ML extends Azure ML service capabilities seamlessly to Kubernetes clusters, and enables customer to train and deploy models on Kubernetes anywhere at scale. With a simple AzureML extension deployment, customer can instantly onboard their data science team with productivity tools for full ML lifecycle, and have access to both Azure managed compute and customer managed Kubernetes anywhere. Customer is flexible to train and deploy models wherever and whenever business requires so. With built-in AzureML pipeline and MLOps support for Kubernetes, customer can scale machine learning adoption in their organization easily. Built on top of Azure Arc hybrid cloud platform, Arc enabled ML natively supports on-premise machine learning with Kubernetes.

This repository is intended to serve as an information hub for customers and partners who are interested in Azure Arc enabled ML public preview. Use this repository for onboarding and testing instructions as well as an avenue to provide feedback, issues, enhancement requests and stay up to date as the preview progresses.

Prerequisites

  1. An Azure subscription. If you don't have an Azure subscription, create a free account before you begin.
  2. A Kubernetes cluster up and running - We recommend minimum of 4 vCPU cores and 8GB memory, around 2 vCPU cores and 3GB memory will be used by Azure Arc agent and AzureML extension components.
  3. A kubeconfig file and context pointing to the Kubernetes cluster.
  4. Install the latest release of Helm 3.
  5. The Kubernetes cluster is connected to Azure Arc. For Azure Kubernetes Services (AKS) cluster, connection to Azure Arc is optional.
  6. (Optional) If the Kubernetes cluster is behind of an outbound proxy server or firewall, please ensure to meet network requirements.
  7. Meet the pre-requisites listed under the generic cluster extensions documentation.
    • Azure CLI version >=2.24.0
    • Azure CLI extension k8s-extension version >=1.0.0.
  8. Create an AzureML workspace if you don't have one already.

Getting started

Getting started with ArcML is easy with following steps:

  1. Install AzureML extension
  2. Attach Kubernetes cluster to AzureML workspace and create a compute target
  3. Train image classification model with AzureML CLI v2
  4. Train image classification model with AzureML Python SDK 1.30 or above
  5. Deploy an image classification model - create an endpoint with blue/green deployment

Using Azure Stack HCI:

Trouble shooting:

Supported AzureML built-features and ArcML unique features

Azure Arc enabled ML essentially brings a new compute target to Azure Machine Learning service. With this new Kubernetes compute target, customer can use existing Azure ML tools and service capabilities to train or deploy model on Kubernetes anywhere. You can git clone available Azure ML examples on the internet, with simple change of compute target name, and those examples will run seamlessly on Kubernetes.

AzureML built-in features ArcML support
Train with the CLI (v2)
Train with the job creation UI
Train with the Python SDK
Train with the REST API
Basic Python training job
Distributed training - PyTorch
Distributed training - TensorFlow
Distributed training - MPI
Sweep hyperparameters
Batch Inference
Deploy model with Online Endpoint - safe production rollout with blue/green deployment
Deploy model with Online Endpoint - REST endpoint with public/private IP and full TLS encryption
Deploy model with isolated Azure Kubernetes Service - VNET and private link support
Build and use ML Pipeline (Python)
Train with Designer pipeline
Train and deploy with AutoML (Python)
Use managed identity to access Azure resources
Deploy model with Designer UI Coming soon
Train and deploy with AutoML (Studio UI) Coming soon
Deploy model with Batch Endpoint Coming soon
Use Azure Private Link to securely access Azure services

In addition to above built-in AzureML features, ArcML also supports following unique machine learning features:

Supported Kubernetes distributions and versions

  • Azure Kubernetes Services (AKS)
  • AKS Engine
  • AKS on Azure Stack HCI
  • Azure RedHat OpenShift Service (ARO)
  • OpenShift Container Platform (OCP)
  • Google GKE
  • Canonical Kubernetes Distribution
  • Amazon EKS
  • Kind
  • K3s-Lightweight Kubernetes
  • Kubernetes 1.18.x, 1.19.x, 1.20.x, 1.21.x, 1.22.x

Support

We are always looking for feedback on our current experiences and what we should work on next. If there is anything you would like us to prioritize, please feel free to suggest so via our GitHub Issue Tracker. You can submit a bug report, a feature suggestion or participate in discussions.

Or reach out to us: [email protected] if you have any questions or feedback.

Activities

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Impressions

Disclaimer

The lifecycle management (health, kubernetes version upgrades, security updates to nodes, scaling, etc.) of the AKS or Arc enabled Kubernetes cluster is the responsibility of the customer.

For AKS, read what is managed and what is shared responsibility here

All preview features are available on a self-service, opt-in basis and are subject to breaking design and API changes. Previews are provided "as is" and "as available," and they're excluded from the service-level agreements and limited warranty.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.