Git Product home page Git Product logo

saxml-sandbox's Introduction

Deploying Saxml on Google Kubernetes Engine

This reference guide introduces a reference architecture for deploying Saxml on Google Kubernetes Engine (GKE) and offers comprehensive guidelines for serving models developed in Paxml, JAX, and PyTorch on Cloud TPU v5e, including the Llama2 series.

Saxml on GKE high level architecture

The diagram below depicts a high-level architecture of the Saxml system on Google Kubernetes Engine.

arch

Saxml deployment configuration

WIP

Environment setup

Prerequisites

Create set project

Terraform service account

Terraform state bucket

Services

Provision environment

Provision infrastructure
Clone repo
Update terraform.tfvars
Run Cloud Build

Deploy Saxml and http proxy

Deploy Locust

WIP

Initialize Terraform

cd environment/infrastructure

export TF_STATE_BUCKET=jk-mlops-dev-tf-state
export TF_STATE_PREFIX=gke-tpu-serving-environment

terraform init -backend-config="bucket=$TF_STATE_BUCKET" -backend-config="prefix=$TF_STATE_PREFIX"

Apply configuration

REMINDER !!!!. Modify permissions for the saxml_sa to limit PubSub access - currently it is pubsub.admin

export PROJECT_ID=jk-mlops-dev
export REGION=us-central2
export ZONE=us-central2-b
export SAXML_ADMIN_BUCKET_NAME=jk-saxml-admin-bucket
export MODEL_REPOSITORY_BUCKET_NAME=jk-saxml-model-repository
export NETWORK_NAME=jk-gke-network
export SUBNET_NAME=jk-gke-subnet
export CLUSTER_NAME=jk-saxml-cluster
export NAMESPACE=saxml
export TPU_TYPE=v4-8
export NUM_TPU_POOLS=1

terraform apply \
-var=project_id=$PROJECT_ID \
-var=cluster_name=$CLUSTER_NAME \
-var=region=$REGION \
-var=zone=$ZONE \
-var=network_name=$NETWORK_NAME \
-var=subnet_name=$SUBNET_NAME \
-var=saxml_namespace=$NAMESPACE \
-var=repository_bucket_name=$MODEL_REPOSITORY_BUCKET_NAME \
-var=saxml_admin_bucket_name=$SAXML_ADMIN_BUCKET_NAME \
-var=tpu_type=$TPU_TYPE \
-var=num_tpu_pools=$NUM_TPU_POOLS 

Destroy infrastructure

export PROJECT_ID=jk-mlops-dev
export REGION=us-central2
export ZONE=us-central2-b
export SAXML_ADMIN_BUCKET_NAME=jk-saxml-admin-bucket
export MODEL_REPOSITORY_BUCKET_NAME=jk-saxml-model-repository
export NETWORK_NAME=jk-gke-network
export SUBNET_NAME=jk-gke-subnet
export CLUSTER_NAME=jk-saxml-cluster
export NAMESPACE=saxml
export TPU_TYPE=v4-8
export NUM_TPU_POOLS=0


terraform destroy \
-var=project_id=$PROJECT_ID \
-var=cluster_name=$CLUSTER_NAME \
-var=region=$REGION \
-var=zone=$ZONE \
-var=network_name=$NETWORK_NAME \
-var=subnet_name=$SUBNET_NAME \
-var=saxml_namespace=$NAMESPACE \
-var=repository_bucket_name=$MODEL_REPOSITORY_BUCKET_NAME \
-var=saxml_admin_bucket_name=$SAXML_ADMIN_BUCKET_NAME \
-var=tpu_type=$TPU_TYPE \
-var=num_tpu_pools=$NUM_TPU_POOLS 


Deploy Saxml on GKE application components


cd environment/applications

PROJECT_ID=jk-mlops-dev
CONVERTER_IMAGE_URI=gcr.io/$PROJECT_ID/checkpoint-converter
MACHINE_TYPE=e2-highcpu-8
PAXML_VERSION=1.2.0
CLOUD_SDK_VERSION=google-cloud-cli-453.0.0-linux-x86_64.tar.gz


gcloud builds submit \
--project $PROJECT_ID \
--config build.yaml \
--substitutions _CONVERTER_IMAGE_URI=$CONVERTER_IMAGE_URI,_PAXML_VERSION=$PAXML_VERSION,_CLOUD_SDK_VERSION=$CLOUD_SDK_VERSION \
--machine-type=$MACHINE_TYPE \
--quiet

Serving workloads examples

WIP

saxml-sandbox's People

Contributors

jarokaz avatar

Watchers

 avatar

Forkers

rajeshthallam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.