Git Product home page Git Product logo

aws-lambda-autoscale-provisioned-concurrency-example's Introduction

Autoscale AWS Lambda provisioned concurrency

Provisioned concurrency is a way to prepare certain number of AWS Lambda execution environments in advance to respond immediately to incoming requests. It’s a way to minimise impact of a cold start on response latency. Application Auto Scaling allows you to manage provisioned concurrency on a schedule or based on utilisation. In this pattern I will show how to use Amazon CloudWatch monitoring features and Artillery load test framework to fine tune utilisation based autoscaling.

Prerequisites

Before start make sure you have installed following tools:

To make life easier it is recommended to also have:

Limitations

  • Bear in mind that extensive load test generates significant amounts of lambda executions that can lead to additional cost. Setup billing alerts and tweak the default test scenario settings to keep cost under control.
  • New AWS Accounts can sometimes have lower quotas set for concurrent execution. Follow up this guide if you need your quotas increased.
  • API Gateway does not have any authorisation controls or WAF protection enabled for the simplicity of this example. For production workflow you should always consider those.

Installation

To install app dependencies run npm i.

Architecture

Image Target architecture is a simplistic representation of a backend application. Amazon API Gateway is the entry point for the end client - in this case this will be Artillery.io test framework. The API exposes two endpoints /custom and /standard which are backed by AWS Lambda functions. Each of them is actually the same code doing nothing except sleeping for a defined amount of time to simulate latency. The only difference lays in the autoscaling setting that we want to compare.

Provisioned concurrency for each Lambda function is initially set to 1. Application Auto Scaling is responsible of scaling it up and down based on the Provisioned Concurrency Utilisation metric. The standardTestFun uses predefined settings for Provisioned Concurrency Utilisation metric which means the alarm is set on when three consecutive data points are above threshold. Each data point is the average utilisation over a one minute window. The customTestFun uses custom metric. The alarm is also set on when three consecutive data points are above the same threshold but this time each data point is calculated as the maximum utilisation. This makes the auto scaling more responsive to even short spikes in traffic load.

The observability part comes with an Amazon CloudWatch Dashboard that consists of several widgets presenting useful metrics:

  • Api Latency: Latency is the time between when API Gateway receives a request from a client and when it returns a response to the client. Integration latency is the time between when API Gateway relays a request to the backend and when it receives a response from the backend. The latency includes the integration latency and other API Gateway overhead.
  • Api Number of Calls: Count is the total number of API requests in a given period (in this example 30 seconds). 4XXError is the number of client-side errors and 5XXError is the number of server-side errors captured in a given period.
  • Api Calls Per Second: This is the calculation done to illustrate number of API request per second
  • Lambda Duration: Avarage and maximum duration time for each function
  • Lambda Invocations: Graphs without suffix present total number of invocations for each function. The '-Spillover' graphs show how many invocation are against instances that are not in the Provisioned pool.
  • Lambda Concurrent Executions: Graphs without suffix show number of parallel Lambda instances being executed. The -'Provisioned' graphs show how many of those parallel Lambdas are from Provisioned pool.
  • Summary: These are the calculations from the API access logs for each endpoint showing time statistic.

Using these metrics one can observe how the scaling performs during different load, how this influences latency and in the end fine tune the desired autoscaling behaviour. For example on above picture you can see that with each spike the customFun uses more and more provisioned instance because this metric is more sensitive in scaling on spikes.

Usage

Deploy this stack to your AWS account

cdk deploy -O output.json

Run load tests. Script extracts API Gateway url from output.json

./run_artillery.sh load-test-spikes.yaml

Clean up

cdk destroy && rm ./output.json

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

aws-lambda-autoscale-provisioned-concurrency-example's People

Contributors

amazon-auto avatar dependabot[bot] avatar wolanlu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

is-os

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.