Git Product home page Git Product logo

elb-metrics's Introduction

Build Status

elb-metrics

Node module which collects metrics from the ElasticLoadBalacer and makes assertions about the performance of the stack, by calculating the number of requests that were 2xxs, 3xxs, 4xxs, 5xxs and the time taken to get back a response (latency).

Install

git clone [email protected]:mapbox/elb-metrics.git
cd elb-metrics
npm install 
npm link 

You can use elb-metrics as a command line tool:

usage: elb-metrics --startTime --endTime --region --elbname

Note: The options you provide for the startTime and endTime should be unix timestamps. The region refers to the region of ElasticLoadBalancer you require the details for. The elbname refers to the name of the ElasticLoadBalancer which you can get by searching the AWS console.

Tests

After cloning and installing elb-metrics you can run tests by running:

npm test

elb-metrics's People

Contributors

aarthykc avatar tmcw avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elb-metrics's Issues

set up CI

Let's get this running on CI once tests are merged, so we don't break anything later on!

command line tool

add function to bin so that we can run it like -

elb-metrics --starttime -endtime 

input / output

Let's keep it simple in terms of what this tool provides:

  • Input: the bare minimum params you need to retrieve ELB metrics.
  • Output: an object that provides a high-level overview of results.

Current bench provides a high level overview of the whole time period, like

### Bench (600s)

- requests:     225.2/s
- connect mean: 142.0ms
- finish mean:  142.1ms
- payload mean: 3315.2
- non 200/304:  1602 (1.2%)
- cacheHit:     0 (0.0%)
- cacheRefresh: 0 (0.0%)
- cacheMiss:    0 (0.0%)
- cacheUnknown: 135119 (100.0%)

I would like to see us do the same. How about we return

  • Time period (number of seconds)
  • Average requests/second over the period
  • Average latency over the period
  • Percentage of total requests that were 2xx
  • Percentage of total requests that were 3xx
  • Percentage of total requests that were 4xx
  • Percentage of total requests that were 5xx

Here's an example of how I want to interact with this utility:

test('give me metrics', function(assert) {
  var params = {
    start: 1471356157000,
    end: 1471356167000,
    elbName: 'my-elb',
    region: 'us-east-1'
  };

  elbMetrics(params, function(err, res) {
    assert.ifError(err);
    assert.deepEqual(res, {
      'period': '10s',
      'requests': '500/s',
      'mean latency': '100ms',
      '2xx': '67%',
      '3xx': '20%',
      '4xx': '10%',
      '5xx': '3%'
    });
    assert.end();
  });
});

The elbMetrics function will be doing more work behind the scenes, and should be comprised of other functions that we test individually:

  1. take user parameters and combine them with other assumed parameters that we need to make a request to CloudWatch
  2. make the request to CloudWatch
  3. parse the results and do some math to find the averages/values we are interested in
  4. prepare the final object based on those values ^ and return it

Even though there's a lot of different work to do, the idea is to expose a single function, elbMetrics, to the user.

@aarthykc do you feel ๐Ÿ‘ about tackling each of those functions (and tests) separately and then putting them all together in a function like the test case above describes ?

Engineering Standards Adherence

Required Elements

If any elements in the below list are not checked, this repo will fail standards compliance.

  • Not running node 4 or below
  • Has at least some test coverage?
  • Has a README?

Rubric

  • 1 pt Is in Version Control/Github โœ… (free points)
  • 1-2 pt node version:
    • 2 pt Best: running node 8+ ๐Ÿ…
    • 1 pt Questionable: node 6
  • 1 pt CI enabled for repo?
  • 1 pt Not running Circle CI version 1? (Point awarded if using Travis)
  • 1 pt nyc integrated to show test coverage summary?
  • 1-3 pt test coverage percentage from nyc?
    • 3 pt High coverage: > 90%
    • 2 pt Moderate coverage: between 75 and 90% total coverage
    • 1 pt 0 - 74% test coverage
  • 1-2 pt evidence of bug fixes/edge cases being tested?
    • 2 pt Strong evidence/several instances noted
    • 1 pt Some evidence - I had a hard time deciding this one
  • 1 pt no flags to enable different functionality in non-test environments?
  • 1 pt Has README?
  • 1 pt Has CHANGELOG?
  • 1-2 pt README explains purpose of a project and how it works to some detail?
    • 2 pt High (but appropriate) amount of detail about the project
    • 1 pt Some detail about the project documented, could be more extensive
  • 1 pt README contains deploy/publish/install instructions?
  • 1 pt README contains CI badges, as appropriate?
  • 1-2 pt Code seems self-documenting: file/module names, function names, variables? No redundant comments to explain naming conventions?
    • 2 pt Strongly self-documented code, little to no improvements needed
    • 1 pt Some evidence of self-documenting code
  • 1 pt No potential security vulnerabilities are reported in dependencies?
  • 1 pt Package is scoped to @mapbox on NPM?
  • 1-2 pt master branch protected?
    • 1 pt PRs can only be merged if tests are passing?
    • 1 pt PRs must be approved before merging?
  • 2 pt BONUS: was this repo covered in a deep dive at some point? (part of bench, so, going to award points for that!)

Total possible: 24 points (+2 bonus)
Grading scale:

Point Total Qualitative Description Scaled Grade
20+ points Strongly adheres to eng. standards 5
16-19 points Adheres to eng. standards fairly well 4
12-15 points Adheres to some eng. standards 3
8-11 points Starting to adhere to some eng. standards 2
4-7 points Following a limited number of eng. standard practices 1
< 4 points Needs significant work, does not follow most standards 0

Repo grade: 15/24. Grade 3 (2018-08-08)

cc/ @mapbox/sreious-business

Prepare human-readable results

From #5, our goal for this project is some human-readable output like:

{
  'period': '10s',
  'requests': '500/s',
  'mean latency': '100ms',
  '2xx': '67%',
  '3xx': '20%',
  '4xx': '10%',
  '5xx': '3%'
}

Doing the math

  • period
    • Calculate this with the difference between startTime and endTime - should be in seconds, for now
  • requests
    • Add up all datapoints from RequestCount metric, and divide by period (ie number of seconds), to get number of requests per second
  • mean latency
    • Average of all datapoints from Latency metric
  • 2xx
    • Calculate total requests by adding all datapoints from RequestCount metric. Calculate total 2xx requests by adding all datapoints from HTTPCode_Backend_2XX metric. 2xx percentage will be total 2xx requests / total requests
  • 3xx
    • Rinse and repeat process from 2xx
  • 4xx
    • Rinse and repeat process from 2xx
  • 5xx
    • Rinse and repeat process from 2xx

Scope of work

โœ… New (synchronous) function prepareResults

  • takes, as input, raw metrics (ie what comes back from your outputMetrics function). also needs, as input, startTimeand endTime
  • output should be an object that looks something like what is below, making sure that units are also included
{
  'period': '10s',
  'requests': '500/s',
  'mean latency': '100ms',
  '2xx': '67%',
  '3xx': '20%',
  '4xx': '10%',
  '5xx': '3%'
}

โœ… Tests for prepareResults

  • test that all the math makes sense on a small scale (using small fixtures so you can do the math in your head to confirm it makes sense)
  • test that it handles edge cases gracefully -- e.g. what if there are no 5xx requests?
  • test any error cases we need to handle here

โœ… Integrate into elbMetrics function

  • should be able to feed this output into your new prepareResults function, and return that response to the end user

Compatibility with ELB2

ELB 2 / Application Load Balancing is a different namespace, and some of the metrics are named differently (ie "Latency" becomes "TargetResponseTime"). It would be great to be compatible with both versions of ELB. Can this be detected automatically, or passed as a flag, and the right metrics returned whether you're using classic ELB or ELB 2?

cc @aarthykc @yhahn

[getMetrics] prepare all of the queries !

We're close with getMetrics, but we need to make sure it gathers everything we're looking for. I'm seeing a need for a synchronous function that prepares all of your CloudWatch queries. It could be called something like prepareQueries or prepareParams.

metric name statistic to look for
HTTPCode_Backend_2XX count
HTTPCode_Backend_3XX count
HTTPCode_Backend_4XX count
HTTPCode_Backend_5XX count
RequestCount count
Latency average

@aarthykc let's add this to your branch in #1 and close this issue when you have the following:

  • new function in index.js that takes startTime, endTime, elbname and returns all of the params objects you will need (in an array)
  • new tests in index.test.js that:
    • tests successful cases
    • tests error cases (e.g. if some input is missing, or if the input is the incorrect type) -- remember this is a synchronous function, so we should throw ^_^

Once we have ^ we can close here and worry about integration back in #1.

[getMetrics] response format

The final act of getMetrics will be to return all datapoints in a sensible way, that way they can be consumed for analysis later on.

@aarthykc I'm thinking the final response from getMetrics should be formatted like this:

{
    'request count': [ /* all of your datapoints here */ ],
    'latency': [ /* datapoints */ ],
    '2xx': [ /* datapoints */ ],
    '3xx': [ /* datapoints */ ],
    '4xx': [ /* datapoints */ ],
    '5xx': [ /* datapoints */ ]
}

Just opening this for visibility, you can implement in #1 and close here when done?

Documentation

The README.md should have some basic documentation on how to use this tool.

  • Description of what it does
  • How to install
  • Usage

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.