Git Product home page Git Product logo

elasticsearch-drift-plugin's Introduction

Elasticsearch Drift Plugin CircleCI

Time series aggregation for flow records.

Drift Plugin Elasticsearch Release date
1.0.x 6.2.4 May 2018
1.1.0 6.5.4 Feb 2019
x.y.z x.y.z June 2019

After 1.1.0 we switched to using the same version number as the Elasticsearch version that is being targetted.

Overview

This plugin provides a new aggregation function proportional_sum that can be used to:

  1. Group documents that contain a date range into multiple buckets
  2. Calculate a sum on a per bucket basis using a ratio that is proportional to the range of time in which the document spent in that bucket.

This aggregation function behaves like a hybrid of both the Metrics and Bucket type aggregations since we both create buckets and calculate a new metric.

Installation

RPM

Install the package repository:

sudo yum install https://yum.opennms.org/repofiles/opennms-repo-stable-rhel7.noarch.rpm
sudo rpm --import https://yum.opennms.org/OPENNMS-GPG-KEY

Install the package:

sudo yum install elasticsearch-drift-plugin

Debian

Create a new apt source file (eg: /etc/apt/sources.list.d/opennms.list), and add the following 2 lines:

deb https://debian.opennms.org stable main
deb-src https://debian.opennms.org stable main

Import the packages' authentication key with the following command:

wget -O - https://debian.opennms.org/OPENNMS-GPG-KEY | sudo apt-key add -

Install the package:

sudo apt-get update
sudo apt-get install elasticsearch-drift-plugin

Use Case

We are interested in generating time series for Netflow records stored in Elasticsearch. Each Netflow record is stored as a separate document and contains the following fields of interest:

{
  "timestamp": 460,
  "netflow.first_switched": 100,
  "netflow.last_switched": 450,
  "netflow.bytes": 350
}

For this record, we’d like to be able to generate a time series with start=0, end=500, step=100, and have the following data points:

t=0, bytes=0
t=100, bytes=100
t=200, bytes=100
t=300, bytes=100
t=400, bytes=50
t=500, bytes=0

In this case, each step (or bucket) would contain a fraction of the bytes, relative to how much of the flow falls into that step. We assume that the flow bytes are evenly spread across the range and if were multiple flow records in a single step we would sum of the corresponding bytes.

Since the existing aggregation facilities in Elasticsearch don't support this behavior, we've gone ahead and developed our own.

Usage

Using the record above, the proportional_sum aggregation can be used as follows:

Request

{
  "size": 0,
  "aggs": {
    "bytes_over_time": {
      "proportional_sum": {
        "fields": [
          "netflow.first_switched",
          "netflow.last_switched",
          "netflow.bytes"
        ],
        "interval": 100,
        "start": 0,
        "end": 500
      }
    },
    "bytes_total": {
      "sum": {
        "field": "netflow.bytes"
      }
    }
  }
}

The fields options must be present, and must reference the following document fields in order:

  1. The start of the range
  2. The end of the range
  3. The value

The interval can be set a string with a date format, or a numeric value representing the number of milliseconds between steps.

The start and end fields are optional and take a unix timestamp in milliseconds. When set, the generated buckets will be limited to ones that fall within this range. This allows for the documents themselves to be contain wider ranges for which we do not want generate buckets/series for.

Response

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "bytes_total" : {
      "value" : 350.0
    },
    "bytes_over_time" : {
      "buckets" : [
        {
          "key" : 100,
          "doc_count" : 1,
          "value" : 100.0
        },
        {
          "key" : 200,
          "doc_count" : 1,
          "value" : 100.0
        },
        {
          "key" : 300,
          "doc_count" : 1,
          "value" : 100.0
        },
        {
          "key" : 400,
          "doc_count" : 1,
          "value" : 50.0
        }
      ]
    }
  }
}

Here we can see that many buckets were generated for the single document and that the value was spread into these buckets accordingly.

Building and installing from source

To compile the plugin run:

mvn clean package

Next, ensure setup an Elasticsearch instance using the same version that is defined in the pom.xml. The version must match exactly, otherwise Elasticsearch will refuse to start.

Install the plugin using:

/usr/share/elasticsearch/bin/elasticsearch-plugin install file:///path/to/elasticsearch-drift/plugin/target/releases/elasticsearch-drift-plugin-1.0.0-SNAPSHOT.zip

elasticsearch-drift-plugin's People

Contributors

christianpape avatar fooker avatar indigo423 avatar mershad-manesh avatar mszoek avatar naicisum avatar opennms-bamboo avatar swachter avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elasticsearch-drift-plugin's Issues

Where do I find updated builds?

Elasticsearch 8.x is out, the latest 7.x branch is 7.17.14. I see a directory for 7.17.13, but no built artifacts (rpms).

I checked the YUM repos on the OpenNMS site, but the latest pkg available is elasticsearch-drift-plugin-1.0.2 here which makes me think I'm missing something?

Intermediate elasticsearch dir prevents from installing from ZIP

I've downloaded the ZIP file compatible with Elasticsearch 6.5.4 from the release page and tried to install it I get the following error message:

 ./elasticsearch-plugin install file:///apps_nms/InstallFiles/elasticsearch-drift-plugin-1.1.0-SNAPSHOT.zip
-> Downloading file:///apps_nms/InstallFiles/elasticsearch-drift-plugin-1.1.0-SNAPSHOT.zip
[=================================================] 100%
ERROR: This plugin was built with an older plugin structure. Contact the plugin author to remove the intermediate "elasticsearch" directory within the plugin zip.

The ZIP file should be build without the intermediate directory "elasticsearch".

Versioning and branching scheme

Given that Elasticsearch requires the plugin to match the specific major/minor/patch versions, we find ourselves constantly having to update the plugin to work with new versions.

In our current versioning scheme, we increment the minor version of the plugin everytime we make a update to be compatible with a new Elasticsearch version.

Is there a better way to handle this? What if we update the plugin version to simply match the Elasticsearch version we're targeting?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.