gpestana / kapacitor-unit Goto Github PK

View Code? Open in Web Editor NEW

75.0 9.0 19.0 8.65 MB

Testing framework for Kapacitor TICKscripts

License: MIT License

Shell 0.27% Go 96.47% Makefile 2.87% Dockerfile 0.39%

kapacitor influxdb tick-scripts

kapacitor-unit's Introduction

Kapacitor-unit

A test framework for TICKscripts

Kapacitor-unit is a testing framework to make TICK scripts testing easy and automated. Testing with Kapacitor-unit is as easy as defining the test configuration saying which alerts are expected to trigger when the TICK script processes specific data.

Read more about the idea and motivation behind kapacitor-unit in this blog post

Show me Kapacitor-unit in action!

Features

✔️ Run tests for stream TICK scripts using protocol line data input

✔️ Run tests for batch TICK scripts using protocol line data input

🔜 Run tests for stream and batch TICK scripts using recordings

Requirements:

In order for all features to be supported, the Kapacitor version running the tests must be v1.3.4 or higher.

Running kapacitor-unit:

Install kapacitor-unit and run

 $ make install
 $ make build-cmd

 $ make run  	# same as ./cmd/kapacitor-unit/kapacitor-unit

You can add ./cmd/kapacitor-unit/kapacitor-unit to your $PATH so you can easily call the kapacitor-unit executable anywhere.

Define the test configuration file (see below)
Run the tests

kapacitor-unit --dir <*.tick directory> --kapacitor <kapacitor host> --influxdb <influxdb host> --tests <test configuration path>

Test case definition:

# Test case for alert_weather.tick
tests:
  
   # This is the configuration for a test case. The 'name' must be unique in the
   # same test configuration. 'description' is optional

  - name: Alert weather:: warning
    description: Task should trigger Warning when temperature raises about 80 

    # 'task_name' defines the name of the file of the tick script to be loaded
    # when running the test
    task_name: alert_weather.tick

    db: weather
    rp: default 
    type: stream

     # 'data' is an array of data in the line protocol
    data:
      - weather,location=us-midwest temperature=75
      - weather,location=us-midwest temperature=82

    # Alert that should be triggered by Kapacitor when test data is running 
    # against the task
    expects:
      ok: 0
      warn: 1
      crit: 0


  - name: Alert no. 2 using recording
    task_id: alert_weather.tick
    db: weather
    rp: default 
    type: stream
    recordind_id: 7c581a06-769d-45cb-97fe-a3c4d7ba061a
    expects:
      ok: 0
      warn: 1
      crit: 0


  - name: Alert no. 3 - Batch
    task_id: alert_weather.tick
    db: weather
    rp: default 
    type: batch
    data:
      - weather,location=us-midwest temperature=80
      - weather,location=us-midwest temperature=82
    expects:
      ok: 0
      warn: 1
      crit: 0

Contributions:

Fork and PR and use issues for bug reports, feature requests and general comments.

©️ MIT

kapacitor-unit's People

Contributors

Stargazers

Watchers

Forkers

manojm321 vhartikainen martilar neki aspring johncming discovery-digital nevins-jask d-velopds dreadpirateshawn j3ffrw m0rp43us nadeem-syed kosmjon gdunstone dima100

kapacitor-unit's Issues

Missing packages

Hello,

I don't know if I've missed something here or not but the unit test package won't install. I followed the documentation on this repository to no avail.

Expected result: i run the commands listed on the documentaion, the Kapacitor unit test installs and i use it for testing.

Actual result:
kapacitorunit.go:5:2: cannot find package "github.com/fatih/color" in any of:
/usr/local/go/src/github.com/fatih/color (from $GOROOT)
/home/philb/gp_projects/src/github.com/fatih/color (from $GOPATH)
kapacitorunit.go:6:2: cannot find package "github.com/gpestana/kapacitor-unit/cli" in any of:
/usr/local/go/src/github.com/gpestana/kapacitor-unit/cli (from $GOROOT)
/home/philb/gp_projects/src/github.com/gpestana/kapacitor-unit/cli (from $GOPATH)
kapacitorunit.go:7:2: cannot find package "github.com/gpestana/kapacitor-unit/io" in any of:
/usr/local/go/src/github.com/gpestana/kapacitor-unit/io (from $GOROOT)
/home/philb/gp_projects/src/github.com/gpestana/kapacitor-unit/io (from $GOPATH)
kapacitorunit.go:8:2: cannot find package "github.com/gpestana/kapacitor-unit/task" in any of:
/usr/local/go/src/github.com/gpestana/kapacitor-unit/task (from $GOROOT)
/home/philb/gp_projects/src/github.com/gpestana/kapacitor-unit/task (from $GOPATH)
kapacitorunit.go:9:2: cannot find package "github.com/gpestana/kapacitor-unit/test" in any of:
/usr/local/go/src/github.com/gpestana/kapacitor-unit/test (from $GOROOT)
/home/philb/gp_projects/src/github.com/gpestana/kapacitor-unit/test (from $GOPATH)
kapacitorunit.go:10:2: cannot find package "gopkg.in/yaml.v2" in any of:
/usr/local/go/src/gopkg.in/yaml.v2 (from $GOROOT)
/home/philb/gp_projects/src/gopkg.in/yaml.v2 (from $GOPATH)

As far as i know the environment is set up correctly.

Go ENV:
GOARCH="amd64"
GOBIN="/home/philb/gp_projects/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/philb/gp_projects"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

Go Version is 1.9

If other packages are required to install this, then it should be listed in the installation instructions.

Have i missed something or is it broken?

Thanks,

Phil

Implement Result to encapsulate expected results and tests results

Implement a Result type which encapsulates expected results and tests results.

The Result type must keep information about number of alerts triggered and its type (ok, warn, crit). An instance of Result can be constructed with data from the test configuration file or from the node-stats entry of the kapacitor task.

The method compare is implemented to compare two results. This method will compare an expected result with a test result. If the Results are the same, the test pass. Otherwise it will fail.

The String method of Test will compare the results and print whether the test has passed or not (and show the reason why it didn't)

Add colors to test output

Output the result in red when the test fails

Support for Kapacitor templated tasks

Hi!

Does this tool support testing Kapacitor tasks in JSON/YAML format with TICKscript templates?
I've tried to point it towards to a folder containing JSON files, however it did not seem to be very happy with parsing the JSON files, but I may be doing something wrong.

If not, is there any plan on gaining support for templated tasks?

https://docs.influxdata.com/kapacitor/v1.4/working/template_tasks/

feature request: support custom batch data duration

Currently batch data is always created in a database with a 1h retention policy.

Often a batch query is used for longer time periods. This is especially needed with the context of relative timestamps (#27).

Add support to provide testdata with timestamps

When developing e.g. alerts, it would be handy to be able to insert testdata with differing values over certain time period. E.g. being able to provide either an explicit timestamp or relative time adjustment from current time.

e.g.

tests:
..
  data: 
    - tag value=1 0
    - tag value=2 -1m
    - tag value=2 -2m

How to check test results when replaying a recording?

Problem:
Kapacitor endpoint /kapacitor/v1/tasks/alert.tick does not show information about the alerts triggered when replaying data. Although kapacitor may trigger the alert (based on the logs), the information is not mapped in the alert status. This behaviour is cumbersome for the development of #7.

Possible solution:
One way to solve this limitation would be to parse Kapacitor logs and use that information to compare with the expected results. Since it requires log parsing, it is not clean and potentially will have problems with different kapacitor versions.

How to reproduce:
(using Kapacitor 1.0.1 (git: master 04f1ab3116b6bab27cbf63cb0e6b918fc77c7320))

Add following task to kapacitor:

var data = stream
	| from()
		.database('weather')
		.retentionPolicy('default')
		.measurement('temperature')

data
	|alert().id('Temperature')
		.message('Temperature alert')
		.warn(lambda: "temperature" > 80)
		.crit(lambda: "temperature" > 100)
		.stateChangesOnly()
    .log('/tmp/temperature.tick.log')

Start a recording

/kapacitor/v1/recordings/stream

Record data that will trigger an alert:

POST /kapacitor/v1/write?db=weather&rp=default temperature,location=us-midwest temperature=90

Replay the recording

POST /kapacitor/v1/replays -d '{"task": "<task_name>", "recording": "'<recording_id>'", "clock": "real"}'

Check outputs when replay has finished:

A) logs:

[alert_weather.tick:alert2] 2017/09/01 13:34:24 D! WARNING alert triggered id:Temperature msg:Temperature alert data:&{temperature map[location:us-midwest] [time temperature] [[2017-09-01 13:34:24.286546253 +0000 UTC 82]] <nil>}
[httpd] 172.21.0.1 - - [01/Sep/2017:13:34:24 +0000] "POST /kapacitor/v1/write?db=weather&rp=default HTTP/1.1" 204 0 "-" "curl/7.43.0" 44b93656-8f1a-11e7-86d3-000000000000 97

B) Task information at /kapacitor/v1/tasks/alert.tick:

...
    "node-stats": {
      "alert2": {
        "alerts_triggered": 0,
        "avg_exec_time_ns": 0,
        "collected": 0,
        "crits_triggered": 0,
        "emitted": 0,
        "infos_triggered": 0,
        "oks_triggered": 0,
        "warns_triggered": 0
      },
...

As seen above, the alert is triggered based on kapacitor logs but the task information does not show it.

Add support to override the .period in batch query

Kapacitor-unit currently overrides already the .every it fins in a query, it would be similarly handy to be able to override somehow the time period for evaluations.

Or this could be unnecessary if support for testdata time adjustment was implemented 😄

Allow configurations to be set by ENV variables

Allowing for configurations to be set by environment variables will be beneficial for a few scenarios:

Containerization of kapacitor-unit itself
Using kapacitor unit as a library

Env variables should be checked first before cli configs which would override any env vars.

Thoughts?

no alerts fired for stream tasks with window node.

Tests sometimes FAIL when they should return OK

Tests sometimes FAIL when they should return OK. If the test run again the test succeeds.

Kapacitor is running in a VM with the IP 192.168.59.101

To reproduce the error, run kapacitor multiple times and check the output. Sometimes a test fails, that was successful last time (see Alert weather:: critical)

Console Output:

weefreeman@snuk-wee:~/go/src/github.com/gpestana/kapacitor-unit$ kapacitor-unit -dir sample/ -kapacitor http://192.168.59.101:9092 -tests sample/test_case.yaml 
2017/09/12 11:26:41 TEST Alert weather:: critical (alert_weather.tick) OK
2017/09/12 11:26:41 TEST Alert weather:: critical (alert_weather.tick) OK
2017/09/12 11:26:41 TEST Alert weather:: failure (alert_weather.tick) FAIL
Should have triggered 2 Ok alerts, triggered 0
Should have triggered 1 Warning alerts, triggered 0
Should have triggered 1 Critical alerts, triggered 0
Alerts triggered (ok: 0, warn: 0, crit: 0)

2017/09/12 11:26:41 TEST Alert weather:: invalid - recording_id + data (alert_weather.tick) ERROR: Configuration file cannot define a recording_id and line protocol data input for the same test case

weefreeman@snuk-wee:~/go/src/github.com/gpestana/kapacitor-unit$ kapacitor-unit -dir sample/ -kapacitor http://192.168.59.101:9092 -tests sample/test_case.yaml 
2017/09/12 11:26:43 TEST Alert weather:: critical (alert_weather.tick) OK
2017/09/12 11:26:43 TEST Alert weather:: critical (alert_weather.tick) FAIL
Should have triggered 1 Warning alerts, triggered 0
Alerts triggered (ok: 0, warn: 0, crit: 0)

2017/09/12 11:26:43 TEST Alert weather:: failure (alert_weather.tick) FAIL
Should have triggered 2 Ok alerts, triggered 0
Should have triggered 1 Critical alerts, triggered 0
Alerts triggered (ok: 0, warn: 1, crit: 0)

2017/09/12 11:26:43 TEST Alert weather:: invalid - recording_id + data (alert_weather.tick) ERROR: Configuration file cannot define a recording_id and line protocol data input for the same test case

Restructure so that the application is usable as a library

Structure the application so that it is useful as either a command line script or as a library that can be integrated with existing test suites.

Update Kapacitor version running in tests container

In order to take into use the feature to inspect replay stats, update the kapacitor image to 1.3.4 or higher (once version is released)

🚀🚀 Looking for new maintainers!

I'm really happy to see that kapacitor-unit has been used in "production" by a bunch of people and companies! Unfortunately I don't have the time to put in for bug squashing and new feature development and issues are starting to pile up.

I'd be really happy if someone would like to jump in as main maintainer and have total independency on where the project should head to. I'd be helping on the side and would also dedicate as much time as possible to explain the concept and current source code.

batch tests do not support --influxdb parameter consistently

When passing --influxdb parameter to kapacitor-unit, the test data is created in the target destination, but the queries are executed against localhost.

Using the sample batch test data in this repo:

root@5604febba2a3:/kapacitor# cat ticks/alert_weather_batch.tick 
var weather = batch
	| query('''
		SELECT mean(temperature)
		FROM "weather"."default"."temperature"
		''')
			.period(5m)
			.every(10m)

var rain = batch
	| query('''
		SELECT count(rain) 
		FROM "weather"."default"."temperature"
	''')
		.period(5m)
		.every(3d)


// simple case with only one batch query

	weather
	| alert().id('Temperature')
		.message('Temperature alert - batch')
		.warn(lambda: "mean" > 80)
		.crit(lambda: "mean" > 100)
		.stateChangesOnly()
    .log('/tmp/temperature_batch.tick.log')

root@5604febba2a3:/kapacitor# cat test/sample.yaml
tests:
  # batch script
  - name: "Alert weather:: batch"  
    task_name: alert_weather_batch.tick
    db: weather
    rp: default
    type: batch
    data:
      - temperature,location=us-midwest temperature=110
      - temperature,location=us-midwest temperature=91
    expects:
      ok: 0
      warn: 0
      crit: 1

The run fails like so:

root@5604febba2a3:/kapacitor# kapacitor-unit --dir /kapacitor/ticks --kapacitor http://localhost:9092 --influxdb http://myinfluxdb1:8086 --tests /kapacitor/test/sample.yaml
  _                          _ _                                _ _            
 | |                        (_) |                              (_) |           
 | | ____ _ _ __   __ _  ___ _| |_ ___  _ __ ______ _   _ _ __  _| |_          
 | |/ / _` | '_ \ / _` |/ __| | __/ _ \| '__|______| | | | '_ \| | __|      
 |   < (_| | |_) | (_| | (__| | || (_) | |         | |_| | | | | | |_          
 |_|\_\__,_| .__/ \__,_|\___|_|\__\___/|_|          \__,_|_| |_|_|\__| 
           | |                                                                 
           |_|                                                        		      
The unit test framework for TICK scripts (v0.9)

Processing batch script alert_weather_batch.tick...
ts=2020-02-25T19:09:28.008Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query1 err="Post http://localhost:8086/query?db=&q=SELECT+mean%28temperature%29+FROM+weather.%22default%22.temperature+WHERE+time+%3E%3D+%272020-02-25T19%3A04%3A28.007892268Z%27+AND+time+%3C+%272020-02-25T19%3A09%3A28.007892268Z%27: dial tcp 127.0.0.1:8086: connect: connection refused"
ts=2020-02-25T19:09:28.008Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query2 err="Post http://localhost:8086/query?db=&q=SELECT+count%28rain%29+FROM+weather.%22default%22.temperature+WHERE+time+%3E%3D+%272020-02-25T19%3A04%3A28.007889895Z%27+AND+time+%3C+%272020-02-25T19%3A09%3A28.007889895Z%27: dial tcp 127.0.0.1:8086: connect: connection refused"
ts=2020-02-25T19:09:29.008Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query2 err="Post http://localhost:8086/query?db=&q=SELECT+count%28rain%29+FROM+weather.%22default%22.temperature+WHERE+time+%3E%3D+%272020-02-25T19%3A04%3A29.007402193Z%27+AND+time+%3C+%272020-02-25T19%3A09%3A29.007402193Z%27: dial tcp 127.0.0.1:8086: connect: connection refused"
ts=2020-02-25T19:09:29.008Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query1 err="Post http://localhost:8086/query?db=&q=SELECT+mean%28temperature%29+FROM+weather.%22default%22.temperature+WHERE+time+%3E%3D+%272020-02-25T19%3A04%3A29.007404716Z%27+AND+time+%3C+%272020-02-25T19%3A09%3A29.007404716Z%27: dial tcp 127.0.0.1:8086: connect: connection refused"
ts=2020-02-25T19:09:30.008Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query1 err="Post http://localhost:8086/query?db=&q=SELECT+mean%28temperature%29+FROM+weather.%22default%22.temperature+WHERE+time+%3E%3D+%272020-02-25T19%3A04%3A30.007409273Z%27+AND+time+%3C+%272020-02-25T19%3A09%3A30.007409273Z%27: dial tcp 127.0.0.1:8086: connect: connection refused"
ts=2020-02-25T19:09:30.008Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query2 err="Post http://localhost:8086/query?db=&q=SELECT+count%28rain%29+FROM+weather.%22default%22.temperature+WHERE+time+%3E%3D+%272020-02-25T19%3A04%3A30.007407014Z%27+AND+time+%3C+%272020-02-25T19%3A09%3A30.007407014Z%27: dial tcp 127.0.0.1:8086: connect: connection refused"
2020/02/25 19:09:30 TEST Alert weather:: batch (alert_weather_batch.tick) FAIL
 Should have triggered 1 Critical alerts, triggered 0
 Alerts triggered (ok: 0, warn: 0, crit: 0)

Note http://localhost:8086/query?

But when I look at myinfluxdb1, I see the appropriate drop/create for each run:

highland@myinfluxdb1:~$ journalctl --since="3 hours ago" | grep weather | tail -n4
Feb 25 19:09:26 myinfluxdb1 influxd[1433]: ts=2020-02-25T19:09:26.977619Z lvl=info msg="Executing query" log_id=0KlPpmyl000 service=query query="CREATE DATABASE weather WITH DURATION 1h0m0s REPLICATION 1 NAME \"default\""
Feb 25 19:09:27 myinfluxdb1 influxd[1433]: [httpd] 192.168.0.17 - - [25/Feb/2020:11:09:27 -0800] "POST /write?db=weather&rp=default HTTP/1.1" 204 0 "-" "Go-http-client/1.1" 578b1f8f-5802-11ea-8e69-0050562ae7ac 23852
Feb 25 19:09:27 myinfluxdb1 influxd[1433]: [httpd] 192.168.0.17 - - [25/Feb/2020:11:09:27 -0800] "POST /write?db=weather&rp=default HTTP/1.1" 204 0 "-" "Go-http-client/1.1" 578ee596-5802-11ea-8e6a-0050562ae7ac 1111
Feb 25 19:09:30 myinfluxdb1 influxd[1433]: ts=2020-02-25T19:09:30.055504Z lvl=info msg="Executing query" log_id=0KlPpmyl000 service=query query="DROP DATABASE weather"

tests need to delete topic between runs

The following tick + test will work fine:

var weather = batch
	| query('''
		SELECT mean(temperature)
		FROM "weather"."default"."temperature"
		''')
			.period(5m)
			.every(10m)

	weather
	| alert().id('Temperature')
		.topic('weather')
		.message('Temperature alert - batch')
		.warn(lambda: "mean" > 80)
		.crit(lambda: "mean" > 100)
		.stateChangesOnly()
    .log('/tmp/temperature_batch.tick.log')

tests:

  - name: "Alert weather:: batch"
    task_name: alert_weather_batch.tick
    db: weather
    rp: default
    type: batch
    data:
      - temperature,location=us-midwest temperature=110
      - temperature,location=us-midwest temperature=91
    expects:
      ok: 0
      warn: 0
      crit: 1

However, if a topic is added to the tick:

	| alert().id('Temperature')
		.topic('weather')

...then the test will only work once, and will fail on subsequent runs.

This is due to .stateChangesOnly() combined with the lack of topic state deletion between test runs.

That is to say -- in kapacitor, the "weather" topic tracks the critical state of the task, so even deleting and recreating the task between tests in not sufficient to reset the state.

The problem is confirmed for batch, unconfirmed for stream.

Improve test package coverage

Currently the test package has a coverage of 32.1%. The Improve test coverage.

How to use the script

Hi and many thanks for the tool.
Unfortunately I am not able to get any results. The command given in the readme file does not work. I am using the following command to run the script (inside the kapacitor-unit folder):

./bin/kapacitor-unit -d ./sample/ -t ./sample/alert_1.tick

and the output:

influxdb is up-to-date
Starting test_manager
kapacitor is up-to-date

I was not able to set an output file or to find any results of the test.

Thanks in advance,
Simon

Inconsistent behavior with first/last data points

I have the following kapacitor-unit test file (sorry for the formatting, github seems pretty insistent on displaying the dashes as bullet points):

`tests:

name: "Alert my_alert:: critical when value < 1314873000"
task_name: my_alert.tick
db: telegraf
rp: autogen
type: stream
data:
- my_measurement,my_tag=matching_tag_value value=3629746000
- my_measurement,my_tag=matching_tag_value value=2629746000
- my_measurement,my_tag=matching_tag_value value=2629745999
- my_measurement,my_tag=matching_tag_value value=1314873000
- my_measurement,my_tag=matching_tag_value value=1314872999
- my_measurement,my_tag=matching_tag_value value=314873000
- my_measurement,my_tag=matching_tag_value value=3629746000
  expects:
  ok: 2
  warn: 1
  crit: 1
  `
  Sometimes this generates the correct expects of 2 OKs. Sometimes it generates 1 OK, sometimes 0 OKs. If I double up the first and last data points, sending each one twice, it is far more consistent, but still sometimes fails to generate one or both OK states. This behavior indicates that sometimes kapacitor-unit is missing the first/last one or more data points. Likely a race condition where the task is not registered and enabled in Kapacitor prior to sending the data points, and possibly that the task is disabled/deleted prior to receiving all data points.

[Bug] If test logic fails, task is not deleted from kapacitor

After, we the task has to be explicitly removed by hand from kapacitor.

Implement clean up so that task is removed after loaded to kapacitor, no matter what happens (using defer() maybe)

[feature] Support recordings as data input for tests

Kapacitor recordings are an easy way to replay a set of data against a task. Kapacitor-unit must support recordings as data input for the tests.

The recordings must be saved previously in the kapacitor instance used to run the tests;
A recording is identified by an unique ID. The unique ID is generated by kapacitor when the recording is created;
The recording ID must be provided in the configuration file:

  - name: Alert weather:: warning
    task_script: alert.tick
    db: weather
    rp: default 
    recording_id: e24db07d-1646-4bb3-a445-828f5049bea0
    expects:
      ok: 0
      warn: 1
      crit: 0

When recording is provided, there cannot be a data field in the test configuration file. If both fields exist, the test configuration is considered invalid. The test will not run and the test output message should describe the invalid error.

cannot specify dbrp in implicitly and explicitly

Hi, was very excited to try this framework but I'm having a problem I don't understand. I have tried removing db/rp from yaml as well as removing it from tick script (not allowed) but neither seems to help. Any ideas?

$ kapacitor-unit --dir etc/kapacitor/load/tasks --tests tests/test_cleanup_nomad.yaml 
...
The unit test framework for TICK scripts (v0.8)

ts=2018-04-17T19:51:02.850Z lvl=info msg="http request" service=http host=127.0.0.1 username=- start=2018-04-17T19:51:02.834445448Z method=POST uri=/kapacitor/v1/tasks protocol=HTTP/1.1 status=400 referer=- user-agent=Go-http-client/1.1 request-id=a8aeea8f-4278-11e8-800d-000000000000 duration=16.322373ms
2018/04/17 19:51:02 400 Bad Request:: {
    "error": "cannot specify dbrp in both implicitly and explicitly"
}

tick script:

dbrp "nomad"."one_month"

var allocated = stream
    |from()
        .measurement('nomad_node_allocated_cpu_megahertz')
    |groupBy('tg_region')
    |window()
        .period(30s)
        .every(30s)
    |mean('gauge')
    |sum('mean')

var capacity = stream
    |from()
        .measurement('nomad_node_resource_cpu_megahertz')
    |groupBy('tg_region')
    |window()
        .period(30s)
        .every(30s)
    |mean('total')
    |sum('mean')

allocated
    |join(capacity)
        .as('allocated', 'capacity')
    |eval(lambda: ("allocated.sum" / "allocated.sum") * 100)
        // Give the resulting field a name
        .as('perc_allocated')
    |log()
    |httpOut('nomad_perc_allocated')
    |alert()
        .id('{{ .Name }}/{{ index .Tags "tg_region" }}')
        .message('{{ .ID }} is {{ .Level }} CPU allocation/capacity rate for {{ index .Tags "tg_region" }}: {{ index .Fields "perc_allocated" | printf "%0.2f" }}%')
        .crit(lambda: "sum" > 80)

test yaml:

tests:
  - name: "Alert allocation approaching capacity:: crit"
    description: "Task should trigger Critical when allocated/capacity above 80"
    task_name: "cleanup_nomad.tick"
    db: "nomad"
    rp: "one_month"
    type: "stream"
    data:
      - "nomad_node_allocated_cpu_megahertz,tg_region=test_region gauge=9000"
      - "nomad_node_resource_cpu_megahertz,tg_region=test_region gauge=10000"
    expects:
      ok: 0
      warn: 0
      crit: 1

Getting "is a directory" error

I feel like I must be doing something really basic wrong...

If I pass a directory with tickscripts in it to kapacitor-unit it complains that it's a directory.

[abutler@dev kapacitor-unit]$ kapacitor-unit --tests tests/connections.yml --dir /opt/tickscripts/enabled/
  _                          _ _                                _ _
 | |                        (_) |                              (_) |
 | | ____ _ _ __   __ _  ___ _| |_ ___  _ __ ______ _   _ _ __  _| |_
 | |/ / _` | '_ \ / _` |/ __| | __/ _ \| '__|______| | | | '_ \| | __|
 |   < (_| | |_) | (_| | (__| | || (_) | |         | |_| | | | | | |_
 |_|\_\__,_| .__/ \__,_|\___|_|\__\___/|_|          \__,_|_| |_|_|\__|
           | |
           |_|
The unit test framework for TICK scripts (v0.8)

2018/03/16 21:16:51 Init Tests failed: read /opt/tickscripts/enabled/: is a directory

If I pass a tickscript file to kapacitor-unit it complains that it's not a directory.

[abutler@dev kapacitor-unit]$ kapacitor-unit --tests tests/connections.yml --dir /opt/tickscripts/enabled/connections.tick
  _                          _ _                                _ _
 | |                        (_) |                              (_) |
 | | ____ _ _ __   __ _  ___ _| |_ ___  _ __ ______ _   _ _ __  _| |_
 | |/ / _` | '_ \ / _` |/ __| | __/ _ \| '__|______| | | | '_ \| | __|
 |   < (_| | |_) | (_| | (__| | || (_) | |         | |_| | | | | | |_
 |_|\_\__,_| .__/ \__,_|\___|_|\__\___/|_|          \__,_|_| |_|_|\__|
           | |
           |_|
The unit test framework for TICK scripts (v0.8)

2018/03/16 21:18:54 Init Tests failed: open /opt/tickscripts/enabled/connections.tick/: not a directory

Support for providing the script path in yaml

It would be helpful to provide the relative tick script path in the configuration YAML if wanted. Such would make it possible to just pass the root tests directory on command line and then have the relative path in config yaml.

e.g.

scripts
+- scripts1
     +- script1.tick
+- scripts2
     +- script2.tick
tests
+- test1.yaml
+- test2.yaml

Then having in test1.yaml

...
   task_name: scripts1/script1.tick

And running:
kapacitor-unit --dir scripts --tests tests/test1.yaml

I will add another improvement for just providing a tests directory and then iterating over all test yaml configs in there :)

Add component/e2e tests

It should be straightforward to add component tests running on Docker which verify the output of a series of kapacitor-unit workflows.

One idea would be something like:

Create docker compose for setting up a kapacitor ,and a component-tests container (where latest kapacitor-unit binary is installed)
The component-tests container runs a bunch of predefined kapacitor-unit tests and logs the outputs to files
When tests finish, the component-tests container verifies the output

batch tests sometimes fail due to "database not found:" errors

Sometimes batch tests can fail, citing "database not found:".

Sample run:

I0326 03:30:26.170480     147 kapacitor.go:33] DEBUG:: Kapacitor loading task: alert_weather_batch.tick
I0326 03:30:26.170561     147 kapacitor.go:42] DEBUG:: batch script after replace: 
var weather = batch
	| query('''
		SELECT mean(temperature)
		FROM "weather"."default"."temperature"
		''')
			.period(5m)
			.every(1s)

var snow = batch
	| query('''
		SELECT count(precipitation)
		FROM "weather"."default"."snow"
	''')
		.period(3h)
		.every(1s)
		.groupBy(time(1h), 'location')
		.alignGroup()


// simple case with only one batch query
//   https://github.com/gpestana/kapacitor-unit/issues/41
//   * must include topic() + stateChangesOnly()

	weather
	| alert().id('Temperature')
		.topic('weather')
		.message('Temperature alert - batch')
		.warn(lambda: "mean" > 80)
		.crit(lambda: "mean" > 100)
		.stateChangesOnly()
    .log('/tmp/temperature_batch.tick.log')

// case with dynamic data

	snow
	| alert().id('Snow')
		.message('Snow alert - batch')
		.warn(lambda: "count" > 2)
		.crit(lambda: "count" > 5)
		.stateChangesOnly()
    .log('/tmp/snow_batch.tick.log')
I0326 03:30:26.250655     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=5   1585186217818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.253084     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585186817818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.254637     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585187417818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.260366     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585188017818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.261962     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585188617818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.263748     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585189217818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.265532     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585189817818311680] to http://localhost:8086/write?db=weather&rp=default
I0326 03:30:26.266927     147 influxdb.go:95] DEBUG:: Influxdb added [snow,location=us-midwest precipitation=10  1585190417818311680] to http://localhost:8086/write?db=weather&rp=default
Processing batch script alert_weather_batch.tick...
I0326 03:30:29.267116     147 kapacitor.go:149] DEBUG:: Kapacitor fetching status of: alert_weather_batch.tick
E0326 03:30:29.285754     147 test.go:137] DEBUG:: teardown test: Alert snow:: batch dynamic timestamps
I0326 03:30:29.286334     147 influxdb.go:119] DEGUB:: Influxdb delete monitor weather
I0326 03:30:29.286408     147 influxdb.go:147] DEGUB:: Influxdb checking database weather
E0326 03:30:29.288026     147 influxdb.go:165] DEBUG:: Database = weather
ts=2020-03-26T03:30:30.217Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query2 err="database not found: weather"
ts=2020-03-26T03:30:30.218Z lvl=error msg="error executing query" service=kapacitor task_master=main task=alert_weather_batch.tick node=query1 err="database not found: weather"
I0326 03:30:30.288275     147 influxdb.go:147] DEGUB:: Influxdb checking database weather
E0326 03:30:30.289786     147 influxdb.go:165] DEBUG:: Database = _internal
E0326 03:30:30.289900     147 influxdb.go:173] DEBUG:: No database = weather
E0326 03:30:30.289954     147 influxdb.go:127] DEGUB:: Database absent: weather
I0326 03:30:30.290011     147 influxdb.go:212] DEBUG:: Influxdb cleanup database q=DROP DATABASE "weather"

Not certain about repro rate.

Also, repro seems to be some dependent upon how users have influx and/or kapacitor set up. In this project, both are started externally via docker-compose, and I've had a moderate repro rate. However, in my work setup, we want to make sure the test container is self-sufficient and self-contained, so we start kapacitor and influx inside the container itself, before running kapacitor-unit. With that approach, the repro rate is much higher.

Which is to say, the crux of the problem might be delicate reliance upon response times / asynchronous behavior.

During debugging, I was able to confirm performing influx commands "DELETE DATABASE foo" then "SHOW DATABASES" in rapid succession would still show database foo briefly after the deletion. Additionally, the test teardown performs the database deletion first, and then the task deletion. It also appears shows the query error occurring AFTER the database deletion has begun -- though that part is more confusing, since the code seems clearly sequential; perhaps this is a quirk of error processing flow?

Test configuration parse failed

kapacitor-unit -dir ./tickscript/ -influxdb http://localhost:8086 -kapacitor http://localhost:9092 --tests tests_conf.yaml -logtostderr

cat tickscript/cpu_alert.tick
stream
|from()
.database('telegraf')
.retationPolicy('autogen')
.measurement('cpu')
.where(lambda: "host" == 'ni-77000-0')
|alert()
.warn(lambda: int("usage_idle") < 70)
.log('/tmp/alerts.log')

cat tests_conf.yaml
tests:
name: Alert
cpu: warning
description: Task should trigger Warning when usage_idle < 70
task_script: cpu_alert.tick
db: telegraf
rp: autogen
data:
- cpu,host=ni-77000-0 usage_idle < 70
expects: warning

Execute all tests found within the tests directory

It would be helpful to provide just the tests directory on command line. Then the tool would iterate over all tests it finds and runs the tests.

e.g.

scripts
+- scripts1
     +- script1.tick
+- scripts2
     +- script2.tick
tests
+- test1.yaml
+- test2.yaml

And running:
kapacitor-unit --dir scripts --tests tests/

The tool would then execute all tests it finds from the directory

[Discussion] How to test batch scripts

When the script type is batch, kapacitor will trigger the query when it is defined by the script (see example below). The query aggregates the data points on which the tick script will run against.

...
var data = batch|query(<query>)
    .period(10m)
    .every(5m)
...

kapacitor-unit should not wait for the batch queries to be triggered, since it could potentially make the test run too slow. We need to figure out how to trigger the batch queries as early as the script is loaded, without influencing the script behaviour.

One idea is to change the query period on the fly before the script is loaded into kapacitor.

Installation broken

Installation appears to be broken now:

xxx@xxx:~/repos/go$ go get github.com/gpestana/kapacitor-unit
package github.com/gpestana/kapacitor-unit: no Go files in /home/xxx/repos/go/src/github.com/gpestana/kapacitor-unit

(but the files are downloaded)
Then:

xxx@xxx:~/repos/go$ go install github.com/gpestana/kapacitor-unit
can't load package: package github.com/gpestana/kapacitor-unit: no Go files in /home/xxx/repos/go/src/github.com/gpestana/kapacitor-unit

I'm new to go, but I believe the problem is this commit:

fe70122

I think it's because there's no longer a top-level file with a main() function, so go get/install throw errors.

Test tick scripts using node output

What

Kapacitor features node outputs, in which processed data from kapacitor can be output to a certain service. Kapacitor has two types of node output implemented out of the box and ready to use: HTTPOutputNode and InfluxOutputNode. It is also possible to define custom output nodes. TICKscripts which process data and output it to a certain endpoint may or may not trigger alerts.

It would be a great feature for kapacitor-unit to test node outputs.

Test flow:

Define the test configuration (which data to load, which tick script to load, what are the output expected after the data processing)
Load tick script
Load task
Inspect output data
Compare expected data with output data

Open points

A) How to pipe data coming from an output node to inspect its contents and compare with the expected data defined in the test definition?
kapacitor-unit should be able to consume processed data coming from any type of NodeOuput, predefined or custom.

B) How would the test definition look like?

C) Add way to split configuration tests so that it's easier to manage them? (e.g. to have a test configuration for alerts and another for data processing (OutputNode) type of tests.

gpestana / kapacitor-unit Goto Github PK

kapacitor-unit's Introduction

Kapacitor-unit

Show me Kapacitor-unit in action!

Features

Requirements:

Running kapacitor-unit:

Test case definition:

Contributions:

kapacitor-unit's People

Contributors

Stargazers

Watchers

Forkers

kapacitor-unit's Issues

What

Test flow:

Open points

Recommend Projects

Recommend Topics

Recommend Org