Git Product home page Git Product logo

nredistimeseries's Introduction

Release CircleCI Dockerhub codecov

RedisTimeSeries

Forum Discord

logo

RedisTimeSeries is a time-series database (TSDB) module for Redis, by Redis.

RedisTimeSeries can hold multiple time series. Each time series is accessible via a single Redis key (similar to any other Redis data structure).

What is a Redis time series?

A Redis time series comprises:

  • Raw samples: each raw sample is a {time tag, value} pair.

    • Time tags are measured in milliseconds since January 1st, 1970, at 00:00:00.

      Time tags can be specified by the client or filled automatically by the server.

    • 64-bit floating-point values.

    The intervals between time tags can be constant or variable.

    Raw samples can be reported in-order or out-of-order.

    Duplication policy for samples with identical time tags can be set: block/first/last/min/max/sum.

  • An optional configurable retention period.

    Raw samples older than the retention period (relative to the raw sample with the highest time tag) are discarded.

  • Series Metadata: a set of name-value pairs (e.g., room = 3; sensorType = ‘xyz’).

    RedisTimeSeries supports cross-time-series commands. One can, for example, aggregate data over all sensors in the same room or all sensors of the same type.

  • Zero or more compactions.

    Compactions are an economical way to retain historical data.

    Each compaction is defined by:

    • A timeframe. E.g., 10 minutes
    • An aggregator: min, max, sum, avg, …
    • An optional retention period. E.g., 10 year

    For example, the following compaction: {10 minutes; avg; 10 years} will store the average of the raw values measured in each 10-minutes time frame - for 10 years.

How do I Redis?

Learn for free at Redis University

Build faster with the Redis Launchpad

Try the Redis Cloud

Dive in developer tutorials

Join the Redis community

Work at Redis

Examples of time series

  • Sensor data: e.g., temperatures or fan velocity for a server in a server farm
  • Historical prices of a stock
  • Number of vehicles passing through a given road (count per 1-minute timeframes)

Features

  • High volume inserts, low latency reads
  • Query by start time and end-time
  • Aggregated queries (Min, Max, Avg, Sum, Range, Count, First, Last, STD.P, STD.S, Var.P, Var.S, twa) for any time bucket
  • Configurable maximum retention period
  • Compactions - automatically updated aggregated timeseries
  • Secondary index - each time series has labels (name-value pairs) which will allows to query by labels

Using with other tools metrics tools

In the RedisTimeSeries organization you can find projects that help you integrate RedisTimeSeries with other tools, including:

  1. Prometheus - read/write adapter to use RedisTimeSeries as backend db.
  2. Grafana - using the Redis Data Source.
  3. Telegraf
  4. StatsD, Graphite exports using graphite protocol.

RedisTimeSeries is part of Redis Stack.

Setup

You can either get RedisTimeSeries setup in a Docker container or on your own machine.

Docker

To quickly try out RedisTimeSeries, launch an instance using docker:

docker run -p 6379:6379 -it --rm redis/redis-stack-server:latest

Build it yourself

You can also build RedisTimeSeries on your own machine. Major Linux distributions as well as macOS are supported.

First step is to have Redis installed, of course. The following, for example, builds Redis on a clean Ubuntu docker image (docker pull ubuntu) or a clean Debian docker image (docker pull debian:stable):

mkdir ~/Redis
cd ~/Redis
apt-get update -y && apt-get upgrade -y
apt-get install -y wget make pkg-config build-essential
wget https://download.redis.io/redis-stable.tar.gz
tar -xzvf redis-stable.tar.gz
cd redis-stable
make distclean
make
make install

Next, you should get the RedisTimeSeries repository from git and build it:

apt-get install -y git
cd ~/Redis
git clone --recursive https://github.com/RedisTimeSeries/RedisTimeSeries.git
cd RedisTimeSeries
./sbin/setup
bash -l
make

Then exit to exit bash.

Note: to get a specific version of RedisTimeSeries, e.g. 1.8.10, add -b v1.8.10 to the git clone command above.

Next, run make run -n and copy the full path of the RedisTimeSeries executable (e.g., /root/Redis/RedisTimeSeries/bin/linux-x64-release/redistimeseries.so).

Next, add RedisTimeSeries module to redis.conf, so Redis will load when started:

apt-get install -y vim
cd ~/Redis/redis-stable
vim redis.conf

Add: loadmodule /root/Redis/RedisTimeSeries/bin/linux-x64-release/redistimeseries.so under the MODULES section (use the full path copied above).

Save and exit vim (ESC :wq ENTER)

For more information about modules, go to the Redis official documentation.

Run

Run redis-server in the background and then redis-cli:

cd ~/Redis/redis-stable
redis-server redis.conf &
redis-cli

Give it a try

After you setup RedisTimeSeries, you can interact with it using redis-cli.

Here we'll create a time series representing sensor temperature measurements. After you create the time series, you can send temperature measurements. Then you can query the data for a time range on some aggregation rule.

With redis-cli

$ redis-cli
127.0.0.1:6379> TS.CREATE temperature:3:11 RETENTION 60 LABELS sensor_id 2 area_id 32
OK
127.0.0.1:6379> TS.ADD temperature:3:11 1548149181 30
OK
127.0.0.1:6379> TS.ADD temperature:3:11 1548149191 42
OK
127.0.0.1:6379>  TS.RANGE temperature:3:11 1548149180 1548149210 AGGREGATION avg 5
1) 1) (integer) 1548149180
   2) "30"
2) 1) (integer) 1548149190
   2) "42"

Client libraries

Some languages have client libraries that provide support for RedisTimeSeries commands:

Project Language License Author Stars Package Comment
jedis Java MIT Redis Stars Maven
redis-py Python MIT Redis Stars pypi
node-redis Node.js MIT Redis Stars npm
nredisstack .NET MIT Redis Stars nuget
redistimeseries-go Go Apache-2 Redis redistimeseries-go-stars GitHub
rueidis Go Apache-2 Rueian rueidis-stars GitHub
phpRedisTimeSeries PHP MIT Alessandro Balasco phpRedisTimeSeries-stars GitHub
redis-time-series JavaScript MIT Rafa Campoy redis-time-series-stars GitHub
redistimeseries-js JavaScript MIT Milos Nikolovski redistimeseries-js-stars GitHub
redis_ts Rust BSD-3 Thomas Profelt redis_ts-stars GitHub
redistimeseries Ruby MIT Eaden McKee redistimeseries-stars GitHub
redis-time-series Ruby MIT Matt Duszynski redis-time-series-rb-stars GitHub

Tests

The module includes a basic set of unit tests and integration tests.

Unit tests

To run all unit tests, follow these steps:

$ make unit_tests

Integration tests

Integration tests are based on RLTest, and specific setup parameters can be provided to configure tests. By default the tests will be ran for all common commands, and with variation of persistency and replication.

To run all integration tests in a Python virtualenv, follow these steps:

$ mkdir -p .env
$ virtualenv .env
$ source .env/bin/activate
$ pip install -r tests/flow/requirements.txt
$ make test

To understand what test options are available simply run:

$ make help

For example, to run the tests strictly desigined for TS.ADD command, follow these steps:

$ make test TEST=test_ts_add.py

Documentation

Read the docs at http://redistimeseries.io

Mailing List / Forum

Got questions? Feel free to ask at the RedisTimeSeries forum.

License

RedisTimeSeries is licensed under the Redis Source Available License 2.0 (RSALv2) or the Server Side Public License v1 (SSPLv1).

nredistimeseries's People

Contributors

aviavni avatar avitalfineredis avatar chayim avatar cjdsellers avatar dvirdukhan avatar filipecosta90 avatar gkorland avatar shaunsales avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

nredistimeseries's Issues

Missing TS.REVRANGE and TS.MREVRANGE commands

I found that these commands are not available via any methods in the TimeSeriesClient. I have a use case whereby I need to retrieve the latest data with a count specified and the reverse command achieves this.

I've implemented my own extension method to get around the issue however I thought I would flag this as I know you're preparing a new release. I could also open a PR.

Support for DUPLICATE_POLICY on TS.CREATE and TS.ADD

As of RedisTimeseries >= 1.4 You can now add samples to a time series where the time of the sample is older than the newest sample in the series. Bundled with that, we now have a policy that will define handling of duplicate samples, and that needs to be supported on the client via the arguments [DUPLICATE_POLICY policy] on TS.CREATE and via [ON_DUPLICATE policy] on TS.ADD. The following are the possible policies:

  • BLOCK - an error will occur for any out of order sample
  • FIRST - ignore the new value
  • LAST - override with latest value
  • MIN - only override if the value is lower than the existing value
  • MAX - only override if the value is higher than the existing value

Further reference: documentation link

Using TimeStamp class can lead to incompatible timestamps in RedisTimeSeries

The TimeStamp class uses DateTime.Ticks which is incompatible with the native Unix Epoch milliseconds used by Redis TimeSeries.

If you have cross-platform clients accessing the time series data, or use the * auto timestamp which defaults to Unix ms, it's easy to end up with unexpected results when querying ranges and aggregating samples.

I suggest the TimeStamp class bounds check entries and default to supporting only UnixMs min and max timestamps. After testing, it appears that Redis TimeSeries only supports positive timestamp values, which means any dates before epoch won't work.

This would allow * to work as expected from the NRedisTimeSeries client and timestamps would be compatible across Unix/MacOS and Windows platforms. However using Unix ms means that DateTime.MinValue would need to throw an out of range exception. I feel if we build out a decent TimeStamp struct, this is an acceptable tradeoff.

Improve performance and reduce allocations when parsing RedisResults

Currently when a RedisResult[] is returned from a TS.RANGE or similar query, we enumerate the entire result and allocate it to a new IReadOnlyCollection. This feels unnecessary, and iteration should be left up to the user.

private static TimeStamp ParseTimeStamp(RedisResult result)
{
	if (result.Type == ResultType.None) return default;
	return new TimeStamp((long)result);
}

private static TimeSeriesTuple ParseTimeSeriesTuple(RedisResult result)
{
	RedisResult[] redisResults = (RedisResult[])result;
	if (redisResults.Length == 0) return null;
	return new TimeSeriesTuple(ParseTimeStamp(redisResults[0]), (double)redisResults[1]);
}

private static IReadOnlyList<TimeSeriesTuple> ParseTimeSeriesTupleArray(RedisResult result)
{
	RedisResult[] redisResults = (RedisResult[])result;
	var list = new List<TimeSeriesTuple>(redisResults.Length);
	if (redisResults.Length == 0) return list;
	Array.ForEach(redisResults, tuple => list.Add(ParseTimeSeriesTuple(tuple)));
	return list;
}

I propose we wrap up the RedisResult into a TsTimeSeriesCollection object, something like;

public class TsTimeSeriesCollection
{
	private RedisResult[] _redisResults;

	public TimeSeriesCollection(RedisResult redisResult)
	{
		_redisResults = (RedisResult[])redisResult;
	}

	public (long TimeStamp, double Value) this[int index] => ((long)((RedisResult[]) _redisResults[index])[0], (double) ((RedisResult[]) _redisResults[index])[1]);

	public int Count => _redisResults.Length;
}

The above will only cast the samples RedisResult when it's accessed, which should help with performance when large sample sets are returned. I haven't extended this to implement IEnumerable<(long,double)> or started optimizing, but the general idea is to create a wrapper around the RedisResult[] and make time series arrays less allocatey and easier to work with.

Since this is a breaking change, it might make sense to do prior to the next release. Feedback welcome!

Add aggregation ALIGN option

RedisTimeSeries/RedisTimeSeries#801

Detailed TS.RANGE args with the new ALIGN feature

TS.RANGE key fromTimestamp toTimestamp [FILTER_BY_TS TS1 TS2 ..] [FILTER_BY_VALUE min max] [COUNT count] [ALIGN value] [AGGREGATION aggregationType timeBucket]
TS.REVRANGE key fromTimestamp toTimestamp [FILTER_BY_TS TS1 TS2 ..] [FILTER_BY_VALUE min max] [COUNT count] [ALIGN value] [AGGREGATION aggregationType timeBucket]

Detail of ALIGN docs

  • ALIGN - Time bucket alignment control for AGGREGATION. This will control the time bucket timestamps by changing the reference timestamp on which a bucket is defined.
    Possible values:

    • start or -: The reference timestamp will be the query start interval time (fromTimestamp).
    • end or +: The reference timestamp will be the signed remainder of query end interval time by the AGGREGATION time bucket (toTimestamp % timeBucket).
    • A specific timestamp: align the reference timestamp to a specific time.

    Note: when not provided alignment is set to 0.


Sample behaviour:

( first ingestion )

127.0.0.1:6379> ts.add serie1 1 10.0
(integer) 1
127.0.0.1:6379> ts.add serie1 3 5.0
(integer) 3
127.0.0.1:6379> ts.add serie1 11 10.0
(integer) 11
127.0.0.1:6379> ts.add serie1 21 11.0
(integer) 21

Old behaviour and the default behaviour when no ALIGN is specified ( aligned to 0 ):

127.0.0.1:6379> ts.range serie1 1 30 AGGREGATION COUNT 10
1) 1) (integer) 0
   2) 2
2) 1) (integer) 10
   2) 1
3) 1) (integer) 20
   2) 1

Align to the query start interval time (fromTimestamp)

127.0.0.1:6379> ts.range serie1 1 30 ALIGN start AGGREGATION COUNT 10
1) 1) (integer) 1
   2) 2
2) 1) (integer) 11
   2) 1
3) 1) (integer) 21
   2) 1

Align to the query end interval time (toTimestamp). The reference timestamp will be the signed remainder of query end interval time by the AGGREGATION time bucket (toTimestamp % timeBucket).

127.0.0.1:6379> ts.range serie1 1 30 ALIGN end AGGREGATION COUNT 10
1) 1) (integer) 0
   2) 2
2) 1) (integer) 10
   2) 1
3) 1) (integer) 20
   2) 1

Align to a timestamp

127.0.0.1:6379> ts.range serie1 1 30 ALIGN 1 AGGREGATION COUNT 10
1) 1) (integer) 1
   2) 2
2) 1) (integer) 11
   2) 1
3) 1) (integer) 21
   2) 1

Refactor TimeStamp class to UNIXTimeStamp

As a follow up to #20
DateTime should be converted to UNIXTimeStamp so this library API be fully compliant with RedisTimeSeries native operations.
This change will trigger 2.0.x Release
Side effects:

  1. This change will no longer allow users to store DateTime values without losing information since the translation from DateTime to UNIX timestamp and DateTime again will cause losing information resolution.

SELECTED_LABELS label1 ... support for TS.MRANGE and TS.MREVRANGE

Changes to TS.MRANGE / MREVRANGE

SELECTED_LABELS allows to request only a subset of the key-value pair labels of a serie.
An important note is that SELECTED_LABELS and WITHLABELS are mutualy exclusive.

TS.MRANGE fromTimestamp toTimestamp
          [FILTER_BY_TS TS1 TS2 ..]
          [FILTER_BY_VALUE min max]
          [COUNT count]
          [WITHLABELS | SELECTED_LABELS label1 ..]
          [AGGREGATION aggregationType timeBucket]
          FILTER filter..
          [GROUPBY <label> REDUCE <reducer>]

TS.MREVRANGE fromTimestamp toTimestamp
          [FILTER_BY_TS TS1 TS2 ..]
          [FILTER_BY_VALUE min max]
          [COUNT count]
          [WITHLABELS | SELECTED_LABELS label1 ..]
          [AGGREGATION aggregationType timeBucket]
          FILTER filter..
          [GROUPBY <label> REDUCE <reducer>]

More detailed args specs:

* WITHLABELS - Include in the reply the label-value pairs that represent metadata labels of the time series. If `WITHLABELS` or `SELECTED_LABELS` are not set, by default, an empty Array will be replied on the labels array position.
* SELECTED_LABELS - Include in the reply a subset of the label-value pairs that represent metadata labels of the time series. This is usefull when you have a large number of labels per serie but are only interested in the value of some of the labels. If `WITHLABELS` or `SELECTED_LABELS` are not set, by default, an empty Array will be replied on the labels array position.

Example:

Query time series with metric=cpu, but only reply the team label

127.0.0.1:6379> TS.ADD ts1 1 90 labels metric cpu metric_name system team NY
(integer) 1
127.0.0.1:6379> TS.ADD ts1 2 45
(integer) 2
127.0.0.1:6379> TS.ADD ts2 2 99 labels metric cpu metric_name user team SF
(integer) 2
127.0.0.1:6379> TS.MRANGE - + SELECTED_LABELS team FILTER metric=cpu
1) 1) "ts1"
   2) 1) 1) "team"
         2) "NY"
   3) 1) 1) (integer) 1
         2) 90
      2) 1) (integer) 2
         2) 45
2) 1) "ts2"
   2) 1) 1) "team"
         2) "SF"
   3) 1) 1) (integer) 2
         2) 99

Add GROUPBY and REDUCE support

Expected argument extension to TS.MRANGE: GROUPBY <label> REDUCE <reducer>

TS.MRANGE 1451679382646 1451682982646 WITHLABELS 
AGGREGATION MAX 60000 
FILTER measurement=cpu 
      fieldname=usage_user 
      hostname=(host_9,host_3,host_5,host_1,host_7,host_2,host_8,host_4)
GROUPBY hostname REDUCE MAX

Return Value

Array-reply, specifically it should include the following labels:

Labels:

  • <label>=<groupbyvalue>
  • __reducer__=<reducer>
  • __source__=key1,key2,key3

Add support for TS.DEL

TS.DEL

Delete data points for a given timeseries and interval range in the form of start and end delete timestamps.

The given timestamp interval is closed (inclusive), meaning start and end data points will also be deleted.

TS.DEL key fromTimestamp toTimestamp
  • key - Key name for timeseries
  • fromTimestamp - Start timestamp for the range deletion.
  • toTimestamp - End timestamp for the range deletion.

Complexity

TS.DEL complexity is O(n).

n = Number of data points that are in the requested range

Delete range of data points example

TS.DEL temperature:2:32 1548149180000 1548149183000

Support for OSS Cluster

The current edge version of RedisTimeSeries is suited for working with an OSS Cluster topology in the following manner:

  • on single key commands like any other command
  • on commands that require multi-key gatter/aggregation/filtering ( like TS.MRANGE, TS.MGET, TS.MREVRANGE,... ) when used together with RedisGears ( Gears integration PR RedisTimeSeries/RedisTimeSeries#653 ).

Even though still not released, we should start embracing cluster-ready ( or topology agnostic ) clients.

Pipelining of TS.ADD

Version

1.3.0

Issue

Does the TS.ADD implement a pipelining mechanism to allow for higher throughput? Is using the Async mechanism analogous to pipelining, if so are there any examples on how to implement pipelining correctly?

Add zrevrangebyts ?

When using sorted sets for time series I use zrevrangebyscore to find the latest element prior to a given timestamp. See the below example. It seems I cannot perform a similar nearest-neighbour look up with TS.RANGE, since the min and max values must match existing keys exactly. Is there a way to find the nearest-neighbour with TS.RANGE? or would a TS.zrangebyts need to be added? I'm sure this kind of usage is common and would be a great addition to this phenomenal library.

ZADD mySortedSet 1.1 100
ZADD mySortedSet 1.2 200
ZADD mySortedSet 1.3 300
ZADD mySortedSet 1.4 400
ZADD mySortedSet 1.5 500

zrevrangebyscore mySortedSet 1.25 -inf LIMIT 0 1

  1. "200"
    zrevrangebyscore mySortedSet 1.35 -inf LIMIT 0 1
  2. "300"
    zrevrangebyscore mySortedSet 1.42 -inf LIMIT 0 1

Simplify the Aggregation type

Working with the Aggregation type is currently not as easy as it could be. We can avoid creating a new object, and the ambiguity with the Aggregation.Name property (easy to confuse when comparing).

I propose we convert it to an enum, which will be easier for the end user;

public enum Aggregation
{
	Avg,
	Sum,
	Min,
	Max,
	Range,
	Count,
	First,
	Last,
	StdP,
	StdS,
	VarP,
	VarS,
}

And internally use an extension method to retrieve the RedisTimeSeries param;

internal static class AggregationExtensions
{
	public static string Param(this Aggregation aggregation) => aggregation switch
	{
		Aggregation.Avg => "avg",
		Aggregation.Sum => "sum",
		Aggregation.Min => "min",
		Aggregation.Max => "max",
		Aggregation.Range => "range",
		Aggregation.Count => "count",
		Aggregation.First => "first",
		Aggregation.Last => "last",
		Aggregation.StdP => "std.p",
		Aggregation.StdS => "std.s",
		Aggregation.VarP => "var.p",
		Aggregation.VarS => "var.s",
		_ => throw new ArgumentException("Invalid Aggregation parameter"),
	};
}

How to store/retrieve multivariate time series

I have a stock price model, with multiple field-value, like this:

public class IndexSeries
{ 
	public int IndexId {get;set;}
	public long TradingDate {get;set;}  //Unix Epoch milliseconds
	public decimal OpenIndex { get; set; }
	public decimal CloseIndex { get; set; }
	public decimal HighestIndex { get; set; }
	public decimal LowestIndex { get; set; }
	public long TotalMatchVolume { get; set; } 
	public long MatchValue { get; set; }
	public long MatchVolume { get; set; }    
        ...  
}

Data is received with high frequency (at least one second), and I want to use RedisTimeSeries to store/retrieve. How can I do this?

Add examples and further notes about duplicate policy

Since RedisTimeSeries 1.4 we've added the ability to back-fill time series, with different duplicate policies.

However, we still see several issues being raised on the core repo/client repo that point to users no being aware of it. Example: RedisTimeSeries/redistimeseries-py#86

We should address this by adding examples and further notes about duplicate policy.
You can check our docs about duplicate policy here: https://oss.redislabs.com/redistimeseries/configuration/#duplicate_policy.

Sample Readme of python client with an example of the expected outcome of this documentation/examples task:
https://github.com/RedisTimeSeries/redistimeseries-py#further-notes-on-back-filling-time-series

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.