chronixdb / chronix.server Goto Github PK

View Code? Open in Web Editor NEW

263.0 263.0 29.0 2.28 MB

The Chronix Server implementation that is based on Apache Solr.

License: Apache License 2.0

Java 62.49% Groovy 35.31% CSS 2.08% Shell 0.05% ANTLR 0.07%

anomalydetection chronix-server database efficiency fast time-series

chronix.server's People

Contributors

Stargazers

Watchers

chronix.server's Issues

Filter / Window Transformations

Chronix currently has aggregations and high-level analyses but no transformations like filter or window / sliding window.

Chronix Simple Ingestion Interface

We should provide a simple ingestion interface for time series data, e.g. pairs of timestamp, value. We should adapt the protocols of InfluxDB, Graphite, ...

Bug when using field in join key that is not part of the requested fields

If the join key uses a field that is not part of the requested fields, then the join fails.

join_key: "null-testmetric"

Do not return empty function / aggregation arguments

MovingAverage implementation is wrong

The documentation says:

/** 
     * Calculates the moving average of the time series using the following algorithm:
     * <p>
     * 1) Get all points within the defined time window
     * -> Calculate the time and value average sum(values)/#points
     * <p>
     * 2) If the distance of two timestamps (i, j) is larger than the time window
     * -> Ignore the emtpy window
     * -> Use the j as start of the window and continue with step 1)
     *
     * @param timeSeries the time series that is transformed
     * @return the transformed time series
     */

That is not a moving average (and it's jumpy).

Time Series Forecast

Implement a transformation that does a time series forecast.

New Aggregation: Signed Difference

Implement signed difference aggregation.

For negative values

first = -1
last = -10
=> diff = -9

For positive values

first = 1
last = 10
=> diff = 9

Positive first, negative last

first = 1
last = -10
=> diff = -11

Negative first, positive last

first = -1
last = 10
=> diff = 11

Describe Chronix' functions in more detail

We should provide a document describing Chronix' functions.

HTML
Markdown
...

New Aggregation: Range

Absolute difference between the minimum and the maximum:

min = -100
max = 200
range => 300

Simple Chronix Client

Should we provide a simple chronix client without SolrJ dependency?

Prometheus Integration

Build an integration with Prometheus and Chronix to read data out of Prometheus into Chronix, to use Chronix as long term storage.

New Aggregation: Difference

Implement difference aggregation:

Abs.(first-last)

Date parsing in subquery (FastDTW)

The subquery currently do not parse the start or end dates.

Pass multiple analyses to Chronix

q=host:xyz&fq=aggregation=min,max, ...

The results of the aggregations and analyses are added to the resulting document:

start: A
end: B
data:[...]
min:X
max:Y
...

Hence can ask several values at once.

Points returned as JSON are not sorted

A request with fl=dataAsJson returns unsorted points.
Solution: Insert a ts.sort() call before JSON serialization.

Moving Average based on a fixed size of samples

We currently only provide a moving average with a time window. Hence the amounts of points within a window varies. Therefore we should also provide a moving average transformation with a fixed amount of points.

Info for a comparison table

Hi, I've started a time series comparison table and wondered if you'd be able to help fill it in?

I've only just discovered Chronix and it looks pretty cool. Would be good to have it listed.

https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCEWdPtAoccJ4a-IuZv4fXDHxM/edit#gid=0

Happy to expand the table comparison criteria too if you think there's anything else that should be noted in particular about Chronix that would help people choose.

Thanks!

Upgrade to Solr 6.0

Upgrade the codebase to Solr 6.0

Key-value attributes

Hi,

I just learned about Chronix, so dare with me if I have overlooked that but is it possible to add key value metadata to the measurements like host:myhost, application:myapp? Like the InfluxDB format or the format described here: https://www.elastic.co/blog/elasticsearch-as-a-time-series-data-store.

Also it would be nice to have documentation about the http ingestion protocol and format, if available as well as the query api and aggregation functions.

Timeshift Transformation

New Aggregation: Count

Result never returns

The Chronix client asks the amount of time series in a first call. If the further result (e.g. an analysis) reduces the amount of time series, then the result is never returned.

Explicit option to enable data return.

Currently aggregations do not return the data field but high level analyses does.
We should provide an option that the user can decide if he needs the data or not.

The default is, that no data is returned for all types of analyses (aggregations / high level analyses). With the option fl=data enabled Chronix returns the rawa data.

Wall clock to set the server timestamp on reported measurements.

Currently Chronix expects pairs of timestamp and value. With a wall clock we can send the values without a timestamp as Chronix adds the current server time on the values. See how InfluxDB does this.

Bug when joining fields

If a field used for joining records is not defined in the requested fields, the join key contains "null" values leading to wrong joins.

Avoid identical functions within one query

Allow server-side response compression

Add this to the jetty-gzip.xml in chronix-X.X/chronix-solr-X.X.X/server/etc

<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_3.dtd">

<!-- =============================================================== -->
<!-- Mixin the GZIP Handler                                          -->
<!-- This applies the GZIP Handler to the entire server              -->
<!-- If a GZIP handler is required for an individual context, then   -->
<!-- use a context XML (see test.xml example in distribution)        -->
<!-- =============================================================== -->

<Configure id="Server" class="org.eclipse.jetty.server.Server">
  <Call name="insertHandler">
    <Arg>
      <New id="GzipHandler" class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
    <Set name="minGzipSize"><Property name="jetty.gzip.minGzipSize" deprecated="gzip.minGzipSize" default="0"/></Set>
    <Set name="checkGzExists"><Property name="jetty.gzip.checkGzExists" deprecated="gzip.checkGzExists" default="false"/></Set>
    <Set name="compressionLevel"><Property name="jetty.gzip.compressionLevel" deprecated="gzip.compressionLevel" default="1"/></Set>
    <Set name="excludedAgentPatterns">
      <Array type="String">
        <Item><Property name="jetty.gzip.excludedUserAgent" deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*"/></Item>
      </Array>
    </Set>

    <Set name="includedMethods">
      <Array type="String">
        <Item>GET</Item>
      </Array>
    </Set>

      </New>
    </Arg>
  </Call>
</Configure>

And the following snippet to chronix-X.X/chronix-solr-X.X.X/server/contexts/solr-jetty-context.xml

<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_0.dtd">
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
  <Set name="contextPath"><Property name="hostContext" default="/solr"/></Set>
  <Set name="war"><Property name="jetty.base"/>/solr-webapp/webapp</Set>
  <Set name="defaultsDescriptor"><Property name="jetty.base"/>/etc/webdefault.xml</Set>
  <Set name="extractWAR">false</Set>

  <!-- Enable gzip compression -->
  <Set name="gzipHandler">
    <New class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
      <Set name="minGzipSize">2048</Set>      
    </New>
  </Set>
</Configure>

Add the gzip.mod to chronix-X.X/chronix-solr-X.X.X/server/modules

Add / Subtract Transformation

Delete the Chronix-Response-Writer

The JSON serialization is now part of the query handler.
-> The response writer does not work with transformed results.

Build Release 0.2

Frequency detection documentation is confusing.

I just had a look at the frequency detection code to find out what its purpose is.
Reading the documentation is not very enlightening: Detects if a points occurs multiple times within a defined time range

Reading the code, doesn't really help either:

It takes multiple time series as arguments, but only looks at the first one.
It uses the List<Long> currentWindow as a counter (just the size, contents irrelevant).
It seems to subdivide a timeseries into chunks just smaller than windowSize minutes in duration and returns true if a chunk has at least windowThreshold more observations than its predecessor.

Enable CORS

Without enabled CORS the Grafana plugin wont work.
Solution: Enable it per default.

Add this to the web.xml

<!-- Activates CORS for queries data e.g. grafana -->
<filter>
    <filter-name>cross-origin</filter-name>
    <filter-class>org.eclipse.jetty.servlets.CrossOriginFilter</filter-class>
    <init-param>
         <param-name>allowedOrigins</param-name>
         <param-value>http://localhost*</param-value>
    </init-param>
     <init-param>
         <param-name>allowedMethods</param-name>
         <param-value>GET,POST,DELETE,PUT,HEAD,OPTIONS</param-value>
     </init-param>
     <init-param>
         <param-name>allowedHeaders</param-name>
         <param-value>origin, content-type, cache-control, accept, options, authorization, x-requested-with</param-value>
     </init-param>
    <init-param>
        <param-name>supportsCredentials</param-name>
        <param-value>true</param-value>
    </init-param>
    <init-param>
      <param-name>chainPreflight</param-name>
      <param-value>false</param-value>
    </init-param>
</filter>

<filter-mapping>
  <filter-name>cross-origin</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>

Upgrade to Solr 6.1

Is there any Chronix client driver for javascript/nodejs

Did not find an email id to write and hence writing it here.

Does chronix provide nodejs/javascript client driver?

FastDTW Analysis

In some cases a time series has two or more timestamps that have exactly the same value. FastDTW can not deal with that. Hence we have to filter / aggregate the points with the same timestamp.

Data as JSON even for transformations

Currently the dataAsJson functionality only works for range queries without any functions (aggregations, transformations, analyses). We should provide this feature also for queries that include functions.

Add prometheus servlet

We should provide a prometheus servlet for monitoring purposes.

Data-mining or real-time?

Asked in Gitter...

@FlorianLautenschlager I have a question about Chronix. Maybe about chronix-storage in particular...

It seems like Chronix is designed more for data-mining that real-time use, is that correct?

I ask, because it seems that a time series is only (should only) be added when a sufficient number of data points have been collected.

For example, in order to benefit from the compression it seems that "chunks" of data points need to be accumulated before adding the total series to Solr. If this is true, the "recent" values would not be available for query. Correct?

Or can I collect a set of metrics every 5 seconds, and add them through the storage service, whereby they can be queried? Does something underlying in Chronix "merge" them in some way into a document of "significant size" over time to achieve better compression and query performance?

My concern is that we are building a monitoring system with thousands (or tens of thousands) of disparate metrics collected every 5 seconds, but for any given host/metric pair there would only be 12 per minute -- but they need to be available "immediately" for query to display on real-time dashboards.

Merging attributes on aggregations and analyses

The attributes of the time series included in an analysis or aggregation are currently not merged. The attributes of the first time series are set in the result. We should merge the attributes using a set as value holding the attributes of the same keys.

Describe Chronix' storage format

We should describe a way to add time series to Chronix using the HTTP Interface.

Sever-side Analysis Scripts

Discuss if we should provide a way that a user can send a groovy script to Chronix that is evaluated on the side of the server. This could be a way to easily extend Chronix with missing analyses.

Chronix Dokumentation as GitBook

Addresses also

New Aggregation: First

Returns the first value of the time series.

New Aggregation: Last

Returns the last value of the time series.

When querying with fq, query handler raises an exception

Steps to reproduce:
Install chronix-solr-6.0.1
solr start
go to http://localhost:8983/solr/#/chronix/query
q = metric:Load
qf = anything you want

solr ui shows:
{
"responseHeader":{
"status":400,
"QTime":4},
"error":{
"metadata":[
"error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.common.SolrException"],
"msg":"no field name specified in query and no default specified via 'df' param",
"code":400}}

solr.log shows:
2016-06-07 21:04:59.630 ERROR (qtp110456297-17) [ x:chronix] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name specified in query and no default specified via 'df' param
at org.apache.solr.parser.SolrQueryParserBase.checkNullField(SolrQueryParserBase.java:700)

Time Series Vectorization

Implement a transformation that does a server-side time series vectorization.
This is useful in many cases, e..g. data reduction on the client side.

Could be something like that:

transform=vector:points,threshold

Periodically check if Chronix has records with small chunks. If, group these records and build larger chunks.

chronixdb / chronix.server Goto Github PK

chronix.server's People

Contributors

Stargazers

Watchers

Forkers

chronix.server's Issues

Recommend Projects

Recommend Topics

Recommend Org