chronixdb / chronix.server Goto Github PK
View Code? Open in Web Editor NEWThe Chronix Server implementation that is based on Apache Solr.
License: Apache License 2.0
The Chronix Server implementation that is based on Apache Solr.
License: Apache License 2.0
Chronix currently has aggregations and high-level analyses but no transformations like filter or window / sliding window.
We should provide a simple ingestion interface for time series data, e.g. pairs of timestamp, value. We should adapt the protocols of InfluxDB, Graphite, ...
If the join key uses a field that is not part of the requested fields, then the join fails.
join_key: "null-testmetric"
The documentation says:
/**
* Calculates the moving average of the time series using the following algorithm:
* <p>
* 1) Get all points within the defined time window
* -> Calculate the time and value average sum(values)/#points
* <p>
* 2) If the distance of two timestamps (i, j) is larger than the time window
* -> Ignore the emtpy window
* -> Use the j as start of the window and continue with step 1)
*
* @param timeSeries the time series that is transformed
* @return the transformed time series
*/
That is not a moving average (and it's jumpy).
Implement a transformation that does a time series forecast.
Implement signed difference aggregation.
For negative values
first = -1
last = -10
=> diff = -9
For positive values
first = 1
last = 10
=> diff = 9
Positive first, negative last
first = 1
last = -10
=> diff = -11
Negative first, positive last
first = -1
last = 10
=> diff = 11
We should provide a document describing Chronix' functions.
Absolute difference between the minimum and the maximum:
min = -100
max = 200
range => 300
Should we provide a simple chronix client without SolrJ dependency?
Build an integration with Prometheus and Chronix to read data out of Prometheus into Chronix, to use Chronix as long term storage.
Implement difference aggregation:
Abs.(first-last)
The subquery currently do not parse the start or end dates.
q=host:xyz&fq=aggregation=min,max, ...
The results of the aggregations and analyses are added to the resulting document:
start: A
end: B
data:[...]
min:X
max:Y
...
Hence can ask several values at once.
A request with fl=dataAsJson returns unsorted points.
Solution: Insert a ts.sort() call before JSON serialization.
We currently only provide a moving average with a time window. Hence the amounts of points within a window varies. Therefore we should also provide a moving average transformation with a fixed amount of points.
Hi, I've started a time series comparison table and wondered if you'd be able to help fill it in?
I've only just discovered Chronix and it looks pretty cool. Would be good to have it listed.
https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCEWdPtAoccJ4a-IuZv4fXDHxM/edit#gid=0
Happy to expand the table comparison criteria too if you think there's anything else that should be noted in particular about Chronix that would help people choose.
Thanks!
Upgrade the codebase to Solr 6.0
Hi,
I just learned about Chronix, so dare with me if I have overlooked that but is it possible to add key value metadata to the measurements like host:myhost, application:myapp? Like the InfluxDB format or the format described here: https://www.elastic.co/blog/elasticsearch-as-a-time-series-data-store.
Also it would be nice to have documentation about the http ingestion protocol and format, if available as well as the query api and aggregation functions.
The Chronix client asks the amount of time series in a first call. If the further result (e.g. an analysis) reduces the amount of time series, then the result is never returned.
Currently aggregations do not return the data field but high level analyses does.
We should provide an option that the user can decide if he needs the data or not.
The default is, that no data is returned for all types of analyses (aggregations / high level analyses). With the option fl=data
enabled Chronix returns the rawa data.
Currently Chronix expects pairs of timestamp and value. With a wall clock we can send the values without a timestamp as Chronix adds the current server time on the values. See how InfluxDB does this.
If a field used for joining records is not defined in the requested fields, the join key contains "null" values leading to wrong joins.
Add this to the jetty-gzip.xml in chronix-X.X/chronix-solr-X.X.X/server/etc
<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_3.dtd">
<!-- =============================================================== -->
<!-- Mixin the GZIP Handler -->
<!-- This applies the GZIP Handler to the entire server -->
<!-- If a GZIP handler is required for an individual context, then -->
<!-- use a context XML (see test.xml example in distribution) -->
<!-- =============================================================== -->
<Configure id="Server" class="org.eclipse.jetty.server.Server">
<Call name="insertHandler">
<Arg>
<New id="GzipHandler" class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
<Set name="minGzipSize"><Property name="jetty.gzip.minGzipSize" deprecated="gzip.minGzipSize" default="0"/></Set>
<Set name="checkGzExists"><Property name="jetty.gzip.checkGzExists" deprecated="gzip.checkGzExists" default="false"/></Set>
<Set name="compressionLevel"><Property name="jetty.gzip.compressionLevel" deprecated="gzip.compressionLevel" default="1"/></Set>
<Set name="excludedAgentPatterns">
<Array type="String">
<Item><Property name="jetty.gzip.excludedUserAgent" deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*"/></Item>
</Array>
</Set>
<Set name="includedMethods">
<Array type="String">
<Item>GET</Item>
</Array>
</Set>
</New>
</Arg>
</Call>
</Configure>
And the following snippet to chronix-X.X/chronix-solr-X.X.X/server/contexts/solr-jetty-context.xml
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_0.dtd">
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
<Set name="contextPath"><Property name="hostContext" default="/solr"/></Set>
<Set name="war"><Property name="jetty.base"/>/solr-webapp/webapp</Set>
<Set name="defaultsDescriptor"><Property name="jetty.base"/>/etc/webdefault.xml</Set>
<Set name="extractWAR">false</Set>
<!-- Enable gzip compression -->
<Set name="gzipHandler">
<New class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
<Set name="minGzipSize">2048</Set>
</New>
</Set>
</Configure>
Add the gzip.mod to chronix-X.X/chronix-solr-X.X.X/server/modules
The JSON serialization is now part of the query handler.
-> The response writer does not work with transformed results.
I just had a look at the frequency detection code to find out what its purpose is.
Reading the documentation is not very enlightening: Detects if a points occurs multiple times within a defined time range
Reading the code, doesn't really help either:
List<Long> currentWindow
as a counter (just the size, contents irrelevant).windowSize
minutes in duration and returns true if a chunk has at least windowThreshold
more observations than its predecessor.Without enabled CORS the Grafana plugin wont work.
Solution: Enable it per default.
Add this to the web.xml
<!-- Activates CORS for queries data e.g. grafana -->
<filter>
<filter-name>cross-origin</filter-name>
<filter-class>org.eclipse.jetty.servlets.CrossOriginFilter</filter-class>
<init-param>
<param-name>allowedOrigins</param-name>
<param-value>http://localhost*</param-value>
</init-param>
<init-param>
<param-name>allowedMethods</param-name>
<param-value>GET,POST,DELETE,PUT,HEAD,OPTIONS</param-value>
</init-param>
<init-param>
<param-name>allowedHeaders</param-name>
<param-value>origin, content-type, cache-control, accept, options, authorization, x-requested-with</param-value>
</init-param>
<init-param>
<param-name>supportsCredentials</param-name>
<param-value>true</param-value>
</init-param>
<init-param>
<param-name>chainPreflight</param-name>
<param-value>false</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>cross-origin</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Did not find an email id to write and hence writing it here.
Does chronix provide nodejs/javascript client driver?
In some cases a time series has two or more timestamps that have exactly the same value. FastDTW can not deal with that. Hence we have to filter / aggregate the points with the same timestamp.
Currently the dataAsJson functionality only works for range queries without any functions (aggregations, transformations, analyses). We should provide this feature also for queries that include functions.
We should provide a prometheus servlet for monitoring purposes.
Asked in Gitter...
@FlorianLautenschlager I have a question about Chronix. Maybe about chronix-storage in particular...
It seems like Chronix is designed more for data-mining that real-time use, is that correct?
I ask, because it seems that a time series is only (should only) be added when a sufficient number of data points have been collected.
For example, in order to benefit from the compression it seems that "chunks" of data points need to be accumulated before adding the total series to Solr. If this is true, the "recent" values would not be available for query. Correct?
Or can I collect a set of metrics every 5 seconds, and add them through the storage service, whereby they can be queried? Does something underlying in Chronix "merge" them in some way into a document of "significant size" over time to achieve better compression and query performance?
My concern is that we are building a monitoring system with thousands (or tens of thousands) of disparate metrics collected every 5 seconds, but for any given host/metric pair there would only be 12 per minute -- but they need to be available "immediately" for query to display on real-time dashboards.
The attributes of the time series included in an analysis or aggregation are currently not merged. The attributes of the first time series are set in the result. We should merge the attributes using a set as value holding the attributes of the same keys.
We should describe a way to add time series to Chronix using the HTTP Interface.
Discuss if we should provide a way that a user can send a groovy script to Chronix that is evaluated on the side of the server. This could be a way to easily extend Chronix with missing analyses.
Returns the first value of the time series.
Returns the last value of the time series.
Steps to reproduce:
Install chronix-solr-6.0.1
solr start
go to http://localhost:8983/solr/#/chronix/query
q = metric:Load
qf = anything you want
solr ui shows:
{
"responseHeader":{
"status":400,
"QTime":4},
"error":{
"metadata":[
"error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.common.SolrException"],
"msg":"no field name specified in query and no default specified via 'df' param",
"code":400}}
solr.log shows:
2016-06-07 21:04:59.630 ERROR (qtp110456297-17) [ x:chronix] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name specified in query and no default specified via 'df' param
at org.apache.solr.parser.SolrQueryParserBase.checkNullField(SolrQueryParserBase.java:700)
Implement a transformation that does a server-side time series vectorization.
This is useful in many cases, e..g. data reduction on the client side.
Could be something like that:
transform=vector:points,threshold
If Chronix is asked for multiple percentile aggregations, only one is answered.
Currently the libraries of Chronix are released on bintray.
We should also release them on maven central.
Chronix' performance is best when the chunk size is ideal (1024 kbyte, uncompressed). But in a live monitoring we need small chunks (short time range) << 1024 kbyte. Hence the query and storage performance drops.
Feature:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.