Git Product home page Git Product logo

tsviewdb's Introduction

TSViewDB

*~ Under development: API or features may change. ~*

Table Of Contents

<a name="Overview"/a> Overview

TSViewDB is a high-performance storage and graphing web service for time-series data from experiments with multiple iterations (time-series of time-series). It provides:

  • A RESTful API.
  • A pluggable storage backend:
    • Currently uses Apache Cassandra.
  • Interactive graphs.
  • Regression detection over non-cyclic data.
  • Easy horizontal infrastructure scaling.

Typical Deployment

Time-series of Time-series

Time-series of time-series can be produced by periodically run experiments each of which contain multiple iterations. TSViewDB stores the data points associated with these multiple iterations, and also automatically calculate various summary statistics over them (mean, min, max, 50th percentile, etc.).

Time-series of Time-series

Interactive Graphs

TSViewDB provides a UI which quickly shows you what's in your data. You have ready access to interactive graphs for summary statistics (aggregates), for iteration points, and for histograms of iteration points. Common functionality is either a mouseover, click, or click and drag away. Graphs are zoomable and auto-resizing.

TSViewDB Screenshot

Regression Detection

TSViewDB can optionally determine if a regression has occurred in non-cyclic data for any of the read result data it returns. The regression function works over noisy, non-cyclic data and returns the precise regressing segments (which may also be graphed). This facilitates setting up daily regression alert emails and analysis systems.

Easy Horizontal Infrastructure Scaling

TSViewDB's current Apache Cassandra backend can be scaled simply by adding nodes, and is known to handle significant write traffic. The TSViewDB server itself can be replicated behind a load balancer. It includes an in-process, cluster-aware cache server which allows latencies for read hits to be low (<100 microseconds for 3.2 GHz Xeon) and throughput high (~100,000 reads/sec for 12-core 3.2 GHz Xeon).

<a name="Features"/a> Features

  • RESTful JSON API
    • Sparse record input; table output
      • Input records may have arbitrary metrics
      • Reads return fully-populated tables
    • Automatic aggregate calculation
      • On writes, 13 aggregations are calculated if not supplied (min, mean, max, count, and various percentiles)
    • Optional, per-record name/value tagging (config tags)
    • X axis changeable to any arbitrary metric or numeric config tag
    • Multiple data sources can be specified in one request to be merged into single result table
  • Graph generation
    • PNG
    • Embeddable interactive HTML5
  • Regression detection functionality
    • Can be enabled on any range reads
    • Allows daily summary alerts, backdated re-analysis
    • Optional regression segment highlighting on PNG graphs
    • Works with noisy non-cyclic data
  • UI for analysis
    • Rapid access to aggregations, points, and histogram of points.
  • Horizontal infrastructure scaling
    • Scalable Apache Cassandra backend (just add nodes)
    • In-server cache with cluster support
  • Low-latency
    • ~5% latency penalty over raw Cassandra reads (3.2GHz Xeon)
      • Reads: ~100ms/1000 rows
      • Cached: ~400us/1000 rows
    • ~30-40% latency penalty over raw Cassandra writes (2.4GHz i7)
      • Writes: <1ms (20 points)
  • High-throughput
    • ~100 reads/sec/Cassandra node
    • ~5,000 writes/sec/Cassandra node
    • Read cache hits: ~10,000 reads/sec/core
  • Easy deployment
    • Single statically-linked TSViewDB server + Cassandra server
    • In-server cache, no need for Varnish-type front-end or additional memcache-type deployment
      • Can serve stale data if background DB request misses deadline (for quick loading dashboards)
      • No thundering herd problem (guaranteed one regen across a cluster)
    • Cross-platform (Linux, OSX, Windows)
  • Disk space efficient
    • Uses application-specific lossy compression (controllable through API, including disabling)
  • Resource efficient retrieval
    • Metric indexing per row for efficiently retrieving only desired data

What is it good for?

  • Data collection for continuously run experiments composed of repeated trials:
    • Continuous pre-production performance measurement.
  • One-off experiments composed of repeated trials.
  • Write-heavy workloads.

What is it NOT good for?

  • Production monitoring.
    • Flat time-series.
  • Storing or analyzing only recent data (round-robin type databases are better fits).
  • Data analysis which requires rolling up (because not supported).
Installation -------------- 1\. **Install and Setup Apache Cassandra** - Install Apache Cassandra and follow its install instructions: http://cassandra.apache.org/download/ - Modify the Cassandra installation config to allow ordered scan results (don't start the Cassandra server until you do this):
cd apache-cassandra-$VERSION/conf
sed -i.bak 's/^partitioner:.*/partitioner: org.apache.cassandra.dht.ByteOrderedPartitioner/' cassandra.yaml
  • Start the Cassandra server:
cd apache-cassandra-$VERSION/conf
bin/cassandra -f
  • Run the setup script to create a keyspace (default is named "perf"):
  cd apache-cassandra-$VERSION/conf
  bin/cassandra-cli -f init_perf_keyspace.script

2. Setup TSViewDB Either download an executable or build from source.

Download executable:

tar xzf tsviewdb-$VERSION.tar.gz
export GOPATH=$HOME/tsviewdb-$VERSION

Build From Source:

  mkdir $HOME/tsviewdb-$VERSION
  export GOPATH=$HOME/tsviewdb-$VERSION
  go get github.com/google/tsviewdb/server
- Make resources:
$GOPATH/src/github.com/google/tsviewdb/MAKE_RESOURCES.sh
  • Start the server:
cd $HOME/tsviewdb-$VERSION
bin/server -logtostderr

Make sure to start the server first (see Installation).

1. Register a new data source: "testdir/testsubdir/testdata"

curl -X PUT 'localhost:8080/src/v1/testdir/testsubdir/testdata'

2. Upload some data to it.

curl -X POST 'localhost:8080/src/v1/testdir/testsubdir/testdata' \
 --data-binary '{ \
 "points":[{"name": "testMetric", "data": [1.8, 2.2, 0.7, 10.5, 3.4, 2.0, 2.1, 8.4, 5.8, 1.1]} \
 ]}'

3. Read aggregate data back.

curl --compressed 'localhost:8080/srcs/v1?src=testdir/testsubdir/testdata:testMetric.mean'

Result should be something like:

{
    "aggregates": [
        [
            1378703896816.0, 
            5.5
        ]
    ], 
    "aggregatesColumnNames": [
        "_Time", 
        "testMetric.mean"
    ]
}

<a name="Additional_Documentation"/a> Additional Documentation

<a name="License"/a> License

TSViewDB is licensed under the Apache License version 2.0. This is not an official Google product.

tsviewdb's People

Contributors

adilhn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tsviewdb's Issues

WebPageTest integration with tsviewdb.

I have integrated my webpagetest private instance as per the documentation. Now when I look at the results here :- http://ec2-52-23-200-102.compute-1.amazonaws.com:8080/v#src=testdir%2Ftestsubdir%2Famazon-repeat-view&last_pts=15&display=c6C8K7N9 there is no way to get the aggregate from more than two tests. The aggregates graph just plot all the points for three tests . I thought it was supposed to give the mean of all the three results. The aggregates just do not work with WebPageTest results. It plots all the three and points show just one.

Is there a way in which I can compare the test metrics from different webpagetest results to get mean ,max min etc. Also what does the second slider in the aggregates do?

The aggregates work correctly with testdata example provided in the documentation, but when integrated with WebPageTest the results do not come out as expected.

install bombs at MAKE_RESOURCES.sh

I get this error when I try to MAKE_RESOURCES.

[root@tsviewdb var]# $GOPATH/src/github.com/google/tsviewdb/MAKE_RESOURCES.sh
cp: cannot stat `/root/tsviewdb-0.1.linux-amd64/src/code.google.com/p/plotinum/vg/fonts/*': No such file or directory

I've never used google go before, but it seems like this might be broken.

go get code.google.com/p/plotinum/

missing data in agregate chart, with chart range "daysOfData=N"

navigate a particular datasource
if you select "range", which results in a default of "daysOfData=10" on the url params,
then the data displayed gets truncated to slightly less than 1 day.

the workaround is to manually adjust the date range slider right hand button to the left by 1 notch, which sets the range ending "today", and the url params to change to "startDate=20141007&endDate=20141015".
the correct data is displayed now.

root cause should be fixed though

UI shows no graphs.

Hi,

I have followed the installation instruction and installed tsdbview successfully. After hitting localhost:8080 I get a textbox to search for datasource. But upon entering testdata I get an empty page ERR_INVALID_RESPONSE. I see the data present in the DB .

What should I do to pull the graphs.

The only thing I missed during installation was this step.

Modify the Cassandra installation config to allow ordered scan results (don't start the Cassandra server until you do this): This wouldn't let me start cassandra.

Would this have created a problem. Kindly suggest what needs to be done here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.