Git Product home page Git Product logo

xephon-b's Introduction

xephon-b's People

Contributors

at15 avatar fossabot avatar gitter-badger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

xephon-b's Issues

[cmd] Setting log level and log src is not working

xb

import (
	"fmt"
	"os"
	"runtime"

	icli "github.com/at15/go.ice/ice/cli"
	goicelog "github.com/at15/go.ice/ice/util/logutil"

	"github.com/xephonhq/xephon-b/pkg/config"
	"github.com/xephonhq/xephon-b/pkg/util/logutil"
)

const (
	myname = "xb"
)

// FIXME: debug logging is not working ....
var log = logutil.Registry

var (
	version   string
	commit    string
	buildTime string
	buildUser string
	goVersion = runtime.Version()
)

var buildInfo = icli.BuildInfo{Version: version, Commit: commit, BuildTime: buildTime, BuildUser: buildUser, GoVersion: goVersion}

var cli *icli.Root
var cfg config.XephonBConfig

func main() {
	cli = icli.New(
		icli.Name(myname),
		icli.Description("Xephon-B Time Series Benchmark cli"),
		icli.Version(buildInfo),
		icli.LogRegistry(log),
	)
	root := cli.Command()
	root.AddCommand(runCmd)
	if err := root.Execute(); err != nil {
		fmt.Fprintln(os.Stderr, err)
		os.Exit(1)
	}
}

func mustLoadConfig() {
	if err := cli.LoadConfigTo(&cfg); err != nil {
		log.Fatal(err)
	}
}

func init() {
	log.AddChild(goicelog.Registry)
}
 xb run                    
info 2018-03-05T00:39:23-08:00 target database is influxdb_0 type influxdb
info 2018-03-05T00:39:23-08:00 workload is workload_0 series 1 value generator is constant
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
info 2018-03-05T00:39:23-08:00 TODO: worker should do something
````

[runner][worker] Limit total number of points

In order to test disk space usage, limit total number of points is needed

  • each worker now reports number of points in each request
  • need to fan out the result channel, it is also needed for multiple reporter
  • a special reporter that count the total number of requests and cancel the context

Provision scripts for develop environment

Related #7

For people who has to use windows, it's hard to setup the develop environment. (Win10 has bash, but it's too young and cause more problems than it solves) Also install some database using package manager can sometimes mess up your OS

The vagrant box should include

  • an up to date go environment with glide and gopath set (may need to change vagrant mount folder)
  • JDK8, maven, gradle (if you want to try some jvm based system)
  • nvm + latest nodejs (we may need to do some front end stuff)
  • essential build tools (some database have native binding)
  • vim, git, curl (which does not ship with ubuntu)
  • docker (I don't know if win has native docker support now, yes, but require win 10 pro)
  • docker compose
  • publish to vagrant cloud

A former box I made for php development can be found here

Ref

Migrate Xephon-B back from Xephon-K

Related to xephonhq/xephon-k#60 Xephon-K clean up

Major issues

  • requires libtsdb-go to have protocol, client & server implementation in HTTP(s) and gRPC
  • support tracing, since Xephon-K server will do it as well, it has penalty, but it will given more detailed insight
  • more workloads, we were only using the extreme workload in CMPS 278 and CMPS 229

TODO

  • wait for libtsdb-go ...
  • switch to dep from glide
  • make extreme workload start running
  • plugin in BenchBoard
  • add API to control workload generation, so it can be used for distributed benchmark
  • plugin in BenchHub

Data loading

Related #1 #9

Generated data is stored to disk and need to be inserted into TSDB

NOTE: the generated data is independent of specify tsdb and is serialized using protobuf. So it need to be transformed before inserted into TSDB

On the fly transform

  • read and de serialize generated file, post into TSDB
  • Pro
    • take less disk space
    • less disk IO
  • Con
    • de serialization overhead
    • can not be reused

Pre transform

  • read and de serialize generated file, save as the exact post format to file
  • read the saved the file and post exact bytes to TSDB
  • Pro
    • no de serialization overhead
    • can be reused
  • Con
    • may take large disk space, due to tag and series name will be duplicated
    • larger disk IO

However when it comes to implementation these two have little different

data -> de serialization ->  pack into certain format (i.e. JSON) -> client lib -> TSDB
data -> de serialization ->  pack into certain format (i.e. JSON) -> file -> client lib -> TSDB

filter log by package like in java's logback

Though Java is verbose, but for libraries like logback, you can config which package to log on the fly, which is quite useful, turning on verbose log will have all the package printing log, while you may only want one package printing debug information. Currently, we are using logrus, and it seems by adding a pkg field and adding hook, it's possible to filter log

  • add hook
  • create own wheel, see dyweb/Ayi#59, created a logrus like package in ordert to add filter functionality
  • enable filter log from command line, i.e. --debug x.tsdb.kairosdb only print kairosdb package and its subpackage
  • enable filter log using config file like log4j

stretchr/testify/suite panic: reflect: Call with too few input arguments

ok  	github.com/xephonhq/xephon-b/pkg/generator	0.002s	coverage: 85.7% of statements
=== RUN   TestSerializerInterface
=== RUN   TestSerializeTestSuite
=== RUN   TestDebugSerializer
--- FAIL: TestDebugSerializer (0.00s)
panic: reflect: Call with too few input arguments [recovered]
	panic: reflect: Call with too few input arguments

goroutine 7 [running]:
panic(0x6a8ca0, 0xc4201f86d0)
	/home/at15/app/go/src/runtime/panic.go:500 +0x1a1
testing.tRunner.func1(0xc4200943c0)
	/home/at15/app/go/src/testing/testing.go:579 +0x25d
panic(0x6a8ca0, 0xc4201f86d0)
	/home/at15/app/go/src/runtime/panic.go:458 +0x243
reflect.Value.call(0xc4201e0ba0, 0xc420036530, 0x13, 0x70fb7d, 0x4, 0xc420034760, 0x1, 0x1, 0x506788, 0x70b060, ...)
	/home/at15/app/go/src/reflect/value.go:358 +0x13c1
reflect.Value.Call(0xc4201e0ba0, 0xc420036530, 0x13, 0xc420059f40, 0x1, 0x1, 0x87c140, 0x0, 0xf6)
	/home/at15/app/go/src/reflect/value.go:302 +0xa4
github.com/xephonhq/xephon-b/vendor/github.com/stretchr/testify/suite.Run.func2(0xc4200943c0)
	/home/at15/workspace/src/github.com/xephonhq/xephon-b/vendor/github.com/stretchr/testify/suite/suite.go:95 +0x1cb
testing.tRunner(0xc4200943c0, 0xc420223960)
	/home/at15/app/go/src/testing/testing.go:610 +0x81
created by testing.(*T).Run
	/home/at15/app/go/src/testing/testing.go:646 +0x2ec
FAIL	github.com/xephonhq/xephon-b/pkg/serialize	0.005s

On line 95 is method.Func.Call([]reflect.Value{reflect.ValueOf(suite)}), however after seeing https://github.com/pavlo/gosuite/blob/master/gosuite.go#L56,
I changed it to method.Func.Call([]reflect.Value{reflect.ValueOf(suite), reflect.ValueOf(t)}) and the test works

Possible reasons

  • I am not using it properly, since Ayi also work

[integeration] generator + loader + monitor + tsdb proxy (kairosdb)

Since all these parts already have a primitive prototype, it's time to put them together before dig into each one further. The process should be

  • start monitor for both client and server into InfluxDB?
  • generator generate synthetic data
  • load read the data and feed into tsdb proxy client and into KairosDB
  • collect latency, db metrics into InfluxDB?

Related

  • #6 Config file, need to define the syntax of config file before integration
  • #30 tsdb-proxy and xephon-b are now in different repository now, though xephon-b still have all the code now.

[reporter][counter] Align metrics from libtsdb-go and metrics/result.go

Currently there are three places we define metrics

  • libtsdb-go return net/http/httptrace results on http clients
  • metrics/result.go contains some numbers but not all of them can be obtained in what is exposed by libtsdb-go
  • counter has some numbers, most of them are not updated correctly

TODO

  • works with http based tsdb clients
  • compatible with raw tcp based client, i.e. graphite
  • return result in finalize stage so it can be written to somewhere by manager

[runner][worker] Limit QPS

Besides limit by time/points, we can add extra constraints like limit QPS, it is different from firs two because those two determines the termination of workload, QPS controls the fastest speed each worker thread should be.

Also from YCSB about latency when QPS is limited

if you specify a target of 10 operations per second (and a single thread) then the Client will only execute an operation every 100 milliseconds. If the operation takes 12 milliseconds, then the client will wait for an additional 88 milliseconds before trying the next operation. However, the reported latency will not include this wait time; a latency of 12 milliseconds, not 100, will be reported.

KairosDB client

Related #10 #14

Payload

  • allow add one point, which turns into bytes right away d7852a9 json serialize example
  • allow add point to buffer, the pointer to point is stored, it will be grouped by series (TODO: then it comes the problem of tags order ....) maybe use set?

Client

  • https support (need to disable some check in order to use self sign ceritficates)
  • share connect to avoid out of file handler problem, like mentioned in hey rakyll/hey#31
  • config qps
  • * load following the time in the data
  • track latency and errors
  • put bench data into TSDB

Metric

  • pull KairosDB metrics
  • pull Cassandra metrics
  • pull machine metrics

Existing clients

Ref

[db][kairosdb] Invalid json. No content due to end of input

  • this error didn't show up when running test using libtsdb
  • points are still written into kairosdb, we can read it in the web ui
  • it could be we are not setting content type correctly
  • not draining connection? close body?
WARN 0009 failed to flush {"errors":["Invalid json. No content due to end of input.","Invalid json. No content due to end of input."]}
WARN 0009 failed to flush {"errors":["Invalid json. No content due to end of input.","Invalid json. No content due to end of input."]}
WARN 0009 failed to flush {"errors":["Invalid json. No content due to end of input.","Invalid json. No content due to end of input."]}
cassandra_1  | WARN  [Native-Transport-Requests-12] 2018-03-05 23:58:42,775 BatchStatement.java:301 - Batch for [kairosdb.data_points] is of size 5.469KiB, exceeding specified threshold of 5.000KiB by 0.469KiB.

Time distribution of generated data

Related https://github.com/at15/xephon-b-paper/issues/4

In current PR #9, the point generatation is the dumbest, fixed value with fixed time interval. which is used to make sure other part of the program can be implemented ASAP (serialize, bulk load, query etc.) In order to simulate real world use case, complex and configurable data generation is needed. The original problem is discussed in the private repo which contains some paper addressing this issue

Will update the issue when I finish bulk load and query using the simplest point.

Logo and website

  • init the gh-pages
  • a logo
  • a hand drawn landing page
  • introduction for the lightning talk

Since the lt is on Friday .... got to finish it tonight

Config snapshot util

It is a pain to let people write a copy of their config when they do benchmark, some people change their config right after one test and start a new one immediately and end up can't matching their config and test result.

Xephon-B should take a snapshot of all the necessary config and put it in the report. We only consider the micro benchmark now

  • config file (may need to filter some credentials or better put credentials in a separated file)
    • load config
  • database information (ie: if using docker, version information can be obtained using docker client)
  • host information (ie: for development, sometimes it's the problem of the developer's machine, not the code)
    • runtime versions
    • basic hardware information, mem, disk space etc.

Data generation

Time series database write is different from other NoSQL, typically it's

  • key a string for describe the source ie cpu.idle
  • timestamp when does the event happen
  • value numeric value, integer or float
  • tags k=>v for adding attributes to data

Examples

Since generate complex data cost a lot resources, it's a wise idea to save the data to the disk.
while influx-comparison use the bulk form of the target database, I think it's better to use a general serialization format, and store meta data in another file (you can even do some dirty trick to the meta
to change the load without generating the data)

Serialization

[refactor] Split tsdb package out into separate repo

Related issues: #28, #18, #15

What the tsdb package doing is like JDBC, except it's in Golang and for TSDB only. And it would provide server side implementation in order to be a proxy. So it's a better idea to make it a standalone repo instead of bundled instead xephon-b. Current problems are

  • the series and point package in pkg/common is coupled with almost every package in xephon-b including tsdb, possible solutions are:

    • have a xephonhq/tsdb-proxy/common package in order to avoid possible cycle problem
  • tsdb actually has its own command related files in xephonhq/xephon-b/cmd/tsdb-proxy while xepoh-b has its own command related files in xephonhq/xephon-b/pkg/cmd and only binary in xephonhq/xephon-b/cmd/xephon-b

  • the tracing functionality is not limited to benchmark tool only and can be switched off for normal usage, as for the benchmark, it's possible to use other goroutine in a pull style to collect all those metrics instead of let each goroutine to send into a channel, if the metrics are not collect by someone, it is simple replaced by new ones no need to worry about memory usage when have a lot of metrics for tons of goroutines when tracing is enabled.

  • create tsdb-proxy repository using git filter-branch

  • change import and remove unused files from each repo (starting from xephon-b might be easier), though the generator and simulator are really close related to time series logic

  • make sure both repository pass their travis test

  • update documentation for both repository

[runner][reporter] Report progress

Runner should report the overall progress of the benchmark so user can have an estimation, i.e. when limit by time 10% 1s/10s, when limit by points 10% 1M/100M (numbers should be human readable when print to console, it's hard to count the zeros ...)

  • have multiple reporter for progress, this is different from the reporter for worker results
  • write to terminal
    • by time
    • by percentage
  • expose to http handler?
  • (optional) report to xephon-b central for distributed processing? then xephon-b itself becomes a framework runs inside benchhub's job scheduler

Config file

Related #1 #2

item with a * prefix will most likely be ignored

  • use viper to read yaml file
  • borrow viper patch from Ayi if necessary
  • one for general config xephon-b.yml
  • one for test config bench.yml, a copy will be stored with the benchmark result
  • * use ~/.xephon-b folder to store history of experiments, use some go k-v store

Write

  • load scenario
  • load type
  • ... many more

Read

do write first

[runner] Server for remote control

Currently Xephon-B just run and stop, however, it could be a server program for

  • dynamic control during benchmark
  • run multiple workload at same time (just create multiple managers)
  • reduce warm up time if dataset is loaded from disk
  • detect memory and goroutine leak in current runner

TSDB Proxy

Related PR: #19

We need to (can) have a TSDB proxy for the following reason

  • we have to develop multiple tsdb clients
  • cAdvisor support various backends, but maybe none of them is the one we want
  • we need to store benchmark result in tsdb, and we should allow people to choose the one they like. (actually we don't know which one works well for storing the result)
  • we may have our own tsdb implementation

Clients

Proxy

Ref

Script for license check

#Currently, xephon-b is released under MIT, however if there are dependencies use GPL or some strange
license, it may need to switch library. (not license)

It should do the following

  • loop the vendor folder and find each repo's license (since we are using glide there should not be nested vendor like old node.js)
  • deal with duplicate since some projec commit vendor into scm
  • print a tree
  • work on windows (under gitbash and/or msys64)

Since it's just a util script, I will write it in python

Container monitor

We need to provide some insight about the databases, like if a database fails with most of the system idle, there is certainly wrong configuration or poor design.

Existing solutions

Ref

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.