Git Product home page Git Product logo

cr8's Introduction

cr8

travis-ci Wheel PyPI Version Python Version

A collection of command line tools for Crate developers (and maybe users as well).

TOC

Why cr8? ๐Ÿค”

  1. To quickly produce sample data. Often if someone reports an issue sample data is required to be able to reproduce it. insert-fake-data and insert-json address this problem.
  2. To benchmark queries & compare runtime across Crate versions. timeit ๐Ÿ•, run-spec and run-track can be used to get runtime statistics of queries. These tools focus on response latencies. Being able to benchmark throughput is NOT a goal of cr8. Similarly, being able to simulate real-world use cases is also NOT a goal of cr8.

Note

Although most commands output text by default. Most take a --output-fmt json argument to output JSON. This is useful if used together with jq to post-process the output

Install ๐Ÿ’พ

Python >= 3.6 is required to use the command line tools.

Install them using pip:

python3.6 -m pip install --user cr8

Usage

The main binary is called cr8 which contains a couple of sub-commands.

Use cr8 -h or cr8 <subcommand> -h to get a more detailed usage description.

The included sub-commands are described in more detail below:

Sub-commands

timeit ๐Ÿ•

A tool that can be used to measure the runtime of a given SQL statement on a cluster:

>>> echo "select name from sys.cluster" | cr8 timeit --hosts localhost:4200
Runtime (in ms):
    mean:    ... ยฑ ...
    min/max: ... โ†’ ...
Percentile:
    50:   ... ยฑ ... (stdev)
    95:   ...
    99.9: ...

insert-fake-data

A tool that can be used to fill a table with random data. The script will generate the records using faker.

For example given the table as follows:

create table x.demo (
    id int,
    name string,
    country string
);

The following command can be used to insert 1000 records:

>>> cr8 insert-fake-data --hosts localhost:4200 --table x.demo --num-records 200
Found schema:
{
    "country": "string",
    "id": "integer",
    "name": "string"
}
Using insert statement:
insert into "x"."demo" ("country", "id", "name") values (?, ?, ?)
Will make 1 requests with a bulk size of 200
Generating fake data and executing inserts
<BLANKLINE>

It will automatically read the schema from the table and map the columns to faker providers and insert the give number of records.

(Currently only top-level columns are supported)

insert-json

insert-json can be used to insert records from a JSON file:

>>> cat tests/demo.json | cr8 insert-json --table x.demo --hosts localhost:4200
Executing inserts: bulk_size=1000 concurrency=25
Runtime (in ms):
    mean:    ... ยฑ 0.000

Or simply print the insert statement generated from a JSON string:

>>> echo '{"name": "Arthur"}' | cr8 insert-json --table mytable
('insert into mytable ("name") values (?)', ['Arthur'])
...

insert-blob

A tool to upload a file into a blob table:

>>> cr8 insert-blob --hosts localhost:4200 --table blobtable specs/sample.toml
http://.../_blobs/blobtable/2917773e74ff46d08f399435ed9b99afb9ed34bd

run-spec

A tool to run benchmarks against a cluster and store the result in another cluster. The benchmark itself is defined in a spec file which defines setup, benchmark and teardown instructions.

The instructions itself are just SQL statements (or files containing SQL statements).

In the specs folder is an example spec file.

Usage:

>>> cr8 run-spec specs/sample.toml localhost:4200 -r localhost:4200
# Running setUp
# Running benchmark
<BLANKLINE>
## Running Query:
   Statement: select count(*) from countries
   Concurrency: 2
   Iterations: 100
Runtime (in ms):
    mean:    ... ยฑ ...
    min/max: ... โ†’ ...
Percentile:
    50:   ... ยฑ ... (stdev)
    95:   ...
    99.9: ...
...
## Skipping (Version ...
   Statement: ...
# Running tearDown
<BLANKLINE>

-r is optional and can be used to save the benchmark result into a cluster. A table named benchmarks will be created if it doesn't exist.

Writing spec files in python is also supported:

>>> cr8 run-spec specs/sample.py localhost:4200
# Running setUp
# Running benchmark
...

run-crate

Launch a Crate instance:

> cr8 run-crate 0.55.0

This requires Java 8.

run-crate supports chaining of additional commands using --. Under the context of run-crate any host urls can be formatted using the {node.http_url} format string:

>>> cr8 run-crate latest-stable -- timeit -s "select 1" --hosts '{node.http_url}'
 # run-crate
===========
<BLANKLINE>
...
Starting Crate process
Crate launched:
    PID: ...
    Logs: ...
    Data: ...
<BLANKLINE>
...
Cluster ready to process requests
<BLANKLINE>
<BLANKLINE>
# timeit
========
<BLANKLINE>
<BLANKLINE>
<BLANKLINE>
<BLANKLINE>

In the above example timeit is a cr8 specific sub-command. But it's also possible to use arbitrary commands by prefixing them with @:

cr8 run-crate latest-nightly -- @http '{node.http_url}'

run-track

A tool to run .toml track files. A track is a matrix definition of node version, configurations and spec files.

For each version and configuration a Crate node will be launched and all specs will be executed:

>>> cr8 run-track tracks/sample.toml
# Version:  latest-testing
## Starting Crate latest-testing, configuration: default.toml
### Running spec file:  sample.toml
# Running setUp
# Running benchmark
...

Development โ˜ข

To get a sandboxed environment with all dependencies installed use venv:

python -m venv .venv
source .venv/bin/activate

Install the cr8 package using pip:

python -m pip install -e .

Run cr8:

cr8 -h

Tests are run with python -m unittest

cr8's People

Contributors

chaudum avatar lowks avatar mfussenegger avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.