Git Product home page Git Product logo

tensorbase's Introduction

What is TensorBase

TensorBase is a modern engineering effort for building a high performance and cost-effective bigdata warehouse in an open source culture.

News

Let's do a year summary for Base's 2020 wonderful journey with Rust.

Base has missed the planned 2020.11 milestone, in that I want a solid new release. The backend engine is 80% ready in the early of Nov. But I still fight for the server side. This is the price to make compatibility with the ClickHouse communication protocol. The good news is that an order of magnitude higher (than official CH server) raw packet throughput sever is out in these near days.

Let me see if I could give more infos for wonderful outcomes in Rust China Conf 2020.

The core works and practices of TensorBase will be presented with the Rust context in mind. And more, current progress (a.k.a. TensorBase 2020.11) will be shown as possible.

Let's meet in the talk, all data nerds!

Status

TensorBase has released milestone 0 as the developer previewing release for inviting more interesting contributors to join in.

Current development is active in background and this repo is not synced regularly because it is planned to introduce different editions from m1 and there is no external contribution now.

  • TensorBase is an architectural performance design.

It is demonstrated to query ~1.5 billion rows of NYC taxi dataset in ~100 milliseconds for total response time in its milestone 0. This is 6x faster than that of ClickHouse.

Aggregation results in Base's baseshell (95 - 118ms)

Aggregation result in ClickHouse client (0.642s or 642ms)

  • TensorBase is a highly hackable system

TensorBase is written from scratch in Rust language and its friend C. Here with comfortable languages, minimized dependencies and from-scratch architectings, you now can use the most familiar tools to challenge the most difficult problems.

If you like this project, please give a star to help it more grown.

Roadmap

The coming m1 will be the first milestone which is targeted to provide a production-friendly release.

A speicial edition will be shown to the interesting personals and oraganizations. Subscribe to TensorBase's Newsletter here to get the first time information if you are interesting.

Try TensorBase

TensorBase is developed for Linux, but should work for any docker enabled system (for example, Windows 10 WSL2).

  • from source

TensorBase follows the idiomatic development flow of Rust. Make sure your Rust nightly toolchain works. If you only try to run, just play with Quick Start. Thanks to the strong rust ecosystem, it is not necessary to run build first. Please check prerequisites before running from soruce.

  • docker

This mode is portable (but has some platform dependent resource and performance effects).

Try like this:

docker pull tensorbase/tensorbase:m0
docker run -ti tensorbase/tensorbase:m0 /bin/bash
>> /base/baseshell

then run a sum agg sql with the preshipped data (1MB):

select sum(trip_id) from nyc_taxi

Quick Start

Now TensorBase provides two binaries to enable the following workflow:

  • baseops: cli/workbench for devops, including kinds of processes/roles starts/stop

  • baseshell: query client (now is a monolithic to include everything), m0 only supports query with single integer column type sum aggregation intentionally.

  1. run baseops to create a table definition in Base
cargo run --bin baseops table create -c samples/nyc_taxi_create_table_sample.sql

Base explicitly separates write/mutation behaviors into the cli baseops. the provided sql file is just an ansi-SQL DDL script, which can be seen in the samples directory of repo.

  1. run baseops to import nyc_taxi csv dataset into Base
cargo run --release --bin baseops import csv -c /jian/nyc-taxi.csv -i nyc_taxi:trip_id,pickup_datetime,passenger_count:0,2,10:51

Base import tool uniquely supports to import csv partially into storage like above. Use help to get more infos.

  1. run baseshell to issue query against Base
cargo run --release --bin baseshell

Dev Docs provides a little more explanation for why above commands work.

Engineering Efforts

Find more infos in dev page.

Communications

Feel free to feedback any problem via issues.

Free-style discuss:

Discord Server

Slack Channel

WeChat Group

Wechat Group

Contributing

Thanks for your contributions!

Dev Docs

License

TensorBase is distributed under the terms of the Apache License (Version 2.0).

See LICENSE for details.

tensorbase's People

Contributors

jinmingjian avatar ranjithpmankada avatar sundy-li avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.