Git Product home page Git Product logo

vectorsql's Introduction

NOTICE: This project have moved to Databend

VectorSQL Logo


Github Actions Status Github Actions Status Github Actions Status codecov.io License

VectorSQL is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse.

Features

  • High Performance
  • High Scalability
  • High Reliability

Server

$git clone https://github.com/vectorengine/vectorsql
$cd vectorsql
$make build
$./bin/vectorsql-server -c conf/vectorsql-default.toml
	
 2020/01/27 19:02:39.245654    	 [DEBUG] 	Database->Attach Table:system.tables, engine:SYSTEM_TABLES <attachTable@database_system.go:116>
 2020/01/27 19:02:39.245670    	 [DEBUG] 	Database->Attach Table:system.databases, engine:SYSTEM_DATABASES <attachTable@database_system.go:116>
 2020/01/27 19:02:39.245680    	 [INFO] 	Database->Load Database:system <[email protected]:110>
 2020/01/27 19:02:39.245794    	 [INFO] 	Listening for connections with native protocol (tcp)::9000 <[email protected]:33>
 2020/01/27 19:02:39.245806    	 [INFO] 	Servers start... <[email protected]:62>

Client

  • clickhouse-client
$clickhouse-client --compression=0
VectorSQL :) SELECT SUM(IF(status!=200, 1, 0)) AS errors, SUM(IF(status=200, 1, 0)) as success, (errors/COUNT(server)) AS error_rate,(success/COUNT(server)) as success_rate, (SUM(response_time)/COUNT(server)) AS load_avg, MIN(response_time), MAX(response_time), path, server FROM logmock(rows->15) GROUP BY server, path HAVING errors>0 ORDER BY server ASC, load_avg DESC;

SELECT 
    SUM(IF(status != 200, 1, 0)) AS errors, 
    SUM(IF(status = 200, 1, 0)) AS success, 
    errors / COUNT(server) AS error_rate, 
    success / COUNT(server) AS success_rate, 
    SUM(response_time) / COUNT(server) AS load_avg, 
    MIN(response_time), 
    MAX(response_time), 
    path, 
    server
FROM logmock(rows -> 15)
GROUP BY 
    server, 
    path
HAVING errors > 0
ORDER BY 
    server ASC, 
    load_avg DESC

┌─errors─┬─success─┬─error_rate─┬─success_rate─┬─load_avg─┬─MIN(response_time)─┬─MAX(response_time)─┬─path───┬─server──────┐
│      2 │       1 │     0.6667 │       0.3333 │       12 │                 10 │                 13 │ /login │ 192.168.0.1 │
│      1 │       5 │     0.1667 │       0.8333 │  11.1667 │                 10 │                 12 │ /index │ 192.168.0.1 │
│      1 │       3 │       0.25 │         0.75 │    11.25 │                 10 │                 14 │ /index │ 192.168.0.2 │
│      1 │       1 │        0.5 │          0.5 │       11 │                 10 │                 12 │ /login │ 192.168.0.2 │
└────────┴─────────┴────────────┴──────────────┴──────────┴────────────────────┴────────────────────┴────────┴─────────────┘
↓ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) 
4 rows in set. Elapsed: 0.005 sec. 
  • http-client
curl -XPOST http://127.0.0.1:8123 -d "SELECT SUM(IF(status!=200, 1, 0)) AS errors, SUM(IF(status=200, 1, 0)) as success, (errors/COUNT(server)) AS error_rate,(success/COUNT(server)) as success_rate, (SUM(response_time)/COUNT(server)) AS load_avg, MIN(response_time), MAX(response_time), path, server FROM logmock(rows->15) GROUP BY server, path HAVING errors>0 ORDER BY server ASC, load_avg DESC"
2	1	0.6667	0.3333	12.0000	10	13	/login	192.168.0.1
1	5	0.1667	0.8333	11.1667	10	12	/index	192.168.0.1
1	3	0.2500	0.7500	11.2500	10	14	/index	192.168.0.2
1	1	0.5000	0.5000	11.0000	10	12	/login	192.168.0.2

Query Language Features

Query language Current version Future versions Example
Scans by Value + + SELECT a,b
Scans by Expression + + SELECT IF(a>2,a,b),SUM(a)
Filter by Value + + WHERE a>10
Filter by Expression + + WHERE a>(b+10)
Group-Aggregate by Value + + GROUP BY a
Group-Aggregate by Expression + + GROUP BY (a+1)
Group-Having by Value + + HAVING count_a>2
Group-Having by Expression + + HAVING (count_a+1)>2
Order by Value + + ORDER BY a desc
Order by Expression + + ORDER BY (a+b)
Window Functions - +
Common Table Expressions - +
Join - +

Performance

  • Dataset: 10,000,000 (10 Million)
  • Hardware: 16vCPUx16G KVM Cloud Instance
  • Benchmark
Query Cost(second)
SELECT COUNT(id) FROM testdata 0.269s
SELECT COUNT(id) FROM testdata WHERE id!=0 0.438s
SELECT SUM(data1) FROM testdata 0.287s
SELECT SUM(data1) AS sum, COUNT(data1) AS count, sum/count AS avg FROM testdata 1.814s
SELECT MAX(id), MIN(id) FROM testdata 0.473s
SELECT COUNT(data1) AS count, data1 FROM testdata GROUP BY data1 ORDER BY count DESC LIMIT 10 0.728s
SELECT email FROM testdata WHERE email like '%[email protected]%' LIMIT 1 0.076s
SELECT COUNT(email) FROM testdata WHERE email like '%[email protected]%' 1.470s
SELECT data1 AS x, x - 1, x - 2, x - 3, count(data1) AS c FROM testdata GROUP BY x, x - 1, x - 2, x - 3 ORDER BY c DESC LIMIT 10 2.396s

Metrics

http://localhost:8080/debug/metrics

vectorsql's People

Contributors

apavlo avatar bohutang avatar sundy-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vectorsql's Issues

I tried vectorsql and this is what happened.

I just tried it out of curiosity. I assume that this is just a development prototype, so no real expectations.
Sorry to bother.

git clone https://github.com/vectorengine/vectorsql
cd vectorsql
make build
./bin/vectorsql-server -c conf/vectorsql-default.toml
clickhouse-client --compression 0
# Cannot load data for command line suggestions: Code: 0, e.displayText() = DB::Exception: Received from localhost:9000. . Every derived table must have its own alias at position 849 near 'where'. (version 20.10.1.1)
VectorSQL :) SELECT 1

SELECT 1


Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . Couldn't find table:dual storage.

VectorSQL :) SHOW DATABASES

SHOW DATABASES

┌─name───┬─engine─┬─data_path────────────┬─metadata_path────────────┐
│ system │ SYSTEM │ data9000/data/system │ data9000/metadata/system │
└────────┴────────┴──────────────────────┴──────────────────────────┘

1 rows in set. Elapsed: 0.004 sec. 

VectorSQL :) SHOW TABLES FROM system

SHOW TABLES FROM system

┌─name──────┐
│ databases │
│ numbers   │
│ tables    │
└───────────┘

3 rows in set. Elapsed: 0.001 sec. 

VectorSQL :) SELECT count() FROM system.numbers

SELECT count()
FROM system.numbers

↙ Progress: 3.00 rows, 155.00 B (4.10 thousand rows/s., 212.08 KB/s.)  99%
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . Unsupported Expression:COUNT. 

0 rows in set. Elapsed: 0.014 sec.

VectorSQL :) SELECT count(*) FROM system.numbers

SELECT count(*)
FROM system.numbers


Exception on client:
Code: 210. DB::NetException: Connection reset by peer, while reading from socket (127.0.0.1:9000): while receiving packet from localhost:9000

Connecting to localhost:9000 as user default.
Connected to VectorSQL server version 19.17.1 revision 54428.
fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x9ad671, 0x16)
        /usr/lib/go-1.12/src/runtime/panic.go:617 +0x72
runtime.sysMap(0xca68000000, 0x1f28000000, 0xe546f8)
        /usr/lib/go-1.12/src/runtime/mem_linux.go:170 +0xc7
runtime.(*mheap).sysAlloc(0xe36e40, 0x1f26286000, 0xe36e50, 0xf93143)
        /usr/lib/go-1.12/src/runtime/malloc.go:633 +0x1cd
runtime.(*mheap).grow(0xe36e40, 0xf93143, 0x0)
        /usr/lib/go-1.12/src/runtime/mheap.go:1222 +0x42
runtime.(*mheap).allocSpanLocked(0xe36e40, 0xf93143, 0xe54708, 0x7f6100000000)
        /usr/lib/go-1.12/src/runtime/mheap.go:1150 +0x37f
runtime.(*mheap).alloc_m(0xe36e40, 0xf93143, 0x100, 0x7f614d7f9dc8)
        /usr/lib/go-1.12/src/runtime/mheap.go:977 +0xc2
runtime.(*mheap).alloc.func1()
        /usr/lib/go-1.12/src/runtime/mheap.go:1048 +0x4c
runtime.(*mheap).alloc(0xe36e40, 0xf93143, 0xc000010100, 0xc0005e4900)
        /usr/lib/go-1.12/src/runtime/mheap.go:1047 +0x8a
runtime.largeAlloc(0x1f262849b8, 0xc000020001, 0xdc7201)
        /usr/lib/go-1.12/src/runtime/malloc.go:1055 +0x99
runtime.mallocgc.func1()
        /usr/lib/go-1.12/src/runtime/malloc.go:950 +0x46
runtime.systemstack(0x7f6130000020)
        /usr/lib/go-1.12/src/runtime/asm_amd64.s:351 +0x66
runtime.mstart()
        /usr/lib/go-1.12/src/runtime/proc.go:1153

...
VectorSQL :) SELECT count(*) FROM system.numbers

SELECT count(*)
FROM system.numbers


Exception on client:
Code: 210. DB::NetException: Connection reset by peer, while reading from socket (127.0.0.1:9000): while receiving packet from localhost:9000

Let's try with clickhouse-cli:

$ pip3 install clickhouse-cli

$ clickhouse-cli
clickhouse-cli version: 0.3.6
Connecting to 127.0.0.1:8123
Error: Request failed: `SELECT version();` query failed.

Let's try HTTP interface:

$ curl http://localhost:8123/ -d 'SELECT count() FROM system.numbers'
Unsupported Expression:COUNT
$ curl http://localhost:8123/ -d 'SELECT count(*) FROM system.numbers'
$ 
$ 
$ curl http://localhost:8123/ -d 'SELECT count(*) FROM system.numbers'
$ curl http://localhost:8123/ -d 'SELECT count(*) FROM system.numbers'

Returns immediately, server prints panic.

VectorSQL :) CREATE TEMPORARY TABLE t (x UInt64);

CREATE TEMPORARY TABLE t
(
    `x` UInt64
)


Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . syntax error at position 23 near 'table'. 
VectorSQL :) CREATE DATABASE test

CREATE DATABASE test

Ok.

0 rows in set. Elapsed: 0.001 sec. 

VectorSQL :) USE test

USE test


Exception on client:
Code: 210. DB::NetException: Connection refused (localhost:9000)

Connecting to localhost:9000 as user default.
Code: 210. DB::NetException: Connection refused (localhost:9000)

Server crashed.

benchmark$ ./run.sh 
01_create_table.sh
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . database:benchmark doesn't exists. 
02_generate_data.sh
rm: cannot remove 'data.tsv': No such file or directory
03_load_data.sh
clickhouse-client: ../src/DataStreams/ParallelParsingBlockInputStream.cpp:190: void DB::ParallelParsingBlockInputStream::parserThreadFunction(DB::ThreadGroupStatusPtr, size_t): Assertion `unit.is_last || !unit.block_ext.block.empty()' failed.
./03_load_data.sh: line 3: 15092 Broken pipe             cat data.tsv
     15093 Aborted                 (core dumped) | clickhouse-client --compression=0 --database=benchmark --query="insert into testdata FORMAT TabSeparated"
04_run_bench.sh
| SELECT COUNT(id) FROM testdata | 0.001s |
| SELECT COUNT(id) FROM testdata WHERE id!=0 | 0.001s |
| SELECT SUM(data1) FROM testdata | 0.001s |
| SELECT SUM(data1) AS sum, COUNT(data1) AS count, sum/count AS avg FROM testdata | 0.001s |
| SELECT MAX(id), MIN(id) FROM testdata | 0.001s |
| SELECT COUNT(data1) AS count, data1 FROM testdata GROUP BY data1 ORDER BY count DESC LIMIT 10 | 0.001s |
| SELECT email FROM testdata WHERE email like '%[email protected]%' LIMIT 1 | 0.001s |
| SELECT COUNT(email) FROM testdata WHERE email like '%[email protected]%' | 0.001s |
| SELECT data1 AS x, x - 1, x - 2, x - 3, count(data1) AS c FROM testdata GROUP BY x, x - 1, x - 2, x - 3 ORDER BY c DESC LIMIT 10 | 0.001s |

Benchmark works Ok.

But server does not return any data:

milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0
ClickHouse client version 20.10.1.1.
Connecting to localhost:9000 as user default.
Connected to VectorSQL server version 19.17.1 revision 54428.

ClickHouse server version is older than ClickHouse client. It may indicate that the server is out of date and can be upgraded.

Cannot load data for command line suggestions: Code: 0, e.displayText() = DB::Exception: Received from localhost:9000. . Every derived table must have its own alias at position 849 near 'where'. (version 20.10.1.1)
VectorSQL :) USE test

USE test

Ok.

0 rows in set. Elapsed: 0.001 sec. 

VectorSQL :) SHOW TABLES

SHOW TABLES

Ok.

0 rows in set. Elapsed: 0.002 sec. 

VectorSQL :) USE benchmark

USE benchmark

Ok.

0 rows in set. Elapsed: 0.001 sec. 

VectorSQL :) SHOW TABLES

SHOW TABLES

┌─name─────┐
│ testdata │
└──────────┘

1 rows in set. Elapsed: 0.003 sec. 

VectorSQL :) SELECT count() FROM testdata

SELECT count()
FROM testdata

Ok.

0 rows in set. Elapsed: 0.001 sec. 

VectorSQL :) SELECT count(*) FROM testdata

SELECT count(*)
FROM testdata


Exception on client:
Code: 210. DB::NetException: Connection reset by peer, while reading from socket (127.0.0.1:9000): while receiving packet from localhost:9000

Connecting to database benchmark at localhost:9000 as user default.
Connected to VectorSQL server version 19.17.1 revision 54428.

ClickHouse server version is older than ClickHouse client. It may indicate that the server is out of date and can be upgraded.

VectorSQL :) SELECT count(ID) FROM testdata

SELECT count(ID)
FROM testdata

Ok.

0 rows in set. Elapsed: 0.002 sec. 

VectorSQL :) Bye.
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0 --query "SELECT count(ID) FROM testdata"
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . Couldn't find table:testdata storage. 
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0 --database benchmark --query "SELECT count(ID) FROM testdata"
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ 
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0 --database benchmark --query "SELECT count(ID) FROM testdata"
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$

[feature] implement window functions

Summary

implement window functions

  SELECT
        EmpName, 
        DeptName,
        SUM(Salary) OVER( PARTITION BY DeptName ) AS SalaryByDept
    FROM @employees;
EmpName DeptName SalaryByDept
Noah Engineering 60000
Sophia Engineering 60000
Liam Engineering 60000
Mason Executive 50000
Emma HR 30000
Jacob HR 30000
Olivia HR 30000
Ava Marketing 25000
Ethan Marketing 25000

Dev plan

What's the completed features and development plan?

[feature] implement CASE expression

Summary

Syntax:

CASE <cond>
  WHEN <condval1> THEN <expr1>
  [ WHEN <condvalx> THEN <exprx> ] ...
  [ ELSE <expr2> ]
END

For example:

SELECT CustomerName, City, Country
FROM Customers
ORDER BY
(CASE
    WHEN City IS NULL THEN Country
    ELSE City
END);

clickhouse-client connection time out

my docker clickhouse-client:

leo@LEO MINGW64 ~/Desktop
$ winpty docker run -it yandex/clickhouse-client --host 192.168.0.106 --compression=0
ClickHouse client version 20.1.4.14 (official build).
Connecting to 192.168.0.106:9000 as user default.


Code: 209. DB::NetException: Timeout exceeded while reading from socket (192.168.0.106:9000)

server log:

 2020/02/29 15:39:22.557296      [INFO]         Memory InUse: 3.3 MB    Alloc: 2.1 MB    Sys: 7.2 MB <[email protected]:80>
 2020/02/29 15:39:32.558532      [INFO]         Memory InUse: 3.3 MB    Alloc: 2.1 MB    Sys: 7.2 MB <[email protected]:80>
 2020/02/29 15:39:42.557965      [INFO]         Memory InUse: 3.3 MB    Alloc: 2.1 MB    Sys: 7.2 MB <[email protected]:80>
 2020/02/29 15:39:42.798833      [DEBUG]        Receive client hello:&{ClientName:ClickHouse client ClientVersionMajor:20 ClientVersionMinor:1 ClientRevision:54431 Database: User:default Password:} <processHello@tcp_hello.go:49>
 2020/02/29 15:39:42.799832      [ERROR]        EOF, *errors.errorString <handle@tcp_handler.go:71>
 2020/02/29 15:39:52.557393      [INFO]         Memory InUse: 3.3 MB    Alloc: 2.1 MB    Sys: 7.2 MB <[email protected]:80>
 2020/02/29 15:40:02.557775      [INFO]         Memory InUse: 3.3 MB    Alloc: 2.1 MB    Sys: 7.2 MB <[email protected]:80>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.