vectorengine / vectorsql Goto Github PK
View Code? Open in Web Editor NEWVectorSQL is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse.
Home Page: https://github.com/vectorengine/vectordb
License: Apache License 2.0
VectorSQL is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse.
Home Page: https://github.com/vectorengine/vectordb
License: Apache License 2.0
implement group by
why not use go mod
create table test(tag string, metrics) Engine=TimeStore
document the expression
I just tried it out of curiosity. I assume that this is just a development prototype, so no real expectations.
Sorry to bother.
git clone https://github.com/vectorengine/vectorsql
cd vectorsql
make build
./bin/vectorsql-server -c conf/vectorsql-default.toml
clickhouse-client --compression 0
# Cannot load data for command line suggestions: Code: 0, e.displayText() = DB::Exception: Received from localhost:9000. . Every derived table must have its own alias at position 849 near 'where'. (version 20.10.1.1)
VectorSQL :) SELECT 1
SELECT 1
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . Couldn't find table:dual storage.
VectorSQL :) SHOW DATABASES
SHOW DATABASES
┌─name───┬─engine─┬─data_path────────────┬─metadata_path────────────┐
│ system │ SYSTEM │ data9000/data/system │ data9000/metadata/system │
└────────┴────────┴──────────────────────┴──────────────────────────┘
1 rows in set. Elapsed: 0.004 sec.
VectorSQL :) SHOW TABLES FROM system
SHOW TABLES FROM system
┌─name──────┐
│ databases │
│ numbers │
│ tables │
└───────────┘
3 rows in set. Elapsed: 0.001 sec.
VectorSQL :) SELECT count() FROM system.numbers
SELECT count()
FROM system.numbers
↙ Progress: 3.00 rows, 155.00 B (4.10 thousand rows/s., 212.08 KB/s.) 99%
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . Unsupported Expression:COUNT.
0 rows in set. Elapsed: 0.014 sec.
VectorSQL :) SELECT count(*) FROM system.numbers
SELECT count(*)
FROM system.numbers
Exception on client:
Code: 210. DB::NetException: Connection reset by peer, while reading from socket (127.0.0.1:9000): while receiving packet from localhost:9000
Connecting to localhost:9000 as user default.
Connected to VectorSQL server version 19.17.1 revision 54428.
fatal error: runtime: out of memory
runtime stack:
runtime.throw(0x9ad671, 0x16)
/usr/lib/go-1.12/src/runtime/panic.go:617 +0x72
runtime.sysMap(0xca68000000, 0x1f28000000, 0xe546f8)
/usr/lib/go-1.12/src/runtime/mem_linux.go:170 +0xc7
runtime.(*mheap).sysAlloc(0xe36e40, 0x1f26286000, 0xe36e50, 0xf93143)
/usr/lib/go-1.12/src/runtime/malloc.go:633 +0x1cd
runtime.(*mheap).grow(0xe36e40, 0xf93143, 0x0)
/usr/lib/go-1.12/src/runtime/mheap.go:1222 +0x42
runtime.(*mheap).allocSpanLocked(0xe36e40, 0xf93143, 0xe54708, 0x7f6100000000)
/usr/lib/go-1.12/src/runtime/mheap.go:1150 +0x37f
runtime.(*mheap).alloc_m(0xe36e40, 0xf93143, 0x100, 0x7f614d7f9dc8)
/usr/lib/go-1.12/src/runtime/mheap.go:977 +0xc2
runtime.(*mheap).alloc.func1()
/usr/lib/go-1.12/src/runtime/mheap.go:1048 +0x4c
runtime.(*mheap).alloc(0xe36e40, 0xf93143, 0xc000010100, 0xc0005e4900)
/usr/lib/go-1.12/src/runtime/mheap.go:1047 +0x8a
runtime.largeAlloc(0x1f262849b8, 0xc000020001, 0xdc7201)
/usr/lib/go-1.12/src/runtime/malloc.go:1055 +0x99
runtime.mallocgc.func1()
/usr/lib/go-1.12/src/runtime/malloc.go:950 +0x46
runtime.systemstack(0x7f6130000020)
/usr/lib/go-1.12/src/runtime/asm_amd64.s:351 +0x66
runtime.mstart()
/usr/lib/go-1.12/src/runtime/proc.go:1153
...
VectorSQL :) SELECT count(*) FROM system.numbers
SELECT count(*)
FROM system.numbers
Exception on client:
Code: 210. DB::NetException: Connection reset by peer, while reading from socket (127.0.0.1:9000): while receiving packet from localhost:9000
Let's try with clickhouse-cli:
$ pip3 install clickhouse-cli
$ clickhouse-cli
clickhouse-cli version: 0.3.6
Connecting to 127.0.0.1:8123
Error: Request failed: `SELECT version();` query failed.
Let's try HTTP interface:
$ curl http://localhost:8123/ -d 'SELECT count() FROM system.numbers'
Unsupported Expression:COUNT
$ curl http://localhost:8123/ -d 'SELECT count(*) FROM system.numbers'
$
$
$ curl http://localhost:8123/ -d 'SELECT count(*) FROM system.numbers'
$ curl http://localhost:8123/ -d 'SELECT count(*) FROM system.numbers'
Returns immediately, server prints panic.
VectorSQL :) CREATE TEMPORARY TABLE t (x UInt64);
CREATE TEMPORARY TABLE t
(
`x` UInt64
)
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . syntax error at position 23 near 'table'.
VectorSQL :) CREATE DATABASE test
CREATE DATABASE test
Ok.
0 rows in set. Elapsed: 0.001 sec.
VectorSQL :) USE test
USE test
Exception on client:
Code: 210. DB::NetException: Connection refused (localhost:9000)
Connecting to localhost:9000 as user default.
Code: 210. DB::NetException: Connection refused (localhost:9000)
Server crashed.
benchmark$ ./run.sh
01_create_table.sh
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . database:benchmark doesn't exists.
02_generate_data.sh
rm: cannot remove 'data.tsv': No such file or directory
03_load_data.sh
clickhouse-client: ../src/DataStreams/ParallelParsingBlockInputStream.cpp:190: void DB::ParallelParsingBlockInputStream::parserThreadFunction(DB::ThreadGroupStatusPtr, size_t): Assertion `unit.is_last || !unit.block_ext.block.empty()' failed.
./03_load_data.sh: line 3: 15092 Broken pipe cat data.tsv
15093 Aborted (core dumped) | clickhouse-client --compression=0 --database=benchmark --query="insert into testdata FORMAT TabSeparated"
04_run_bench.sh
| SELECT COUNT(id) FROM testdata | 0.001s |
| SELECT COUNT(id) FROM testdata WHERE id!=0 | 0.001s |
| SELECT SUM(data1) FROM testdata | 0.001s |
| SELECT SUM(data1) AS sum, COUNT(data1) AS count, sum/count AS avg FROM testdata | 0.001s |
| SELECT MAX(id), MIN(id) FROM testdata | 0.001s |
| SELECT COUNT(data1) AS count, data1 FROM testdata GROUP BY data1 ORDER BY count DESC LIMIT 10 | 0.001s |
| SELECT email FROM testdata WHERE email like '%[email protected]%' LIMIT 1 | 0.001s |
| SELECT COUNT(email) FROM testdata WHERE email like '%[email protected]%' | 0.001s |
| SELECT data1 AS x, x - 1, x - 2, x - 3, count(data1) AS c FROM testdata GROUP BY x, x - 1, x - 2, x - 3 ORDER BY c DESC LIMIT 10 | 0.001s |
Benchmark works Ok.
But server does not return any data:
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0
ClickHouse client version 20.10.1.1.
Connecting to localhost:9000 as user default.
Connected to VectorSQL server version 19.17.1 revision 54428.
ClickHouse server version is older than ClickHouse client. It may indicate that the server is out of date and can be upgraded.
Cannot load data for command line suggestions: Code: 0, e.displayText() = DB::Exception: Received from localhost:9000. . Every derived table must have its own alias at position 849 near 'where'. (version 20.10.1.1)
VectorSQL :) USE test
USE test
Ok.
0 rows in set. Elapsed: 0.001 sec.
VectorSQL :) SHOW TABLES
SHOW TABLES
Ok.
0 rows in set. Elapsed: 0.002 sec.
VectorSQL :) USE benchmark
USE benchmark
Ok.
0 rows in set. Elapsed: 0.001 sec.
VectorSQL :) SHOW TABLES
SHOW TABLES
┌─name─────┐
│ testdata │
└──────────┘
1 rows in set. Elapsed: 0.003 sec.
VectorSQL :) SELECT count() FROM testdata
SELECT count()
FROM testdata
Ok.
0 rows in set. Elapsed: 0.001 sec.
VectorSQL :) SELECT count(*) FROM testdata
SELECT count(*)
FROM testdata
Exception on client:
Code: 210. DB::NetException: Connection reset by peer, while reading from socket (127.0.0.1:9000): while receiving packet from localhost:9000
Connecting to database benchmark at localhost:9000 as user default.
Connected to VectorSQL server version 19.17.1 revision 54428.
ClickHouse server version is older than ClickHouse client. It may indicate that the server is out of date and can be upgraded.
VectorSQL :) SELECT count(ID) FROM testdata
SELECT count(ID)
FROM testdata
Ok.
0 rows in set. Elapsed: 0.002 sec.
VectorSQL :) Bye.
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0 --query "SELECT count(ID) FROM testdata"
Received exception from server (version 19.17.1):
Code: 0. DB::Exception: Received from localhost:9000. . Couldn't find table:testdata storage.
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0 --database benchmark --query "SELECT count(ID) FROM testdata"
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$ clickhouse-client --compression 0 --database benchmark --query "SELECT count(ID) FROM testdata"
milovidov@milovidov-desktop:~/work/vectorsql/benchmark$
Now, the datavalues.Value using more memory as the base datum, need to re-write a light one.
What's the completed features and development plan?
implement groupby having
implement timestamp data type
$ go build -v -o bin/vectorsql-server src/cmd/server.go
base/xlog
# base/xlog
src\base\xlog\xlog.go:52:12: undefined: syslog.New
src\base\xlog\xlog.go:52:23: undefined: syslog.LOG_DEBUG
Add scheduler for distributed query engine.
my docker clickhouse-client:
leo@LEO MINGW64 ~/Desktop
$ winpty docker run -it yandex/clickhouse-client --host 192.168.0.106 --compression=0
ClickHouse client version 20.1.4.14 (official build).
Connecting to 192.168.0.106:9000 as user default.
Code: 209. DB::NetException: Timeout exceeded while reading from socket (192.168.0.106:9000)
server log:
2020/02/29 15:39:22.557296 [INFO] Memory InUse: 3.3 MB Alloc: 2.1 MB Sys: 7.2 MB <[email protected]:80>
2020/02/29 15:39:32.558532 [INFO] Memory InUse: 3.3 MB Alloc: 2.1 MB Sys: 7.2 MB <[email protected]:80>
2020/02/29 15:39:42.557965 [INFO] Memory InUse: 3.3 MB Alloc: 2.1 MB Sys: 7.2 MB <[email protected]:80>
2020/02/29 15:39:42.798833 [DEBUG] Receive client hello:&{ClientName:ClickHouse client ClientVersionMajor:20 ClientVersionMinor:1 ClientRevision:54431 Database: User:default Password:} <processHello@tcp_hello.go:49>
2020/02/29 15:39:42.799832 [ERROR] EOF, *errors.errorString <handle@tcp_handler.go:71>
2020/02/29 15:39:52.557393 [INFO] Memory InUse: 3.3 MB Alloc: 2.1 MB Sys: 7.2 MB <[email protected]:80>
2020/02/29 15:40:02.557775 [INFO] Memory InUse: 3.3 MB Alloc: 2.1 MB Sys: 7.2 MB <[email protected]:80>
Syntax:
IF ( <cond>, <expr1>, <expr2> )
Evaluates <cond>
, then evaluates if the condition is true, or otherwise.
Implement the clickent client insert
Add more checks on GroupBy/Where clause in planners, such as Aggregate expression on them is invalid.
MySQL 8.0 CTE:
WITH
cte1 AS (SELECT a, b FROM table1),
cte2 AS (SELECT c, d FROM table2)
SELECT b, d FROM cte1 JOIN cte2
WHERE cte1.a = cte2.c;
https://dev.mysql.com/doc/refman/8.0/en/with.html#common-table-expressions
implement ClickHouse send progress protocol to client.
Implement the http client insert
implement projection
implement merge-sort orderby
add Cost interface to measure the transform performance
implement JOIN
Syntax:
CASE <cond>
WHEN <condval1> THEN <expr1>
[ WHEN <condvalx> THEN <exprx> ] ...
[ ELSE <expr2> ]
END
For example:
SELECT CustomerName, City, Country
FROM Customers
ORDER BY
(CASE
WHEN City IS NULL THEN Country
ELSE City
END);
implement float64 datatype
implement scalar functions
implement cancel protocol in tcp server:
call ctx.Cancel()
Currently we need to set vectorsql as GOPATH, which is quite weird.
implement window functions
SELECT
EmpName,
DeptName,
SUM(Salary) OVER( PARTITION BY DeptName ) AS SalaryByDept
FROM @employees;
EmpName | DeptName | SalaryByDept |
---|---|---|
Noah | Engineering | 60000 |
Sophia | Engineering | 60000 |
Liam | Engineering | 60000 |
Mason | Executive | 50000 |
Emma | HR | 30000 |
Jacob | HR | 30000 |
Olivia | HR | 30000 |
Ava | Marketing | 25000 |
Ethan | Marketing | 25000 |
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.