background

pipeline flow

Bug Report

What version of KipSQL are you using?
latest

What version of Rust are you using?

rustc 1.77.0-nightly

What did you do?

explain select t.a from t1;
            PLAN
----------------------------
 Projection [t.a] [Project]+
   Scan t1 -> [] [SeqScan]
(1 row)

What did you expect to see?

some error message

Keyword splicing abbreviation in `TableCodec`

Feature Request

Use a certain format to simplify key generation
like: TableName + TypeNum + 0 + TypeId + 0 + DataValue
e.g. T0_Index_0_0_0_0000000000000000000 => T020000000000000000000000

Feature Request

Is your feature request related to a problem? Please describe:
currently, we just support like select * from t1 where a=1
we should add order operation for exp.
select a, b from t1 order by a

Describe the feature you'd like:
order by a desc/asc

Feature Request

e.g. select v2 from t where v1 like 'G%';

Insert operations where some columns are null are not supported

Bug Report

What version of KipSQL are you using?
latest

What version of Rust are you using?
nightly
What's the status of the running?

What did you do?
create table t1 (a int, b bool null);

insert into table t1 values (1, null);

like bustub we should support the part of the column is null

What did you expect to see?
insert success
and the value is exp. ' bool(null) '

What did you see instead?
the value is 'NULL'

Feat: Support `TopK` on `Order By ... Limit ...`

Feature Request

When OrderBy .. Limit .. exists, the TopK operator will be used

Tips: TopK should also use index

Feature Request

this issue is #97 sub-issue

support alert table:

ALTER TABLE table ADD column_1 INTERGER;
ALTER TABLE table MODIFY column_1 INTERGER;
ALTER TABLE table CHANGE COLUMN column_1 column_2 INTERGER;
ALTER TABLE table DROP COLUMN column_1;
RENAME table TO table ;

Feature Request

TPC-H:

CREATE TABLE lineitem (
    l_orderkey INT,
    l_partkey INT,
    l_suppkey INT,
    l_linenumber INT,
    l_quantity DECIMAL(15,2),
    l_extendedprice DECIMAL(15,2),
    l_discount DECIMAL(15,2),
    l_tax DECIMAL(15,2),
    l_returnflag CHAR(1),
    l_linestatus CHAR(1),
    l_shipdate DATE,
    l_commitdate DATE,
    l_receiptdate DATE,
    l_shipinstruct CHAR(25),
    l_shipmode CHAR(10),
    l_comment VARCHAR(44),
    PRIMARY KEY (l_orderkey, l_linenumber)
);

Bug Report

What version of KipSQL are you using?

latest

What version of Rust are you using?

1.77.0

What's the status of the running?

What did you do?
When I run a SUBSTRING function that length is larger than the string, the connectionn is reseted.

What did you expect to see?

=> select substring('abc', 1, 10);
---------------------------------
 abc

What did you see instead?

=> select substring('abc', 1, 10);
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

I think it's better to handle this by retrieving all the characters after the start index(like PostgreSQL and MySQL)

Feature Request

Is your feature request related to a problem? Please describe:
currently, we just support select a from t1
we should add filter operation for exp.
select * from t1 where a=1

Describe the feature you'd like:
where a>1 a<1 a =1
where c1 > c2
where c1 > c2 * 2

Bug Report

What version of KipSQL are you using?
main

What version of Rust are you using?

What's the status of the running?
build error

https://github.com/KipData/KipSQL/blob/main/src/storage/mod.rs#L151
KipDB new version(0.1.2-alpha.17) CacheError has been removed, we should remove it in KipSql

What did you do?

What did you expect to see?

What did you see instead?

Feature Request

Function

CURRENT_TIMESTAMP

Create Table

PRIMARY KEY (column_1, column_2, ...)
UNIQUE (column_1, column_2, ...)
Options
- Default e.g. CREATE TABLE CUSTOMER (C_ID INTEGER DEFAULT '0' NOT NULL)

Ref: #103

Create Index

CREATE UNIQUE INDEX u_index ON table(column_1, column_2, ...);
Perf Unique Index Choose (options)

Alert Table

ALTER TABLE table ADD column_1 INTERGER; ref: #104
ALTER TABLE table MODIFY column_1 INTERGER;
ALTER TABLE table CHANGE COLUMN column_1 column_2 INTERGER;
ALTER TABLE table DROP COLUMN column_1; ref: #109
RENAME table TO table ;

Ref:

#101

Desc

Desc Table: #102

the predicate order of Filter will make it impossible to push down to Join.

Bug Report

Columns from different tables should be divided as much as possible on both sides so that they can be pushed down normally under the Join.

Bug: In multi-level nested subquery, the field relationships of different tables are wrong.

Bug Report

It seems to project twice on temp table.
And I don't think it can handle more complex situations.

explain SELECT b FROM t1 WHERE a in (SELECT a FROM t2 where (select a from t1));
                               PLAN
-------------------------------------------------------------------
Projection [t1.b] [Project]                                      +
  LeftSemi Join On t1.a = (t2.a) as (_temp_table_2_.a) [HashJoin]+
    Scan t1 -> [a, b] [SeqScan]                                  +
    Projection [(t2.a) as (_temp_table_2_.a)] [Project]          +
      Projection [t2.a] [Project]                                +
        Inner Join Where _temp_table_1_.a [HashJoin]             +
          Scan t2 -> [a] [SeqScan]                               +
          Projection [(t2.a) as (_temp_table_1_.a)] [Project]    +
            Projection [t2.a] [Project]                          +
              Scan t1 -> [] [SeqScan]

I think every query using their own binder context may be a solution, but it may be a big work.

#168 (comment)

Feature Request

Extract KipDB transactions and put them under user control

Invalid:

The Update statement involves two transaction creations:

scan
insert

PR

#82

PANIC occurs during bind `Insert`

Bug Report

What version of KipSQL are you using?
0.0.1-alpha.6

What version of Rust are you using?

❯ rustc -V
rustc 1.76.0-nightly (de686cbc6 2023-12-14)

What's the status of the running?

What did you do?

> 👇👇🏻👇🏼👇🏽👇🏾👇🏿 <
create table blog_1 (id int primary key, title varchar unique, created_at datetime);

+----------------------+
| CREATE TABLE SUCCESS |
+======================+
| blog_1               |
+----------------------+

> 👇👇🏻👇🏼👇🏽👇🏾👇🏿 <
insert into blog_1 (id, title) values (0, 'KipSQL', '2022-10-12 12:22:33');
thread 'main' panicked at /home/jojo/code/open-source/kipsql/src/binder/insert.rs:52:52:
index out of bounds: the len is 2 but the index is 2
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

What did you expect to see?
Insert bind fails with error

> 👇👇🏻👇🏼👇🏽👇🏾👇🏿 <
insert into blog_1 (id, title) values (0, 'KipSQL', '2022-10-12 12:22:33');
Oops!: bind error: column count not match, expect 2, got 3

What did you see instead?
PANIC occurs when the number of columns does not match the expression during bind Insert.

> 👇👇🏻👇🏼👇🏽👇🏾👇🏿 <
insert into blog_1 (id, title) values (0, 'KipSQL', '2022-10-12 12:22:33');
thread 'main' panicked at /home/jojo/code/open-source/kipsql/src/binder/insert.rs:52:52:
index out of bounds: the len is 2 but the index is 2
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Feature Request

By default, kipsql provides cargo run for testing sql statements. As an embedded database, kipsql needs to perform some maintenance operations. It is more user-friendly to operate with cli instead of coding. Therefore, use the main method of this test directly. Just provide an optional path parameter

e.g. cargo run -path '/home/kould/kipsql_data'

Support `Text` type

Feature Request

Separate Varchar from Text, use a prefix of the same length, and combine it with the suffix to form a complete string.

Tips: The suffix of Varchar is string, and the suffix of Text is file.

Perf: Optimize the loading of `StatisticsMetaLoader`

Feature Request

As mentioned in issue: #162, StatisticsMetaLoader::load takes a considerable amount of time.
Currently, StatisticsMetaLoader loads table statistical information at a granularity. This may happen when a table has only a few indexes that are frequently used, but indexes that are not commonly used are still loaded. We can reduce the granularity and increase the loading speed.

Tips: StatisticsMeta is based on Index as the granularity

pub fn load(&self, table_name: TableName) -> Result<&Vec<StatisticsMeta>, DatabaseError> {
        let option = self.cache.get(&table_name);

        if let Some(statistics_metas) = option {
            Ok(statistics_metas)
        } else {
            let paths = self.tx.statistics_meta_paths(&table_name)?;
            let mut statistics_metas = Vec::with_capacity(paths.len());

            for path in paths {
                statistics_metas.push(StatisticsMeta::from_file(path)?);
            }

            Ok(self
                .cache
                .get_or_insert(table_name, |_| Ok(statistics_metas))?)
        }
    }

Create a new table with an existing table name

table t1 has existed，but we can create another t1 table, that is a bug~

parser

Background

In KipSQL, we use sqlparser-rs to parse SQLs.

Parser will return an AST, which is a tree that represents users' query. For example, the above SQL will produce an AST like...

create table t1 (v1 int not null, v2 int);

AST: CreateTable {
    or_replace: false,
    temporary: false,
    external: false,
    global: None,
    if_not_exists: false,
    transient: false,
    name: ObjectName(
        [
            Ident {
                value: "t1",
                quote_style: None,
            },
        ],
    ),
    columns: [
        ColumnDef {
            name: Ident {
                value: "v1",
                quote_style: None,
            },
            data_type: Int(
                None,
            ),
            collation: None,
            options: [
                ColumnOptionDef {
                    name: None,
                    option: NotNull,
                },
            ],
        },
        ColumnDef {
            name: Ident {
                value: "v2",
                quote_style: None,
            },
            data_type: Int(
                None,
            ),
            collation: None,
            options: [],
        },
    ],
    constraints: [],
    hive_distribution: NONE,
    hive_formats: Some(
        HiveFormat {
            row_format: None,
            storage: None,
            location: None,
        },
    ),
    table_properties: [],
    with_options: [],
    file_format: None,
    location: None,
    query: None,
    without_rowid: false,
    like: None,
    clone: None,
    engine: None,
    default_charset: None,
    collation: None,
    on_commit: None,
    on_cluster: None,
    order_by: None,
}

Tasks

Implemented with traits, it is convenient to implement parser by yourself later

csv storage

Feature Request

Is your feature request related to a problem? Please describe:

for test
support bound and projection

Describe the feature you'd like:

add csv storage

Teachability, Documentation, Adoption, Migration Strategy:

use arrow type

support desc table

Feature Request

we need support desc table to query table structure：

DESC table_name;

I will try impl this

catalog

Background

Build the most basic component of the database: catalog. It provides metadata information for database tables, which is used for subsequent binder and logical planner construction.
At the same time, it is a component that runs through the entire query engine processing flow and provides table metadata information.

To simplify implementation, there is no introduction of the concepts of database and schema here.
Under RootCatalog, a HashMap directly contains all TableCatalogs.
The TableCatalog contains all ColumnCatalogs.

root->table->column

Tasks

root
table
column

Implement SQL Insert

Feature Request

Convert sql to Logical Plan of Insert -> Projection -> Values

The new Values operator is used to convert Rows to BoxedExecutor as a temporary data source

Bug: Only one related transaction should be allowed to exist when making a DDL statement

Feature Request

When DDL exists in multiple transactions, it may cause the data modification to be incorrect but still able to be submitted.(Write conflict detection requires the key to be the same)

`[not] In\[not] Like\is [not] null` in the predicate in Filter needs to be rewritten as an equivalent if range condition

Bug Report

In

c1 In (1, 2, 3) => c1 = 1 or c1 = 2 or c1 = 3

Like

c1 like 'abc%' => c1 > 'abc' and c1 < 'abd'

Is null

c1 is null => c1 == null(logical)

By converting these advanced operators into equality and range queries, CBO can provide more accurate row number estimates.

is [not] null
[not] like
[not] in

RBO rules supplement

Feature Request

max-min-eliminate
- ref: https://docs.pingcap.com/zh/tidb/stable/max-min-eliminate
correlated-subquery-optimization
- ref: https://docs.pingcap.com/zh/tidb/stable/subquery-optimization
predicate-pushdown
- outer -> inner, ref: link
outer_join_eliminate
- ref: https://cn.pingcap.com/blog/tidb-source-code-reading-21/
aggregation_push_down
- ref: https://www.openpie.com/community/blog/1284050105

TODO: SQL 2016

Feature Request

If you want to contribute code to FnckSQL and find requirements, you can refer to the following list

The corresponding test files are attached under the requirements description. The specific location is in the test directory.
e.g. E011_02 => tests/slt/sql_2016/E011_02.slt

Tips: When you complete the requirement and submit the PR, please do not forget to delete the comment corresponding to the Case.

binder

Background

Binder is a module in the entire database system that is very important.
Its function is to bind the AST and Schema information generated after parsing, specifically including:

Checking whether the input name is legal, whether there are duplicates or ambiguities
Inferring the return value type of expressions and checking whether it is legal
Converting input names into internal IDs.

Tasks

In this stage, you should Implement 3 types of statement binding

can not select from exsiting table after rebooting

Bug Report

What version of KipSQL are you using?

What version of Rust are you using?

What's the status of the running?

What did you do?
create table t1 (a int primary key, b int, k int)
insert into t1 (a, b, k) values (-99, 1, 1), (-1, 2, 2), (5, 2, 2)
quit
cargo run
select * from t1

What did you expect to see?
(-99, 1, 1), (-1, 2, 2), (5, 2, 2)

What did you see instead?
Oops!: bind error: invalid table bind table t1

Feature: Support Nested Loop Join

Feature Request

There are currently many features that depend on this

Cross Join
Subquery on where: c1 >= (select 1)

`DataValue::to_index_key` and `DataValue::to_primary_key` abbreviation

Feature Request

When converting most numbers to the corresponding key, you can use type conversion to [u8] to convert to a string to reduce the length of the key while ensuring the bitwise comparison order.

Supports `WASM` for JavaScript compatibility

Feature Request

As a lightweight embedded SQL storage database, KipSQL can enhance support for micro-frontends to expand usage scenarios.

[1] https://developer.mozilla.org/zh-CN/docs/WebAssembly/Rust_to_Wasm

Support type length limit

Feature Request

e.g. Varchar(255)

before: #57

support update operator

Feature Request

UPDATE table_name SET field1=new-value1, field2=new-value2 [WHERE Clause]
logic plan
physics plan
executor
storage using hash replace vector

Support `sub query`

Feature Request

e.g. SELECT * FROM employees WHERE salary > (SELECT salary FROM employees WHERE last_name = 'Abel');

Perf: Composite Index Range Detach optimization

Feature Request

TODO: When the case is select * from t1 where (c1 = 7 or c1 = 10) and c2 < 2;, the range is (-inf, (10, 2)), in fact it is better to be ((7), (7) ,2)), ((10), (10, 2))

on pr: #154 TODO part

Reduce performance issues caused by `Build Plan`

Feature Request

I try to perform this case in a table with 200,000 rows: select * from t1 where c1 > 500 and c1 < 1000, and use pprof-rs to generate a flame graph

The most obvious one is StatisticMetaLoader::load, which will then be cached, and then I generate a flame graph after loading.

In the new flame graph, HepOptimizer::find_best is a relatively large performance bottleneck.

Feature: drop table should support if_exists

Feature Request

Is your feature request related to a problem? Please describe:

execute drop table t, if table t is not exist will get an error, I think need support if_exists param
Oops!: executor error: storage_ap error: kipdb error

Describe the feature you'd like:

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

Invalid field specified when inserting

Bug Report

let _ = kipsql.run("create table t1 (a int, b int)").await?;
let _ = kipsql.run("insert into t1 (b, a) values (1, 1), (3, 3), (5, 4)").await?;

let vec_batch_full_fields = kipsql.run("select * from t1").await?;
print_batches(&vec_batch_full_fields)?;

👇

a	b
1	1
3	3
5	4

Tips: Fix it by directly adding Projection to Insert and Values

add testing framework for kipsql

In order to facilitate testing and ensure the correctness of existing functions, the e2e testing framework sqllogictest-rs is introduced
https://github.com/risinglightdb/sqllogictest-rs

`select * from t1 where a > 0 order by a, b` error on test `test_run_sql`

Bug Report

Fix null failure of insert part

Bug Report

when inserting some null values, the DataType conversion fails.

because AST recognizes null in Insert Values as DataValue::Null, and the data type Null corresponding to KipSQL's DataValue::i32 is represented as i32(None)
therefore, Binder::bind_insert should convert the DataType::Null of expr in the cols again according to the datatype of the corresponding catalog (if the DataType of ColumnCataLog is DataType::i32, then DataValue::Null -> DataValue::i32(None) )

PG Wire support

Feature Request

skip startup step

Simple Query
Extended Query

Question:
KipSQL is currently mainly used as an embedded SQL database. Its similar database may be SQLite, but SQLite does not support remote connections.
Therefore, it may be a bit redundant for KipSQL to support PG Wire?
However, if it is used to support operation and maintenance, it may be a good choice.

Tips:
This is a lower priority issue(I think)，Because there is no suitable answer in the Question part.

[1] http://mysql.taobao.org/monthly/2020/03/02/
[2] https://zhuanlan.zhihu.com/p/493045524

Support `is null`

Feature Request

e.g. select v2 from t where v1 is null;

Roadmap v0.1

Milestone

The sql-query-engine-rs v0.1 milestone is to build a basic SQL query on CSV storage, like:

DDl:

CREATE TABLE student (
    id      INTEGER PRIMARY KEY,
    name    VARCHAR NOT NULL,
    age     INTEGER
);

DQL:

SELECT * FROM t
SELECT a, b FROM t
SELECT a, b, 1 FROM t

DML:

INSERT INTO t(a, b) VALUES (1, 10)

Therefore, we should implement the following database components:

Reading and writing CSV

Feature Request

By supporting CSV reading and output, it provides testing, backup and other applications.

kipdata / fncksql Goto Github PK

fncksql's People

Contributors

Stargazers

Watchers

Forkers

fncksql's Issues

background

pipeline flow

Bug Report

Feature Request

Feature Request

Feature Request

Bug Report

Feature Request

Feature Request

Feature Request

Bug Report

Feature Request

Bug Report

Feature Request

Bug Report

Bug Report

Feature Request

Invalid:

PR

Bug Report

Feature Request

Feature Request

Feature Request

Background

Tasks

Feature Request

Feature Request

Background

Tasks

Feature Request

Feature Request

Bug Report

Feature Request

Feature Request

Background

Tasks

Bug Report

Feature Request

Feature Request

Feature Request

Feature Request

Feature Request

Feature Request

Feature Request

Feature Request

Feature Request

Bug Report

Bug Report

Bug Report

Feature Request

Feature Request

Milestone

Feature Request

Recommend Projects

Recommend Topics

Recommend Org