libsql / sqld Goto Github PK

View Code? Open in Web Editor NEW

900.0 17.0 38.0 3.43 MB

LibSQL with extended capabilities like HTTP protocol, replication, and more.

Home Page: https://libsql.org

database rust serverless sqlite

sqld's Introduction

SQLD (archived)

The code in this repository has been moved to tursodatabase/libsql and moved within the libsql-server folder.

To open a new issue please do so here.

sqld's People

Contributors

Stargazers

Watchers

sqld's Issues

SQLite shell support

Let's add support for the sqlite3 shell using the client code we have.

Enable client library to run in browser apps

The only things stopping @libsql/client from running in the browser (in order to access sqld over HTTP) are:

The sqlite3 module used by the Sqlite3Driver code requires nodejs
sqld does not currently accept CORS requests

If the TypeScript code is simply copied into a React project, and modified to remove all references to SQLite, while also installing cross-fetch, it runs to the point of getting a CORS error.

If we make the sqlite driver its own installable module and enable cors on sqld, that should be sufficient to allow browser apps to directly make queries.

Bottomless storage

Integrate with libsql bottomless storage https://github.com/libsql/bottomless so that we can easily suspend and resume sqld database instances.

Replication protocol: bottomless frame replication

In order to offer bottomless storage, and prevent data losses, we need to persist the WAL log.

We will build on top of the work done in bottomless, and use its replication module to replicate the replication WAL instead of the standard libsql WAL.

Instead of writing frames directly to the WAL file, we first write them to the bottomless replication module. When we know for sure the frames have been replicated, we write them to the log file, so they can be replicated to the read replicas. This is to prevent the situation where a crash would happen, and the primary would be restored to a point back in time relative to the read replicas.

Doing this will probably require some generalization changes to be performed to bottomless.

Replication Protocol: Handshake

Currently, the replication protocol does not perform any handshake. A reader connects to a writer and starts asking for a frame the next frame in its log.

When a reader initiates a connection with the primary, it sends a handshake request message. The writer responds to that handshake message with the id of the database, along with the current it's current WAL frame offset. If the reader was not replicating any database yet (fresh replica), it saves the id of the database and starts pulling frames from the primary. If the reader was already replicating a database, it ensures that the database id it received matches that of the database it was replicating and that it's current frame index is less that the frame index sent by the primary. If any of those invariants is broken, the replica should shut down.

client developer experience would be better if we could pass a local file from the start

I am testing this configuration:

  const config = {
    url: process.env.DATABASE_URL
  };
  const db = connect(config);

If DATABASE_URL is empty it works fine but it seems to be in memory.
If I pass a filename, I get:

	throw new TypeError('Only absolute URLs are supported');
		      ^

And if I try a sqlite:// string, I get:

/Users/glaubercosta/sqldblog/node_modules/node-fetch/lib/index.js:1331
		throw new TypeError('Only HTTP(S) protocols are supported');

This forces the user to set up a local sqld for dispatch, but for usage in integration tests, for example, this is less optimal.
If a local file is passed, this should be a no-setup local sqlite instance.

Serverless JavaScript driver

The client API we have is great for cases where you can do LD_PRELOAD, but other databases are adopting serverless JavaScript drivers such as:

https://github.com/planetscale/database-js

https://developers.cloudflare.com/d1/platform/client-api/

Let's implement a JavaScript driver that connects to sqld.

Param binding silently fails in HTTP query to DB replica

The values of query parameters are not bound to the query, when the query is sent to a DB replica.

Version:

version: 0.1.0
commit SHA: 1ff0f63a60c350e50f5d8831f7cc8bcd330f59e6
build date: 2023-01-27

Steps to reproduce:

Run sqld primary on :8080 and a replica on :8081
- sqld --grpc-listen-addr=0.0.0.0:5001 --http-listen-addr=0.0.0.0:8080
- sqld --pg-listen-addr 127.0.0.1:5433 --primary-grpc-url http://127.0.0.1:5001 --http-listen-addr=0.0.0.0:8081 (make sure to run in a different folder or set a different DB file)
Setup the table: curl localhost:8080/ -d '{"statements":["CREATE TABLE user (id INTEGER PRIMARY KEY, name TEXT NOT NULL);"]}'
Insert into main DB:

curl localhost:8080/ -d '{"statements":[{ "q": "INSERT INTO user (name) VALUES(:name);", "params": { ":name": "main" } }]}' 
# works!

Fail to insert into replica:

curl localhost:8081/ -d '{"statements":[{ "q": "INSERT INTO user (name) VALUES(:name);", "params": { ":name": "replica" } }]}'
# query fails with error:
# [{"error":{"message":"missing parameters, expected 1 found 0"}}]

Clean sqld database file structure

sqld current file structure is messy. Let's clean it up, and put all of the generated files in a directory managed by sqld.

This will also make cleaning the db easier since only one file will need to be cleaned.

Fedora 38:/usr/include/sqlite3.h:35:10: fatal error: 'stdarg.h' file not found

cargo run failed on fedora 35

[me@fedora01d sqld]$ sudo ./scripts/install-deps.sh
[me@fedora01d sqld]$ cargo run
<snipped>

  cargo:rerun-if-env-changed=LIBSQLITE3_SYS_BUNDLING

  --- stderr
  /usr/include/sqlite3.h:35:10: fatal error: 'stdarg.h' file not found
  thread 'main' panicked at 'could not run bindgen on header /usr/include/sqlite3.h', /home/me/.cargo/registry/src/github.com-1ecc6299db9ec823/libsqlite3-sys-0.                               25.2/build.rs:611:33
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
[me@fedora01d sqld]$ cat /etc/redhat-release
Fedora release 35 (Thirty Five)
[me@fedora01d sqld]$

Where is stdarg.h ?

[me@fedora01d sqld]$ grep stdarg.h /usr/include/sqlite3.h
#include <stdarg.h>     /* Needed for the definition of va_list */
[me@fedora01d sqld]$
[me@fedora01d sqld]$ locate stdarg.h
/usr/include/c++/11/tr1/stdarg.h
/usr/lib/gcc/x86_64-redhat-linux/11/include/cross-stdarg.h
/usr/lib/gcc/x86_64-redhat-linux/11/include/stdarg.h
/usr/lib64/clang/13.0.1/include/stdarg.h
/usr/src/kernels/6.0.12-100.fc35.x86_64/include/linux/stdarg.h
[me@fedora01d sqld]$

Binding params in HTTP queries

It would be interesting to be able to bind parameters in HTTP queries, once #12 lands. I'm not completely sure about the API though.

Currently, there request payload looks like this:

{
    "statements": [/* stmts */],
}

We could for, for example add a params fields to the payload, with variables scoping all the statements in statement, e.g:

{
    "statements": ["select * from users where name = $name"],
    "params": {
        "name": "adhoc",
    }
}

another option could be statement-scoped variables, like so:

{
    "statements": [{ "statement": "select * from users where name = $name", "params": {"name": "adhoc"}}],
}

Invalid/useless query params should yield an error

This issue refers to the changes made in #45.

The following queries silently yield no results because the parameters don't make sense for the queries. Instead, my opinion is that they should yield an error in order to prevent confusion.

In the below case, $2 refers to a positional parameter that doesn't exist. It's unclear what value actually gets bound to the statement.

{
  "statements": [
    {
      "q": "select * from foo where bar = $2",
      "params": [
        "hi"
      ]
    }
  ]
}

In the next case, the valx parameter doesn't get used in the query. The query is also expecting val, which isn't specified. These conditions should both yield an error (the latter moreso than the former).

{
  "statements": [
    {
      "q": "select * from foo where bar = $val",
      "params": {
        "valx": "some-value"
      }
    }
  ]
}

I also question whether these cases (provided in the original issue) make sense to allow due to potential confusion:

        {"q": "select * from users where name = $1", "params": {"name": "adhoc"}}, # object is order sensitive
        {"q": "select * from users where name = ?", "params": {"name": "adhoc"}},

Lastly, specifying a null value for a parameter doesn't work in the case of a SQL where clause, probably because SQL is expecting "x IS NULL" instead of "x = null". If we can catch that, I'm sure developers will appreciate it.

Support parameter binding in the client library

Currently the client library does not support parameter binding and instead requires fully formatted SQL statements as input. Let's support both positional and named parameters as implemented in #45.

HTTP basic authentication support

Add two new command line options, --http-username and --http-password, to enable HTTP basic authentication in the sqld HTTP server for basic access control.

SQL over HTTP

Serverless JavaScript apps, for example, integrate best with HTTP-based things (using the fetch() API). Let's add a SQL over HTTP protocol to sqld, which we can use to implement #4. We should support both single statements and transactional batches as PlanetScale's JavaScript driver does: https://github.com/planetscale/database-js

Example bash client does not handle new batch query results JSON

With batch query rules JSON from #33, the example yields errors like this:

$ ./examples/http/client.sh localhost:8080

sqld> select * from foo

jtbl:  Cannot represent this part of the JSON Object as a table.
       (Could be an Element, an Array, or Null data instead of an Object):
       [ ... JSON here ... ]

Disable buf_redux feature of sqlite3-parser

As you already have the whole SQL query in memory here:

sqld/sqld/src/query_analysis.rs

Line 104 in 4f0368b

let mut parser = Box::new(Parser::new(s.as_bytes()));

you should be able to remove the buf_redux feature here:

sqld/sqld/Cargo.toml

Line 40 in 4f0368b

 sqlite3-parser = { git = "https://github.com/gwenn/lemon-rs.git", rev = "be6d0a96c4424fada2e068b51c25d1fdc9f983f4" } 

sqlite3-parser = { git = "https://github.com/gwenn/lemon-rs.git", rev = "be6d0a96c4424fada2e068b51c25d1fdc9f983f4", default-features = false, features = [
    "YYNOERRORRECOVERY"
] }

Regards

Fix build caching in dockerfile

The docker file uses cargo-chef to cache the building of dependencies and increase repeated docker builds. This seems to be broken now, since every change that I make tot he project seems to retrigger a full compilation in docker.

@libsql/client does not handle new batch query results

The return type of execute() return type needs to be changed to support multiple result sets that come from sqld as a result of #33. Right now it just handles a single ResultSet. Perhaps it should be Promise<ResultSet[]>.

END statement not supported by parser

We use sqlparser-rs to parse and analyze queries. Despite using the SQLite dialect, compatibility with the language seems a bit limited. Support for the END statement seems to be one such instance of bad support.

Maybe we should consider alternatives. I'm considering https://github.com/gwenn/lemon-rs.

In the meantime, we should be returning a parsing error, rather than ignoring it.

sqld docker image does not build and run correctly using libSQL enabled with wasm

When modifying server/build.rs to include the --enable-wasm-runtime for libsql, then running locally with docker-compose up, the server fails with the following messages:

Attaching to sqld-nginx-1, sqld-reader-1, sqld-writer-1
sqld-writer-1  | /bin/sqld: error while loading shared libraries: libsqlite3-3.41.0.so.0: cannot open shared object file: No such file or directory
sqld-writer-1 exited with code 127
sqld-reader-1  | /bin/sqld: error while loading shared libraries: libsqlite3-3.41.0.so.0: cannot open shared object file: No such file or directory
sqld-reader-1 exited with code 127

@psarna was able to work around this by modifying Dockerfile:

# install dependencies
FROM rust:slim-bullseye AS compiler
RUN apt update && apt install -y libclang-dev clang \
    build-essential tcl protobuf-compiler file
WORKDIR /sqld

# build sqld
FROM compiler AS builder
COPY . .
RUN ( cd libsql && ./configure --enable-wasm-runtime --with-pic --enable-releasemode && make )
COPY ./libsql/.libs/lib* /usr/lib/
COPY ./libsql/sqlite3.h /usr/include
RUN cargo build --release -p sqld

# runtime
FROM debian:bullseye-slim
COPY --from=builder /sqld/target/release/sqld /bin/sqld
COPY --from=builder /sqld/libsql/.libs/lib* /usr/lib/
COPY docker-entrypoint.sh /usr/local/bin
ENTRYPOINT ["docker-entrypoint.sh"]

EXPOSE 5000
CMD ["/bin/sqld"]

Clean up top-level error messages (e.g. SQL syntax errors)

Issuing SQL with a syntax error like this:

selectx * from foo

Results in a somewhat confusing error message:

near ID, "Some("selectx")": syntax error at (1, 8)

This could probably be cleaned up a bit and made more helpful.

MySQL protocol? is it hard to say mysql protol?

DBeaver: serverVersion must not be null

Attempting to connect to sqld with DBeaver triggers an error message:

Something unusual has occurred to cause the driver to fail. Please report this exception.
serverVersion must not be null

sqld 0.0.1 installed on macOS via Homebrew.

command:

sqld -d sqld.db -p 127.0.0.1:5432 --http-listen-addr=127.0.0.1:8000

DBeaver 22.3.3.202301211053

Driver: PostgreSQL
Connect by Host: localhost:5432
Local PGclient: 14.6

P.S. darn cool project.

Client library support for transactions

The sqlite client library we're using, better-sqlite3, has a transaction API that blocks the event thread until completion. This is incompatible with our async transaction method that would invoke it. We need some sort of workaround (manual transaction statements with robust error handling?) or a new library that will do transactions asynchronously.

Wrong transaction boundary in JavaScript client `transaction()` in SQLite mode

Let's add BEGIN and COMMIT statements in the statement list executed by transaction() to ensure the same transaction boundary between HTTP and SQLite.

Use more robust fetching in JavaScript driver

Kyle Mathews suggested using a more robust fetch that handlers retries, for example, in the JavaScript driver: https://github.com/sindresorhus/ky-universal

Prepared statement support in client

Currently, we use simple queries in the client. Let's switch to prepared statements, which has the extra benefit of us being able to implement the SQLite API semantics correctly.

Multiple statements should be wrapped in an implicit transaction in sqld

As discussed today, we should add a transaction to the batch of statements.

Replication protocol: Log compaction & snapshots

The replication WAL is an append-only series of WAL frames. Each frame contains the data for a certain page. Since this file can grow indefinitely, and because we can to be able to restore a database from a snapshot we need to be able to compact this file into a snapshot and store it somewhere safe.

The snapshot file will be a deduplicated version of the log file. This snapshot can later be restored as if this was just a simple transaction.

here is a general overview of the process:

Compacting the replication WAL

In order to compact the replication WAL, we decide on a compaction point in the file. This point is a commit boundary so there is no dangling transaction in a snapshot. Once this point is determined, we walk backward in the log while maintaining a set of already-seen pages, and for each frame in the log, we check if we already saw that page. If we did, we move on to the next frame. If we haven't we add the page to the snapshot file, and mark this page as seen. The snapshot header contains the start and end frame index contained in the snapshot. Once the snapshot file is created, it is sent to a safe storage place (likely s3), and then, the replication log can be truncated, and the header start frame index is updated to reflect the truncation.

Restoring from snapshot

Restoring from snapshots can be done in an ad-hoc manner. In an intercepted call to xFrame, we restore the snapshot as if it was a flight of frames, all part of the same transaction. The easiest way is probably to memory map the snapshot file, and build a PageHdr linked list pointing to pages in the mapped memory. Once the snapshot has been successfully applied, it can safely be discarded.

Another method would be to use a buffer and apply frames by batch, only committing to the last batch.

Snapshot chains

The process we have described here deals with partial snapshots. A snapshot only contains pages that were written too between the snapshot start frameno and the snapshot end frameno. It may be necessary to download earlier snapshots to get a complete view. Snapshots must be applied in chronological order. To avoid downloading too many snapshot files to rebuild a consistent view of the database, snapshots can be merged in a similar process as the compaction process.

Optimize single statement execution in JavaScript client

We currently wrap transaction() statements with BEGIN + COMMIT. Unfortunately, this now means as single statement (that would be autocommitted) is also wrapped. Let's optimize the single statement case by avoiding the wrapping.

Wal logging ans snapshotting should be performed even when not in writer mode

The WAL logging is enabled only if the node is in writer mode open for replication.

The issue with this approach is that if a writer node is not in the replication mode cannot be upgraded to be replicated. This is because we don't keep track of the pages already written to the log, nor have a snapshot to bootstrap a replica with.

To fix that, we should maintain a log no matter what, so that a node can always be upgraded.

libsql-client: use secrets instead of regular variables for sensitive data

Cloudflare Workers API gives access to secrets: https://docs.rs/worker/latest/worker/struct.RouteContext.html#method.secret

Let's use it instead of getting the auth variables from regular variables, which are not protected by default.

Client library error handling needs to be improved

The current behavior of execute() that causes an error leads to return values that look like this:

{
  results: { error: 'no such table: x' },
  success: true,
  meta: { duration: 0 }
}

Firstly, success should be false. Secondly, it's confusing (and not typesafe) that the error message is embedded in the results property. I suggest that we define the output such that results only exists if there is an actual result set from a statement that did not error, and a separate optional error property which contains an object that describes the error. Clients can check for errors by checking the success property to understand which of the other properties to consult.

libsql-client(Rust): detect and add https:// to the URL if necessary

To make sure that the client successfully connects, let's check if urls start with https://. If they don't, or start with http:// instead, let's fix the URL in-place.

Replication fails silently if theres a dangling `.wal_index.iku` file

Http: disallow interactive transactions

We don't want interactive (stateful) transactions over http

Remove posgres over websocket

With hrana as the superior protocol to communicate over websocket, we should get rid of the WS protocol for postgres. This simplify postgres code significantly.

Add startup header

It would be useful to have a startup header that summarizes the configuration, version etc

Docker build fails

When I run docker build . the build returns with the following error:

#18 0.475    Compiling sqld v0.1.0 (/sqld/server)
#18 1.731 error: failed to run custom build command for `sqld v0.1.0 (/sqld/server)`
#18 1.731
#18 1.731 Caused by:
#18 1.731   process didn't exit successfully: `/sqld/target/release/build/sqld-2c6ab55d9ddaac33/build-script-build` (exit status: 101)
#18 1.731   --- stderr
#18 1.731   thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', server/build.rs:17:37
#18 1.731   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
------
executor failed running [/bin/sh -c cargo build --release -p sqld]: exit code: 101

System Spec

MacBook Pro Apple M1
Docker version 20.10.21, build baeda1f

Add shellcheck to CI

The shellcheck too is great at catching bugs in shell scripts so let's add it to the CI.

Make SQL console opt-in

We don't want to deploy the SQL console by default so let's make it opt-in.

Identify database with unique ID

When creating a new database, and its associated WAL log, we should give it a unique identifier.

When a new read replica is started, it first asks the writer for metadata about the database that is about to replicate. If a database already exists at the path passed to the read replica, it ensures that is matches the one from the writer. If they don't match, the reader returns an error.

Setup test harness

We now have a lot of different protocols to interact with sqld, all with specificities. On top of that is the distributed setup. This create a lot of surface to test. Let's find a way to test all of those configurations without repeating ourselves too much.

It should be an error to try to run more than one statement with client lib `execute()`

If a statement string is issued with more than one command separated by semicolons, everything but the first command is silently ignored. Instead, I think sqld should reject the query with an error saying that each string can only contain one statement.

Enable client library to work in browsers

Right now, the client library will work in a browser app if the better-sqlite3 dependency is not present. I did this by simply copying the TS source code into a react app and removing all references to the sqlite driver. What would be great is to allow this package to work without modification by making the better-sqlite3 package optional somehow.

RL8: Unknown flag: --experimental_allow_proto3_optional

Not knowing much on protobuf. I am just playing with sqld build for RL8.

The offending code when compiling on RL8

  tonic_build::configure()
        .protoc_arg("--experimental_allow_proto3_optional")
        .compile_with_config(
            config,
            &["proto/wal_log.proto", "proto/proxy.proto"],
            &["proto"],
        )?;

The error message on RL8

  --- stderr
  Error: Custom { kind: Other, error: "protoc failed: Unknown flag: --experimental_allow_proto3_optional\n" }
[me@rocky8t01d server]$ protoc --version
libprotoc 3.5.0
[me@rocky8t01d server]$

protoc version difference

[me@fedora01d sqld]$ protoc --version
libprotoc 3.14.0
[me@fedora01d sqld]$
[me@rocky8t01d sqld]$ protoc --version
libprotoc 3.5.0
[me@rocky8t01d sqld]$

HTTP API ignores all statements but the first

For some reason which I haven't investigated, sending multiple statements does not work as expected, and only the first one is effective, e.g.:

curl -s -X POST -d '{"statements": ["create table testme(a,b,c)", "insert into testme values(1,2,3)"]}' localhost:8000

only creates the table, but doesn't insert anything.

I also tried sending a single string with multiple statements delimited by ;, but it does not work either, which is probably the root cause:

curl -s -X POST -d '{"statements": ["create table testme(a,b,c); insert into testme values(1,2,3)"]}' localhost:8000

OAuth bearer token support for HTTP client authentication

Let's add support for OAuth bearer tokens so people are not forced to use HTTP basic authentication.

SQL over HTTP API tests

We currently have some tests for the JavaScript driver, but let's add tests that use the SQL over HTTP API directly.