Git Product home page Git Product logo

go-dqlite's Introduction

go-dqlite CI tests Coverage Status Go Report Card GoDoc

This repository provides the go-dqlite Go package, containing bindings for the dqlite C library and a pure-Go client for the dqlite wire protocol.

Usage

The best way to understand how to use the go-dqlite package is probably by looking at the source code of the demo program and use it as example.

In general your application will use code such as:

dir := "/path/to/data/directory"
address := "1.2.3.4:666" // Unique node address
cluster := []string{...} // Optional list of existing nodes, when starting a new node
app, err := app.New(dir, app.WithAddress(address), app.WithCluster(cluster))
if err != nil {
        // ...
}

db, err := app.Open(context.Background(), "my-database")
if err != nil {
        // ...
}

// db is a *sql.DB object
if _, err := db.Exec("CREATE TABLE my_table (n INT)"); err != nil
        // ...
}

Build

In order to use the go-dqlite package in your application, you'll need to have the dqlite C library installed on your system, along with its dependencies.

By default, go-dqlite's client module supports storing a cache of the cluster's state in a SQLite database, locally on each cluster member. (This is not to be confused with any SQLite databases that are managed by dqlite.) In order to do this, it imports https://github.com/mattn/go-sqlite3, and so you can use the libsqlite3 build tag to control whether go-sqlite3 links to a system libsqlite3 or builds its own. You can also disable support for SQLite node stores entirely with the nosqlite3 build tag (unique to go-dqlite). If you pass this tag, your application will not link directly to libsqlite3 (but it will still link it indirectly via libdqlite, unless you've dropped the sqlite3.c amalgamation into the dqlite build).

Documentation

The documentation for this package can be found on pkg.go.dev.

Demo

To see dqlite in action, either install the Debian package from the PPA:

sudo add-apt-repository -y ppa:dqlite/dev
sudo apt install dqlite-tools libdqlite-dev

or build the dqlite C library and its dependencies from source, as described here, and then run:

go install -tags libsqlite3 ./cmd/dqlite-demo

from the top-level directory of this repository.

This builds a demo dqlite application, which exposes a simple key/value store over an HTTP API.

Once the dqlite-demo binary is installed (normally under ~/go/bin or /usr/bin/), start three nodes of the demo application:

dqlite-demo --api 127.0.0.1:8001 --db 127.0.0.1:9001 &
dqlite-demo --api 127.0.0.1:8002 --db 127.0.0.1:9002 --join 127.0.0.1:9001 &
dqlite-demo --api 127.0.0.1:8003 --db 127.0.0.1:9003 --join 127.0.0.1:9001 &

The --api flag tells the demo program where to expose its HTTP API.

The --db flag tells the demo program to use the given address for internal database replication.

The --join flag is optional and should be used only for additional nodes after the first one. It informs them about the existing cluster, so they can automatically join it.

Now we can start using the cluster. Let's insert a key pair:

curl -X PUT -d my-value http://127.0.0.1:8001/my-key

and then retrieve it from the database:

curl http://127.0.0.1:8001/my-key

Currently the first node is the leader. If we stop it and then try to query the key again curl will fail, but we can simply change the endpoint to another node and things will work since an automatic failover has taken place:

kill -TERM %1; curl http://127.0.0.1:8002/my-key

Shell

A basic SQLite-like dqlite shell is available in the dqlite-tools package or can be built with:

go install -tags libsqlite3 ./cmd/dqlite
Usage:
  dqlite -s <servers> <database> [command] [flags]

Example usage in the case of the dqlite-demo example listed above:

dqlite -s 127.0.0.1:9001 demo

dqlite> SELECT * FROM model;
my-key|my-value

The shell supports normal SQL queries plus the special .cluster and .leader commands to inspect the cluster members and the current leader.

go-dqlite's People

Contributors

cnnrznn avatar cole-miller avatar freeekanayaka avatar juneezee avatar ktsakalozos avatar letfunny avatar marco6 avatar masnax avatar mhilton avatar morphis avatar nanjj avatar paulstuart avatar peterwishart avatar rabits avatar simonrichardson avatar snoe925 avatar stgraber avatar tomponline avatar utsav2 avatar vianaddsv avatar zxilly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-dqlite's Issues

Invalid memory address or nil pointer dereference within DQLite library

this issue is cloned from https://github.com/lxc/lxd/issues/9908

Required information

  • Distribution: Arch Linux
  • Distribution version: Rolling at 12.02.2022 ( DD.MM.YYYY)
  • The output of "lxc info" or if that fails:

Issue description

lxd -dv results in a crash with internal error:

DBUG[02-12|11:11:31] Connecting to a local LXD over a Unix socket 
DBUG[02-12|11:11:31] Sending request to LXD                   method=GET url=http://unix.socket/1.0 etag=
INFO[02-12|11:11:31] LXD is starting                          version=4.22 mode=normal path=/var/lib/lxd
DBUG[02-12|11:11:31] Unknown backing filesystem type: 0x61756673 
INFO[02-12|11:11:31] Kernel uid/gid map: 
INFO[02-12|11:11:31]  - u 0 0 4294967295 
INFO[02-12|11:11:31]  - g 0 0 4294967295 
INFO[02-12|11:11:31] Configured LXD uid/gid map: 
INFO[02-12|11:11:31]  - u 0 1000000 1000000000 
INFO[02-12|11:11:31]  - g 0 1000000 1000000000 
WARN[02-12|11:11:31] AppArmor support has been disabled because of lack of kernel support 
INFO[02-12|11:11:31] Kernel features: 
INFO[02-12|11:11:31]  - closing multiple file descriptors efficiently: yes 
INFO[02-12|11:11:31]  - netnsid-based network retrieval: yes 
INFO[02-12|11:11:31]  - pidfds: yes 
INFO[02-12|11:11:31]  - core scheduling: yes 
INFO[02-12|11:11:31]  - uevent injection: yes 
INFO[02-12|11:11:31]  - seccomp listener: yes 
INFO[02-12|11:11:31]  - seccomp listener continue syscalls: yes 
INFO[02-12|11:11:31]  - seccomp listener add file descriptors: yes 
INFO[02-12|11:11:31]  - attach to namespaces via pidfds: yes 
INFO[02-12|11:11:31]  - safe native terminal allocation : yes 
INFO[02-12|11:11:31]  - unprivileged file capabilities: yes 
INFO[02-12|11:11:31]  - cgroup layout: hybrid 
WARN[02-12|11:11:31]  - AppArmor support has been disabled, Disabled because of lack of kernel support 
WARN[02-12|11:11:31]  - Couldn't find the CGroup blkio.weight, disk priority will be ignored 
INFO[02-12|11:11:31]  - shiftfs support: no 
INFO[02-12|11:11:31]  - idmapped mounts kernel support: yes 
INFO[02-12|11:11:31] Initializing local database 
DBUG[02-12|11:11:31] Refreshing local trusted certificate cache 
INFO[02-12|11:11:31] Set client certificate to server certificate fingerprint=cdfdcab6239bd6969e1998d4ac9a011468e9da0eac5e1ccc4fefcbc9eba6eae5
DBUG[02-12|11:11:31] Initializing database gateway 
INFO[02-12|11:11:31] Starting database node                   id=1 local=1 role=voter
INFO[02-12|11:11:31] Daemon stopped 
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xd08 pc=0x5565af5f7a4e]

goroutine 1 [running]:
github.com/canonical/go-dqlite/internal/bindings._Cfunc_GoString(...)
	_cgo_gotypes.go:102
github.com/canonical/go-dqlite/internal/bindings.NewNode(0x5565af65f707, {0x5565b09d5198, 0x1}, {0xc000124b40, 0x1c})
	github.com/canonical/[email protected]/internal/bindings/server.go:127 +0x105
github.com/canonical/go-dqlite.New(0x1, {0x5565b09d5198, 0xc000040140}, {0xc000124b40, 0x1c}, {0xc000687090, 0x1, 0xc000687068})
	github.com/canonical/[email protected]/node.go:70 +0xb5
github.com/lxc/lxd/lxd/cluster.(*Gateway).init(0xc00029c380, 0x0)
	github.com/lxc/lxd/lxd/cluster/gateway.go:764 +0x6ca
github.com/lxc/lxd/lxd/cluster.NewGateway({0x5565b0de3bc0, 0xc00040ffc0}, 0xc000318078, 0xc0000f4000, 0xc000401aa0, {0xc000687638, 0x2, 0xc00050f4b0})
	github.com/lxc/lxd/lxd/cluster/gateway.go:65 +0x277
main.(*Daemon).init(0xc00016a000)
	github.com/lxc/lxd/lxd/daemon.go:1020 +0x1a9b
main.(*Daemon).Init(0xc00016a000)
	github.com/lxc/lxd/lxd/daemon.go:745 +0x5f
main.(*cmdDaemon).Run(0xc00040cd98, 0xc00050fc58, {0xc000401a30, 0x0, 0x0})
	github.com/lxc/lxd/lxd/main_daemon.go:78 +0x5b0
github.com/spf13/cobra.(*Command).execute(0xc00044aa00, {0xc00003c070, 0x1, 0x1})
	github.com/spf13/[email protected]/command.go:856 +0x60e
github.com/spf13/cobra.(*Command).ExecuteC(0xc00044aa00)
	github.com/spf13/[email protected]/command.go:974 +0x3bc
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/[email protected]/command.go:902
main.main()
	github.com/lxc/lxd/lxd/main.go:226 +0x1acd

Steps to reproduce

Unknown, on my system it appears to always crash.

Information to attach

  • No apparent relevant dmesg output
  • No container as daemon failed to launch
  • My root filesystem is rebind mounted ( i.e. has filesystem == none )

Output of df -h

Filesystem                          Size  Used Avail Use% Mounted on
dev                                 5.8G     0  5.8G   0% /dev
run                                 5.8G  1.7M  5.8G   1% /run
none                                196G  324M  186G   1% /
/dev/mapper/ssdgroup-pkg            147G   52G   88G  38% /aufs/layers/pkg
/dev/mapper/ssdgroup-config         9.8G   43M  9.3G   1% /aufs/layers/config
/dev/mapper/ssdgroup-runtime        196G  324M  186G   1% /aufs/layers/runtime
/dev/mapper/ssdgroup-games_pkg       98G   32K   93G   1% /aufs/layers/gpkg
/dev/mapper/ssdgroup-games_runtime   98G  1.2G   92G   2% /aufs/layers/gruntime
none                                9.8G   43M  9.3G   1% /aufs/stacks/config
none                                 98G  1.2G   92G   2% /aufs/stacks/gaming
tmpfs                               5.8G     0  5.8G   0% /dev/shm
tmpfs                               4.0M     0  4.0M   0% /sys/fs/cgroup
tmpfs                               5.8G   28K  5.8G   1% /tmp
/dev/sda1                            16G  156M   16G   1% /boot
/dev/mapper/ssdgroup-home           167G  131G   28G  83% /home
tmpfs                               1.2G   40K  1.2G   1% /run/user/1000

Output of various package information

โžœ  ~ pacman -Q  | grep lx  
lxc 1:4.0.12-1
lxcfs 4.0.12-1
lxd 4.22-1
python-lxml 4.7.1-1
โžœ  ~ pacman -Q | grep ql 
dqlite 1.9.1-1
postgresql 13.4-6
postgresql-libs 13.4-6
sqlite 3.37.2-1

record will not be replicated to other clusters with returning clause

refer: https://sqlite.org/lang_returning.html

can be easily reproduced in master branch with below test case:
prerequisite: go-dqlite linked against sqlite3 >= v3.35.0

func TestDqliteBasic(t *testing.T) {
	var (
		file1 = "/db1"
		file2 = "/db2"
		file3 = "/db3"

		url1 = "127.0.0.1:2379"
		url2 = "127.0.0.1:2380"
		url3 = "127.0.0.1:2381"

		tempDir string
		err     error

		ctx = context.Background()

		failOnError = func(err error) {
			t.Helper()
			if err != nil {
				t.Fatal(err)
			}
		}
		
	)

	tempDir, err = ioutil.TempDir("", "dqlite-test-*")
	failOnError(err)
	defer func() {
		if t.Failed() {
			t.Logf("db data kept in %s =====\n", tempDir)
		} else {
			os.RemoveAll(tempDir)
		}
	}()

	err = os.MkdirAll(tempDir+file1, 0755)
	failOnError(err)

	err = os.MkdirAll(tempDir+file2, 0755)
	failOnError(err)

	err = os.MkdirAll(tempDir+file3, 0755)
	failOnError(err)

	app1, err := app.New(tempDir+file1, app.WithAddress(url1))
	failOnError(err)
	defer app1.Close()

	db1, err := app1.Open(ctx, "main")
	failOnError(err)


	app2, err := app.New(tempDir+file2, app.WithAddress(url2), app.WithCluster([]string{url2, url1}))
	failOnError(err)
	defer app2.Close()

	db2, err := app1.Open(ctx, "main")
	failOnError(err)


	app3, err := app.New(tempDir+file3, app.WithAddress(url3), app.WithCluster([]string{url2, url1, url3}))
	failOnError(err)
	defer app3.Close()

	db3, err := app1.Open(ctx, "main")
	failOnError(err)

	_, err = db1.Exec("CREATE TABLE my_table (n INT)")
	failOnError(err)

	var d2 int
	// row := db2.QueryRow("INSERT INTO my_table (n) VALUES (10)")   <--  everything works fine without RETURNING
	row := db2.QueryRow("INSERT INTO my_table (n) VALUES (10) RETURNING n")
	err = row.Scan(&d2)
	failOnError(err)
	t.Logf("got from db2: %d", d2)

	row = db3.QueryRow(`SELECT * FROM my_table`)
	var d3 int
	err = row.Scan(&d3)
	failOnError(err)

	t.Logf("got from db3: %d", d3)
}

Mocking conflict with internal packages.

When attempting to use gomock or similar mocking libraries with, the mocking libraries don't interpret type aliases correctly, which causes errors when using them for mocks. In essence, types that alias to an internal package become unmockable.

As a client library, it makes it hard to use.

Can we not consider putting the types (DialFunc, LogFunc, NodeInfo, NodeStore, NodeRole, etc) that are aliased in another package that isn't internal. Even if you want to keep the alias going, you could have the internal types alias to a package outside of internal.

production-readiness?

I know in the README page you mentioned this is currently in beta, may I ask if this is accurate or we are production-ready? thanks

Test failures in IPv6-only environment

While updating this package in Debian to v1.10.1, I discovered that two tests fail when run in an IPv6-only environment. The code tries to append the port as :9000 to the raw address, but when combined with an IPv6 address, it produces an error:

=== RUN   TestNew_PristineDefault
    app_test.go:25: 
        	Error Trace:	app_test.go:964
        	            				app_test.go:939
        	            				app_test.go:25
        	Error:      	Received unexpected error:
        	            	listen to 2a02:16a8:dc41:100::238:9000: listen tcp: address 2a02:16a8:dc41:100::238:9000: too many colons in address
        	Test:       	TestNew_PristineDefault
--- FAIL: TestNew_PristineDefault (0.00s)
=== RUN   TestOptions
    app_test.go:875: 
        	Error Trace:	app_test.go:964
        	            				app_test.go:939
        	            				app_test.go:875
        	Error:      	Received unexpected error:
        	            	listen to 2a02:16a8:dc41:100::238:9000: listen tcp: address 2a02:16a8:dc41:100::238:9000: too many colons in address
        	Test:       	TestOptions
--- FAIL: TestOptions (0.00s)

error message when trying to build/install the demo app

hi, I was following the readme and try to build/install the demo app, but I keep getting the following error, could you please advise on what need to be done or what can be the most likely thing I am missing? thanks

/go/src/github.com/canonical/go-dqlite/internal/bindings/errors.go:33:27: could not determine kind of name for C.SQLITE_IOERR_LEADERSHIP_LOST

Compiling with gccgo?

Hi, I tried to compile the demos (on cmd/) with gccgo, but it fails with the following error :

go1: internal compiler error: in write_hash_function, at go/gofrontend/types.cc:1955
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions

Some minimal working example (on a Dockerfile):

FROM golang:1.14-alpine
WORKDIR /
RUN apk add --no-cache --update gcc-go ca-certificates git dumb-init automake libtool autoconf libuv-dev lz4-dev make gcc musl-dev linux-headers

#### raft and dqlite ####
RUN git clone https://github.com/canonical/raft 
WORKDIR /raft
RUN autoreconf -i && ./configure --enable-uv --prefix=/usr  && make -j4 install
WORKDIR /
RUN apk add sqlite-dev
RUN git clone https://github.com/canonical/dqlite
WORKDIR /dqlite
RUN autoreconf -i && ./configure --prefix=/usr && make -j4 install
WORKDIR /

RUN git clone https://github.com/canonical/go-dqlite
WORKDIR /go-dqlite

ENV CGO_LDFLAGS_ALLOW="-Wl,-z,now"
RUN go build -tags libsqlite3 -compiler gccgo -buildmode=exe -gccgoflags -g ./cmd/dqlite-demo

Is gccgo supported? Or at least, is it compile-able?

Support for schema v1 requests with more parameters

I've implemented server-side support for versions of the EXEC, EXEC_SQL, QUERY, and QUERY_SQL requests that can include up to 2^32 - 1 statement parameters in canonical/dqlite#407. We'll want to implement client-side support for these in go-dqlite. The higher-level interface in the client package will need to take into account that older dqlite servers won't handle the new-style requests properly (they will try and presumably fail to parse them in the old format), perhaps with some kind of graceful fallback to the more limited old-style requests.

Expose GO_DQLITE_MULTITHREAD programmatically

Right now setting GO_DQLITE_MULTITHREAD=1 is required in my application because I use sqlite in tandem with dqlite and my sqlite instance is expected to be multithreaded. Its not feasible to have all users set GO_DQLITE_MULTITHREAD=1 env var and it's a bit fragile if I do os.Setenv in an init() method because it may run before or after go-dqlite.init().

Ideally it would be nice if singlethread mode was just not set by default, but if that is not feasible then is it possible to expose a method to set Multithread mode that can be called from a consumers init method?

Cannot find package "context"

Problem countered๏ผš
go install -tags libsqlite3 ./cmd/dqlite-demo
cmd/dqlite-demo/add.go:4:2: cannot find package "context" in any of:
/usr/lib/go-1.6/src/context (from $GOROOT)
/home/zz/go/src/context (from $GOPATH)
/home/zz/go/src/github.com/canonical/go-dqlite/src/context

Originally posted by @179416326 in #64 (comment)

Cannot store and retrieve a sql.NullTime

Inserting and then selecting a NULL (Valid == false) sql.NullTime somehow loses the nullness.

If you save the following as nulltime.go

// +build ignore

package main

import (
	"context"
	"database/sql"
	"fmt"
	"io/ioutil"
	"os"

	"github.com/canonical/go-dqlite/app"
)

func main() {
	if err := run(context.Background()); err != nil {
		fmt.Fprintf(os.Stderr, "nulltime: %s\n", err)
		os.Exit(1)
	}
	fmt.Println("OK!")
}

func run(ctx context.Context) error {
	// Note this does not remove the temporary storage so that it can
	// be examined later.
	dir, err := ioutil.TempDir("", "dqlite-*")
	if err != nil {
		return err
	}
	app, err := app.New(dir)
	if err != nil {
		return err
	}
	if err := app.Ready(ctx); err != nil {
		return err
	}
	db, err := app.Open(ctx, "test")
	if err != nil {
		return err
	}
	defer db.Close()

	if _, err := db.Exec(`CREATE TABLE test (tm DATETIME)`); err != nil {
		return err
	}
	var t sql.NullTime
	res, err := db.Exec(`INSERT INTO test (tm) VALUES (?)`, t)
	if err != nil {
		return err
	}
	rows, err := res.RowsAffected()
	if err != nil {
		return err
	}
	if rows != 1 {
		return fmt.Errorf("unexpected number of rows updated (got %d, expected 1)", rows)
	}
	row := db.QueryRow(`SELECT tm FROM test LIMIT 1`)
	var t2 sql.NullTime
	if err := row.Scan(&t2); err != nil {
		return err
	}
	if t2.Time != t.Time || t2.Valid != t.Valid {
		return fmt.Errorf("sql.NullTime round-trip failure %#v != %#v", t2, t)
	}
	return nil
}

And then execute it with go run -tags libsqlite3 nulltime.go. You will get the following output showing that the NullTime becomes not-null somewhere in the process:

nulltime: sql.NullTime round-trip failure sql.NullTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Valid:true} != sql.NullTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Valid:false}
exit status 1

dqlite does not error out when -s option provided with non existing cluster file

Ran dqlite command with -s option with cluster yaml.
However the cluster yaml does not exist. I expected dqlite command to error out.
But it provided the dqlite prompt and .leader command hanged further.

$ ls /home/ubuntu/cluster.yaml
ls: cannot access '/home/ubuntu/cluster.yaml': No such file or directory
$ /home/ubuntu/go/bin/dqlite -s file:///home/ubuntu/cluster.yaml k8s
dqlite> .leader

$

Unable to execute demo, Invalid flag error

I'm trying to execute the demo, but when i get to the command:
go install -tags libsqlite3 ./cmd/dqlite-demo

I got the following error

go build github.com/canonical/go-dqlite/internal/bindings: invalid flag in #cgo LDFLAGS: -Wl,-z,now

After i ran:

export CGO_LDFLAGS_ALLOW="-Wl,-z,now"

The errors now is:

internal/bindings/server.go:166:11: could not determine kind of name for C.dqlite_node_set_snapshot_params

for both go build -tags libsqlite3 and go install -tags libsqlite3 ./cmd/dqlite-demo

I'm using a fresh install of go version go1.15.13 linux/amd64 on ubuntu 20.04.

I'm also very unexperient with go, so it can also so be some simple thing i'm missing

fatal error: dqlite.h: No such file or directory

$ go get github.com/canonical/go-dqlite

github.com/canonical/go-dqlite/internal/bindings

github.com\canonical\go-dqlite\internal\bindings\server.go:11:20: fatal error: dqlite.h: No such file or directory
compilation terminated.

Error when restarting a bootstrap node that was in a cluster

Restarting a bootstrapping dqlite node fails if it was previously part of a cluster.
Tested on 684b536 with https://pastebin.ubuntu.com/p/N5trdB2jNQ/ applied, on both 18.04 and 20.04 with go snap 1.13.11 and 1.14.3

Steps to reproduce
1/ start two nodes in a cluster
./dqlite-demo --api 127.0.0.1:5001 --db 127.0.0.1:4001
./dqlite-demo --api 127.0.0.1:5002 --db 127.0.0.1:4002 --join 127.0.0.1:4001

2/ stop both nodes
3/ start the bootstrap node again, it will fail to connect and spew out the following error message

Dqlite: WARN: no known leader address=127.0.0.1:4001 attempt=1

Side note: after 30 seconds, two of the same logs appear each second, with different attempt numbers, e.g.:

2020/05/25 22:14:16 Dqlite: WARN: no known leader address=127.0.0.1:4001 attempt=98
2020/05/25 22:14:16 Dqlite: WARN: no known leader address=127.0.0.1:4001 attempt=128
2020/05/25 22:14:17 Dqlite: WARN: no known leader address=127.0.0.1:4001 attempt=99
2020/05/25 22:14:17 Dqlite: WARN: no known leader address=127.0.0.1:4001 attempt=129

Starting the slave node will provide the same log as well as indicating failed to establish network connection: dial tcp 127.0.0.1:4001 (even if we remove --join 127.0.0.1:4001 when starting it)

Observations:

  • Removing dqlite data dir fixes the issue but drops all data. Removing only cluster.yaml doesn't help.
  • Restarting only one node does not fail, both nodes have to restart.

How to start node with multi machine

Start at machine A๏ผŒ it's works fine

  • dqlite-demo -a :8001 -d @1.1.1.1:9001 -v

Start at machine B get error

  • dqlite-demo -a :8001 -d @2.2.2.2:9001 -j 1.1.1.1:9001 -v
    unsupported protocol 1, attempt with legacy
  • dqlite-demo -a :8001 -d @2.2.2.2:9001 -j @1.1.1.1:9001
    unsupported protocol 1, attempt with legacy and write handshake: write unix @->@1.1.1.1:9001: write: broken pipe

So what's correct way to create a node join the cluster

ld error running "go build" with go-dqlite version > 1.8.0

Hi,

I get the following errors when I run go build -tags libsqlite3 in my project that is using dqlite:

image

Everything built fine in v1.8.0, but I get this error for both v1.9.0 and v1.10.0. I have built and installed the latest version of C-raft and dqlite C library from source following the Readme in the dqlite repo.

Any help getting this working would be greatly appreciated. Thanks!

Issue building lxd on Fedora >=33: undefined reference to `co_swap_function'

I'm maintaining lxd RPMs on Fedora COPR. Unfortunately I have errors when trying to build them for aarch64 (ganto/copr-lxc4#5) and x86_64 (ganto/copr-lxc4#6) on the upcoming Fedora 33.

The error happens when building github.com/lxc/lxd/vendor/github.com/canonical/go-dqlite/internal/bindings where it fails with:

  • aarch64:
TERM='dumb' gcc -I . -fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=$WORK/b187=/tmp/go-build -gno-record-gcc-switches -o $WORK/b187/_cgo_.o $WORK/b187/_cgo_main.o $WORK/b187/_x001.o $WORK/b187/_x002.o $WORK/b187/_x003.o -L/builddir/build/BUILD/lxd-4.5/_dist/deps/sqlite/.libs/ -L/builddir/build/BUILD/lxd-4.5/_dist/deps/libco/ -L/builddir/build/BUILD/lxd-4.5/_dist/deps/raft/.libs/ -L/builddir/build/BUILD/lxd-4.5/_dist/deps/dqlite/.libs/ -Wl,-rpath,/usr/lib64/lxd -lsqlite3 -lraft -lco -ldqlite
# github.com/lxc/lxd/vendor/github.com/canonical/go-dqlite/internal/bindings
/usr/bin/ld: /builddir/build/BUILD/lxd-4.5/_dist/deps/dqlite/.libs//libdqlite.so: undefined reference to `co_swap'
collect2: error: ld returned 1 exit status
  • x86_64:
TERM='dumb' gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=$WORK/b189=/tmp/go-build -gno-record-gcc-switches -o $WORK/b189/_cgo_.o $WORK/b189/_cgo_main.o $WORK/b189/_x001.o $WORK/b189/_x002.o $WORK/b189/_x003.o -L/builddir/build/BUILD/lxd-4.5/_dist/deps/sqlite/.libs/ -L/builddir/build/BUILD/lxd-4.5/_dist/deps/libco/ -L/builddir/build/BUILD/lxd-4.5/_dist/deps/raft/.libs/ -L/builddir/build/BUILD/lxd-4.5/_dist/deps/dqlite/.libs/ -Wl,-rpath,/usr/lib64/lxd -lsqlite3 -lraft -lco -ldqlite
# github.com/lxc/lxd/vendor/github.com/canonical/go-dqlite/internal/bindings
/usr/bin/ld: /builddir/build/BUILD/lxd-4.5/_dist/deps/dqlite/.libs//libdqlite.so: undefined reference to `co_swap_function'
collect2: error: ld returned 1 exit status

It's building fine for i686. Do you have any pointer for me what's going wrong here?

Is it possible to let it work with hostnames instead of ip addresses?

I'm trying to use dqlite for an application that will live as a container inside Kubernetes as a StatefulSet.
The problem is that Kubernetes for his nature don't assign fixed ips to containers, so when an app using dqlite get restarted its ip changes, breaking how dqlite try to reconfigure the cluster having that logic based on ips.

It is something that could be possible to change?
which part has to be modified to let this happen?

I investigated just bit the code and could we write in store files (cluster.yaml, info.yaml) hostnames and then resolve to ips at runtime?

If someone could direct me to the right points, I can make a pull request by my own.

Thanks

Unable to use hostname as bind address

Starting dqlite using hostname:port, fails to start with this error failed to set bind address...

What i am trying to do is to run dqlite inside a pod using StatefulSet.

Database not immediately accessible

When using the driver sub-package, the database appears to be inaccessible before the execution of certain unrelated statements.
For example, in the demo, if you create an entry in the database using the update command, then restart the service and query that entry, using the query command, you are able to obtain the previously created value. However, if you remove the section of the query command code that creates the table if it does not already exist (here) and attempt the above scenario, you get an error saying that the table does not exist when you execute the query. The only way for the query to succeed is by first executing certain unrelated statements; such as the one that attempts to create the table, or something like: PRAGMA ignored_pragma

Test failure when only loopback interface available

While updating this package in Debian to v1.10.1, I encountered an isssue with defaultAddress() as defined in app/options.go returning an empty string when only the loopback interface is available during a build. That method returns an empty string, which then causes a test failure (output below). I think it would be better if a proper error was returned, rather than silently passing on an invalid address.

I made a quick patch to manually set an actual address for the failing test; there may be a better way to fix this:

diff --git a/app/app_test.go b/app/app_test.go
index f8ef633..b21e2ca 100644
--- a/app/app_test.go
+++ b/app/app_test.go
@@ -855,7 +855,7 @@ func TestRolesAdjustment_ReplaceStandByHonorFailureDomains(t *testing.T) {
 
 // Open a database on a fresh one-node cluster.
 func TestOpen(t *testing.T) {
-	app, cleanup := newApp(t)
+	app, cleanup := newApp(t, app.WithAddress("127.0.0.1:9000"))
 	defer cleanup()
 
 	db, err := app.Open(context.Background(), "test")
=== RUN   TestOpen
    app_test.go:957: 14:28:11.796 - 91: WARN: attempt 0: server : dial: dial tcp: missing address
[snip]
    app_test.go:957: 14:37:11.276 - 91: WARN: attempt 498: server : dial: dial tcp: missing address
panic: test timed out after 10m0s

goroutine 5290 [running]:
testing.(*M).startAlarm.func1()
	/usr/lib/go-1.17/src/testing/testing.go:1788 +0x8e
created by time.goFunc
	/usr/lib/go-1.17/src/time/sleep.go:180 +0x31

goroutine 1 [chan receive, 8 minutes]:
testing.(*T).Run(0xc000103ba0, {0x7c2685, 0x46fb33}, 0x7df688)
	/usr/lib/go-1.17/src/testing/testing.go:1307 +0x375
testing.runTests.func1(0xc00011f620)
	/usr/lib/go-1.17/src/testing/testing.go:1598 +0x6e
testing.tRunner(0xc000103ba0, 0xc00017fd18)
	/usr/lib/go-1.17/src/testing/testing.go:1259 +0x102
testing.runTests(0xc000176100, {0xa40720, 0x19, 0x19}, {0x48870d, 0x7c49a4, 0xa8f280})
	/usr/lib/go-1.17/src/testing/testing.go:1596 +0x43f
testing.(*M).Run(0xc000176100)
	/usr/lib/go-1.17/src/testing/testing.go:1504 +0x51d
main.main()
	_testmain.go:95 +0x14b

goroutine 5313 [sleep]:
time.Sleep(0x3b9aca00)
	/usr/lib/go-1.17/src/runtime/time.go:193 +0x12e
github.com/canonical/go-dqlite/internal/protocol.makeRetryStrategies.func1(0xc000131b18)
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/internal/protocol/connector.go:320 +0x49
github.com/Rican7/retry.shouldAttempt(0x7fb12e6b9a68, {0xc000178008, 0x1, 0xc0007ba018})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/Rican7/retry/retry.go:32 +0x3e
github.com/Rican7/retry.Retry(0xc000131b20, {0xc000178008, 0x1, 0x1})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/Rican7/retry/retry.go:19 +0x92
github.com/canonical/go-dqlite/internal/protocol.(*Connector).Connect(0xc000259c40, {0x842b30, 0xc00013e000})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/internal/protocol/connector.go:74 +0xac
github.com/canonical/go-dqlite/driver.(*Connector).Connect(0xc00021a198, {0x842b30, 0xc00013e000})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/driver/driver.go:256 +0x269
database/sql.(*DB).conn(0xc0002b41a0, {0x842b30, 0xc00013e000}, 0x1)
	/usr/lib/go-1.17/src/database/sql/sql.go:1364 +0x7ac
database/sql.(*DB).PingContext(0xc00001a200, {0x842b30, 0xc00013e000})
	/usr/lib/go-1.17/src/database/sql/sql.go:853 +0x7a
github.com/canonical/go-dqlite/app.(*App).Open(0xc00017c1a0, {0x842b30, 0xc00013e000}, {0x7c18db, 0x474})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app.go:391 +0x9b
github.com/canonical/go-dqlite/app_test.TestOpen(0xc0004f0708)
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app_test.go:861 +0x5d
testing.tRunner(0xc00017c1a0, 0x7df688)
	/usr/lib/go-1.17/src/testing/testing.go:1259 +0x102
created by testing.(*T).Run
	/usr/lib/go-1.17/src/testing/testing.go:1306 +0x35a

goroutine 5321 [IO wait, 8 minutes]:
internal/poll.runtime_pollWait(0x7fb0e46f68a0, 0x72)
	/usr/lib/go-1.17/src/runtime/netpoll.go:234 +0x89
internal/poll.(*pollDesc).wait(0xc0006a6080, 0x62aa7b, 0x0)
	/usr/lib/go-1.17/src/internal/poll/fd_poll_runtime.go:84 +0x32
internal/poll.(*pollDesc).waitRead(...)
	/usr/lib/go-1.17/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0006a6080)
	/usr/lib/go-1.17/src/internal/poll/fd_unix.go:402 +0x22c
net.(*netFD).accept(0xc0006a6080)
	/usr/lib/go-1.17/src/net/fd_unix.go:173 +0x35
net.(*TCPListener).accept(0xc00021a108)
	/usr/lib/go-1.17/src/net/tcpsock_posix.go:140 +0x28
net.(*TCPListener).Accept(0xc00021a108)
	/usr/lib/go-1.17/src/net/tcpsock.go:262 +0x3d
github.com/canonical/go-dqlite/app.(*App).proxy(0xc0002b40d0)
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app.go:423 +0x77
created by github.com/canonical/go-dqlite/app.New
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app.go:239 +0x1766

goroutine 5323 [select, 8 minutes]:
database/sql.(*DB).connectionOpener(0xc0002b41a0, {0x842af8, 0xc000088080})
	/usr/lib/go-1.17/src/database/sql/sql.go:1196 +0x93
created by database/sql.OpenDB
	/usr/lib/go-1.17/src/database/sql/sql.go:794 +0x188

goroutine 5322 [sleep]:
time.Sleep(0x3b9aca00)
	/usr/lib/go-1.17/src/runtime/time.go:193 +0x12e
github.com/canonical/go-dqlite/internal/protocol.makeRetryStrategies.func1(0xc00012fd48)
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/internal/protocol/connector.go:320 +0x49
github.com/Rican7/retry.shouldAttempt(0x7, {0xc0004e8040, 0x1, 0xc00015a108})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/Rican7/retry/retry.go:32 +0x3e
github.com/Rican7/retry.Retry(0xc00012fd50, {0xc0004e8040, 0x1, 0x1})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/Rican7/retry/retry.go:19 +0x92
github.com/canonical/go-dqlite/internal/protocol.(*Connector).Connect(0xc0001fdde8, {0x842af8, 0xc000088040})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/internal/protocol/connector.go:74 +0xac
github.com/canonical/go-dqlite/client.FindLeader({0x842af8, 0xc000088040}, {0x83fe20, 0xc00028e840}, {0xc0000740e0, 0x2, 0xc0002831a0})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/client/leader.go:26 +0x1b8
github.com/canonical/go-dqlite/app.(*App).Leader(0xc0002b40d0, {0x842af8, 0xc000088040})
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app.go:410 +0x50
github.com/canonical/go-dqlite/app.(*App).run(0xc0002b40d0, {0x842af8, 0xc000088040}, 0x6fc23ac00, 0xb8)
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app.go:465 +0x166
created by github.com/canonical/go-dqlite/app.New
	/build/golang-github-canonical-go-dqlite-1.10.1/_build/src/github.com/canonical/go-dqlite/app/app.go:245 +0x18f6
FAIL	github.com/canonical/go-dqlite/app	600.013s

errors are from internal package

From what I see the errors from the SQL driver are of type bindings.Error. bindings is an internal package so it is not possible for the caller to cast the error and read the actual error code from SQLite. Right now I need to detect a unique constraint error.

Heads up: github.com/Rican7/retry >= 0.3.0 backwards-incompatible changes

Just a heads up that github.com/Rican7/retry >= 0.3.0 has backwards-incompatible changes that will cause test failures in internal/protocol/connector_test.go. Debian updated that library about a month ago, which has then exposed this issue when running tests for go-dqlite.

It would be great if this library could be updated to use the current version of github.com/Rican7/retry.

Issues with demo

$ ./dqlite-demo start 1 &
[1] 12448
$ ./dqlite-demo start 2 &
[2] 12456
$ ./dqlite-demo start 3 &
[3] 12471
$ ./dqlite-demo add 2
Error: can't add node: failed to receive response: failed to receive header: failed to receive header: read tcp 127.0.0.1:58432->127.0.0.1:9181: i/o timeout
Usage:
  dqlite-demo add <id> [flags]

Flags:
  -a, --address string    address of the node to add (default is 127.0.0.1:918<ID>)
  -c, --cluster strings   addresses of existing cluster nodes (default [127.0.0.1:9181,127.0.0.1:9182,127.0.0.1:9183])
  -h, --help              help for add

$ ./dqlite-demo add 3
$ ./dqlite-demo cluster
ID      Leader  Address
1       true    127.0.0.1:9181
2       false   127.0.0.1:9182
$ ./dqlite-demo add 7889
$ ./dqlite-demo add 788912123
$ ./dqlite-demo add 2
$ ./dqlite-demo add 3
$ ./dqlite-demo cluster
ID      Leader  Address
1       true    127.0.0.1:9181
2       false   127.0.0.1:9182
$ ./dqlite-demo query a
Error: can't create demo table: driver: bad connection
Usage:
  dqlite-demo query <key> [flags]

Flags:
  -c, --cluster strings   addresses of existing cluster nodes (default [127.0.0.1:9181,127.0.0.1:9182,127.0.0.1:9183])
  -h, --help              help for query

Can't get any errors out of it. Not sure what's going on. Ports seem to be open and listening. I don't think add (any number) should just work. Not sure what the logic in the code is there.

Dynamic addresses

Hi, I noticed that the addresses of the nodes are saved in a file. I am experimenting with container deployments and those usually don't come with fixed repeatable IPs. For example, docker compose or Kubernetes stateful set.

I was trying to use DNS as the address of the leader to connect to, but the nodes look up the IP of this hostname and remember it.

The problem is, if the IPs change after restart or due to other reasons, the nodes are not able to start any more.

I am unsure how this could be solved. Likewise, I am thinking of deleting the files from the data volumes which contain this information before starting the node, but I don't know what file the actual data is in and what file can be deleted.

Is there some recommended way to deal with this scenario? I would assume it is not uncommon these days to deploy apps in a Kubernetes cluster, for example.

Tests fail with "leadership transfer failed"

Recently, two tests have begun failing when building this library on a Debian unstable system. What's puzzling is that the version of this library hasn't changed (1.10.2), which points to a change in one of its dependencies, yet I can't find an obvious culprit.

From my attempts to figure out why the tests are failing, it seems to be related to the version of the raft library (v0.9.25 works fine, with errors appearing when using v0.11.2) --but-- the tests had previously passed when the raft version in unstable was 0.11.2+git211119. I thought it might be related to newer versions of dqlite (hasn't changed in Debian unstable since mid-November) or sqlite3 (v3.37 was just recently uploaded, but tests fail with v3.36 as well).

I'm hoping someone else may have some idea about what's changed that is now causing the tests to fail.

Versions:

  • Debian unstable
  • golang v1.17.6
  • go-dqlite v1.10.2
  • dqlite v1.9.0+git211116
  • raft v0.11.2+git211214
  • sqlite3 v3.37.2
=== RUN   TestHandover_TwoNodes
    app_test.go:958: 22:17:01.858 - 36: DEBUG: new connection from 127.0.0.1:39042
    app_test.go:958: 22:17:01.885 - 36: DEBUG: attempt 0: server 127.0.0.1:9001: connected
    app_test.go:958: 22:17:01.888 - 36: DEBUG: new connection from 127.0.0.1:39044
    app_test.go:958: 22:17:01.024 - 36: DEBUG: new connection from 127.0.0.1:39046
    app_test.go:958: 22:17:01.041 - 37: DEBUG: attempt 0: server 127.0.0.1:9001: connected
    app_test.go:958: 22:17:01.053 - 36: DEBUG: new connection from 127.0.0.1:39050
    app_test.go:958: 22:17:01.053 - 37: DEBUG: new connection from 127.0.0.1:46340
    app_test.go:958: 22:17:01.077 - 36: DEBUG: new connection from 127.0.0.1:39052
    app_test.go:958: 22:17:01.102 - 36: DEBUG: attempt 0: server 127.0.0.1:9001: connected
    app_test.go:958: 22:17:01.102 - 37: DEBUG: new connection from 127.0.0.1:46346
    app_test.go:958: 22:17:01.102 - 36: DEBUG: new connection from 127.0.0.1:39056
    app_test.go:958: 22:17:01.128 - 37: DEBUG: new connection from 127.0.0.1:46350
    app_test.go:958: 22:17:01.153 - 36: DEBUG: new connection from 127.0.0.1:39060
    app_test.go:958: 22:17:01.783 - 36: DEBUG: promoted 127.0.0.1:9002 from spare to voter
    app_test.go:958: 22:17:01.783 - 37: DEBUG: new connection from 127.0.0.1:46354
    app_test.go:958: 22:17:01.783 - 36: DEBUG: new connection from 127.0.0.1:39064
    app_test.go:958: 22:17:01.822 - 36: WARN: transfer leadership to 127.0.0.1:9002: leadership transfer failed (1)
    app_test.go:318: 
        	Error Trace:	app_test.go:318
        	Error:      	Received unexpected error:
        	            	transfer leadership: leadership transfer failed (1)
        	Test:       	TestHandover_TwoNodes
--- FAIL: TestHandover_TwoNodes (1.16s)

=== RUN   TestClient_Transfer
    client_test.go:117: 
        	Error Trace:	client_test.go:117
        	Error:      	Received unexpected error:
        	            	leadership transfer failed (1)
        	Test:       	TestClient_Transfer
--- FAIL: TestClient_Transfer (1.16s)

segmentation fault during node creation

I've been trying to create rust bindings for dqlite, but kept stumbling on issues. I figured I'd try go-dqlite to see if my system was maybe misconfigured.

I tried running the demo app but I immediatly got this segfault:

$ go run ./cmd/dqlite-demo/dqlite-demo.go -a 127.0.0.1:5005 -db 127.0.0.1:5006
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]

runtime stack:
runtime.throw(0x9f24b4, 0x2a)
        /usr/lib/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
        /usr/lib/go/src/runtime/signal_unix.go:679 +0x46a

goroutine 1 [syscall]:
runtime.cgocall(0x810328, 0xc0000db8b8, 0xc0000100b8)
        /usr/lib/go/src/runtime/cgocall.go:133 +0x5b fp=0xc0000db888 sp=0xc0000db850 pc=0x40b95b
github.com/canonical/go-dqlite/internal/bindings._Cfunc_dqlite_node_create(0x2dc171858c3155be, 0x126b8a0, 0x126b8c0, 0xc0000100b8, 0xc000000000)
        _cgo_gotypes.go:137 +0x4d fp=0xc0000db8b8 sp=0xc0000db888 pc=0x7869ed
github.com/canonical/go-dqlite/internal/bindings.NewNode.func3(0x2dc171858c3155be, 0x126b8a0, 0x126b8c0, 0xc0000100b8, 0x0)
        /home/jerome/go/src/github.com/canonical/go-dqlite/internal/bindings/server.go:116 +0x89 fp=0xc0000db8f0 sp=0xc0000db8b8 pc=0x788729
github.com/canonical/go-dqlite/internal/bindings.NewNode(0x2dc171858c3155be, 0x7ffe237fc194, 0x1, 0xc000026620, 0x12, 0x0, 0x0, 0x0)
        /home/jerome/go/src/github.com/canonical/go-dqlite/internal/bindings/server.go:116 +0x10e fp=0xc0000db978 sp=0xc0000db8f0 pc=0x78769e
github.com/canonical/go-dqlite.New(0x2dc171858c3155be, 0x7ffe237fc194, 0x1, 0xc000026620, 0x12, 0xc0000dbb60, 0x2, 0x2, 0x100000000000000, 0x0, ...)
        /home/jerome/go/src/github.com/canonical/go-dqlite/node.go:56 +0xbe fp=0xc0000db9e8 sp=0xc0000db978 pc=0x7a634e
github.com/canonical/go-dqlite/app.New(0xc000026620, 0x12, 0xc0000dbce8, 0x2, 0x2, 0x0, 0x8000103, 0x0)
        /home/jerome/go/src/github.com/canonical/go-dqlite/app/app.go:104 +0x42b fp=0xc0000dbc10 sp=0xc0000db9e8 pc=0x7a9a4b
main.main.func1(0xc0000bcb00, 0xc000034480, 0x1, 0x4, 0x0, 0x0)
        /home/jerome/go/src/github.com/canonical/go-dqlite/cmd/dqlite-demo/dqlite-demo.go:37 +0x1b3 fp=0xc0000dbd38 sp=0xc0000dbc10 pc=0x80eb23
github.com/spf13/cobra.(*Command).execute(0xc0000bcb00, 0xc0000200b0, 0x4, 0x4, 0xc0000bcb00, 0xc0000200b0)
        /home/jerome/go/src/github.com/spf13/cobra/command.go:842 +0x453 fp=0xc0000dbe10 sp=0xc0000dbd38 pc=0x7ff403
github.com/spf13/cobra.(*Command).ExecuteC(0xc0000bcb00, 0x9dfa62, 0x2, 0x0)
        /home/jerome/go/src/github.com/spf13/cobra/command.go:950 +0x349 fp=0xc0000dbee8 sp=0xc0000dbe10 pc=0x7fff29
github.com/spf13/cobra.(*Command).Execute(...)
        /home/jerome/go/src/github.com/spf13/cobra/command.go:887
main.main()
        /home/jerome/go/src/github.com/canonical/go-dqlite/cmd/dqlite-demo/dqlite-demo.go:106 +0x384 fp=0xc0000dbf88 sp=0xc0000dbee8 pc=0x80e294
runtime.main()
        /usr/lib/go/src/runtime/proc.go:203 +0x212 fp=0xc0000dbfe0 sp=0xc0000dbf88 pc=0x43dd32
runtime.goexit()
        /usr/lib/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc0000dbfe8 sp=0xc0000dbfe0 pc=0x46b271
exit status 2

I'm running go 1.14.2 on Arch linux. I am using dqlite 1.4.1 and sqlite-replication 3.31.1.4, installed via pacman. I had to set the following environment variables to even get it to run at all:

export CGO_CFLAGS="-I/usr/include/sqlite-replication/"
export CGO_LDFLAGS="-L/usr/lib/sqlite-replication/"
export LD_LIBRARY_PATH=/usr/lib/sqlite-replication/

Am I doing something wrong?

Using returning clause causes errors

Hi, I noticed different types of errors when using the returning clause.

Sometimes I see this:

server: src/vfs.c:1701: vfsFileShmLock: Assertion `wal->n_tx == 0' failed

And sometimes this:

error: another row available

These errors happen with the below code after the app is ready and the DB has been created.

if _, err := db.ExecContext(ctx, "create table if not exists test (id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, value TEXT NOT NULL)"); err != nil {
	return err
}

for i := 0; i < 10; i++ {
	result, err :=  db.ExecContext(ctx, "INSERT INTO test (value) VALUES (?) RETURNING *",  i)
	if err != nil {
		return err
	}
	id, err := result.LastInsertId()
	if err != nil {
		return err
	}
	log.Printf("inserted %d", id)
}

It works OK, when removing the returning clause.

I have installed the c libraries with this script: https://gist.github.com/bluebrown/85e1b39980f50c66682743afe0d8b316.

Insert problems from dqlite shell

Trying to do some inserts into a dqlite database from dqlite shell I observed that the insert statement seems to overwrite records, eg:

dqlite> CREATE TABLE test (id INT, s TEXT)
dqlite> INSERT INTO test VALUES (1, 'a')
dqlite> INSERT INTO test VALUES (2, 'b')
dqlite> SELECT * FROM test
2|b
dqlite> 

Support for linearizable/quorum reads

Raft as described in the whitepaper guarantees linearizability by (among other things) having leaders respond to read-only requests only after contacting a majority of the cluster. This ensures that leaders who have been deposed can't respond to a read-only request with stale data. Our implementation of raft doesn't support this kind of checked read-only request, and dqlite leaders will answer queries without contacting a majority of the cluster first; as noted in the documentation, this means that queries can return stale data. I think it would be useful for dqlite (and go-dqlite) to provide an API for performing linearizable reads/queries that do check that the current leader hasn't been deposed.

One subtlety: dqlite leaders will run a barrier before performing queries if there are uncommitted entries in the leader's log. In that case, the barrier does the work of contacting a majority of the cluster, and the operation will be linearizable. But it's possible for the leader to have been deposed without having uncommitted entries in its log, so linearizability is not guaranteed in all circumstances.

only linux ?

seems this uses libuv. SO no mac no windows ?

Tests remain sensitive to timing when on slower hardware

While working on the Debian packaging for this library, I've found that several of the different builds for various architectures fail semi-randomly because tests fail due to timing issues. The recent pull request #167 makes things much better, but there are still occasional failures when building/running on "slower" hardware like an arm box.

A good way of testing that I've found to reliably expose this issue is to build the library and run its tests on a RaspberryPi 3B that I have available locally (arm64, running Debian bullseye off a micro-SD card). Building v1.10.1 plus that cherry-picked pull request, I will typically see one or maybe two test failures -- it's not always the same test that fails, nor do they seem to fail with equal probability. I haven't taken rigorous notes on the failing tests, but some of the more frequent ones are:

TestHandover_TransferLeadership
TestRolesAdjustment_ReplaceVoter
TestRolesAdjustment_ReplaceVoterHonorFailureDomain
TestRolesAdjustment_ReplaceVoterHonorWeight
TestRolesAdjustment_ReplaceStandByHonorFailureDomains

If there's other information that I can provide to help resolve this issue, just let me know!

more details from dqlite

enter dqlite shell bydqlite -s file://xxx -c xxx -k xxxx -f json k8s.
I input the select command, but I do not know the table name in the db. what should I do to get the table name?
and then, I want to quit the shell, but what should I do to quit the dqlite shell?

dqlite> select
Error:  query: incomplete input
dqlite> select *
Error:  query: no tables specified
dqlite> select * from k8s
Error:  query: no such table: k8s
dqlite> .table
Error:  near ".": syntax error
dqlite> .quit
Error:  near ".": syntax error
dqlite> exit
Error:  near "exit": syntax error
dqlite> quit
Error:  near "quit": syntax error
dqlite> .exit
Error:  near ".": syntax error

cluster.yaml can become out of date for killed nodes

The cluster.yaml can become out of date if a node in the cluster is removed in a non-programmatic way or without user interaction. A typical scenario could be OOM'd node or restart that gives us a different IP address. In that case, the cluster.yaml will still show the old node even if it's gone away, even after a substantial amount of time has passed.

Having spoken with @MathieuBordere, a possible solution would be to include a last seen timestamp in the cluster.yaml and get the leader to run a goroutine to spot when the last seen timestamp is bigger than we can work with and then use client.Remove().

Alternatively, this could be done directly in the app abstraction in the run loop, and remove the nodes after a configurable timeout.

"$GOPATH not set" when running go install -tags libsqlite3.

Hi, freeekanayaka, sorry for the disturb again, i have built dqlite by hand (also tried for apt install), but countered the following problems. So what else do i need to set $GOPATH?

zz@zz-virtual-machine:~/canonical/go-dqlite-master$ go version
go version go1.6.2 linux/amd64

zz@zz-virtual-machine:~/canonical/go-dqlite-master$ go install -tags libsqlite3 ./cmd/dqlite-demo
cmd/dqlite-demo/add.go:4:2: cannot find package "context" in any of:
/usr/lib/go-1.6/src/context (from $GOROOT)
($GOPATH not set)
cmd/dqlite-demo/start.go:12:2: cannot find package "github.com/canonical/go-dqlite" in any of:
/usr/lib/go-1.6/src/github.com/canonical/go-dqlite (from $GOROOT)
($GOPATH not set)
cmd/dqlite-demo/add.go:9:2: cannot find package "github.com/canonical/go-dqlite/client" in any of:
/usr/lib/go-1.6/src/github.com/canonical/go-dqlite/client (from $GOROOT)
($GOPATH not set)
cmd/dqlite-demo/benchmark.go:9:2: cannot find package "github.com/canonical/go-dqlite/driver" in any of:
/usr/lib/go-1.6/src/github.com/canonical/go-dqlite/driver (from $GOROOT)
($GOPATH not set)
cmd/dqlite-demo/add.go:10:2: cannot find package "github.com/pkg/errors" in any of:
/usr/lib/go-1.6/src/github.com/pkg/errors (from $GOROOT)
($GOPATH not set)
cmd/dqlite-demo/add.go:11:2: cannot find package "github.com/spf13/cobra" in any of:
/usr/lib/go-1.6/src/github.com/spf13/cobra (from $GOROOT)
($GOPATH not set)

dqlite does not report any leader

Hi there,

I'm giving dqlite a try, and I just applied the following commands with the demo program:

dqlite-demo -v --api 127.0.0.1:8001 --db 127.0.0.1:9001
dqlite-demo -v --api 127.0.0.1:8002 --db 127.0.0.1:9002 --join 127.0.0.1:9001
dqlite-demo -v --api 127.0.0.1:8003 --db 127.0.0.1:9003 --join 127.0.0.1:9001

According to the logs, there is an elected leader:

[...]
2022/09/08 21:56:38 127.0.0.1:8003: DEBUG: attempt 1: server 127.0.0.1:9001: connect to reported leader 127.0.0.1:9002
2022/09/08 21:56:38 127.0.0.1:8003: DEBUG: attempt 1: server 127.0.0.1:9001: connected
[...]

However, when I run .cluster from the dqlite command, like this dqlite -s 127.0.0.1:9001 /tmp/dqlite-demo/127.0.0.1:9002, I get the following output:

dqlite -s 127.0.0.1:9001 /tmp/dqlite-demo/127.0.0.1:9001
dqlite> .cluster
2dc171858c3155be|127.0.0.1:9001|voter
7f60f4c8d1fb81f2|127.0.0.1:9002|voter
588d2ca432a7bb09|127.0.0.1:9003|voter
dqlite> %                            

7f60f4c8d1fb81f2| is not reported as leader contrary to what I see from the logs.

All of this was noticed after fresh git clone from the master branch.

Did I do anything wrong ?

Thanks :)

panic: connection is already registered with serial X

I randomly(?) encounter the following panic with LXD currently:

panic: connection is already registered with serial 15

goroutine 69 [running]:
github.com/CanonicalLtd/go-dqlite/internal/registry.(*Registry).connAdd(0xc4204d6000, 0x7f10ea49f3f0)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/internal/registry/conn.go:136 +0x162
github.com/CanonicalLtd/go-dqlite/internal/registry.(*Registry).ConnLeaderAdd(0xc4204d6000, 0xc42120cd4a, 0x6, 0x7f10ea49f3f0)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/internal/registry/conn.go:26 +0x39
github.com/CanonicalLtd/go-dqlite.(*cluster).Register(0xc4201a7900, 0x7f10ea49f3f0)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/cluster.go:66 +0x47
github.com/CanonicalLtd/go-dqlite/internal/bindings.clusterRegisterCb(0x64, 0x7f10ea49f3f0)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/internal/bindings/cluster.go:207 +0x63
github.com/CanonicalLtd/go-dqlite/internal/bindings._cgoexpwrap_c04adbaf2773_clusterRegisterCb(0x64, 0x7f10ea49f3f0)
        _cgo_gotypes.go:911 +0x35
github.com/CanonicalLtd/go-dqlite/internal/bindings._Cfunc_dqlite_server_run(0x7f10e41f0420, 0x7f1000000000)
        _cgo_gotypes.go:397 +0x49
github.com/CanonicalLtd/go-dqlite/internal/bindings.(*Server).Run.func1(0x7f10e41f0420, 0xc420063f90)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/internal/bindings/server.go:136 +0x56
github.com/CanonicalLtd/go-dqlite/internal/bindings.(*Server).Run(0x7f10e41f0420, 0x10324a0, 0x0)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/internal/bindings/server.go:136 +0x2f
github.com/CanonicalLtd/go-dqlite.(*Server).run(0xc4204dcfa0)
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/server.go:131 +0x54
created by github.com/CanonicalLtd/go-dqlite.NewServer
        /home/schu/code/go/src/github.com/CanonicalLtd/go-dqlite/server.go:88 +0x2be

My LXD version at the moment is 32ba6b5e14c9c325ed095ba8eb27d270ca6969ec (stable-3.0), built from source.

LXD log before the panic:

[...]
DBUG[08-12|11:17:43] handling                                 ip=@ method=GET url=/1.0/images/44b8e3990ae1b339887b24bfdf5efddfff80f9057bfa16e8f2ecba69bd5b7165                                                     
DBUG[08-12|11:17:43] Database error: &errors.errorString{s:"sql: no rows in result set"}
DBUG[08-12|11:17:43] handling                                 ip=@ method=GET url=/1.0/events
DBUG[08-12|11:17:43] New event listener: db8c929d-905b-4c33-af4d-dc130cce9f39
DBUG[08-12|11:17:43] handling                                 ip=@ method=POST url=/1.0/containers
DBUG[08-12|11:17:43] Responding to container create                                                  
INFO[08-12|11:17:48] Dqlite: closing client                                                          
DBUG[08-12|11:17:48] Database error: &errors.errorString{s:"driver: bad connection"}                                                                                                                               
DBUG[08-12|11:17:48] Retry failed db interaction (driver: bad connection)                          
INFO[08-12|11:17:48] Dqlite: handling new connection (fd=26)                       
INFO[08-12|11:17:53] Dqlite: closing client                                         
DBUG[08-12|11:17:53] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0
attempt=0                                                                                              
DBUG[08-12|11:17:53] Dqlite: connection failed err=no available dqlite leader server found attempt=0                                                                                                               
INFO[08-12|11:17:59] Dqlite: closing client
DBUG[08-12|11:17:59] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0
attempt=1                        
DBUG[08-12|11:17:59] Dqlite: connection failed err=no available dqlite leader server found attempt=1 
INFO[08-12|11:18:04] Dqlite: closing client                                                           
DBUG[08-12|11:18:04] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0
attempt=2                                                                                             
DBUG[08-12|11:18:04] Dqlite: connection failed err=no available dqlite leader server found attempt=2
INFO[08-12|11:18:09] Dqlite: closing client                                         
DBUG[08-12|11:18:09] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0
attempt=3                                                                           
DBUG[08-12|11:18:09] Dqlite: connection failed err=no available dqlite leader server found attempt=3
INFO[08-12|11:18:15] Dqlite: closing client
DBUG[08-12|11:18:15] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0
attempt=4
DBUG[08-12|11:18:15] Dqlite: connection failed err=no available dqlite leader server found attempt=4 
INFO[08-12|11:18:21] Dqlite: closing client 
DBUG[08-12|11:18:21] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0 attempt=5 
DBUG[08-12|11:18:21] Dqlite: connection failed err=no available dqlite leader server found attempt=5 
INFO[08-12|11:18:27] Dqlite: closing client 
DBUG[08-12|11:18:27] Dqlite: server connection failed err=failed to send Leader request: failed to receive response: failed to receive header: failed to receive header: read unix @->@00310: i/o timeout address=0 attempt=6 
DBUG[08-12|11:18:27] Dqlite: connection failed err=no available dqlite leader server found attempt=6 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=25) 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=26) 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=25) 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=26) 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=25) 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=26) 
INFO[08-12|11:18:28] Dqlite: handling new connection (fd=25) 
INFO[08-12|11:18:28] Dqlite: connected address=0 attempt=7 
INFO[08-12|11:18:33] Dqlite: closing client 
DBUG[08-12|11:18:33] Database error: &errors.errorString{s:"driver: bad connection"} 
DBUG[08-12|11:18:33] Retry failed db interaction (driver: bad connection) 
INFO[08-12|11:18:33] Dqlite: handling new connection (fd=26) 
INFO[08-12|11:18:34] Dqlite: connected address=0 attempt=0 

Outbound connection does not appear to be closed/removed from use when remote side closes it

See https://github.com/lxc/lxd/pull/9891 for more info.

The reproducer is fairly easy:

  1. Launch 3 Ubuntu Focal VMs and install the lxd edge snap (sudo snap install lxd --edge).
  2. Setup a 3 node LXD cluster (sudo lxd init).
  3. Once cluster is up and running, shutdown the leader member cleanly (i.e lxc stop and not lxc stop -f).
  4. On the the third member (the one not promoted to leader), check sudo ss -tnp and you should see an outbound LXD 8443 connection to the shutdown member in close-wait state (the logging this PR adds has identified that it is an outbound dqlite and not raft connection).
  5. Wait a few seconds and then try running lxc cluster ls on the non-leader member, it should start giving the error at that point.
Error: failed to begin transaction: call exec-sql (budget 9.999963394s): receive: header: EOF

This doesn't happen 100% of the time, but trying a couple of times should reproduce it, making sure you always test on a non-leader member.

This issue has appeared since we removed the continuous background job for maintaining the LXD event websocket connections (which was causing every member to query the cluster database for members every 1s), somehow by stopping those continuous queries, this has prevented go-dqlite from detecting/handling a remotely closed connection.

My hunch is that its an issue in driverError in go-dqlite, somehow not handling/detecting the EOF error and then not returning driver.ErrBadConn err that would presumably cause the connection to be taken out of use by dqlite.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.