Raft as described in the whitepaper guarantees linearizability by (among other things)

Support for linearizable/quorum reads about go-dqlite HOT 6 OPEN

canonical commented on August 17, 2024

Support for linearizable/quorum reads

from go-dqlite.

Comments (6)

cole-miller commented on August 17, 2024

With support for linearizable reads in go-dqlite, we could add this Jepsen test to our current battery:

http://jepsen-io.github.io/jepsen/jepsen.tests.linearizable-register.html

from go-dqlite.

freeekanayaka commented on August 17, 2024

My understanding is that linearizability applies naturally to atomic/single operations (like reading a register). The SQL model actually has transactional/multi operations, where usually the most you can have is serializability.

In particular SQLite in WAL mode uses snapshot isolation. I'm not sure how can you avoid stale reads with snapshot isolation: it's always possible to start a read transaction and read a stale value that was overwritten by a write transaction just after the read transaction started. I don't think there's any raft modification that can avoid that, since it's a property of the SQL/snapshot-isolation mode.

For example, if you use the sqlite3 command line to create a database like:

sqlite3 foo "pragma journal_mode=WAL; begin; create table foo(a text, b int); insert into foo (a,b) values ('a', 1); insert into foo (a,b) values ('b', 2); commit;"

then in 2 separate shells you start 2 sqlite3 shells and run these commands in the following order:

# shell 1
sqlite> begin;
sqlite> select * from foo where a='a';
a|1

this starts a read transaction on shell 1, then:

# shell 2
sqlite> begin;
sqlite> update foo set b=3 where a='b';
sqlite> begin;

this overwrites the value 'b', which transaction 1 still has to read. Now:

# shell 1
sqlite> select * from foo where a='b';
b|2

which is a stale read. This is kind of non-linearizable "by design", in the sense that the snapshot isolation transaction model is serializable but not linearizable (and indeed in the SQL standard there's no isolation mode guarantees linearizability, because it doesn't quite match with multi-operation transactions).

I've used the sqlite3 command line, but with dqlite is the same.

from go-dqlite.

cole-miller commented on August 17, 2024

My understanding is that linearizability applies naturally to atomic/single operations (like reading a register). The SQL model actually has transactional/multi operations, where usually the most you can have is serializability.

But as far as dqlite is concerned, the whole database is one big piece of state, right? Is the inclusion of transactions in the SQL model relevant when all query/exec operations are atomic and opaque to raft (i.e. they might as well be modifying and reading a single register)?

I'm not sure how can you avoid stale reads with snapshot isolation: it's always possible to start a read transaction and read a stale value that was overwritten by a write transaction just after the read transaction started.

I don't think the situation you describe is a stale read, is it? It's consistent with a sequencing of transactions where the read happens before the right, and linearizability (or strict serializability, I guess, since we're talking about transactions) isn't violated because each transaction appears to execute atomically at some point between invocation and return.

from go-dqlite.

freeekanayaka commented on August 17, 2024

My understanding is that linearizability applies naturally to atomic/single operations (like reading a register). The SQL model actually has transactional/multi operations, where usually the most you can have is serializability.

But as far as dqlite is concerned, the whole database is one big piece of state, right? Is the inclusion of transactions in the SQL model relevant when all query/exec operations are atomic and opaque to raft (i.e. they might as well be modifying and reading a single register)?

It is relevant because a dqlite SQL transaction can't be quite mapped to a raft operation, because it's not atomic in terms of client interactions. Typically the client starts the transaction, sends SQL commands, gets back results, possibly sends other SQL commands, until commit. It's only at commit time that we can possibly start a raft operation. However, precisely because they involve multiple client interactions, transactions can interleave. And if a write and read transaction interleave you might get the effect described in the example, without any possibility of resolving it with raft.

I believe the only way would be to allow either N concurrent read transactions OR one single write transaction. At the moment rN ead transactions can proceed concurrently with one single write transactions thanks to the WAL and its snapshot isolation.

Another way to see it, is that although the whole database is one big piece of state as you say, the operations that are run through raft are not the SQL transactions, but rather the result of the SQL transactions (i.e. the WAL pages that get added to the WAL).

Not sure how well I've articulated this, if it's confusing I'm happy to elaborate further.

I'm not sure how can you avoid stale reads with snapshot isolation: it's always possible to start a read transaction and read a stale value that was overwritten by a write transaction just after the read transaction started.

I don't think the situation you describe is a stale read, is it? It's consistent with a sequencing of transactions where the read happens before the right, and linearizability (or strict serializability, I guess, since we're talking about transactions) isn't violated because each transaction appears to execute atomically at some point between invocation and return.

The thing is that the definition of linearizable refers to single objects. If you take just that definition and apply it to the single object "second row of table foo", then this is a stale read because it violates real-time ordering. Likewise if you take the stronger multi-object version of linearizability, called strict serializability, this example is still a stale read since it violates real-time ordering.

As you point, these two transactions can be logically sequenced (i.e. serialized). However real-time ordering is violated, which is the key of linearizability (or actually strict serializability for transactions) vs serializability.

from go-dqlite.

cole-miller commented on August 17, 2024

As you point, these two transactions can be logically sequenced (i.e. serialized). However real-time ordering is violated, which is the key of linearizability (or actually strict serializability for transactions) vs serializability.

Hmm, I'm looking at this part from the jepsen.io page about strict serializability that you linked:

When Herlihy and Wing introduced linearizability, they defined strict serializability in terms of a serializable system which is compatible with real-time order.

A history is serializable if it is equivalent to one in which transactions appear to execute sequentially, i.e., without interleaving. A (partial) precedence order can be defined on non-overlapping pairs of transactions in the obvious way. A history is strictly serializable if the transactions’ order in the sequential history is compatible with their precedence order.

“The obvious way” means that transaction A precedes transaction B if A completes before B begins; this is the real-time constraint from linearizability.

Under that partial precedence order, the two transactions in your example wouldn't even be comparable, hence strict serializability doesn't impose any requirement on the order they should be applied -- right?

from go-dqlite.

freeekanayaka commented on August 17, 2024

As you point, these two transactions can be logically sequenced (i.e. serialized). However real-time ordering is violated, which is the key of linearizability (or actually strict serializability for transactions) vs serializability.

Hmm, I'm looking at this part from the jepsen.io page about strict serializability that you linked:

When Herlihy and Wing introduced linearizability, they defined strict serializability in terms of a serializable system which is compatible with real-time order.

A history is serializable if it is equivalent to one in which transactions appear to execute sequentially, i.e., without interleaving. A (partial) precedence order can be defined on non-overlapping pairs of transactions in the obvious way. A history is strictly serializable if the transactions’ order in the sequential history is compatible with their precedence order.

“The obvious way” means that transaction A precedes transaction B if A completes before B begins; this is the real-time constraint from linearizability.

Under that partial precedence order, the two transactions in your example wouldn't even be comparable, hence strict serializability doesn't impose any requirement on the order they should be applied -- right?

If you call the read transaction in the example "A" and the write transaction "B", you have B completing before A in real-time ordering but the serialization (logical) ordering needs to be that A completes before B (because it does not see the modifications made by B). This violates strict linearizability.

The ordering between these 2 specific transactions can't be partial, because they operate on the same data.

In case you aren't yet convinced, it might be worth writing to jepsen's author Kyle Kingsbury (aphyr at jepsen.io) to confirm or provide more details. Canonical had paid for training/consulting from him back then, so I believe he will be happy to reply. I'm not nearly as expert as him.

from go-dqlite.

Support for linearizable/quorum reads about go-dqlite HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent