Git Product home page Git Product logo

crdt's Introduction

CRDT

An implementation of ∂-state based Conflict-free Replicated Data Types (CRDT) in the Swift language.

codecov

code coverage chart

Overview

This library implements well-known state-based CRDTs as swift generics, sometimes described as convergent replicated data types (CvRDT). The implementation includes delta-state replication functions, which allows for more compact representations when syncing between collaboration endpoints. The alternative is to replicate the entire state for every sync.

The CRDT API documentation is hosted at the Swift Package Index.

  • G-Counter (grow-only counter)
  • PN-Counter (A positive-negative counter)
  • LWW-Register (last write wins register)
  • G-Set (grow-only set)
  • OR-Set (observed-remove set, with LWW add bias)
  • OR-Map (observed-remove map, with LWW add or update bias)
  • List (causal-tree list)

For more information on CRDTs, the Wikipedia page on CRDTs is quite good. I'd also suggest the website CRDT.tech as a wonderful collection of further resources. The implementations within this library were heavily based on algorithms described in Conflict-free Replicated Data Types by Nuno Preguiça, Carlos Baquero, and Marc Shapiro (2018), and heavily influenced/sourced from the package ReplicatingTypes, created by Drew McCormack, used under license (MIT).

What's Different about this Package

The two most notable change from Drew's code are:

  • consistently exposing the type used to identify the collaboration instance (be that person, process, or machine) as a generic type
  • adding explicit delta-state transfer mechanisms so that you didn't need to transfer the entirety of a CRDT instance to another location in order to merge the data.

Like the ReplicatingTypes package, this package is available under the MIT license for you to use as you like, asking only for recognition that it was sourced.

If your goal is creating local-first software, this implementation is start, but (in my opinion) incomplete to those needs. In particular, there are none of the serialization optimizations included that would reduce the space needed by the instances when serialized in their entirety to be stored. There are also none of the optimizations that other libraries (for example Automerge or Yjs) that improve memory overhead needed to support longer-form collaborative text interactions.

These limitations may change in the future, and contributions are welcome.

Alternative Packages and Libraries

Other Swift implementations of CRDTs:

Two very well established CRDT libraries used for collaborative text editing:

Optimizations

Articles discussing tradeoffs, algorithm details, and performance, specifically for sequence based CRDTs:

Benchmarks

Running the library:

swift run -c release crdt-benchmark library run Benchmarks/results.json --library Benchmarks/Library.json --cycles 5 --mode replace-all
swift run -c release crdt-benchmark library render Benchmarks/results.json --library Benchmarks/Library.json --output Benchmarks

Current Benchmarks

There's also stubbed benchmarks using package-benchmark under the ExternalBenchmarks directory. These additional benchmarks are primarily one-dimensional and DO require that additional libraries are installed (jemalloc) in order for them to operate. If you just want to explore, the .devContainer setting in this repository includes that library - so it's easy to trial this out from within VSCode and Docker. To explore the 1-dimension external benchmarks:

cd ExternalBenchmarks
swift package benchmark

crdt's People

Contributors

finestructure avatar hassila avatar heckj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

crdt's Issues

(more) performant memory storage

Look into using one of the newer pieces of swift collections, specifically the pending-1.1-release HashTreeCollections, which provides an optimized append-only structure that sounds like it would be lovely for the backing store of operations. (right now, it's just using Set<> which, while functional, might not provide the best operations for scaled up sets of data.

To verify the performance aspects, I think I'd want to bolster the benchmarks applied here, and seriously consider some of the classic CRDT benchmark structures, such as the string-sequence one that Martin Kleppmann (encoded at https://github.com/automerge/automerge-perf)

ORSet and GSet need more replication verification tests

While looking at sizing numbers, I ran across the following delta's and numbers (GSet and ORSet):

let orset_1 = ORSet(actorId: UInt(31), [1, 2, 3, 4])
let orset_2 = ORSet(actorId: UInt(31), [4, 5])

print("ORSet1(4 elements, UInt actorId) = \(orset_1.sizeInBytes())")
print("ORSet2(2 elements, UInt actorId) = \(orset_2.sizeInBytes())")
print("ORSet1's state size: \(orset_1.state.sizeInBytes())")
print("ORSet2's state size: \(orset_1.state.sizeInBytes())")
print("Delta size of Set1 merging into Set2: \(orset_2.delta(orset_1.state).sizeInBytes())")
print("Delta size of Set2 merging into Set1: \(orset_1.delta(orset_2.state).sizeInBytes())")

Delta Numbers for GSet:

GSet1(4 elements, UInt actorId) = 64
GSet2(2 elements, UInt actorId) = 40
GSet1's state size: 48
GSet2's state size: 48
Delta size of Set1 merging into Set2: 24
Delta size of Set2 merging into Set1: 40

Delta Numbers for ORSet:

ORSet1(4 elements, UInt actorId) = 116
ORSet2(2 elements, UInt actorId) = 66
ORSet1's state size: 16
ORSet2's state size: 16
Delta size of Set1 merging into Set2: 0
Delta size of Set2 merging into Set1: 50

The 0 in set1 merging into set2 is a notable issue - means something in the delta/state computation logic is screwed up.
ORSet has fairly complicated state, delta, and diffing logic - so best to work the corners there and verify its all operating as expected.

investigate using benchmark package

Package-benchmark (ref: https://forums.swift.org/t/package-benchmark-0-6-0-released/63103) release 0.6 recently, and didn't exist when I was first starting this. I'm currently using collections-benchmark, but I'd like to investigate using this newer one for the one-dimensional benchmarks that make sense - and I'm insanely curious about the capability that might be available with the histogram built-in - with maybe providing visual insights into distributions of calls.

create an iCloud file sync example showing the use of CRDT

Partially to show how to use the library, but also to work out more kinks - dig around in iCloud/CloudKit and set up an iCloud-based app that synchronizes a file and load/store the CRDT information from that - including reloading and merging on updates, and if that works and doesn't step over each other as two collaborators edit a single, simple document.

support retaining the full history for LWWRegister, ORMap, and ORSet

Currently, the history is optimized out and dropped on delta updates (or more specifically, generating a delta for these types only includes the most recent value - not all the possible changes that have happened since a previous update).

However, if you want to be able to view the state of the CRDT at any arbitrary point in its history, then that additional detail becomes worth maintaining, even thought the overall CRDT size can become notably larger.

The current logic would require a few updates to maintain this history, but it should be pretty directly doable as an initialization option.

Is the constraint on List.T conforming to Comparable necessary?

Apologies if this isn't the right venue to ask the question, but is it really necessary for the elements of List to be comparable? I don't see anywhere that the elements are being compared against one another, and commenting out the constraint doesn't prevent compilation or cause tests to fail.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.