zheguang / resdb Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
Virtual node -> physical nodes
"Preferred list"
Vector clock for data reads and writes
D([(S, ctr)...])
where D
is the object, S
is the coordinator node id, ctr
is the counter. There is a list of such node-counter pairs.For now, assume R and W quorum to be 1. So one concurrent example is:
App1
writes on S_1
: D_1([S_1,1])
.App2
writes on S_2
: D_2([S_2,1])
.App1
reads from S_2
: D_2
App1
reconcile D_1
and D_2
App1
updates and writes on S_1
: D_3([S_1,2],[S_2,1])
Two flavors: Rendezvous Hashing and Consistent Hashing.
Without considering failure and replication, take care of node addition and node departure.
Unavailability can be caused by unexpected failure to reach a node, or node taking offline for maintenance.
Hinted handoff for temporary failures:
If a node S just temporarily went offline, replicas that are intended for S is forwarded to S'. S' is chosen from outside the preferred list of N, say the N+1st node. S' then stores a special marker for this S's replica, (S, replica). When it detects that S becomes available again, it sends all S's replicas back to S.
Detection of failures is to use heartbeat ('are you alive?') + gossip ('i just learned that node F has failed').
Another consideration for failure recovery is the efficiency. When a node fails and its data has to be redistributed to other node(s). So one way to characterize efficiency in failure recovery is the load of the receiving node(s). For example, if only one node is to receive the failed node's whole load, then that this receiving node might be overwhelmed, and the requests for objects on this receiving node may experience delay. However, if the load of the failed node is distributed more evenly to more nodes, then we can expect each receiving node only experience a minor delay.
We will understand Dynamo by running an end-to-end example through, focusing on these questions:
Future steps in mind:
Background links:
A key step here is the communication layer. I suggest ZeroMQ.
In the context of Resilient DB, I can see two venues to explore:
(1) Resilient relational database
Problem here is to build a lightweight mechanism to make a standard relational database (e.g. MySQL) resilient to failures.
The idea is to run a bunch of secondaries in a p2p network, similar to Dynamo. The challenge here is to extend Dynamo to relational setting, and consider different deployment issues such as varying resources and failing probabilities.
Reading:
Dynamo: Amazon's highly available key-value store. DeCandia et al. SOSP'07.
(2) Resilient ML
Problem here is to build a tool for private-preserving federated learning that can tolerate failures such as failing devices. This tool is called Secure Aggregation. The challenge is to extend it to handle floating points (duh), and make it more efficient. Another challenge is to extend it to handle different types of aggregations, such as affine transform, counts, etc.
Reading:
Practical Secure Aggregation for Privacy-Preserving Machine Learning. Bonawitz et al., CCS'17.
Consistent hashing:
The simplest explanation I found is the Voldemort Design, and source code, ConsistentRoutingStrategy
Otherwise, for the theoretically-inclined: https://dl.acm.org/doi/pdf/10.1145/258533.258660
Rendevous hashing
An alternative to consistent hashing. Wiki page has a good explanation: https://en.wikipedia.org/wiki/Rendezvous_hashing
Merkle tree
This one is not needed as a first step implementation, but it's a lingering question we had from our last Dynamo discussion.
Again wiki page has a good overview: https://en.wikipedia.org/wiki/Merkle_tree
Cryptographic hash function provides the collision resistance that Merkle tree relies on: https://en.wikipedia.org/wiki/Cryptographic_hash_function
A common crypto hash function, as you might have heard of, is SHA-256: https://en.wikipedia.org/wiki/SHA-2
Todo:
Sync Replication
Async Replication
We do not consider failures just yet. But we can hopefully already see the effect of replication on reducing version divergence.
Slides:
Demo:
Project profile
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.