Comments (6)
Another thing I did not think about before is that the lamport clock can actually help to easier understand read/query/search responses if you aggregate results from multiple nodes (lamport clocks does provide value when you look at both write and read, not only write)
from peerbit.
Lamport clocks don't add much value IMO. As you said, using nexts
they don't provide anymore information. They can also be played with a lot; which shouldn't actually affect ordering that much funnily enough since ordering by nexts
always come first. Meaning, whether the clocks are being used correctly or after being poisoned, it should make no difference as long as there are are not completely disjoint and even then the logical clock can be emulated with the merkle one.
I could see in some situations where the logical clock may be more efficient to depend on but I don't think the trade off is worth. It complicates things and opens up area for shenanigans without offering much and less reliable ordering than the merkle clock.
Main issue with removing the clock on Peerbit as of now is how to order unrelated documents in a document store, if they are not connected to each other in any meaningful way (like through a shared clock). How to we make sure that new entries are submitted with a truthful timestamp?
It sounds like in this situation the database is sharded and updates do not link to updates in other shards (makes sense) but does use a share clock. Shards in peerbit sound dynamic so deterministic sharding/partitioning databases doesnt work which is probably why the ordering is needed cross-shard.
It might be worth investigating randomness beacons like drand for either referencing to directly from shard entries or creating a separate beacon log with ordered drand randomness and referencing to that from shards. I havent look at drand too much, and maybe its not needed, but it might provide the ordered randomness needed.
Using a shared logical clock for now doesnt seem that bad either.
from peerbit.
I agree with you. Good reasoning.
It sounds like in this situation the database is sharded and updates do not link to updates in other shards (makes sense) but does use a share clock. Shards in peerbit sound dynamic so deterministic sharding/partitioning databases doesnt work which is probably why the ordering is needed cross-shard.
Yes exactly.. In a sharded database, the clock is even more ambiguous since the clock is defined (ish) by the length of the logs, and in a sharded setup, a database is represented with multiple logs. If you have multiple shards locally, you could sync them yourself, but that does not help with the synchronization across peers. (I had a "exchange clock" routine implemented before so you could have synchronization cross peers, but it did not play out nicely because you are introducing new problems, like what happens if not all peers are connected, what happens if someone provides the wrong time, etc. So I removed it some time ago.
It might be worth investigating randomness beacons like drand for...
Smart! Will check it out.
My gut feeling as now of now is to remove it from the log entry and rather have it on the database level. For example, if you have a social media Post database, you would have a timestamp perhaps on each post. Then, it would be the access controllers job to approve or reject new posts depending on whether new posts have reasonable timestamps (based on previously observed posts and the current time). This would at least let you order things in time somewhat "okay" for real world applications. In practice, no matter if lamport clock would exist or not, you will still need to have a real world clock on posts anyway because a lamport clock can not easily be translated into a normal date, hence, this kind of timestamp checking has to be implemented no matter what.
(A related note, I previously removed "refs" from the log entry and shared "refs" on the "exchange heads" pubsub level instead, so you are not paying for that storage for that kind of links and keeping the door open for more dynamic behaviour and changes. The reduced complexity from that change makes me keen/interested in the clock problem)
from peerbit.
it would be the access controllers job to approve or reject new posts depending on whether new posts have reasonable timestamps (based on previously observed posts and the current time).
I don't usually like the idea of an update being valid or invalid based on predecessors because it is not easy to verify its correctness. It's a similar problem to verifying that the logical clock has been set correctly which requires traversal all the way to the root which isn't possible if entries have been lost.
I wonder what the best option is for handling an incorrectly set timestamp and it's successors. You can either add the successors until finding the entry with the invalid timestamp and then removing them; or only adding entries after verifying the correctness of their clocks.
from peerbit.
I agree with you. You have valid points and have changed my mind. Since yesterday I have been deep diving into the clock issue and actually found that a lot of distributed DBs use a HLC (hybrid logical clock) that keeps track of both physical and logical time in the same clock instance. The implementation seems to be fairly easy. Here is a good article about the subject
I am not sure if you have to traverse to the root to verify the clock for every new entry, I mean since its append only, you only have to verify an entry once. So, if you have previously verified three entries, linked in a chain: 1 <- 2 <- 3, and then entry 4 comes in with next to entry 3. You only have to verify the clock of entry 4 in relation to 3 (and not 3 to 2 again since its has already been verified before)?
However, there is an additional complexity layer when you can not trust peers (that is not present for traditional distributed DBs). And this is mainly a problem for the first entry (I think), of any log since it is not possible to verify it against some previous entries.
Last night I played around with the idea is that you would do a simple Peerbit clock program that is hosted by a cluster of nodes (whom are somewhat blindly trusted) just like a NTP. And the only thing they do is signing clock/time messages as a service so that entries without any nexts can get a somewhat ok physical time in the HLC.
Another solution could perhaps to be to have an ACL for the first entry that compares local time and allows some error margin..
from peerbit.
Physical time is now implemented with Hybrid logical clock on commits and you can use a ClockService to verify that root (or any) commit have timestamp that is accurate to an error threshold.
For the document store you can now search for documents that have been created or edited on a particular date
it("modified between", async () => {
let response: Results<Document> = undefined as any;
const allDocs = writeStore.docs.store.oplog.values;
await stores[1].docs.index.query(
new DocumentQueryRequest({
queries: [
new ModifiedAtQuery({
modified: [
new U64Compare({
compare: Compare.GreaterOrEqual,
value: allDocs[1].metadata.clock
.timestamp.wallTime,
}),
new U64Compare({
compare: Compare.Less,
value: allDocs[2].metadata.clock
.timestamp.wallTime,
}),
],
}),
],
}),
(r: Results<Document>) => {
response = r;
},
{ waitForAmount: 1 }
);
expect(
response.results.map((x) => x.context.head)
).toContainAllValues([allDocs[1].hash]);
});
from peerbit.
Related Issues (20)
- `Documents`: Reconsider default min replicas settings
- `getReady()` should return a list of PublicSignKey's instead of hashcodes HOT 1
- `pubsub`: Add 'seqno' to pubsub data messages
- `shared-log`: Smartly fetch blocks
- IP privacy
- `Shared-log`: add option to automatically set minReplicas based on network dynamics
- `getDefaultMinRoleAge`: Make it adaptive the amount of data that exist HOT 1
- Improved Changelog and release notes
- `documents`: add "NOT" query
- `@peerbit/server`: cli does not show nodes deployed in multi-region with same name
- Saving 170k+ entries crashes process HOT 2
- `documents`: Allow for SQLLite WASM backend HOT 2
- `shared-log`: Efficient sync HOT 2
- Pinning documents and document deletions
- Uncaught exception and crash when peer disconnects
- setMaxListeners error example demo chat room HOT 1
- Add Tor integration
- Verify that Peerbit is working as expected on Windows HOT 1
- Bootstrap node terminating at random HOT 1
- graphs HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from peerbit.