Git Product home page Git Product logo

Comments (10)

v0y4g3r avatar v0y4g3r commented on June 7, 2024 1

How about providing support for configurable consistency level: https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlConfigConsistency.html

It allows user to trade consistency for performance if they have large amount of data and each data point is not that critical to lose.

In quorum-write pattern, configurable consistency level can be easily implemented using client-side option ACK_QUORUM, which means client confirms write as soon as it has received given number of ACKs. Thta's exactly how cassandra and Kafka implement this feature.

from bunshin.

evenyag avatar evenyag commented on June 7, 2024

What about putting this into a design-docs directory?

from bunshin.

sunng87 avatar sunng87 commented on June 7, 2024

How about providing support for configurable consistency level: https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlConfigConsistency.html

It allows user to trade consistency for performance if they have large amount of data and each data point is not that critical to lose.

from bunshin.

v0y4g3r avatar v0y4g3r commented on June 7, 2024

What about putting this into a design-docs directory?

Will move to design doc when it's finished.

from bunshin.

evenyag avatar evenyag commented on June 7, 2024

The design looks similar to bookkeeper. I'd recommend adding a reference or related works part.

Will move to design doc when it's finished.
Maybe you could open a pull request for this (like an RFC), which should be easier to review.

from bunshin.

evenyag avatar evenyag commented on June 7, 2024

A sequence of log entries in a segment that are written to the same write quorum is logically called a "chunk"

The naming of the chunk is a bit confusing with the segment.

from bunshin.

killme2008 avatar killme2008 commented on June 7, 2024

A sequence of log entries in a segment that are written to the same write quorum is logically called a "chunk"

The naming of the chunk is a bit confusing with the segment.

Agree. Maybe a graph to describe these concepts is better.

from bunshin.

v0y4g3r avatar v0y4g3r commented on June 7, 2024

The design looks similar to bookkeeper. I'd recommend adding a reference or related works part.

Will move to design doc when it's finished.
Maybe you could open a pull request for this (like an RFC), which should be easier to review.

Quorum systems almost look the same, and we have to somehow simplify the design of bookkeeper, since the operation and maintainence bring lot of pain in real-world production environment.

from bunshin.

tisonkun avatar tisonkun commented on June 7, 2024

If we build such a quorum read/write system on cloud, it will require stateful set to store multiple replications. That would cause a dependency to PV or similar things.

Instead, projects like AutoMQ builds WAL over single AZ EBS (perhaps Azure Storage supports cross-region replication out-of-the-box).

I doubt a BookKeeper alike system does not leverage cloud storage and introduces unnecessary complexity.

What if we read/write to a block service on cloud as a WAL? IIRC we can tolerate some data loss in WAL. And AWS EBS is 99.999% reliable; maybe even better than operating a multi-server quorum system with stateful set and PV.

WDYT?

from bunshin.

v0y4g3r avatar v0y4g3r commented on June 7, 2024

You're correct. That's why we are still revisiting the design.

That would cause a dependency to PV or similar things.

That's not the main concern. Persistent Volumes are just the abstraction of durable storage like EBS in K8s.

The problems of BookKeeper-like quorum systems are:

  • First, it mixes cold and hot data in one single storage, in which case they may contend for IO and bandwidth. In fact, WAL systems have one write pattern (tail append) and two read patterns (tail read and catch-up read), while the latter (reading cold data) can tolerate higher latency.
  • Second, quorum systems implicitly replicate data multiple times. In the quorum level, take Qw=3 for example, data has 3 replicas. While in each quorum node, data is stored on EBS, which also has 3 replicas internally. That sums to 9 replicas, with no region-failover capabiity.

Despite it's quorum or not, we still need a standalone WAL service. It serves not only for GreptimeDB components as WAL, but also as a channel that streams database events to faciliate functions like CDC and incremental materialized views.

from bunshin.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.