Git Product home page Git Product logo

thesecretlivesofdata's Introduction

The Secret Lives of Data

Understanding what your bits do when you're not looking.

Overview

So often we use databases and servers without really understanding how they work. The way that data flows is critical to performance and reliability.

This project seeks to spread the knowledge of our systems through interactive visualization. If you have a system that you understand and wish to share then please add a GitHub Issue. Data visualization knowledge is not necessary -- just the desire to spread some knowledge.

Visualizations

Below is a list of data visualizations and their associated Github issue. Please report any bugs you find or any suggestions you have for how to make these visualizations more understandable.

  1. Raft: Understandable Distributed Consensus (#1)

  2. Apache Kafka (#4) - Planning

If you have suggestions for new topics, please submit a new Github issue.

thesecretlivesofdata's People

Contributors

benbjohnson avatar jxlwqq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

thesecretlivesofdata's Issues

Great iniciative - PBFT woulb be great

Congrats on the idea. It's much more easier to understand a concept when you are able to see it.
As a suggestion, i'd love to see implemented Gossip protocol and PBFT consensus protocol.

Greetings

Apache Kafka

Overview

Status: Proposal

Apache Kafka is a scalable, distributed publish-subscribe messaging system. It allows organizations to use it as an event bus and decouple systems from one another.

Sections

  • Basics of Apache Kafka
  • Replication
  • Internals / Zookeeper interaction

Raft Distributed Consensus Protocol

Overview

The Raft distributed consensus protocol allows a collection of processes to maintain consistency even in the face of multiple node failure. The two main tenants of the protocol are leader election and log replication.

This visualization will lay out the problem of distributed consensus followed by a general overview of leader election and log replication. It will then follow up with details of Leader Election using best case (Single Candidate) and worst case (Split Vote) scenarios. Then it will show details of Log Replication using the best case (Network OK) and worst case (Network Partitions) scenarios. Finally, it will conclude with additional resources on where to learn more.

URL

http://thesecretlivesofdata.com/raft/

Frames

- [x] What is Distributed Consensus?
- [x] Overview
  - [x] States (Follower, Candidate, Leader)
- [x] Leader Election
  - [x] Election Timeout
  - [x] Candidacy
  - [x] Leadership & heartbeat timeout.
  - [x] Re-election
  - [x] Split Vote
- [ ] Log Replication
  - [X] Complex state machine example.
  - [X] Commitment rules
  - [X] Network Partitions
  - [X] Client reads.
- [X] Conclusion & Additional Resources

RAFT: Follower's term is not incremented for request_vote RPC

Follower's term is only incremented after it receives a heartbeat.

Here is the relevant section from the spec.

\* Server i receives a RequestVote request from server j with
\* m.mterm <= currentTerm[i].
HandleRequestVoteRequest(i, j, m) ==
    LET logOk == \/ m.mlastLogTerm > LastTerm(log[i])
                 \/ /\ m.mlastLogTerm = LastTerm(log[i])
                    /\ m.mlastLogIndex >= Len(log[i])
        grant == /\ m.mterm = currentTerm[i]
                 /\ logOk
                 /\ votedFor[i] \in {Nil, j}
    IN /\ m.mterm <= currentTerm[i]
       /\ \/ grant  /\ votedFor' = [votedFor EXCEPT ![i] = j]
          \/ ~grant /\ UNCHANGED votedFor
       /\ Reply([mtype        |-> RequestVoteResponse,
                 mterm        |-> currentTerm[i],
                 mvoteGranted |-> grant,
                 \* mlog is used just for the `elections' history variable for
                 \* the proof. It would not exist in a real implementation.
                 mlog         |-> log[i],
                 msource      |-> i,
                 mdest        |-> j],
                 m)
       /\ UNCHANGED <<state, currentTerm, candidateVars, leaderVars, logVars>>

The reply contains original term of the follower.

WOW! So good

This was fascinating. Such great work. Please do more.

Are there any plans to expand to Paxos and other Distributed Systems

Hi, wanted to say that I love the animations here for RAFT and it helped me understand it better when implementing it for my Distirbuted Systems course. Wanted to ask if there are any plans to make animations for perhaps Paxos or PBFT and other implementations of distributed systesm

Randomised election timeouts

Hi Ben
Let me start off by thanking you for the amazing Raft visualiser - I watched it after your Go-Time podcast, and it has really helped me so much now that I started reading the paper
Thanks to your work it was really easy for me to mentally visualise while reading the paper

One callout that I have - I noticed that the nodes seem to experience election timeout at fixed (individual) intervals. Was that a conscious choice?

I'm not sure if the paper specifies that behaviour, and it might help with lowering the probability of candidacy collisions even further by choosing a random duration after every timeout

Just wanted to run this by you in case you agree

Thanks!

What is "majority"?

What is "majority" exactly? More detailed questions:

  1. If the network is split, will be "majority" be recalculed for each partition separately?

  2. What if node A connects to nodes B and C, but node B doesn't connect to C? What is "majority" in such a case?

Questions

First, thank you for the presentation, extremely easy to understand. I have a few questions regarding this protocol:

  1. How long is the election term ? You talk about the election timout 150~300ms to become a candidate, but I missed the term timeframe info.
  2. How do you handle attackers ? If nodes can decide themselves to become candidate, couldn't one bypass the election timeout and send out request vote messages instantly ?
  3. How do you insure that the leader is fair ? Even if bypassing the election timeout isn't possible, what if I wait to become a leader then start sending whatever message I want to send (ie in a blockchain, let everyone know that some follower owns me all his coins) ?

This kind of central authority, even if for a brief term, is a bit worrying ^^' Wouldn't a number of cluster centers sharing the authority (and elected more carefully with some trust weight from past operations) be more robust ? [/me go study about the federated consensus]

Keyboard Navigation

Add support for the right arrow key instead of requiring the user to constantly hit the "Continue" button.

/cc: @philips

JIT Compilation

This'd probably be a nice visualization, something showing how hot paths get optimized in JIT compilation and all that.

Of course, I'm not really a JIT expert, so I don't know what would be considered notable enough to visualize.

Paxos

Paxos visualization
will to help code :)

Service Orchestration?

would service orchestration make sense? Things like etcd and Serf seem like they are going to be important in managing multi node clusters.

GGG

use python to draw the print may be a good choice,by the way, is there any other project like this?

It's very wonderful

the view is very wonderful. maybe it can be a complete organization to help people understand the complex algorithm

Raft leader election during network split

Isn't the leader elected by majority of all nodes? If so, why during the presentation the network partition allows to create two leaders in disjointed partitions.

Request: WebRTC

WebRTC consists of P2P + an unspecified initial communication channel that is often a central server. An explanation of WebRTC and ICE/STUN would be helpful since so many sites are using datachannels and/or WebRTC.

Kuberneetes

Would be nice to have a presentation for kubernetes

how about PBFT?

PBFT is popular, if possible, please make an introduction of PBFT.

DO IT NOW!!!!

THANKS A LOT!!!!
I HOPE TO LEARN MORE!!!!
IT'S VERY USEFUL!!!!

May I use your visualization to teach students about raft consensus algorithms?

Hi Ben. I am preparing teaching materials as part of Samsung's effort to provide free to use education material. I came across your most excellent visualization tool and hope to use it as part of my curriculum. Unfortunatley, Samsung's
policy prevents me from simply including links. This is because many of the countries where the educatiion material will
be provided often have very poor or no internet connections. So, in order to use your visualization, I created a small video
of the visualization. (Screen recording with me pushing the buttons).

Please let me know if I may use your materials in this manner, and if so, what type of copyright disclosure you would like me to present in the materials.

Thank you.

Henry Park

visualization request: Nakamoto Consensus Algorithm

Really awesome visualizations with Raft. Great work. I'd like to request if you can add Nakamoto Consensus algorithm used in Bitcoin invented by Satoshi Nakamoto in 2008 to solve the Byzantine General's Problem. It allows consensus between a network of nodes in a completely decentralized and trustless way.

It's truly awesome and I would really appreciate if you can make a visualization for it.

Thanks. Keep going.

Much awesome

Not an issue, just a quick note to say that this is freakin' awesome. The raft protocol explanation was fantastic. It's like the Khan Academy meets data. Clearly explains the protocol in a few short minutes in super simple terms. Thank you kindly for your inspiring work in this area. You rock.

Gossip Protocols

I find Gossip Protocols have very little information about them. The information is typically within implementations such as Riak and Cassandra, for example. Even though Gossip Protocols are extremely powerful once understood, even though certain trade-offs are made (higher-availability while sacrificing consistency).

Although, Gossip Protocols vary quite a bit, so it might be hard to display the gossip protocol -- like Raft; you'd have to show a variation of sort.

Menu Navigation

Add a dropdown menu to jump to sections of the current visualization.

Paxos

I'm not sure that Paxos lends itself to simple conceptual examples, but if it can be done, an overview to Paxos allowing people to compare/contrast with Raft would be very interesting.

Gossip Protocol

I developed a good understanding about raft after going through the visualization. Can we have one for Gossip protocol?
Please let me know if I can contribute here.

Backwards button

I think it would add educational value if there wan an option to go backwards step-by-step.
To the left of the continue button, perhaps.

browser tests under raft/test/index.html give errors

If I go to "http://thesecretlivesofdata.com/raft/test/" I get test errors when using Chrome. If I checkout the latest code at the github page and run a phython local server I get the same set of errors:

Node

initialize()

should initialize as a follower ‣
should initialize with model ‣

bbox()

should return a bbox around the circle ‣

execute()

should append a log entry ‣
TypeError: Cannot read property 'invalidate' of null
at Node.onStateChange (http://thesecretlivesofdata.com/raft/scripts/model/node.js:703:30)
at Node.EventDispatcher.dispatchEvent (http://thesecretlivesofdata.com/scripts/playback/playback.js:776:26)
at Node.dispatchEvent (http://thesecretlivesofdata.com/raft/scripts/model/node.js:739:53)
at Node.EventDispatcher.dispatchChangeEvent (http://thesecretlivesofdata.com/scripts/playback/playback.js:791:17)
at Node.state (http://thesecretlivesofdata.com/raft/scripts/model/node.js:280:14)
at Node.currentTerm (http://thesecretlivesofdata.com/raft/scripts/model/node.js:243:18)
at Context. (http://thesecretlivesofdata.com/raft/test/model/node.js:39:22)
at Test.Runnable.run (http://thesecretlivesofdata.com/scripts/mocha/mocha-1.14.0.js:4263:32)
at Runner.runTest (http://thesecretlivesofdata.com/scripts/mocha/mocha-1.14.0.js:4635:10)
at http://thesecretlivesofdata.com/scripts/mocha/mocha-1.14.0.js:4681:12

nextIndex()

should default to 1 ‣
should set and return value ‣

matchIndex()

"before each" hook ‣
TypeError: Cannot read property 'invalidate' of null
at Node.onLeaderIdChange (http://thesecretlivesofdata.com/raft/scripts/model/node.js:707:30)
at Node.EventDispatcher.dispatchEvent (http://thesecretlivesofdata.com/scripts/playback/playback.js:776:26)
at Node.dispatchEvent (http://thesecretlivesofdata.com/raft/scripts/model/node.js:739:53)
at Node.EventDispatcher.dispatchChangeEvent (http://thesecretlivesofdata.com/scripts/playback/playback.js:791:17)
at Node.candidateEventLoop (http://thesecretlivesofdata.com/raft/scripts/model/node.js:359:14)
at Node.state (http://thesecretlivesofdata.com/raft/scripts/model/node.js:267:18)
at Context. (http://thesecretlivesofdata.com/raft/test/model/node.js:75:19)
at Hook.Runnable.run (http://thesecretlivesofdata.com/scripts/mocha/mocha-1.14.0.js:4263:32)
at next (http://thesecretlivesofdata.com/scripts/mocha/mocha-1.14.0.js:4526:10)
at http://thesecretlivesofdata.com/scripts/mocha/mocha-1.14.0.js:4538:5

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.