Git Product home page Git Product logo

google-file-system's Introduction

The Google File System

This project aims at implementing the Google File System.

It aims at a simple implementation but remain close to what is described in the paper.

https://research.google/pubs/the-google-file-system/

Client library

Is data path a different request from commit ?

Based on the wording from the paper it appears so.

Master server

Two types of expiration are possible:

  1. A chunk server stops sending heartbeats, in this case it is excluded and its chunks are considered not replicated anymore.
  2. A chunk server reports fewer chunks than previously (corrupted chunks).

Chunk server

The chunk server stores each chunk in a separate file with checksums at the beginning, this means that the data is in a contiguous part of the disk. We waste around 4KB of checksums at the beginning of the file, this should be ok as the GFS is not optimized for small files "Small files must be supported, but we need not optimize for them.".

Chunk versions are stored in a separate file to speed up restart, otherwise we would need to read all chunk blocks before starting.

When are chunks created ?

Chunks are created when a client wants to write to them. The chunk file is created on primary and replicas when receiving lease.

How data integrity is ensured ?

We assume to be in a crash-stop model where crash of all 3 (minimum replication factor) servers holding a chunk is assumed to be impossible. This implies that we can assume that after accepting an in memory write, a chunk server will always survive to save user data.

What about crash-recovery ?

When a chunk server crashes at the exact moment they accepted an operation but before writing anything to disk (including new checksums) is tricky. We can support this by using versions, this chunk will have an out of date version and in the worst case the master will find out that this chunk is not up-to-date and will remove this replication.

google-file-system's People

Contributors

clementchouteau avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.