Git Product home page Git Product logo

pekardrive's Introduction

PekarDrive

Tests

This is a simple distributed storage system. The system is made of a single master with multiple workers. The workers are responsible for carrying out file operatins, such as writing, reading and deleting files. The master is responsible for load balancing. A client is included and used to initiate operations on the master. An archive file is found in the /include folder that can be used as an API to the master.

The general architecture of PekarDrive is seen in the figure below:

PekarDrive Architecture

Fault Tolerance

Fault tolerance is achieved by letting the master write checkpoints of itself periodically. Furthermore, the master pings the workers every minute. There is no failure detection implemented to detect master failure, though having the workers detect master failure if ping message has not been received for over a minute could be implemented.

Guide

This section describes with examples how to use the PekarDrive interface as well as the terminal application.

Compiling

Run the following command to build an archive libpekardrive.a file in folder /lib used for linkage:

make install

Then, compile with the following options:

... -l<PATH TO LIBRARY>/lib/libpekardrive.a -I<PATH TO LIBRARY>/INCLUDE

Terminal

In order to specify the address of a PekarDrive, the commands set-ip and set-port are used, as seen in the example below:

pekar set-ip <IP>

pekar set-port <PORT>

The PekarDrive terminal application comes with 5 operations:

  • ls: List all contents stored on all worker nodes.
  • read: Read all contents of a given file.
  • write: Write a new file.
  • append: Append to an existing file.
  • delete: Delete a given file.

List all contents of every file in every node with the following command:

pekar ls

Read contents of a file with the following command:

pekar read <FILENAME>

Write to a file using the following command:

pekar write <FILENAME> <DATA>

Append to an existing file with the following command:

pekar append <FILENAME> <DATA>

Delete a file with the following command:

pekar delete <FILENAME>

PekarDrive library

The library comes with three headers used to interact with a master node. The header interface.h contains everything needed to interact with a master node. The header comes with the following functions:

const void *ls(const char *host, unsigned short port)

const void *file_read(const char *file_name, const char *host, unsigned short port)

size_t file_write(const char *file_name, const void *buffer, size_t len, short append, const char *host, unsigned short port)

short file_delete(const char *file_name, const char *host, unsigned short port)

Master and Worker Setup

Execute the following Makefile command to build a master and worker executable, including a client to interact with the master:

make MASTER_TKN=<MASTER_TOKEN> WORKER_TKN=<WORKER_TOKEN>

Tokens are required to be assigned to the master and worker. The tokens must be integer values. In order to only build a single executable, insert either master or worker after make to build the master or the worker executable, respectively.

Note, it is important the master does not run on the same machine as any workers. If this is needed, change the globally used port number in lib/comm.c and src/server/boot.h before building an individual component.

Also, if you have not downloaded a release, the project is compiled in debug mode. To disable debug mode and logging, remove -DDEBUG and set to -DVERBOSE_1 instead of -DVERBOSE_2 from the Makefile.

pekardrive's People

Contributors

mrpekar98 avatar

Stargazers

Eason Wang avatar

Watchers

James Cloos avatar  avatar

Forkers

yuleo1

pekardrive's Issues

Implement master index

Master should have a way of indexing the files stored at nodes. We already store information about which nodes are connected, but we should also store some metadata about what is stored in each node.
This way, we don't have to ask every node which files they have in order to locate a file.

A synchronization mechanism must be implemented to ensure the index is up to date.

Logger file for error messages

Create a file for error and warning logging. Name of component must be specified to allow debugging of specific components.

FIFO ordering of transmission sequence

Received chunks from transmissions must be ordered by sequence number.
This ordering should be implemented in function receive in file transmission.c.

Write master checkpoints

Master must write a file of server_table. This file can be read to re-register workers after master fail.

Reading from worker after ping must have timeout

A timeout must be set for reading ping response. Otherwise, the master will be blocked indefinitely.
This is particularly a problem when a worker fails. Then the master will be blocked forever, since it will forever wait for response that it will never get.

Communication protocol wrapper

Create a wrapper for communication. Parameters passed are operation type from packet.h, data and data length.

Protocol hierarchy is as follows, where the top layer is the most abstract layer:

  1. Wrapper.
  2. transmission.
  3. comm.

Difficult to distinguish between logging macros

There is no clear difference between logging macros LOG and DEBUG. Keep macro DEBU for debug code and add two new macros VERBOSE_1 and VERBOSE_2, where the first of them indicates important logging information and the second one is everything.

Implement client

Client should be implemented as fast as possible, since it can be used as a good tool for integration/system testing.

`comm.c` should send fix-sized chunks of data

This will remove asking to read more than actually sent.

If less data is given than chunk size, then send empty bytes.

Remove the length parameter from the read function. The function just reads everything is chunks sizes and appends them.

Add token between master and workers

Add a token field to struct packet. This is to ensure that all packets to worker are from a master. Likewise, masters should ensure responses are from a worker.

Add usage guide to README

README should contain a guide how to use the library, and also how to use the client terminal application.
Code examples should be included in the guide on how to use the library.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.