Git Product home page Git Product logo

gopherwood's Introduction

Gopherwood

Gopherwood is an embedded persistent caching library with infinite space by leveraging object storage. It provides unified filesystem APIs for local caching files and supports offloading data to Object Store Service transparently when caching size exceeded the local disk volume. A block-based metadata system is designed to support infinite caching space, multi-process accessing, data persistent and caching recovery.

The project is created to help on-premise system easily transform to cloud-oriented system. Using object
store service will help solve the data high availability issues and lower the disk cost by configuring a minimum local disk quota.

See the github wiki for detailed documentations, developer guides and FAQs.

Installation

Requirement

To build Gopherwood, the following libraries are needed.

cmake (2.8+)                    http://www.cmake.org/
boost (tested on 1.53+)         http://www.boost.org/
liboss                          binary integrated in project, will open source later

To run tests, the following libraries are needed.

gtest (tested on 1.7.0)         already integrated in the source code
gmock (tested on 1.7.0)         already integrated in the source code

To run code coverage test, the following tools are needed.

gcov (included in gcc distribution)
lcov (tested on 1.9)            http://ltp.sourceforge.net/coverage/lcov.php

Configuration, Build and Install

cd GOPHERWOOD_HOME
mkdir build
cd build
../bootstrap

Run command "../bootstrap --help" for more configuration.

make 
make install 

Configurations of Object Storage Service

modify $GOPHERWOOD_HOME/config.properties 
export GOPHERWOOD_CONF=$GOPHERWOOD_HOME/config.properties

Test

To test all test cases, run command

make testAll

To do function test, run command

make functiontest

To show code coverage result, run command. Code coverage result can be found at BUILD_DIR/CodeCoverageReport/index.html

make ShowCoverage

gopherwood's People

Contributors

chenbaggio avatar eric5553 avatar neuyilan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gopherwood's Issues

Support multi-thread Preload

Currently, with the simple implementation. The read latency would be high because there might be many blocks need to be loaded.
A common way to improve the read performance is to add multi-thread preload mechanism.

bug - file write raise assert exception

  1. test code

#include
#include
#include <fcntl.h>
#include
#include
#include "gopherwood/gopherwood.h"

int main(int agrInt, char **agrStr)
{
for (int i = 0; i < 10; i++) {
testGWWrite("TestReadWriteSeek-ReadEvictBlock");
}

return 0

}

using namespace std;

void testGWWrite(std::string fileName) {
AccessFileType type = randomType;

gopherwoodFS gwFS = gwCreateContext((char *) fileName.c_str());
gwFile file = gwOpenFile(gwFS, (char *) fileName.c_str(), O_CREAT);

int SIZE = 128;
//3. construct the file name
std::stringstream ss;
ss << "/ssdfile/ssdkv/" << fileName;
std::string filePath = ss.str();

//4. read data from file
std::ifstream infile;
infile.open(filePath.c_str());


int totalWriteLength = 0;
char *buf = new char[SIZE];
infile.read(buf, SIZE);
int readLengthIn = infile.gcount();
while (readLengthIn > 0) {
    totalWriteLength += readLengthIn;
    std::cout << "totalWriteLength=" << totalWriteLength << ",readLength="
              << readLengthIn << std::endl;
    std::cout << "buf=" << buf << std::endl;
    //5. write data to the gopherwood
    gwWrite(gwFS, file, buf, readLengthIn);

    buf = new char[SIZE];
    infile.read(buf, SIZE);
    readLengthIn = infile.gcount();
}

gwCloseFile(gwFS, file);

std::cout << "*******END OF WRITE*****, totalWriteLength=" << totalWriteLength << std::endl;

}
2. pre-condition
code version:
branch: master
commit e385461
Author: houliang [email protected]
Date: Wed Feb 7 16:58:10 2018 +0800

running env:
/ssdfile/ssdkv/TestReadWriteSeek-ReadEvictBlock 4096 bytes
/ssdfile/ssdkv/sharedMemory/smFile 40960 bytes and initalize to total zero

  1. assert exception
    call stack:
    #0 0x00007ffff6c1a1f7 in raise () from /lib64/libc.so.6
    #1 0x00007ffff6c1b8e8 in abort () from /lib64/libc.so.6
    #2 0x00007ffff6c13266 in __assert_fail_base () from /lib64/libc.so.6
    #3 0x00007ffff6c13312 in __assert_fail () from /lib64/libc.so.6
    #4 0x00007ffff7b6f389 in Gopherwood::Internal::FileSystemImpl::acquireNewBlock (this=0x605a70, fileName=0x607368 "TestReadWriteSeek-ReadEvictBlock") at /opt/Gopherwood/src/core/FileSystemImpl.cpp:533
    #5 0x00007ffff7b84692 in Gopherwood::Internal::OutputStreamImpl::writeInternal (this=0x608b10, buf=0x60b900 "", size=128) at /opt/Gopherwood/src/core/OutputStreamImpl.cpp:101
    #6 0x00007ffff7b84413 in Gopherwood::Internal::OutputStreamImpl::write (this=0x608b10, buf=0x60b900 "", size=128) at /opt/Gopherwood/src/core/OutputStreamImpl.cpp:72
    #7 0x00007ffff7b83ae5 in Gopherwood::OutputStream::write (this=0x607330, buf=0x60b900 "", size=128) at /opt/Gopherwood/src/core/OutputStream.cpp:32
    #8 0x00007ffff7b8acc7 in gwWrite (fs=0x6072f0, file=0x607310, buffer=0x60b900, length=128) at /opt/Gopherwood/src/core/gopherwood.cpp:313
    #9 0x0000000000401633 in testGWWrite (fileName="TestReadWriteSeek-ReadEvictBlock") at TestGWAPI.cpp:48
    #10 0x0000000000401c2a in main (agrInt=2, agrStr=0x7fffffffe258) at TestGWAPI.cpp:128

[enhancement] archive the gopherwood SharedMemory identifier in SMLock file

Currently if gopherwood is formatted, the previously opened gwContext might got error because it was not informed of the format operation.

A simple proposal could be archive the SharedMemory information and the corresponding identifier to SMLock file. Then a checking mechanism could be built based on this information.

[BUG] create different OSS file prefix when evicting data

Currently, the OSS file format is something like filepath_blocknum. This format will got conflict problem when different hosts evicting same block of filepath.

To solve this problem, we need to add unique OSS file prefix for different hosts.

Log block offset when write finished

Currently, we support multi-write, but there is still one problem.

Suppose two writer ActiveStatus opened, they don't know each other's Eof if no gwFlush was issued. To overcame this, we need to update block Eof once we finished a gwWrite operation or switched to next block. By doing this, any opening thread will know the updated Eof by checking the SharedMemBucket Eof field.

feature-please provide 'cancelfile' interface

if gopherwood use oss, for performance, please provide 'cancelfile', the caller not need to call
close->delete flow and just call cancelfile is ok while transaction is break

imply: the cancelfile would call oss cancelfile , please refer to oss

deletefile should use file path

current file delete interface is
int gwDeleteFile(gopherwoodFS fs, gwFile file);
it is not convenient for using, please design it like following:
int gwDeleteFile(char* filePath);

[enhancement] Dynamically adjust the quota size

Currently we use a fixed formula to calculate quota size when an ActiveStatus acquire new blocks.
As we planned, the enhancement to dynamically adjust quota size should be implemented after function test was 100% written.

[BUG]. should check the config value when init the context

`gopherwoodFS gwCreateContext(char *workDir, GWContextConfig *config) {
LOG(Gopherwood::Internal::INFO, "------------------gwCreateContext start------------------");
gopherwoodFS retVal = NULL;

if (config != NULL) {
    Configuration::NUMBER_OF_BLOCKS = config->numBlocks;
    Configuration::LOCAL_BUCKET_SIZE = config->blockSize;
    Configuration::CUR_CONNECTION = config->numPreDefinedConcurrency;
}`

when init the context, should check the config value is NULL or not, if the config.* is null, then the context will be in a wrong status.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.