Git Product home page Git Product logo

rocksdb-rs's Introduction

rocksdb-rs

rust version of rocksdb

Why we need to build a rust version of rocksdb

Clean and simple architecture

RocksDB is a common data engine for multiple kinds of database, and one of the most important application among them is MyRocks, which is the kernel engine to replace InnoDB in MySQL. Obviously, most users in RocksDB community do not need a transaction engine for MySQL, we just want a simple but well-perform KV engine. The RocksDB has merged so many features which we may never enable them and they made this project hard to maintain. I want to build a simple database which is easy to maintain for simple KV application.

Better support for asynchronous IO

RocksDB does not support asynchronous IO. Not only for IO, but also other method such as Ingest and CreateColumnFamily are also synchronous. It means that every method may block the user thread for a long time. In cloud environment, this problem may be worse because the latency of cloud disk is much higher than local NVMe SSD.

Development Guide

Model And Architecture

Our engine has five main modules, which are WAL, MANIFEST, Version, Compaction, Table.

  • WAL module will assign sequence for every writes and then write them into a write-ahead-log file. It will run as an independent future task, and some other jobs may also be processed in this module, such as ingest. You can think of him as a combination of write_thread and WriteToWAL in RocksDB. The format of file is compatible with RocksDB, so that we can start this engine at the RocksDB directory.
  • MANIFEST will persist changes for SST files, include the result of compaction and flush jobs.
  • The most important structure of Version module are VersionSet and KernelNumberContext. I split them from VersionSet of RocksDB. If one operation can convert to an atomic operation, I store it in KernelNumberContext, otherwise it must acquire a lock guard for Arc<Mutex<VersionSet>>. VersionSet will manage the info of ColumnFamily and every ColumnFamily will own a SuperVersion, which include the collection of Memtable and the collection of SSTable. SuperVersion consists of MemtableList and Version, every time we switch memtable for one ColumnFamily, we will create a new SuperVersion with the new Memtable and the old Version. Every time we finish a compaction job or a flush job, we will create a new SuperVersion with the old Memtable and the new Version.
  • Compaction module consists all code for Compaction and Flush.
  • Table module consists the SSTable format and the read operation or write operation.

TODO List

Compaction

  • refactor compaction pickup strategy and calculate the effect of delete keys.

Table

  • Support LZ4 and ZSTD compression algorithm.
  • Support hash-index for small data block.
  • Support block-cache.

IO

  • Support AIO for asynchronous IO. (I used user threads as independent io thread, but I'm not sure if it's a better solution than AIO.)

rocksdb-rs's People

Contributors

jmpotato avatar little-wallace avatar w41ter avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.