Git Product home page Git Product logo

hatofi-db's Introduction

Hatofi DB

What ?

Hatofi, stands for "Hashmap to Filesystem" is a key-value database running on Linux. Aims to reduce IO latency from traditional SQL database with less financial cost compared to Redis (stored in filesystem instead of RAM)

Usage

  • Database creation via config file via hatofi [-d,--data-directory] <data_dir> gen [-c,--config] <config_file.txt>
  • Import data with file as input (use-case: bulk importation) hatofi [-d,--data-directory] <data_dir> load [-i,--input] <input.txt> [-f,--force /!\ Will erase existing data]
  • Import in-line data (use-case: one-time importation) hatofi [-d,--data-directory] <data_dir> load [-D,--dataclass] <dataclass> [-T,--text] <data_plaintext>
  • Query data (partial or exact match) using hatofi [-d,--data-directory] <data_dir> query search [-D,--dataclass] <dataclass> [[-H,--hash] <data_md5> | [-T,--text] <data_plaintext>]
  • Get linked data with hatofi [-d,--data-directory] <data_dir> query links [-D,--dataclass] <dataclass> [-H,--hash] <data_md5>
  • Get data importation logs with hatofi [-d,--data-directory] <data_dir> query logs [-D,--dataclass] <dataclass> [-H,--hash] <data_md5>

Input Format

Input file format:

<entity_dataclass>:<entity_key>:<dataclass>:<base64_encoded_data>:<data_md5>

Assuming current date is 2022-01-01

Assuming file UUIDv4 is 693ad1be-c353-4562-b12a-930f2ed43b79

Example input.txt:

693ad1be-c353-4562-b12a-930f2ed43b79
email:20bacbe5082d09eb3ac96a4565c1dc33:email:ZGF0YTFAbmV0LmNvbQ==:20bacbe5082d09eb3ac96a4565c1dc33
email:20bacbe5082d09eb3ac96a4565c1dc33:password:cGFzc3dvcmQ=:5f4dcc3b5aa765d61d8327deb882cf99
email:ae68135e4f74eed19a79fd982c7c4f98:email:ZGF0YTJAZW1haWwuY29t:ae68135e4f74eed19a79fd982c7c4f98
email:1e77fd2c7a59f06a6c8dc8ace3ebf221:email:ZGF0YTNAZW1haWwuY29t:1e77fd2c7a59f06a6c8dc8ace3ebf221
email:26af7d285fa312aa2f8d3857d0f00af4:email:ZGF0YTRAZG9tYWluLmNvbQ==:26af7d285fa312aa2f8d3857d0f00af4
email:26af7d285fa312aa2f8d3857d0f00af4:password:YW5hd2Vzb21lcGFzc3dvcmQ=:90282e03043af181c985c9891c52c00f
$ md5sum input.txt 
3dca08a5fb0b47ae9a09e35c4fad9dbf  input.txt
$ sha256sum input.txt
9976ed3c6a2c2ced0002f2ffcbc4bb6bcaa6a653cd7f15846e187664e97159a2 input.txt

Output Format

Desired filesystem output architecture:

> db/
    .dbmeta Database configuration file
    > data/
        > dataclasses/
            > email/
                > xx/xx/
                > 1e/77/
                    > 1e77fd2c7a59f06a6c8dc8ace3ebf221/
                        > 1e77fd2c7a59f06a6c8dc8ace3ebf221.md5 << md5:1e77fd2c7a59f06a6c8dc8ace3ebf221
                        > links/
                        > logs/
                            import-2022-01-01.log
                > 20/ba/
                    > 20bacbe5082d09eb3ac96a4565c1dc33
                        > 20bacbe5082d09eb3ac96a4565c1dc33.md5 << md5:20bacbe5082d09eb3ac96a4565c1dc33
                        > links/
                            > 5f4dcc3b5aa765d61d8327deb882cf99 -> ../../../../../password/5f/4d/5f4dcc3b5aa765d61d8327deb882cf99
                        > logs/
                            import.log << 20bacbe5082d09eb3ac96a4565c1dc33 2022-01-01 693ad1be-c353-4562-b12a-930f2ed43b79
                            import.log.1    # Old logs
                > 26/af/
                    > 26af7d285fa312aa2f8d3857d0f00af4
                        > 26af7d285fa312aa2f8d3857d0f00af4.md5 << md5:26af7d285fa312aa2f8d3857d0f00af4
                        > links/
                            > 5f4dcc3b5aa765d61d8327deb882cf99 -> ../../../../../password/90/28/90282e03043af181c985c9891c52c00f
                        > logs/
                            import.log << 26af7d285fa312aa2f8d3857d0f00af4 2022-01-01 693ad1be-c353-4562-b12a-930f2ed43b79
                            import.log.1    # Old logs
                > ae/68/
                    > ae68135e4f74eed19a79fd982c7c4f98
                        > ae68135e4f74eed19a79fd982c7c4f98.md5 << md5:ae68135e4f74eed19a79fd982c7c4f98
                        > links/
                        > logs/
                            import.log << ae68135e4f74eed19a79fd982c7c4f98 2022-01-01 693ad1be-c353-4562-b12a-930f2ed43b79
                            import.log.1    # Old logs
                > xx/xx/
            > password/
                > 5f/4d/
                    > 5f4dcc3b5aa765d61d8327deb882cf99/
                        > 5f4dcc3b5aa765d61d8327deb882cf99.md5 << md5:5f4dcc3b5aa765d61d8327deb882cf99
                        > links/
                            > 20bacbe5082d09eb3ac96a4565c1dc33 -> ../../../../../email/20/ba/20bacbe5082d09eb3ac96a4565c1dc33
                        > logs/
                            import.log << 5f4dcc3b5aa765d61d8327deb882cf99 2020-01-01 693ad1be-c353-4562-b12a-930f2ed43b79
                            import.log.1    # Old logs
                > 90/28/
                    > 90282e03043af181c985c9891c52c00f/
                        > 90282e03043af181c985c9891c52c00f.md5 << md5:90282e03043af181c985c9891c52c00f
                        > links/
                            > 26af7d285fa312aa2f8d3857d0f00af4 -> ../../../../../email/26/af/26af7d285fa312aa2f8d3857d0f00af4
                        > logs/
                            import.log << 90282e03043af181c985c9891c52c00f 2020-01-01 693ad1be-c353-4562-b12a-930f2ed43b79
                            import.log.1    # Old logs
                > xx/xx/
                
        > file
            > xx/xx/
            > 99/22/
                > 99226b116e370a25130cce7e55fe3f813a0f3168c30e584de422e9f43b76fc1a.sha256 << md5 8998c19bb14b9af66ccdc79bed5818c4 2022-01-01 << sha256 99226b116e370a25130cce7e55fe3f813a0f3168c30e584de422e9f43b76fc1a 2022-01-01 << status started << status done
            > xx/xx/

MAIN CHANGE OF 2.1: Removed support for full-text search optimization. Will be delegated to an external project.

Changelog

  • 2.3.x:
    • Added raw data loading only from dataclass + value
    • Data import logging is now optional (>= 2.3.1)
  • 2.2.x: Added back data base64 encoded value for simplication purposes. Added "Get Links", "Get Logs" and "Search by Hash" features
  • 2.1.x: Remove support for partial data search: Database is now full anonymized
  • v2.x: Builds graph at importation, calculate heuristics to reduce keyspace at partial search
  • v1.x: Basic organization of data in filesystem

hatofi-db's People

Contributors

xvanilor avatar

Watchers

 avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.