Git Product home page Git Product logo

flaxkv's Introduction


🗲 FlaxKV

A high-performance dictionary database.

PyPI version License Release (latest by date) tests pypi downloads

English | 简体中文


The flaxkv provides an interface very similar to a dictionary for interacting with high-performance key-value databases. More importantly, as a persistent database, it offers performance close to that of native dictionaries (in-memory access).
You can use it just like a Python dictionary without having to worry about blocking your user process when operating the database at any time.


Key Features

  • Always Up-to-date, Never Blocking: It was designed from the ground up to ensure that no write operations block the user process, while users can always read the most recently written data.

  • Ease of Use: Interacting with the database feels just like using a Python dictionary! You don't even have to worry about resource release.

  • Buffered Writing: Data is buffered and scheduled for write to the database, reducing the overhead of frequent database writes.

  • High-Performance Database Backend: Uses the high-performance key-value database LevelDB as its default backend.

  • Atomic Operations: Ensures that write operations are atomic, safeguarding data integrity.

  • Thread-Safety: Employs only necessary locks to ensure safe concurrent access while balancing performance.


Quick Start

Installation

pip install flaxkv 
# Install with server version: pip install flaxkv[server]

Usage

from flaxkv import FlaxKV
import numpy as np
import pandas as pd

db = FlaxKV('test_db')
"""
Or start as a server
>>> flaxkv run --port 8000

Client call:
db = FlaxKV('test_db', root_path_or_url='http://localhost:8000')
"""

db[1] = 1
db[1.1] = 1 / 3
db['key'] = 'value'
db['a dict'] = {'a': 1, 'b': [1, 2, 3]}
db['a list'] = [1, 2, 3, {'a': 1}]
db[(1, 2, 3)] = [1, 2, 3]
db['numpy array'] = np.random.randn(100, 100)
db['df'] = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})

db.setdefault('key', 'value_2')
assert db['key'] == 'value'

db.update({"key1": "value1", "key2": "value2"})

assert 'key2' in db

db.pop("key1")
assert 'key1' not in db

for key, value in db.items():
    print(key, value)

print(len(db))

Tips

  • flaxkv provides performance close to native dictionary (in-memory) access as a persistent database! (See benchmark below)
  • You may have noticed that in the previous example code, db.close() was not used to release resources! Because all this will be automatically handled by flaxkv. Of course, you can also manually call db.close() to immediately release resources.

Benchmark

benchmark

Test Content: Write and read traversal for N numpy array vectors (each vector is 1000-dimensional).

Execute the test:

cd benchmark/
pytest -s -v run.py

Use Cases

  • Key-Value Structure: Used to save simple key-value structure data.
  • High-Frequency Writing: Very suitable for scenarios that require high-frequency insertion/update of data.
  • Machine Learning: flaxkv is very suitable for saving various large datasets of embeddings, images, texts, and other key-value structures in machine learning.

Limitations

  • In the current version, due to the delayed writing feature, in a multi-process environment, one process cannot read the data written by another process in real-time (usually delayed by a few seconds). If immediate writing is desired, the .write_immediately() method must be called. This limitation does not exist in a single-process environment.
  • By default, the value does not support the Tuple, Set types. If these types are forcibly set, they will be deserialized into a List.

Citation

If FlaxKV has been helpful to your research, please cite:

@misc{flaxkv,
    title={FlaxKV: An Easy-to-use and High Performance Key-Value Database Solution},
    author={K.Y},
    howpublished = {\url{https://github.com/KenyonY/flaxkv}},
    year={2023}
}

Contributions

Feel free to make contributions to this module by submitting pull requests or raising issues in the repository.

License

FlaxKV is licensed under the Apache-2.0 License.

flaxkv's People

Contributors

kenyony avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

flaxkv's Issues

两个python文件,一个写入db,一个读取db,实时读写问题

两个python文件,均是通过db = FlaxKV(data_name,root_path_or_url,backend='lmdb')启动db。
file.a写入db,file.b读取db。
file.a通过db['test'] = 'test_value'写入db后,
file.b能够通过db.items()读取到{‘test’:"test_value"},但db.get('test')结果为空。
file.b只能通过db=FlaxKV(...)重新打开db才能使用db.get('test'), pop('test')等。

Feat: Retrieve data to memory

添加写入的逆过程:将backend中的数据取回内存中,其中backend可以为本地库(lmdb or leveldb),也可以是远程数据库。

缓存模块:添加额外缓存,用于存放高频访问的数据库数据。
缓存控制: 启动时通过参数指定缓存行为

Thanks

特意感谢,最近正好需要到kv数据库,好用

远程连接时报错

启动服务:
image

然后执行:

from flaxkv import dictdb
import numpy as np

db = dictdb('http://localhost:8000', remote=True)
db[1] = 1

print(len(db))

报错如下:
image

版本号:

  • Name: flaxkv Version: 0.1.6
  • Name: httpx Version: 0.25.2
  • Name: msgpack Version: 1.0.7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.