Git Product home page Git Product logo

atomic_store's Introduction

atomic_store

Easier than a DBMS, but more fault-resistant than just a file.

Sometimes you need to manage a bit of state across executions. Sometimes, a fully-blown database is just too much.

This library makes it easy to keep a store of stuff in a JSON file, in an atomic and fault-resistant manner.

Other formats (like pickle and bson) are also supported, and arbitrary formats are possible.

Table of Contents

Install

Just pip install atomic_store. Or, if you must, pip install -r requirements.txt

Note that the only real dependency is atomicwrites, which has no dependencies.

(The dependency on bson is only really needed when you choose the BSON backend.)

Usage

By default, the store is encoded as json, written to a temporary file, and then atomically replaces the old file. When reading, if the file does not exist, a default value is used. The default default value is None.

Context Manager

This program remembers all start times:

import atomic_store
import time

with atomic_store.open('runs.json', default=[]) as store:
    print('Previous executions:')
    print(store.value)
    new_entry = time.strftime('%Y-%m-%d %H:%M:%S%z')
    store.value.append(new_entry)

Leaving the context manager takes care of all writes. No intermediate values get written to disk.

This is ideal if the task runs short, and in case of any error you only want to keep the old state anyway.

For advanced uses, also see the subsection on reentrancy.

Manual control

This program remembers all start times:

import atomic_store

my_store = atomic_store.open('gathered.json', default=dict())

my_store.value['state'] = 'running'
my_store.value['thought'] = 'I would not eat green eggs and ham.'
my_store.commit()
# ... some calculations ...
my_store.value['state'] = 'done'
my_store.value['thought'] = 'I do so like Green eggs and ham!'
my_store.commit()

Only calls to commit() cause writes to the disk. Again, no intermediate values get written to disk.

This is ideal if you have a long-running job with clear steps, and each step's output is valuable.

Note that commit() is also available in the context manager.

Format tweaks

If you're using the json backend, and want to keep the JSON file as small as possible, you can call open with dump_kwargs=dict(separators=(',', ':')). The keyword load_kwargs also exists.

Non-JSON formats

You can use arbitrary other formats, using the format keyword:

atomic_store.open('runs.json', default=[], format=MY_FORMAT)

Supported values are None (for JSON), 'json', 'pickle', 'bson' (requires bson to be installed), and also any module or object providing dump/load or dumps/loads. By default, atomic_store assumes you operate on binary files, except when JSON is involved. To override this, you can set is_binary. Note that this means you can use the modules json, pickle, and bson as they are.

For convenience, you can also override the abstract classes atomic_store.AbstractFormatFile or atomic_store.AbstractFormatBstr.

In all cases, load_kwargs and dump_kwargs are still supported.

Reentrancy

If the same atomic_store is used as a context manager more than once, the default behavior is to write the file only when the last with is exited:

# Assume `state.json` contains only `"before"`.
mngr = atomic_store.open('mystate.json', default=[])
with mngr as store:
    store.value = 'outer'
    # File contains `"before"`: We haven't exited any context manager yet.
    with mngr as store:
        store.value = 'inner'
        # File contains `"before"`: We haven't exited any context manager yet.
    # File now contains `"inner"`, because the inner `with`-statement wrote it.
    # Read the Reentrancy section if you consider this undesired behavior.
# File now contains `"inner"`, because the outer `with`-statement wrote it again.

If you consider this behavior undesirable, you can either just use multiple context managers (by calling atomic_store.open multiple times), or by using the keyword ignore_inner_exits=True, like this:

# Assume `state.json` contains only `"before"`.
mngr = atomic_store.open('mystate.json', default=[], ignore_inner_exits=True)
with mngr as store:
    store.value = 'outer'
    # File contains `"before"`: We haven't exited any context manager yet.
    with mngr as store:
        store.value = 'inner'
        # File contains `"before"`: We haven't exited any context manager yet.
    # File *still* contains `"before"`, as the manager detected that it is still active.
# File now contains `"outer"`, because the outer `with`-statement wrote it.

Atomic is not magic

This library is not magical.

If two threads (or two processes, or whatever) open a store, modify something, and then write concurrently, one of the results may be lost. However, the writes are guaranteed to be atomic, so the data is merely lost, but not corrupted.

TODOs

  • Figure out how to make bson optional

NOTDOs

Here are some things this project will not support:

  • Any DB backend.
  • Any multi-file backend.
  • More advanced semantics than just commit.
  • This includes rollback. It's just not obvious which behavior is desired when the file does not exist (Re-use default value? What if it was modified, as it happens with lists and dicts?), and with stacked context managers (should it rollback to the file's state? Or to the beginning of the with?)

Contribute

Feel free to dive in! Open an issue or submit PRs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.