Git Product home page Git Product logo

theine's Issues

type safe and elegent decorator

With the help of type hints it's possible to limit key function's signature, just the same way as Cacheme. Using a separate function avoids the weird lamda or f-string in original decorator

@Cache("tlfu", 10000)
def get_user_info(user_id: int) -> Dict:
    return {}

# or use existing cache
cache = Cache("tlfu", 10000)
@cache
def get_user_info(user_id: int) -> Dict:
    return {}

# register key function: function name is not important, so just use _ here
@get_user_info.key
def _(user_id: int) -> str:
    return f"user:{user_id}"

Memory Leak with Thread creation

When a new Cache is initialised, a new Thread is spawned from the following code in the init method:

self._maintainer = Thread(target=self.maintenance, daemon=True)
self._maintainer.start()

The Thread is never stopped and even if the Cache dies the thread remains and holds onto the cached data. Even if the data is cleared, the Thread will continue to persist.

Minimal Reproducible Example

from theine import Cache

# Loop endlessly - this will crash after a few minutes
while True:
    # Create a new cache - which creates a new Thread
    c = Cache("tlfu", 10000)
    # Add some fake data. Not necessary for the crash, but it takes up memory
    c.set("data", ["data"] * 2048)
    # Remove the cache, just to be explicit.
    del c

Proposal to Integrate SIEVE Eviction Algorithm

Hi there,

Our team (@1a1a11a) has developed a new cache eviction algorithm, called SIEVE. It’s simple, efficient, and scalable.

Why SIEVE could be a great addition:

  • Simplicity: Integrating SIEVE is straightforward, usually needing to change less than 20 lines of code on average.
  • Efficiency: On skewed workloads, which are typical in web caching scenarios, SIEVE is top-notch.
  • Cache Primitive: SIEVE is not just another algorithm; it's a primitive that could enhance or replace LRU/FIFO queues in advanced systems like LeCaR, TwoQ, ARC, and S3-FIFO.

Welcome to dive into the details on our website sievecache.com and on our SIEVE blog.

We would love to explore the possibility of integrating SIEVE into theine. We believe it could be a beneficial addition to the library and the community.

Looking forward to your feedback!

OverflowError: cannot fit 'int' into an index-sized integer

Here is the minimal test case:

from theine import Cache

cache: Cache = Cache(policy="tlfu", size=10000)

def test1() -> None:
    print(len(cache))
    cache.clear()

def test2() -> None:
    print(len(cache))
    cache.clear()

I am running this test file using pytest. Second test fails with OverflowError: cannot fit 'int' into an index-sized integer error. If I remove cache.clear() then it works. If I change the policy to "clockpro" then it also works.

Environment: MacOS 13.5.1, python 3.8.16.

Feature Request: TTL only Cache

  1. Unlimited sized Cache
@Memoize(Cache("unlimited"), timedelta(seconds=10 * 60))
def get():
    ...
  1. Zero sized Cache (ignore returned value)
@Memoize(Cache("empty"), timedelta(seconds=10 * 60))
def update_global_dict():
    value = query(..)
    global_dict[value.field2] = value.field1
    global_dict[value.field3] = value.field1

Documentation: background threads behavior

In a django backend, we had one of the request that (wrongly) triggered the instantiation of a new Cache instance, which was then garbage collected immediately after the request finished.

With this pattern, we noticed the CPU usage of our backend to increase linearly with the number of past requests served. We ultimately traced it down to this erroneous creation of Cache instances.

I guess it is starting a new background thread for each instance, and these threads are not killed with garbage collection, hence resulting in millions of threads after some time.

The theine library has multiple interfaces. Could you document which pattern starts new threads in the background? Would it affect the @Memoize decorator or the django adapter? For instance if inside a request handling we have a dynamic function with @Memoize, wouldn't that also instantiate a new cache at each request without ever killing background threads?

Enhancements

  • Add Timing Wheel to make TTL more generic
  • Add Django cache backend adapter

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.