yiling-j / theine Goto Github PK
View Code? Open in Web Editor NEWhigh performance in-memory cache
License: BSD 3-Clause "New" or "Revised" License
high performance in-memory cache
License: BSD 3-Clause "New" or "Revised" License
With the help of type hints it's possible to limit key function's signature, just the same way as Cacheme. Using a separate function avoids the weird lamda or f-string in original decorator
@Cache("tlfu", 10000)
def get_user_info(user_id: int) -> Dict:
return {}
# or use existing cache
cache = Cache("tlfu", 10000)
@cache
def get_user_info(user_id: int) -> Dict:
return {}
# register key function: function name is not important, so just use _ here
@get_user_info.key
def _(user_id: int) -> str:
return f"user:{user_id}"
When a new Cache is initialised, a new Thread is spawned from the following code in the init method:
self._maintainer = Thread(target=self.maintenance, daemon=True)
self._maintainer.start()
The Thread is never stopped and even if the Cache dies the thread remains and holds onto the cached data. Even if the data is cleared, the Thread will continue to persist.
Minimal Reproducible Example
from theine import Cache
# Loop endlessly - this will crash after a few minutes
while True:
# Create a new cache - which creates a new Thread
c = Cache("tlfu", 10000)
# Add some fake data. Not necessary for the crash, but it takes up memory
c.set("data", ["data"] * 2048)
# Remove the cache, just to be explicit.
del c
Hi there,
Our team (@1a1a11a) has developed a new cache eviction algorithm, called SIEVE. Itβs simple, efficient, and scalable.
Why SIEVE could be a great addition:
Welcome to dive into the details on our website sievecache.com and on our SIEVE blog.
We would love to explore the possibility of integrating SIEVE into theine. We believe it could be a beneficial addition to the library and the community.
Looking forward to your feedback!
AFAIK nothing exists in Python. Maybe you can configure a shared address space or something and multiple processes be able to share the same cache.
Here is the minimal test case:
from theine import Cache
cache: Cache = Cache(policy="tlfu", size=10000)
def test1() -> None:
print(len(cache))
cache.clear()
def test2() -> None:
print(len(cache))
cache.clear()
I am running this test file using pytest. Second test fails with OverflowError: cannot fit 'int' into an index-sized integer
error. If I remove cache.clear()
then it works. If I change the policy to "clockpro" then it also works.
Environment: MacOS 13.5.1, python 3.8.16.
Already done in Go: https://github.com/Yiling-J/theine-go#cache-persistence
Because Theine Python is combination of Rust & CPython, it would be a little difficult to implement. But if you really need this feature, please leave a comment and I will consider adding it.
@Memoize(Cache("unlimited"), timedelta(seconds=10 * 60))
def get():
...
@Memoize(Cache("empty"), timedelta(seconds=10 * 60))
def update_global_dict():
value = query(..)
global_dict[value.field2] = value.field1
global_dict[value.field3] = value.field1
Both code and readme
In a django backend, we had one of the request that (wrongly) triggered the instantiation of a new Cache
instance, which was then garbage collected immediately after the request finished.
With this pattern, we noticed the CPU usage of our backend to increase linearly with the number of past requests served. We ultimately traced it down to this erroneous creation of Cache
instances.
I guess it is starting a new background thread for each instance, and these threads are not killed with garbage collection, hence resulting in millions of threads after some time.
The theine
library has multiple interfaces. Could you document which pattern starts new threads in the background? Would it affect the @Memoize
decorator or the django adapter? For instance if inside a request handling we have a dynamic function with @Memoize
, wouldn't that also instantiate a new cache at each request without ever killing background threads?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.