Git Product home page Git Product logo

django-cacheback's Introduction

About me

I'm the Head of Software Engineering at Octopus Energy.

I used to maintain several open-source projects but I do less open-source work these days. I'm the original author of django-oscar although I'm not active in the project any more.

Latest blog posts

Browse all blog posts

Latest TIL posts

I learnt...

Browse all TIL posts

Latest Gists

Browse all public Gists

django-cacheback's People

Contributors

alanjds avatar coagulant avatar codeinthehole avatar fjsj avatar geekfish avatar jacobh avatar jezdez avatar kennethlove avatar kevin-brown avatar kobold avatar lovemyliwu avatar lpomfrey avatar lukaslundgren avatar lukmdo avatar martinblech avatar mdomans avatar michaelkuty avatar mrgeislinger avatar mudetz avatar pomali avatar redsnapper8t8 avatar stephrdev avatar svetlyak40wt avatar thedrow avatar thijstriemstra avatar thisisstephenbetts avatar tomwys avatar walison17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

django-cacheback's Issues

Django 1.7 Support

Are you aware of any reasons that this wouldn't work with Django 1.7? When installing into a Django 1.7 environment, I noticed that it was intentionally scoped to 1.6.

Was really stoked to find this package. It's a terrific design pattern.

Consider adding support for other caches than default

Currently cacheback uses default django cache without a way to use any other alias then default.

This would be also helpful for testing where some test are fine with django backends.dummy.DummyCache wheres some need something like backends.locmem.LocMemCache. My suggestion is keeping azising behaviour as default but adding cacheback.base.Job constructor param cache

@codeinthehole curious to hear your feedback

@cacheback decorator and FunctionJob are incompatible with Celery's JSON serializer

This took quite some time to debug. I was using the JSON task serializer for my celery instance, and I observed that the cached data were never refreshed after the initial fetch.

It turns out that serializing tasks as JSON forces all of the strings to unicode. The FunctionJob produces the bytestring names of the module and function to run, and the base Job uses the hash of the arguments to create the cache key. The celery task received the unicode u'mymodule:myfunction' and it was hashing to a different value than the original bytes 'mymodule:myfunction'. This was causing the celery task to place the updated data into a different key than the webserver had originally placed the data after the first sync.

Support for Django 3.x?

django-cacheback is currently unusable with Django 3.x as cacheback uses django.utils.six which no longer exists. Any plans or ETA for a new release that supports this newer Django?

Generating cache keys doesn't work in python 3

Cache keys are generated using builtin hash method. Python 3.2 introduced hash randomization. Each process salts hashed strings with random value. Therefore key generated by celery worker is different than generated by web worker.

`cache_alias` option is not respected in FunctionJob and cacheback decorator

Just that, if you pass the cache_alias option to a FunctionJob (and hence the cacheback decorator), the resulting function won't respect it. The reason why is because FunctionJob's super __init__ is called before self.cache_alias is set:

super(FunctionJob, self).__init__()
if lifetime is not None:
self.lifetime = int(lifetime)
if fetch_on_miss is not None:
self.fetch_on_miss = fetch_on_miss
if cache_alias is not None:
self.cache_alias = cache_alias

But of course, FunctionJob's super __init__ depends on self.cache_alias, so it's too late to influence the initialization of the cache.

I'm actually responsible for this bug, so I'll also submit a PR. πŸ™‚

Refactor test suite to use py.test

I think moving the test suite to py.test would lead to easier integration of stuff like pep8, flakes and isort validation. In addition coverage could be added to improve the test suite in the long run.

Would it be worth working on it?

Scheduler for cache updates

I want to have the ability to schedule the cache updates tasks.
For instance, as in the provided example, I want to always have the most recent tweets for the specific user whithout initial timeouts to request data first time.
Do you think it will be a lot of work to create management command in the django-cacheback app, which can be started via cron every X minutes to kick off all available cache update tasks asyncronously?

Possibility to get into perpetual "DEAD cache hit" - stale result state

I'm not positive this is a bug, but its happened several times so far in development and I'm worried whether this might happen in production. Basically, something causes Celery to choke, and it discards or loses the task to update the cache. From then on, the Job will perpetually return the stale result "DEAD cache hit" until cache is cleared.

I'm not familiar enough with the framework to know what the alternative to this would be, but it seems like it could be problematic since the DEAD cache hit message is only logged at DEBUG level so one would be unlikely to realize whether or not this happened. Thus, it would seem feasible that if celery or the task chokes on something and loses or intentionally discards a cacheback task, the app would never refresh the Job until the cache item gets pushed out of memory (which could be never).

Assumption that cached object has "length"

I am using a cacheback Job in conjunction with a custom object. For this object, len(object) is meaningless and fails.

The fact that the len(object) fails causes the referesh_cache method in the cacheback task to fail when it tries to log info about the successful fetch

   logger.info("Fetched %s item%s in %.6f seconds", len(data),
                    's' if len(data) > 1 else '', duration)

Nothing in the docs that I can find indicates that Jobs have to operate on dict or list-like objects with length, and it seems to be working on my object, other than the failed logging.

Great app, btw. Should be a huge help.

Task cacheback.tasks.refresh_cache[1ba3f95a-4cd2-464b-9932-6e122da41cfd] raised exception: TypeError("object of type 'MenuMonth' has no len()",)
Traceback (most recent call last):
  File "/Users/ben/Envs/nutripublisher/lib/python2.7/site-packages/celery/task/trace.py", line 224, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/Users/ben/Envs/nutripublisher/lib/python2.7/site-packages/celery/task/trace.py", line 406, in __protected_call__
    return self.run(*args, **kwargs)
  File "/Users/ben/Envs/nutripublisher/lib/python2.7/site-packages/cacheback/tasks.py", line 46, in refresh_cache
    #        logger.info("Fetched %s item%s in %.6f seconds", len(data),
TypeError: object of type 'MenuMonth' has no len()

Invalidate job when using the decorator

There doesn't appear to be any way to manually invalidate the cache when using the cacheback decorator. It would be nice if the decorator also exposed an invalidate() method; something like this:

def _wrapper(fn):
    ...
    def __wrapper(*args, **kwargs):
        ...
    ...
    __wrapper.invalidate = lambda: job.invalidate()
    return __wrapper

(not sure if this will actually work).

BTW I think your caching scheme is awesome; thanks for releasing this package!

the WSGIRequest pickled failed when I use the decorator

I am using cacheback as the decorator and I wrote my own key function according with our business:

Class SthApiCacheJob(FnctionJob):
    def key(self, *args, **kwargs):
        """
        Return the cache key to use.
        Overide Job's key
        """
        prefix = "sth.api"
        if args and len(args)> 2 and hasattr(args[1], 'build_absolute_uri'):
            request = args[1]
            url = hashlib.md5(force_bytes(iri_to_uri(request.build_absolute_uri())))
            key  = "%s.%s:%s" % (prefix, self.class_path, url.hexdigest())
            print "new unique key -------> %s" % (key)
            return key
        else:
            raise RuntimeError("Unable to generate uniq cache key")

here is the use case:

@cacheback(lifetime=60, fetch_on_miss=True, job_class=SthApiCacheJob)
def v1_dep(request, user=None):
    size = parse_int(request.GET.get('size'), default=5)
    if size<= 0:
        return api_error('invalid_input', _('size is invalid'))
    #sth code....
    return { 'a': a, 'b': b}

I got error when running:

sth.api.api_sth:v1_dep', <WSGIRequest: GET '/sth/api/v1/dep'>, <SimpleLazyObject: <function <lambda> at 0x10a2db5f0>>), {})
new unique key -------> sth.api.sth.helper.SthApiCacheJob:441fa73b67ae49a151a364a57e00b3b2
2016-05-17 10:01:19,808 cacheback [DEBUG] Job sth.helper.SthApiCacheJob with key 'sth.api.sth.helper.SthApiCacheJob:441fa73b67ae49a151a364a57e00b3b2' - STALE cache hit - triggering async refresh and returning stale result
--before run async_refresh print the call_args-----> type({'kwargs': {'klass_str': 'sth.helper.SthApiCacheJob', 'call_args': ('sth.api.api_sth:v1_dep', <WSGIRequest: GET '/sth/api/v1/dep'>, <SimpleLazyObject: <function <lambda> at 0x10a2db5f0>>), 'obj_kwargs': {'lifetime': 60}, 'obj_args': (), 'call_kwargs': {}}}) is <type 'dict'>
2016-05-17 10:01:19,815 cacheback [ERROR] Unable to trigger task asynchronously - failing over to synchronous refresh
Traceback (most recent call last):
  File "/Users/user/work/ENV/lib/python2.7/site-packages/cacheback/base.py", line 250, in async_refresh
    **self.task_options
  File "/Users/user/work/ENV/lib/python2.7/site-packages/cacheback/utils.py", line 76, in enqueue_task
    return celery_refresh_cache.apply_async(*args, **kwargs)
  File "/Users/user/work/ENV/lib/python2.7/site-packages/celery/app/task.py", line 565, in apply_async
    **dict(self._get_exec_options(), **options)
  File "/Users/user/work/ENV/lib/python2.7/site-packages/celery/app/base.py", line 354, in send_task
    reply_to=reply_to or self.oid, **options
  File "/Users/user/work/ENV/lib/python2.7/site-packages/celery/app/amqp.py", line 310, in publish_task
    **kwargs
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/messaging.py", line 165, in publish
    compression, headers)
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/messaging.py", line 241, in _prepare
    body) = dumps(body, serializer=serializer)
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/serialization.py", line 164, in dumps
    payload = encoder(data)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/serialization.py", line 59, in _reraise_errors
    reraise(wrapper, wrapper(exc), sys.exc_info()[2])
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/serialization.py", line 55, in _reraise_errors
    yield
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/serialization.py", line 164, in dumps
    payload = encoder(data)
  File "/Users/user/work/ENV/lib/python2.7/site-packages/kombu/serialization.py", line 356, in pickle_dumps
    return dumper(obj, protocol=pickle_protocol)
EncodeError: Can't pickle <type 'cStringIO.StringO'>: attribute lookup cStringIO.StringO failed
new unique key -------> sth.api.sth.helper.SthApiCacheJob:441fa73b67ae49a151a364a57e00b3b2

my django version: 1.9.5;

I saw the code and I tried to write my prepare_args function , but I found the way is against the my key function.

Please help me. What can I do to solve the problem?

Error on readme

In the documentation, you say the lifetime default value is "600 seconds (5 minutes)" ... which does not make sense xD
ΒΏIt is 600 secods (10 minutes) or 300 seconds (5 minutes)?

Support for @method_decorator in Python 3.5 and later

At present using this inside a class based view:

@method_decorator(cacheback(lifetime=60))
def get_context_data(...):
    ...

Causes an error: AttributeError: 'functools.partial' object has no attribute '__module__'

The error arises in jobs.prepare_args which uses ("%s:%s" % (fn.__module__, fn.__name__),) + args to serialise the function for the queue. The suggestion here: https://stackoverflow.com/questions/20594193/dynamically-created-method-and-decorator-got-error-functools-partial-object-h

Is that in python 3.5 and later a reference to the original function is maintained in the partial. You can access it as .func.

Informer template tag with cacheback

Example usage informer_tag

In templatetags

@informer_tag(register, 'training/tags/event_tizer.html', 10)
def event_tizer(object_id):
    instance = Event.objects.get(pk=object_id)
    return {
        'object': instance
    }

In django template

{% event_tizer 1000 %}

Template tag code

# -*- coding: utf-8 -*-
from inspect import getargspec
from django.conf import settings
from django import template
from django.template.base import parse_bits
from django.template import loader
from django.utils import importlib
from cacheback.base import Job


class InformerJob(Job):

    def __init__(self, fn_string, template_name, lifetime=None, fetch_on_miss=None, cache_ttl=None):
        self.fn_string = fn_string
        self.template_name = template_name

        if lifetime is not None:
            self.lifetime = int(lifetime)

        if fetch_on_miss is not None:
            self.fetch_on_miss = fetch_on_miss

        if cache_ttl is not None:
            self.cache_ttl = cache_ttl

    def get_context_klass(self):
        return template.Context({
            'MEDIA_URL'   : settings.MEDIA_URL,
            'STATIC_URL'  : settings.STATIC_URL,
        })

    def get_constructor_args(self):
        return (self.fn_string, self.template_name)

    def get_constructor_kwargs(self):
        return { 'lifetime': self.lifetime, 'cache_ttl': self.cache_ttl }

    def fetch(self, *args, **kwargs):
        module_path, fn_name = self.fn_string.split(":")
        module = importlib.import_module(module_path)
        fn = getattr(module, fn_name)

        context = self.get_context_klass()
        context.update(fn(*args, **kwargs))
        t = loader.get_template(self.template_name)
        return t.render(context)

    def key(self, *args, **kwargs):
        key = super(InformerJob, self).key(self.template_name, *args, **kwargs)
        return '%s:%s' % (self.fn_string, key)



def informer_tag(register, template_name, lifetime=None, fetch_on_miss=None, cache_ttl=None, name=None):
    def decorator(func):
        params, varargs, varkw, defaults = getargspec(func)
        function_name = (name or getattr(func, '_decorated_function', func).__name__)

        class InformerNode(template.Node):
            def __init__(self, fn_string, template_name, *args, **kwargs):
                self.job_class = InformerJob(fn_string, template_name, lifetime, fetch_on_miss, cache_ttl)
                self.args = args
                self.kwargs = kwargs

            def get_resolved_arguments(self, context):
                resolved_args = [var.resolve(context) for var in self.args]
                resolved_kwargs = dict((k, v.resolve(context)) for k, v in self.kwargs.items())
                return resolved_args, resolved_kwargs

            def render(self, context):
                resolved_args, resolved_kwargs = self.get_resolved_arguments(context)
                return self.job_class.get(*resolved_args, **resolved_kwargs)

        def wrapper(parser, token):
            bits = token.split_contents()[1:]
            args, kwargs = parse_bits(parser, bits, params, varargs, varkw, defaults, False, function_name)
            return InformerNode('%s:%s' % (func.__module__, func.__name__), template_name, *args, **kwargs)

        wrapper.__doc__ = func.__doc__
        register.tag(function_name, wrapper)
        return func
    return decorator

Failure to update the cached value when async_refresh is called

I'm seeing a situation where my cached values are not being updated. I'm running 0.3

The only way I've found so far to get the cached value to update is to open up a shell, create an instance of my Job sub-class and call it's refresh() method. Calling the async_refresh() method fails to update the cached value. Unfortunately I'm only seeing this issue on a live server, so its a little tricky getting debug information out.

Confusing results of to_bytestring method

Cacheback uses this code for the batteries included version of FunctionJob.

Because it calls str on the args it can lead to wrong results if the value of that call is same for two different objects.

An example of such case are Django models instances which can, for two different objects, have same str return value.

Therefore I suggest we either fix this (preferable solution) or expand the docs to document this behaviour.

I'm happy to contribute code on this via a PR - just wanted to touch base first and discuss the direction.

Wrong key getting generated kwargs

This is a major bug!

cacheback does the following to generate key for the cache:
self.hash(tuple(kwargs.keys()

Keys are orderless in python and creating a tuple of different order creates a different key.
This causes random cache misses.

What's missing to support Django 1.10?

Hi, this lib is awesome! I want to use it in a Django 1.10 project, but setup.py forces it to support only <1.10. Is there any huge incompatibility with 1.10?

Cacheback templatetag

I would like a general purpose cacheback template tag that functions the same as the builtin cache template tag except uses cacheback to update the cache asynchronously. I started a branch to implement such a tag, but there is a major problem of serializing the context to send to celery to process. I cannot get it to serialize with cPickle at all and with regular pickle it took ~30 seconds which is clearly unacceptable and would require some sort of custom serializer to use pickle instead of cPickle. Does anybody have any ideas how we can serialize the request context to celery in a reasonable amount of time?

djcelery is a redundant dependency

New Celery versions do not need djcelery for Django integration. Recommended that this dependency is removed.

I'm trying to work on a patch for this, though it isn't clear how the tests cover this stuff, if at all.

'QuerySetFilterJob' object has no attribute 'filter'

I use the example

staff = QuerySetFilterJob(models.User).filter(is_staff=True)

When I executed the above code, here is error:
'QuerySetFilterJob' object has no attribute 'filter'
I think it should be:

QuerySetFilterJob(models.User).get(is_staff=True)

django-rq Support

I'm working on supporting both Celery and django-rq.
Is it worth to do a pull request or is there no interest in having django-rq support?

Newbie problems -- would appreciate help!

Hey there, thank you very much for this library!

I'm pretty sure I'm just not using it right: I'm trying to update my top/latest/etc. stories on my front page with an async event.

I just got this result on my celery daemon by running
python manage.py celeryd -l DEBUG

[2012-12-15 14:54:03,707: INFO/MainProcess] Refreshed cache in 22.853967 seconds
[2012-12-15 14:54:14,371: INFO/MainProcess] Task cacheback.tasks.refresh_cache[9a9a5f09-23d3-4ff5-8bf1-c132a9098f34] succeeded in 33.6776080132s: None

Here's some relevant view code + update functions

def fetch_cached_index(page, section, flags, order_by):
    cache_key = 'index.%s.%s.%s.%s' % (page, section, flags, order_by)

    item = cache.get(cache_key)

    if item is None:
        # Scenario 1: Cache miss - return empty result set and trigger a refresh
        filtered_objects = update_cached_index(page, section, flags, order_by)

    else:
        # item is tuple like so (data, expire time)
        filtered_objects, expiry = item

        print 'expiry, ', expiry
        print 'now, ', datetime.now()

        if expiry < datetime.now():
            print 'updating ze cache'

            # Scenario 2: Cached item is stale - return it but trigger a refresh
            async_updater = cacheback(lifetime=15)(update_cached_index)
            async_updater(page, section, flags, order_by)

    return filtered_objects

def update_cached_index(page, section, flags, order_by):

    print 'async updatin??'

    cache_key = 'index.%s.%s.%s.%s' % (page, section, flags, order_by)

    # Filters
    filtered_objects = Experience.objects.all()
    q = Q()

    if (flags & 1) == 1:
        q |= Q(story__gt=u'')
    if (flags & 2) == 2:
        q |= Q(screenshot__isnull=False)
    if (flags & 4) == 4:
        q |= Q(replay__isnull=False)

    # Section
    if section == 'hawt':
        order_by = '-score'
        #filtered_objects = filtered_objects.filter(date_published__gte=datetime.now()-timedelta(days=3))
    elif section == 'top':
        order_by = '-karma'
    elif section =='new':
        pass

    # Force evaluation for caching
    filtered_objects = list(filtered_objects.select_related().filter(q).order_by(order_by))

    now = datetime.now()

    # Lifetime of 15 seconds
    cache.set(cache_key, (filtered_objects, now + timedelta(seconds=15)), 2592000)

    return filtered_objects

Basically I'm following the main example using cacheback, but one big thing I'm doing is synchronously returning a first result instead of an empty set. Am I doing this right? Why did it take 20+ seconds to do that result when I ran the test like so:
siege -c20 http://mysite.org

If I just refresh the page it seems OK, dunno what's going on!

Sorry if I'm bothering.

EDIT:
Also, with the code:

        if expiry < datetime.now():
            print 'updating ze cache'

            # Scenario 2: Cached item is stale - return it but trigger a refresh
            async_updater = cacheback(lifetime=15)(update_cached_index)
            async_updater(page, section, flags, order_by)

How does that stop the "cache stampede"? I'll look through the decorator code more, but it seems like that could get called just as many times as the database code itself?!

EDIT:
Any requests for this user's tweets during the period that Celery is refreshing the cache will also return None. However Cacheback is aware of cache stampedes and does not trigger any additional jobs for refreshing the cached item.
I see my celery daemon running many tasks when it expires--I must be doing something wrong :)

Stale-while-error functionallity

We use cacheback a lot for async 'always fast' fetching of data that is requested regularly. Example: Our API exposes weather info. This data is fetched from a 3rd party vendor API. Occasionally that API has quirks for a certain (usually short) period, which (within certain boundaries of course) we want to hide by returning stale data.

If I understand correctly, once a request is made outside of lifetime but within cache_ttl, the entry stored in cache is replaced by one with ttl of timeout.

So, if request somehow fails, after 'timeout', the cached entry is gone as well, and errors will be visible.

I'd like to elegantly implement a stale-while-error mechanism in Cacheback.

Reference: Similar functionality in Fastly (basically varnish-as-a-service) and Nginx

Haven't looked into implementation but some first thoughts:

  • Similar to the fetch() method that needs to be implemented on a Job subclass, provide a method that can be implemented where errors can be handled. handle_error_while_stale() for example.
  • If not using a custom Job class, some way to provide a callable that decorates fetch(), catches errors and re-raises something like AllowedWhileStaleError

Do you see any value in this?

Job.delete tries to fetch the item before deleting

Job.delete does

item = self.cache.get(key)
if item is not None:
    self.cache.delete(key)

Why is this the expected behaviour? To me it seems like doing double work just to delete a key, we could just call self.cache.delete(key) and save a network call.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.