Django caching a large list

妖精的绣舞 提交于 2019-12-21 17:18:29

问题


My django application deals with 25MB binary files. Each of them has about 100,000 "records" of 256 bytes each.

It takes me about 7 seconds to read the binary file from disk and decode it using python's struct module. I turn the data into a list of about 100,000 items, where each item is a dictionary with values of various types (float, string, etc.).

My django views need to search through this list. Clearly 7 seconds is too long.

I've tried using django's low-level caching API to cache the whole list, but that won't work because there's a maximum size limit of 1MB for any single cached item. I've tried caching the 100,000 list items individually, but that takes a lot more than 7 seconds - most of the time is spent unpickling the items.

Is there a convenient way to store a large list in memory between requests? Can you think of another way to cache the object for use by my django app?


回答1:


edit the item size limit to be 10m (larger than 1m), add

-I 10m

to /etc/memcached.conf and restart memcached

also edit this class in memcached.py located in /usr/lib/python2.7/dist-packages/django/core/cache/backends to look like this:

class MemcachedCache(BaseMemcachedCache):
"An implementation of a cache binding using python-memcached"
def __init__(self, server, params):
    import memcache
    memcache.SERVER_MAX_VALUE_LENGTH = 1024*1024*10 #added limit to accept 10mb
    super(MemcachedCache, self).__init__(server, params,
                                         library=memcache,
                                         value_not_found_exception=ValueError)



回答2:


I'm not able to add comments yet, but I wanted to share my quick fix around this problem, since I had the same problem with python-memcached behaving strangely when you change the SERVER_MAX_VALUE_LENGTH at import time.

Well, besides the __init__ edit that FizxMike suggests you can also edit the _cache property in the same class. Doing so you can instantiate the python-memcached Client passing the server_max_value_length explicitly, like this:

from django.core.cache.backends.memcached import BaseMemcachedCache

DEFAULT_MAX_VALUE_LENGTH = 1024 * 1024

class MemcachedCache(BaseMemcachedCache):
    def __init__(self, server, params):
        #options from the settings['CACHE'][connection]
        self._options = params.get("OPTIONS", {})
        import memcache
        memcache.SERVER_MAX_VALUE_LENGTH = self._options.get('SERVER_MAX_VALUE_LENGTH', DEFAULT_MAX_VALUE_LENGTH)

        super(MemcachedCache, self).__init__(server, params,
                                             library=memcache,
                                             value_not_found_exception=ValueError)

    @property
    def _cache(self):
        if getattr(self, '_client', None) is None:
            server_max_value_length = self._options.get("SERVER_MAX_VALUE_LENGTH", DEFAULT_MAX_VALUE_LENGTH)
            #one could optionally send more parameters here through the options settings,
            #I simplified here for brevity
            self._client = self._lib.Client(self._servers,
                server_max_value_length=server_max_value_length)

        return self._client

I also prefer to create another backend that inherits from BaseMemcachedCache and use it instead of editing django code.

here's the django memcached backend module for reference: https://github.com/django/django/blob/master/django/core/cache/backends/memcached.py

Thanks for all the help on this thread!



来源:https://stackoverflow.com/questions/11874377/django-caching-a-large-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!