Analyzing and solving the problem of memory leakage caused by improper use of Python local cache.

contexts

Recently a large version of the online, Python written api main service memory usage has risen significantly, service restart a few hours after the machine will be triggered by 90% memory consumption alarms, the analysis found that the improper use of the local cache caused by a memory leakage problem, here to record the analysis process.

Problem analysis

LocalCache Implementation Analysis

The approximate implementation code for this cache is as follows:

class LocalCache():
    notFound = object() # Define the unique object that is returned when the cache fails to hit
    # list dict etc. do not support weak references per se, but their subclasses do, so here's the wrapping
    class Dict(dict):
        def __del__(self):
            pass

    def __init__(self, maxlen=10): # maxlenSpecifies the maximum number of objects to be cached.
         = () # store weakly referenced dicts of cached objects
         = (maxlen=maxlen) # deque for storing strong references to cached objects

    # Find the object corresponding to the key from the cached dict, return notFound if it has expired or does not exist.
    def get_ex(self, key):
        value = (key, )
        if value is not :
            expire = value['expire']
            if () > expire:
                return 
            else:
                return value['result']
        return 

    # Set kv to the cache dict and set its expiration time
    def set_ex(self, key, value, expire):
        [key] = strongRef = ({'result': value, 'expire': ()+expire})
        (strongRef)

As the above code, the core of the LocalCache lies in an object that stores weak references and a deque object that stores strong references (for an introduction to weak references and strong references in Python, see this article --).Exploring weak references and base type support in Python ), the maximum number of cached objects can be specified when LocalCache is instantiated. Use the set_ex method to set a new cache kv, get_ex to get the cache object for the specified key, and return notFound if the key does not exist or has expired.
The LocalCache removes queue elements in FIFO order when maxlen is reached via deque, and once all strong references to an object have been removed, the WeakValueDictionary feature ensures that weak references to the corresponding object are also removed directly from the dict, thus realizing a simple local cache with support for expiration time and maximum number of cached objects. This results in a simple local cache that supports expiration time and maximum number of cached objects.

Bug evaluation of memory consumption for LocalCache usage

According to the above LocalCache principle, theoretically, as long as you set a reasonable expiration time and maxlen value, you should be able to ensure the reasonable use of its reasonable memory, but this new version of the release of the new version is similar to the following two LocalCache.

id_local_cache0 = LocalCache(500000)
id_local_cache1 = LocalCache(500000)
id_local_cache0.set_ex('user_id_012345678901', 'display_id_ABCDEFGH', 1800)
id_local_cache1.set_ex('display_id_ABCDEFGH', 'user_id_012345678901', 1800)

Such as the definition of the two 50w size cache, the cache is used within the business user_id to the user's app visible display_id mapping relationship, the mapping relationship in the user's creation of the generation of a fixed, you can set a longer period of time, if at the same time the number of valid objects exceeds the maxlen, the LocalCache directly equivalent to a LRU. If the number of valid objects exceeds the maxlen, the LocalCache is directly equivalent to an LRU, and the object release can completely rely on the deque's FIFO elimination mechanism.
The following factors were taken into account when evaluating its memory footprint at the very beginning.

A single k and v pair user_id up to 20 bytes, display_id up to 8 bytes, plus 8 bytes for the expiration time float field to be deposited, for a total size of 20 + 8 + 8 = 36, plus some extra spending up to 100 bytes
Maximum 50w limit memory usage: 500000 * 100/1024 = 47.6MB
The online api service is a multi-process operation provided by the uWSGI framework, with 4 worker processes on a single machine, total memory consumption: 47.6 * 4 = 190MB.
Memory used by two LcoalCache: 190MB * 2 = 380MB

According to this calculation a host even if each process is cached full of 50w objects, will increase the memory occupation of less than 400MB, not to mention that according to estimates at the same time in the validity of the cached objects should be far less than 50w, so the remaining memory should be more than enough, however, this evaluation is actually much smaller than the actual value.

Proper Evaluation of LocalCache Memory Usage

After the memory problem on the line, I tried to use tracemalloc to analyze the memory allocation of the online service, and found that a lot of memory is concentrated in the piece of LocalCache, so I reevaluated this memory occupation in conjunction with the actual, and found the following problems:

str and float of the memory footprint assessment error, even if str itself len only 10 characters, its memory footprint is actually much larger than 10, and float is not 8 bytes but 24 bytes, the following code can be verified:

In [20]: len('0123456789')
Out[20]: 10
In [21]: ('0123456789')
Out[21]: 59
In [23]: (())
Out[23]: 24

Even an empty dict takes up 64 bytes of memory, and if it is stored in kv it expands even more rapidly to at least 232.

In [24]: ({})
Out[24]: 64
In [26]: ({'result': {'user_id_012345678901': 'display_id_ABCDEFGH'}, 'expire': ()})
Out[26]: 232

Regardless of the expiration time, since resource reclamation of objects stored in the cache is entirely dependent on the deque removing strong references to them - even if an object has expired by time, it will not be reclaimed as long as the deque still holds it - eventually the cache will reach the set maxlen, occupying the maximum memory it can theoretically occupy. so that eventually the cache will reach its theoretical maximum memory capacity.

Combined the above points, although the beginning of the set expiration time is shorter, the number of objects in the LocalCache at the same time is far less than 50w, but ultimately the LocalCache will still be full of 50w objects, while the average memory size of an object deposited into the LocalCache measured in 700 ~ 800 bytes, so that the assessment, and ultimately, the two caches on a single host This evaluation, the final two cache single host need to occupy the maximum and will certainly reach the memory size becomes: 700 * 500000 * 4 * 2 / 1024/1024 = 2.67GB, is the previous evaluation of the wrong value of 6 times ==! That's not enough memory on the host.

Follow-up treatment

The following principles of LocalCache usage are summarized after evaluating the memory footprint correctly in context:

The maxlen setting should be set to a reasonable value according to the actual data - such as the maximum possible number of valid objects at the same time 1.1 ~ 2.0 times, to prevent a large number of expired objects occupy memory for a long time without releasing the situation, check to confirm that the online code there are a number of maxlen greater than the maximum number of valid objects 5 ~ 10 times the number of LocalCache to use.
Split the large objects and small objects at the same time the cache, because occupies a few hundred bytes of small objects of maxlen set to 1,000, 10,000 or even 10w are reasonable, but for occupies a few MB set a dozen MB of objects, maxlen set >100 may already take up a lot of memory.

After optimizing the multiple LocalCache used by the api service according to the above principles, the total amount of memory it occupies dropped by more than 3GB.

summarize

When evaluating cache memory usage in the first version, the evaluation method was used as a matter of course, without measuring the actual size of each type and object, resulting in the evaluation value being much smaller than the actual value.
For LocalCache's object recovery principle is not understood in depth, it has been taken for granted that as long as the effective time has passed, the object will be recycled, without realizing that its recovery is entirely dependent on the deque.
Another problem caused by taking things for granted.

This article on the improper use of Python local cache leads to memory leakage problem analysis and resolution of the article is introduced to this, more related to Python memory leakage content, please search for my previous articles or continue to browse the following related articles I hope that you will support me more in the future!