Redis Series (IV) -- Memory Elimination Mechanism (including Suggestions for Memory Optimization in Stand-alone Edition)

Keywords: Redis encoding less Database

The memory of each redis server is limited, and not all memory is used to store information. Moreover, the implementation of redis does not optimize the memory too much, so the implementer has taken some measures to control the memory in order to prevent the memory from being too saturated.

The structure of this paper is: (1) memory strategy; (2) the principle of memory release mechanism; (3) how to rationally apply the elimination strategy in the project; (4) the attention points of memory optimization in Redis single-machine version.

This series:

(1) Redis Series (1) - Installation, helloworld, and understanding of configuration files

(2)Redis Series (2) - Caching Design (Full Table Caching and Ranking Caching Scheme Implementation)

(3) Redis series (3) - expiration strategy

1. Memory strategy: Let's eat an official document first.

The setting of maximum memory is accomplished by setting maxmemory bytes. When the current memory used exceeds the maximum memory set, it is necessary to release the memory. When it needs to be released, it is necessary to use some strategy to delete the saved objects. Redis has six strategies (volatile-lru is the default).

In redis, when the memory exceeds the limit, the key-value is eliminated according to the configuration strategy, so that the memory can continue to have enough space to save new data. When redis determines to expel a key-value pair, it deletes the data and publishes the data change message to the local (AOF persistence) and slave (master-slave connection).

(1) volatile-lru: Select the least recently used data from the data set (server.db[i].expires) that has set the expiration time.

(2) volatile-ttl: Select data that will expire from a set of data sets (server.db[i].expires) that have set expiration times

(3) volatile-random: Select data culling arbitrarily from a set of data sets (server.db[i].expires) that have set expiration times

(4) allkeys-lru: Selecting the least recently used data from the data set (server.db[i].dict) and eliminating it

(5) allkeys-random: data elimination by arbitrary selection from a data set (server.db[i].dict)

(6) no-enviction: Prohibiting the Elimination of Data

In addition, there is a configuration item, maxmemory-samples, the default value is 3, because the above policy code implements the approximate algorithm, so whether lru algorithm or ttl, are not all data-based algorithms in the database, because when there is a lot of data in the database, this is too efficient. Low, so the code is based on maxmemory-samples data approximation algorithm. Read below for details.

How does the replacement strategy work?

1) The client executes a new command, causing the database to add data (such as set key value)

2) Redis checks memory usage. If memory usage exceeds maxmemory, some key s will be deleted according to the replacement strategy.

3) Successful execution of new orders

Be careful:

If we continue to write data, the memory will reach or exceed the maxmemory limit, but the replacement strategy will reduce the memory usage below the upper limit.

If you need to use a lot of memory at a time (such as writing a large set at a time), Redis's memory usage may exceed the maximum memory limit for a period of time.

2. The principle of memory release mechanism:

(1) Overview:

When mem_used memory exceeds the maxmemory setting, redis.c/freeMemoryIfNeeded(void) functions are triggered for all read and write requests to clear the excess memory. Note that the cleaning process is blocked until enough memory space is cleared. Therefore, if the maxmemory is reached and the caller is writing continuously, the active cleaning strategy may be triggered repeatedly, resulting in a certain delay in the request.

When cleaning up, the user configures maxmemory-policy to do appropriate cleaning (usually LRU or TTL), where the LRU or TTL strategy is not for all the keys of redis, but to sample cleaning with the maxmemory-samples keys in the configuration file as the sample pool.

The default configuration of maxmemory-samples in redis-3.0.0 is 5. If it is increased, the accuracy of LRU or TTL will be improved. The result of redis author's test is that the accuracy of full LRU is very close when this configuration is 10, and increasing maxmemory-samples will lead to more CPU time consumption in active cleaning. The following suggestions are made :

1) Do not trigger maxmemory as far as possible. It is better to consider increasing hz to speed up elimination or cluster expansion after mem_used memory occupies a certain proportion of maxmemory.

2) If you can control memory, you can not modify the maxmemory-samples configuration; if Redis itself is a LRU cache service (which is usually in the maxmemory state for a long time and is automatically eliminated by LRU by Redis), you can adjust maxmemory-samples appropriately.

(2) Memory management source code parsing: Reference Blog

Redis releases memory by the function freeMemoryIfNeeded. redis processes each command with the processCommand function, which calls the freeMemoryIfNeeded function before the real command is processed. This function determines whether the current memory used exceeds the maximum memory used. If it exceeds, it will be interpreted according to memory. Placement policy releases memory.

The freeMemoryIfNeeded function first calculates how much memory is currently used. Note that slaves output cache and AOF cache are not included here. The source code is as follows:

int freeMemoryIfNeeded(void) {
    size_t mem_used, mem_tofree, mem_freed;
    int slaves = listLength(server.slaves);

    /* Remove the size of slaves output buffers and AOF buffer from the
     * count of used memory. 
     */ 
     //The slave output buffer and aof buffer are not calculated when calculating the occupied memory size, so the maxmemory should be smaller than the actual memory, leaving enough space for the two buffers.
    mem_used = zmalloc_used_memory();
    if (slaves) {
        listIter li;
        listNode *ln;

        listRewind(server.slaves,&li);
        while((ln = listNext(&li))) {
            redisClient *slave = listNodeValue(ln);
            unsigned long obuf_bytes = getClientOutputBufferMemoryUsage(slave);
            if (obuf_bytes > mem_used)
                mem_used = 0;
            else
                mem_used -= obuf_bytes;
        }
    }
    if (server.appendonly) {
        mem_used -= sdslen(server.aofbuf);
        mem_used -= sdslen(server.bgrewritebuf);
    }
//Determine whether the used memory exceeds the maximum used memory, and if not, return REDIS_OK.
    /* Check if we are over the memory limit. */
    if (mem_used <= server.maxmemory) return REDIS_OK;
//When the maximum memory usage is exceeded, it is necessary to determine which memory release strategy redis uses at this time, and take different measures according to different strategies.
//(1) First determine whether it is a no-enviation policy, and if so, return REDIS_ERR, and redis will no longer accept any write commands.
    if (server.maxmemory_policy == REDIS_MAXMEMORY_NO_EVICTION)
        return REDIS_ERR; /* We need to free memory, but policy forbids. */

    /* Compute how much memory we need to free. */
    mem_tofree = mem_used - server.maxmemory;
    mem_freed = 0;
    //(2) The next step is to determine whether the elimination strategy is based on all keys or only on keys with expiration time set. If it is for all keys, it extracts data from server.db[j].dict, and if it is for keys with expiration time set, it extracts data from server.db[j].expires.
    while (mem_freed < mem_tofree) {
        int j, k, keys_freed = 0;

        for (j = 0; j < server.dbnum; j++) {
            long bestval = 0; /* just to prevent warning */
            sds bestkey = NULL;
            struct dictEntry *de;
            redisDb *db = server.db+j;
            dict *dict;
    //(3) Then determine whether it is a random strategy, including volatile-random and allkeys-random. These two strategies are the simplest, which is to delete a key randomly from the data set above and then delete it.
            if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
                server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM)
            {
                dict = server.db[j].dict;
            } else {
                dict = server.db[j].expires;
            }
            if (dictSize(dict) == 0) continue;
//Then we judge whether allkeys-random or volatile-ttl strategy
            /* volatile-random and allkeys-random policy */
            if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM ||
                server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_RANDOM)
            {
                de = dictGetRandomKey(dict);
                bestkey = dictGetEntryKey(de);
            }//If random delete, select a key randomly from dict
//Then it is to judge whether lru strategy or ttl strategy is adopted, and if lru strategy is adopted, lru approximation algorithm is adopted.
            /* volatile-lru and allkeys-lru policy */
            else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
                server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
            {
                for (k = 0; k < server.maxmemory_samples; k++) {
                    sds thiskey;
                    long thisval;
                    robj *o;

                    de = dictGetRandomKey(dict);
                    thiskey = dictGetEntryKey(de);
                    /* When policy is volatile-lru we need an additonal lookup
                     * to locate the real key, as dict is set to db->expires. */
                    if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
                        de = dictFind(db->dict, thiskey); //Because the last access time of the key is not recorded in the data structure maintained by dict - > expires
                    o = dictGetEntryVal(de);
                    thisval = estimateObjectIdleTime(o);

                    /* Higher idle time is better candidate for deletion */
                    if (bestkey == NULL || thisval > bestval) {
                        bestkey = thiskey;
                        bestval = thisval;
                    }
                }//In order to reduce the computational complexity, the lru algorithm of redis and expire elimination algorithm are both non-optimal solutions. lru algorithm is to select maxmemory_samples (default setting is 3) key s in the corresponding dict and select the lru among them to eliminate.
            }
//If it is a ttl strategy. The ttl strategy is simple: take the maxmemory_samples key, compare their expiration time, and find the fastest expiration key from these keys, which is the key we are going to delete.
            /* volatile-ttl */
            else if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_TTL) {
                for (k = 0; k < server.maxmemory_samples; k++) {
                    sds thiskey;
                    long thisval;

                    de = dictGetRandomKey(dict);
                    thiskey = dictGetEntryKey(de);
                    thisval = (long) dictGetEntryVal(de);

                    /* Expire sooner (minor expire unix timestamp) is better
                     * candidate for deletion */
                    if (bestkey == NULL || thisval < bestval) {
                        bestkey = thiskey;
                        bestval = thisval;
                    }
                }//Note that the ttl implementation, as above, is to select maxmemory_samples for selection
            }
//According to different strategies, we found the keys to be deleted. Here is the time to delete them. Delete the selected key-value pairs.
            /* Finally remove the selected key. */
            if (bestkey) {
                long long delta;

                robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
                // Publish data update messages, mainly AOF persistence and slave
                propagateExpire(db,keyobj); //Spread del commands to slaves

    // Note that propagateExpire() may result in memory allocation,
    // propagateExpire() is executed ahead of schedule because redis only computes
    // The size of memory released by dbDelete(). If dbDelete() is calculated at the same time
    // Released memory and propagateExpire() allocation space size, and so on
    // At the same time, assuming that the allocation space is larger than the release space, it is possible to never get out of this cycle.
    // The following code calculates both the memory released by dbDelete() and the propagateExpire() allocation space.
                /* We compute the amount of memory freed by dbDelete() alone.
                 * It is possible that actually the memory needed to propagate
                 * the DEL in AOF and replication link is greater than the one
                 * we are freeing removing the key, but we can't account for
                 * that otherwise we would never exit the loop.
                 *
                 * AOF and Output buffer memory will be freed eventually so
                 * we only care about memory used by the key space. */
              // Calculate only the size of dbDelete() released memory
                delta = (long long) zmalloc_used_memory();
                dbDelete(db,keyobj);
                delta -= (long long) zmalloc_used_memory();
                mem_freed += delta;
                server.stat_evictedkeys++;
                decrRefCount(keyobj);
                keys_freed++;

                /* When the memory to free starts to be big enough, we may
                 * start spending so much time here that is impossible to
                 * deliver data to the slaves fast enough, so we force the
                 * transmission here inside the loop. */
                 // Send data in slave recovery space to slave in time
                if (slaves) flushSlavesOutputBuffers();
            }
        }//Traverse through all db s, and then determine whether the deleted key released enough space, failed to release space, and redis still used excessive memory size, failed to return
        if (!keys_freed) return REDIS_ERR; /* nothing to free... */
    }
    return REDIS_OK;
}

This function is called before a specific command is executed and returns to OK when the current memory footprint is below the limit. So it's possible that redis takes up more memory than maxmemory limits after subsequent commands are executed. Therefore, maxmemory is the maximum memory footprint that redis needs to ensure to execute commands, not the actual maximum memory footprint of redis. (without considering slave buffer and aof buffer).

TTL Data Elimination Mechanism:

In the data structure of redis dataset, a table of key-value pair expiration time, redisDb.expires, is saved.

Definition:

Several key-value pairs were randomly selected from the table of expiration time, and the largest one of ttl key-value pairs was eliminated. You will also find that redis is not guaranteed to get the fastest expired key pairs in the table with all expiration times, but just randomly selected key pairs.

freeMemoryIfNeeded function on TTL source code:

//Select data that will expire
else if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_TTL) {
 // server.maxmemory_samples is the number of key-value pairs selected randomly
    // Randomly select server.maxmemory_samples key-value pairs to expel data that will expire as soon as possible
    for (k = 0; k < server.maxmemory_samples; k++) {
        sds thiskey;
        long thisval;

        de = dictGetRandomKey(dict);
        thiskey = dictGetKey(de);
        thisval = (long) dictGetVal(de);

        /* Expire sooner (minor expire unix timestamp) is better
         * candidate for deletion */
        if (bestkey == NULL || thisval < bestval) {
            bestkey = thiskey;
            bestval = thisval;
        }
    }
}

LRU Data Elimination Mechanism:

The LRU counter server.lrulock is saved in the server configuration and updated regularly (redis timer server Corn ()). The value of server.lrulock is calculated based on server.unixtime. In addition, it can be found from struct redisObject that each redis object sets the corresponding lru. It is conceivable that redisObject.lru will be updated every time data is accessed.

Definition of LRU Data Elimination Mechanism:

Several key-value pairs are selected randomly in the data set, and the minimum key-value pair of lru is eliminated. So, you'll find that redis 
It is not guaranteed to get the least recently used (LRU) key pairs in all data sets, but only the key pairs of several randomly selected key pairs.

freeMemoryIfNeeded function on LRU source code:

//Different strategies operate on different datasets
if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
    server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM)
{
    dict = server.db[j].dict;
} else {//The operation is to set the key set with expiration time
    dict = server.db[j].expires;
}
if (dictSize(dict) == 0) continue;

/* volatile-random and allkeys-random policy */
//Random selection for elimination
if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM ||
    server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_RANDOM)
{
    de = dictGetRandomKey(dict);
    bestkey = dictGetKey(de);
}

/* volatile-lru and allkeys-lru policy */
//Specific LRU algorithm
else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||
    server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)
{
    struct evictionPoolEntry *pool = db->eviction_pool;

    while(bestkey == NULL) {
        //Select random patterns and select data to be eliminated from samples using LRU algorithm
        evictionPoolPopulate(dict, db->dict, db->eviction_pool);
        /* Go backward from best to worst element to evict. */
        for (k = REDIS_EVICTION_POOL_SIZE-1; k >= 0; k--) {
            if (pool[k].key == NULL) continue;
            de = dictFind(dict,pool[k].key);
            sdsfree(pool[k].key);
            //Move the element after pool+k+1 forward by one unit
            memmove(pool+k,pool+k+1,
                sizeof(pool[0])*(REDIS_EVICTION_POOL_SIZE-k-1));
            /* Clear the element on the right which is empty
             * since we shifted one position to the left.  */
            pool[REDIS_EVICTION_POOL_SIZE-1].key = NULL;
            pool[REDIS_EVICTION_POOL_SIZE-1].idle = 0;
            //Choose data to be phased out
            if (de) {
                bestkey = dictGetKey(de);
                break;
            } else {
                /* Ghost... */
                continue;
            }
        }
    }
}

LRU algorithm is implemented in evictionPoolPopulate method.

3. How to rationally apply the elimination strategy in the project:

This part is reproduced in this blogger's blog

(1) Setting up maxmemory reasonably:

There are two ways to modify it:

1. Set by CONFIG SET:

127.0.0.1:6379> CONFIG GET maxmemory
1) "maxmemory"
2) "0"
127.0.0.1:6379> CONFIG SET maxmemory 80MB
OK
127.0.0.1:6379> CONFIG GET maxmemory
1) "maxmemory"
2) "83886080"

2. Modify the configuration file redis.conf: Configuration File Explanation

maxmemory 80mb

Note: In 64 bit system, maxmemory is set to 0 to indicate that there is no restriction on Redis memory usage. In 32 bit system, maxmemory can not be more implicitly than 3GB.

When the Redis memory usage reaches the specified limit, a replacement strategy needs to be selected.

(2) Selection of replacement strategy:

When the Redis memory usage reaches maxmemory, the old data is replaced with the maxmemory-policy set.

The method of setting maxmemory-policy is similar to the method of setting maxmemory, which is dynamically modified by redis.conf or CONFIG SET.

If no key is matched to be deleted, then volatile-lru, volatile-random and volatile-ttl strategies are the same as noeviction replacement strategies -- no key is replaced.

It's important to choose the appropriate replacement strategy, which depends on the access mode of your application. Of course, you can also dynamically modify the replacement strategy and output the cache hit rate by using the Redis command - INFO, which can then be used to optimize the replacement strategy.

For scenarios used by some strategies:

1) allkeys-lru: If our application's access to caches conforms to power-law distribution (i.e., there are relatively hot data), or we are not very clear about the cache access distribution we are applying, we can choose allkeys-lru strategy.

All keys are most frequently used recently, so you need to choose allkeys-lru to replace the least frequently used keys recently, if you are not sure which strategy to use.

Setting expire takes up some memory, but using allkeys-lru does not necessarily set expire time, which can make more efficient use of memory.

2) allkeys-random: If our application has the same access probability to cache keys, we can use this strategy.

If the access probability of all keys is the same, the allkeys-random strategy can be used to replace the data.

3) volatile-ttl: This strategy allows us to prompt Redis which key s are more suitable for eviction.

If you know enough about the data and can specify hint for key (specified by expire/ttl), volatile-ttl can be chosen for replacement.

4) volatile-lru policy and volatile-random policy are suitable for us to apply a Redis instance to both cache and persistent storage. However, we can achieve the same effect by using two Redis instances. It is worth mentioning that setting key expiration time actually consumes more memory. Therefore, we suggest using allkeys-lru policy to be more effective. Rate of memory usage.

Fourth, Redis memory optimization attention points: (first for stand-alone version)

(1) Coding of Redis: This section refers to this blogger's blog

Summary:

Many data types can be optimized by special encoding. Among them, Hash, List and Sets composed of Integer can optimize the storage structure in this way so as to occupy less space, in some cases, 9/10 of the space can be saved.

These special codes are completely transparent to Redis usage, in fact, they are only a transaction between CPU and memory. If memory usage is higher, then CPUs are naturally consumed when manipulating data, and vice versa. A set of configuration parameters is provided in Redis to set various thresholds related to special coding.

Threshold setting in detail:

    #If the number of fields in Hash is less than the parameter value, Redis will use a special encoding for the Key's HasValue.
    hash-max-zipmap-entries 64
    #If the maximum length of each field in Hash does not exceed 512 bytes, Redis will also use a special encoding method for the HasValue of the Key.
    hash-max-zipmap-value 512
    #The meaning of the following two parameters is basically the same as the above two Hash-related parameters, except that the object type is List.
    list-max-ziplist-entries 512
    list-max-ziplist-value 64
    #If the number of integer elements in set does not exceed 512, Redis will use this special encoding.
    set-max-intset-entries 512
    #If the number of elements in zset is less than 128 or the length of each field is less than 64, redis uses special encoding for zset 
    zset-max-ziplist-entries 128
    zset-max-ziplist-value 64

If a coded value is modified beyond the maximum limit in the configuration information, Redis will automatically convert it to a normal encoding format. This operation is very fast, but if in turn, a larger value of normal encoding will be converted to a special encoding. Redis's recommendation is that the conversion efficiency should be tested simply before doing it formally. Because such conversion is often very inefficient.

(2) Use bit-level operations and byte byte-level operations to reduce unnecessary memory usage:

bit-level operations: GETRANGE, SETRANGE, GETBIT and SETBIT

Byte byte level operations: GETRANGE and SETRANGE

Redis provides four commands for string type Key/Value: GETRANGE/SETRANGE/GETBIT/SETBIT. With these commands, we can access value data of String type just like operation arrays. For example, the ID that uniquely identifies the user may be just one of the substrings of the String value. This can be easily extracted by GETRANGE/SETRANGE command. Furthermore, BITMAP can be used to represent the user's gender information, such as 1 for male and 0 for female. This way of representing 100,000,000 users'gender information only takes up 12 MB of storage space. At the same time, data traversal through SETBIT/GETBIT commands is also very efficient.

(3) Use hashes whenever possible, because small hashes are encoded into a very small space.

Because small Hash type data occupies relatively less space, we should consider using Hash type as much as possible in practical application, such as user registration information, which includes the fields of name, gender, email, age and password. Of course, we can store this information in the form of Key, while the information that users fill in is stored in the form of String Value. Redis, however, prefers to store in Hash, and the above information is expressed in Field/Value.

Now let's further prove this by learning about Redis's storage mechanism. Special coding mechanisms have been mentioned at the beginning of this blog, including two configuration parameters related to Hash types: hash-max-zipmap-entries and hash-max-zipmap-value. As far as their scope of action has been given before, there will be no more redundancy here. Now let's assume that the number of fields stored in Hash Value is less than hash-max-zipmap-entries, and the length of each element is also less than hash-max-zipmap-value. In this way, whenever there is a new Hash type of Key/Value storage, Redis creates a fixed length of space for Hash Value, with the maximum number of bytes that can be allocated:

 total_bytes = hash-max-zipmap-entries * hash-max-zipmap-value

In this way, the positions of all fields in Hash are reserved, and Field/Value can be accessed randomly as an array, with a hash-max-zipmap-value interval between them. Only when the number of fields in Hash Value or the length of a new element exceeds the above two parameter values, Redis will consider re-storing them as Hash Table, otherwise this efficient storage and access method will always be maintained. Moreover, because each Key stores some related system information, such as expiration time, LRU, etc., Hash type greatly reduces the number of keys (most keys are represented and stored in Hash fields) compared with String type Key/Value, thus further optimizing the efficiency of storage space.

(4) Reasonable design of expiration strategy:

Please refer to this blog Click here

(5) Reasonable memory allocation:

If Redis is not set up by maxmemory, it will continue to allocate memory because it thinks it is appropriate, so it can eat up all of your available memory. Therefore, it is generally recommended to configure some limitations. You may also need to set the maxmemory policy, which defaults to noeviction (which is not the default value for some older versions of edis).

This causes Redis to return erroneous write commands that are out of memory, if it reaches its limit - which in turn may lead to application errors, but does not result in machine death due to memory starvation.

(6) Compress your data before saving it to Redis.

(7) Some real Redis memory designs will be written later in this series. For example, optimization of shared objects, [string] -> [hash] -> [segment-hash]

Okay, Redis Series (4) - Memory Elimination Mechanism (including the single-machine version of memory optimization recommendations) is finished. This is Redis memory optimization notes in the project process. Now it is listed for you. This is a necessary step to accumulate. I will continue this series of articles and share my experience with you. Welcome to point out the mistakes below and learn together!! Your praise is my best support!!

More content, accessible Jack Frost's blog

Posted by asurfaceinbetween on Fri, 28 Jun 2019 14:21:52 -0700