Gospel of Redis lazyfree big key deletion

background

Severe users of redis should have encountered the use of DEL commands to delete larger keys, or when using FLUSHDB and FLUSHALL to delete databases containing a large number of keys, resulting in redis blocking; in addition, redis in cleaning up expired data and eliminating memory overrun data, if it happens to hit a large number of keys will also cause server blocking.

To solve the above problems, redis 4.0 introduces lazy free mechanism, which can delete keys or database operations in the background thread to execute, so as to avoid server blocking as much as possible.

lazyfree mechanism

The principle of lazyfree is not difficult to imagine, that is, when deleting an object, only logical deletion is made, and then the object is thrown to the background, so that the background thread can execute the real destruct, so as to avoid blocking due to the large size of the object. This is the case with the lazyfree implementation of redis. Let's introduce the implementation of lazyfree from several commands.

1. UNLINK command

First, let's look at the new unlink command:

void unlinkCommand(client *c) {
    delGenericCommand(c, 1);
}

The entry is simple. It calls delGenericCommand. The second parameter, 1, indicates that it needs to be deleted asynchronously.

/* This command implements DEL and LAZYDEL. */
void delGenericCommand(client *c, int lazy) {
    int numdel = 0, j;

    for (j = 1; j < c->argc; j++) {
        expireIfNeeded(c->db,c->argv[j]);
        int deleted  = lazy ? dbAsyncDelete(c->db,c->argv[j]) :
                              dbSyncDelete(c->db,c->argv[j]);
        if (deleted) {
            signalModifiedKey(c->db,c->argv[j]);
            notifyKeyspaceEvent(REDIS_NOTIFY_GENERIC,
                "del",c->argv[j],c->db->id);
            server.dirty++;
            numdel++;
        }
    }
    addReplyLongLong(c,numdel);
}

The delGenericCommand function decides whether to delete synchronously or asynchronously according to the lazy parameter. The logic of synchronous deletion will not change much. We will focus on the implementation of new asynchronous deletion.

#define LAZYFREE_THRESHOLD 64
// Firstly, the threshold of enabling background deletion is defined. When the element in the object is larger than the threshold, it is really thrown to the background thread to delete. If there are too few elements in the object, it is not necessary to throw them to the background thread, because thread synchronization also needs to be consumed.
int dbAsyncDelete(redisDb *db, robj *key) {
    if (dictSize(db->expires) > 0) dictDelete(db->expires,key->ptr);
    //Clear the expiration time of the key to be deleted

    dictEntry *de = dictUnlink(db->dict,key->ptr);
    //dictUnlink returns the entry pointer containing key in the database dictionary and extracts the entry from the database dictionary (no resources are released)
    if (de) {
        robj *val = dictGetVal(de);
        size_t free_effort = lazyfreeGetFreeEffort(val);
        //Lazy freeGetFreeEffort to get the number of elements contained in the val object

        if (free_effort > LAZYFREE_THRESHOLD) {
            atomicIncr(lazyfree_objects,1);
            //Atomic operations add 1 to lazyfree_objects in case the info command checks how many objects are deleted by background threads
            bioCreateBackgroundJob(BIO_LAZY_FREE ,val,NULL,NULL);
            //At this point, the object val is really dropped into the task queue of the background thread.
            dictSetVal(db->dict,de,NULL);
            //Set the value pointer in the entry to NULL to prevent repetitive deletion of val objects when deleting database dictionary entries
        }
    }

    if (de) {
        dictFreeUnlinkedEntry(db->dict,de);
        //Delete database dictionary entries and release resources
        return 1;
    } else {
        return 0;
    }
}

This is the logic of asynchronous deletion. First, it clears the expiration time, then calls dictUnlink to extract the object to be deleted from the database dictionary, then determines the size of the object (too small to delete background), if large enough, throws it to the background thread, and finally cleans up the entry information of the database dictionary.

As can be seen from the above logic, when unlink has a larger key, the actual deletion is handed to the background thread, so redis will not be blocked.

2. FLUSHALL, FLUSHDB commands

4.0 adds option - Async to the flush class command. When the flush class Command follows the async option, it enters the background deletion logic. The code is as follows:

/* FLUSHDB [ASYNC]
 *
 * Flushes the currently SELECTed Redis DB. */
void flushdbCommand(client *c) {
    int flags;

    if (getFlushCommandFlags(c,&flags) == C_ERR) return;
    signalFlushedDb(c->db->id);
    server.dirty += emptyDb(c->db->id,flags,NULL);
    addReply(c,shared.ok);

    sds client = catClientInfoString(sdsempty(),c);
    serverLog(LL_NOTICE, "flushdb called by client %s", client);
    sdsfree(client);
}

/* FLUSHALL [ASYNC]
 *
 * Flushes the whole server data set. */
void flushallCommand(client *c) {
    int flags;

    if (getFlushCommandFlags(c,&flags) == C_ERR) return;
    signalFlushedDb(-1);
    server.dirty += emptyDb(-1,flags,NULL);
    addReply(c,shared.ok);
    ...
}

The logic of flushdb and flushall is basically the same. They call getFlushCommandFlags to get flags (which is used to identify whether asynchronous deletion is used), and then call emptyDb to empty the database. The first parameter is -1, which means that all databases should be emptied.

long long emptyDb(int dbnum, int flags, void(callback)(void*)) {
    int j, async = (flags & EMPTYDB_ASYNC);
    long long removed = 0;

    if (dbnum < -1 || dbnum >= server.dbnum) {
        errno = EINVAL;
        return -1;
    }

    for (j = 0; j < server.dbnum; j++) {
        if (dbnum != -1 && dbnum != j) continue;
        removed += dictSize(server.db[j].dict);
        if (async) {
            emptyDbAsync(&server.db[j]);
        } else {
            dictEmpty(server.db[j].dict,callback);
            dictEmpty(server.db[j].expires,callback);
        }
    }
    return removed;
}

After entering emptyDb, the first step is some checking steps. After checking is passed, the emptyDbAsync function is deleted asynchronously. Synchronized deletion is to call dictEmpty loop to traverse all objects in the database and delete them (which is easy to block redis). Today's core is to delete emptyDbAsync function asynchronously.

/* Empty a Redis DB asynchronously. What the function does actually is to
 * create a new empty set of hash tables and scheduling the old ones for
 * lazy freeing. */
void emptyDbAsync(redisDb *db) {
    dict *oldht1 = db->dict, *oldht2 = db->expires;
    db->dict = dictCreate(&dbDictType,NULL);
    db->expires = dictCreate(&keyptrDictType,NULL);
    atomicIncr(lazyfree_objects,dictSize(oldht1));
    bioCreateBackgroundJob(BIO_LAZY_FREE,NULL,oldht1,oldht2);
}

Directly point DB - > dict and DB - > expires to the newly created two empty dictionaries, and then drop the original two dictionaries into the task queue of the background threads. They are simple and efficient, and are no longer afraid of blocking redis.

Lazy free thread

Next, let's look at the lazy free threads that really work.

First of all, we need to clarify a misunderstanding. Many people refer to redis as a single-threaded memory database, but in fact it is not. Although redis places the processing of network sending and receiving and executing commands on the main worker thread, there are many other bio background threads working conscientiously, such as those used to handle heavy IO operations such as closing files and brushing disks. This time, the bio family has added a new companion, lazy free thread.

void *bioProcessBackgroundJobs(void *arg) {
    ...
        if (type == BIO_LAZY_FREE) {
            /* What we free changes depending on what arguments are set:
             * arg1 -> free the object at pointer.
             * arg2 & arg3 -> free two dictionaries (a Redis DB).
             * only arg3 -> free the skiplist. */
            if (job->arg1)
                lazyfreeFreeObjectFromBioThread(job->arg1);
            else if (job->arg2 && job->arg3)
                lazyfreeFreeDatabaseFromBioThread(job->arg2, job->arg3);
            else if (job->arg3)
                lazyfreeFreeSlotsMapFromBioThread(job->arg3);
        }
    ...
}

redis named the new lazyfree thread BIO_LAZY_FREE. The background thread judged itself as lazyfree thread according to the type, and then executed the corresponding function according to the parameters in bio_job.

The background deletes the object and calls decrRefCount to reduce the reference count of the object. When the reference count is 0, the resource is really released.
```
 void lazyfreeFreeObjectFromBioThread(robj *o) {
     decrRefCount(o);
     atomicDecr(lazyfree_objects,1);
 }
```
Additionally, since redis 4.0, the reference count of key-value objects stored in redis has only one or shared state. In other words, the object handed to lazy free thread must be 1, which avoids the problem of multithreading competition.

Background empties the database dictionary and calls dictRelease to iterate through the database dictionary to delete all objects.

 void lazyfreeFreeDatabaseFromBioThread(dict *ht1, dict *ht2) {
     size_t numkeys = dictSize(ht1);
     dictRelease(ht1);
     dictRelease(ht2);
     atomicDecr(lazyfree_objects,numkeys);
 }

The key-slots mapping table is deleted in the background. Native redis will be used if they run in cluster mode. The function of self-research cluster mode used by cloud redis will not be invoked at present.
```
 void lazyfreeFreeSlotsMapFromBioThread(rax *rt) {
 size_t len = rt->numele;
 raxFree(rt);
 atomicDecr(lazyfree_objects,len);
 }
```

Expiration and expulsion

Redis supports setting expiration times and ejection, and the resulting deletion action may block redis.

So redis 4.0 adds four background deletion configuration items besides unlink, flushdb async and flushall async commands, respectively:

Slve-lazy-flush: Clear data options after slave receives RDB files
Lazy free-lazy-eviction: full memory ejection option
Lazy free-lazy-expire: expired key deletion option
lazyfree-lazy-server-del: Internal deletion options, such as rename oldkey new key, need to be deleted if new key exists

The above four options are deleted synchronously by default, and the background deletion function can be turned on by config set [parameter] yes.

The function of background deletion has not been changed, but in the original synchronous deletion place, according to the above four configuration items to choose whether to call dbAsyncDelete or emptyDbAsync for asynchronous deletion, the specific code can be seen:

slave-lazy-flush

 void readSyncBulkPayload(aeEventLoop *el, int fd, void *privdata, int mask) {
     ...
     if (eof_reached) {
         ...
         emptyDb(
             -1,
             server.repl_slave_lazy_flush ? EMPTYDB_ASYNC : EMPTYDB_NO_FLAGS,
             replicationEmptyDbCallback);
         ...
     }
     ...
 }

lazyfree-lazy-eviction

 int freeMemoryIfNeeded(long long timelimit) {
     ...
             /* Finally remove the selected key. */
             if (bestkey) {
                 ...
                 propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
                 if (server.lazyfree_lazy_eviction)
                     dbAsyncDelete(db,keyobj);
                 else
                     dbSyncDelete(db,keyobj);
                 ...
             }
     ...
 }

lazyfree-lazy-expire

 int activeExpireCycleTryExpire(redisDb *db, struct dictEntry *de, long long now) {
     ...
     if (now > t) {
         ...
         propagateExpire(db,keyobj,server.lazyfree_lazy_expire);
         if (server.lazyfree_lazy_expire)
             dbAsyncDelete(db,keyobj);
         else
             dbSyncDelete(db,keyobj);
         ...
     }
     ...
 }

lazyfree-lazy-server-del

 int dbDelete(redisDb *db, robj *key) {
     return server.lazyfree_lazy_server_del ? dbAsyncDelete(db,key) :
                                              dbSyncDelete(db,key);
 }

In addition, cloud redis has made minor improvements to expiration and eviction.

expire and evict optimization

Redis enters the active Expire Cycle cycle to delete expired keys in idle time. Each cycle takes the lead in calculating an execution time. Instead of traversing the entire database, redis randomly select a part of the keys to see if they expire, so sometimes the time will not be exhausted (asynchronous deletion will speed up the cleaning of expired keys), and the rest of the time can be handed over to FreeMemory IfNee. Ded is executed.

void activeExpireCycle(int type) {
    ...
afterexpire:
    if (!g_redis_c_timelimit_exit &&
        server.maxmemory > 0 &&
        zmalloc_used_memory() > server.maxmemory)
    {
        long long time_canbe_used = timelimit - (ustime() - start);
        if (time_canbe_used > 0) freeMemoryIfNeeded(time_canbe_used);
    }
}

Posted by maddali on Sat, 11 May 2019 02:38:55 -0700

Programmer Group