[redis5 source code analysis] analysis of dump of redis command

Keywords: C Redis encoding

Grape

Official documents

DUMP key
Serialize a given key and return the serialized value, which can be deserialized to the Redis key using the RESTORE command.
The values generated by serialization have the following characteristics:

  • It has a 64-bit checksum for error detection, and RESTORE checks the checksum before deserializing it.
  • The encoding format of the value is consistent with that of the RDB file.
  • The RDB version is encoded in the serialized value. If the RDB format is incompatible due to the different versions of Redis, Redis will refuse to deserialize the value.

The serialized values do not include any lifetime information.
Available Version: >= 2.6.0
Time complexity:
The complexity of finding a given key is O(1) and serializing the key is O(N*M), where N is the number of EDIS objects that make up the key, and M is the average size of these objects.
If the serialized object is a relatively small string, the complexity is O(1).
Return value: If key does not exist, return nil. Otherwise, return the value after serialization.

redis> SET greeting "hello, dumping world!"
OK
redis> DUMP greeting
"\x00\x15hello, dumping world!\x06\x00E\xa0Z\x82\xd8r\xc1\xde"
redis> DUMP not-exists-key
(nil)

As we can see, the dump command is designed to serialize a given key. So what is Serialization? Let's look at the definition of Serialization: the process by which the state information of an object is converted into a form that can be stored or transmitted. During Serialization, an object writes its current state to a temporary or persistent store. Later, the object can be recreated by reading or deserializing the state of the object from the store. The purpose is that objects can be stored across platforms and transmitted over networks.
The use of commands is simple, namely dump key, which is usually used in conjunction with RESTORE, serialization and deserialization. If you want to know more about serialization, it is recommended to read: Serialization is easy to understand..

Source code analysis

First, we paste the source code:

/* DUMP keyname
 * DUMP is actually not used by Redis Cluster but it is the obvious
 * complement of RESTORE and can be useful for different applications. */
void dumpCommand(client *c) {
    robj *o, *dumpobj;
    rio payload;
    /* Check whether the key exists */
    if ((o = lookupKeyRead(c->db,c->argv[1])) == NULL) {
    addReply(c,shared.nullbulk);
    return;
    }
    /*Create dump loads. */
    createDumpPayload(&payload,o);
    /* Transfer to client */
    dumpobj = createObject(OBJ_STRING,payload.io.buffer.ptr);
    addReplyBulk(c,dumpobj);
    decrRefCount(dumpobj);
    return;
}

Next, let's analyze it slowly.
The core of Dump command is to create dump load, so our core lies in this process. First, we briefly describe the general process: first, check whether the key we want to serialize exists, and then create dump load if it exists, and then transmit it to the client.

  1. Check whether the key exists, mainly through lookup keyRead to achieve, roughly through the key to redis DB to check whether it exists, if it exists, then downward execution, otherwise send information to the client. Usually when we dump a non-existent key, we get a nil.
  2. Create dump load, which is the core of dump command. The specific implementation code is as follows:

     void createDumpPayload(rio *payload, robj *o) {
     unsigned char buf[2];
     uint64_t crc;
     /* Serialize the object in a RDB-like format. It consist of an object type
        byte followed by the serialized object. This is understood by RESTORE. 
        Serialize objects in a format similar to rdb. It consists of object type bytes and serialized objects.
     */
     rioInitWithBuffer(payload,sdsempty());
     /*Write the given object type into rdb, fail to report an error*/
     serverAssert(rdbSaveObjectType(payload,o));
     /*Write a given object into rdb, fail to report an error*/
     serverAssert(rdbSaveObject(payload,o));
     /* Write the footer, this is how it looks like:
     
        ----------------+---------------------+---------------+
        ... RDB payload | 2 bytes RDB version | 8 bytes CRC64 |
        ----------------+---------------------+---------------+
        RDB version and CRC are both in little endian.
     /* RDB Version, saved in two bytes, expressed as 0-65535 */
     buf[0] = RDB_VERSION & 0xff;
     buf[1] = (RDB_VERSION >> 8) & 0xff;
     /*sdscatlen Functions are extended lengths with strings appended
     payload->io.buffer.ptr = sdscatlen(payload->io.buffer.ptr,buf,2);
     /* Calculate CRC check code, 8 bytes in total */
     crc = crc64(0,(unsigned char*)payload->io.buffer.ptr,
                 sdslen(payload->io.buffer.ptr));
     
     /*For the target machine, the byte code is converted to the large-end byte sequence machine.
     Conversion of 16 byte, 32 byte, 64 byte bytes is provided.
     It is used in intset ziplist zipmap three data structures.
     It makes the rdb file format generated by different byte order machines uniform (small-end byte order) and easy to be compatible.*/
     memrev64ifbe(&crc);
     payload->io.buffer.ptr = sdscatlen(payload->io.buffer.ptr,&crc,8);

    }
    The resulting serialized object format is the following format:
    +-------------+---------------------+---------------+
    | RDB payload | 2 bytes RDB version | 8 bytes CRC64 |
    +-------------+---------------------+---—+

  3. Send information to the client, the last piece we can think of as messaging, passing a good sequence of information to the client. I will write a special topic to introduce this piece later.

If readers are interested, do try gdb for yourself!!

Expanding reading

Posted by shinyo on Thu, 19 Sep 2019 20:31:49 -0700