redis source code analysis -- 6. Implementation of skip list

Keywords: C++ Database Redis source code list

skiplist is a very useful data structure, which is also common in interviews. Its efficiency is basically equivalent to that of red black tree, and its coding implementation is much simpler than that of red black tree

1. How to implement a dict by yourself

Unlike C + +, java and other high-level languages with built-in map, C language does not provide dict library, so if you want to use dict, you need to implement it yourself. So how to implement a dict?

  1. Array method
    • This is also the easiest method to implement. In short, it is to open up a large array with length N (usually N is a prime number), then calculate the hash value D of the key through a hash function, and then use d%N to get the lower edge of the array for the key. Imagine that if the array is large enough and the hash function is hash enough, we can ensure that the corresponding subscripts of different keys will never conflict, so we can implement a dict.
    • The above described method to implement dict has great limitations in practice. For example, we don't know that most arrays are suitable. Even if we open one million, this one million is not enough with the growth of business. Secondly, no hash function can ensure that the hash values of different key s are not repeated.
    • In view of this, when we use arrays to implement dict, the array length is generally dynamic. We judge the expansion and contraction capacity according to the number of elements / array length. Secondly, we find a hash function that can be scattered as much as possible. Finally, for conflicting key s, we can simply use the zipper method to solve the conflict.
  2. Red black tree method
    • The underlying implementation of map in C + + is red black tree. The core of the red black tree is to try to ensure that the binary tree is left-right balanced, so the implementation of the red black tree is more complex, and it is necessary to distinguish between various situations. There is no need for hash function, just implement the key comparison function, so there is no key conflict.
  3. Skip table method
    • If you want to achieve the effect of red black tree, but do not want to achieve red black tree, then jumping table is a good choice, and its efficiency is almost equivalent to that of red black tree. The zset bottom layer in redis uses table adjustment, and the specific implementation will be discussed later.

2. Implementation of dict in redis

  • The implementation of dict in redis uses the first method mentioned above, that is, array zipper method. For hash function, redis selects Murmurhash2. Let's look at the structure definition.

    typedef struct dictEntry {
        void *key;
        union {
            void *val;
            uint64_t u64;
            int64_t s64;
            double d;
        } v;
        struct dictEntry *next;
    } dictEntry; /* entry It can be considered as a node. In addition to storing value, it also stores key and next. This is because if a conflict occurs, we need to further compare keys */
    typedef struct dictType {
        uint64_t (*hashFunction)(const void *key);
        void *(*keyDup)(void *privdata, const void *key);
        void *(*valDup)(void *privdata, const void *obj);
        int (*keyCompare)(void *privdata, const void *key1, const void *key2);
        void (*keyDestructor)(void *privdata, void *key);
        void (*valDestructor)(void *privdata, void *obj);
    } dictType; /* Some interfaces are defined here to implement different processing functions, so as to achieve the effect of the interface */
    /* This is our hash table structure. Every dictionary has two of this as we
     * implement incremental rehashing, for the old to the new table. */
    typedef struct dictht {
        dictEntry **table; /* Why is this the pointer of the pointer? In fact, it is a pointer array. The header pointer of the entry linked list stored in each element */
        unsigned long size;
        unsigned long sizemask;
        unsigned long used;
    } dictht; /* An array is defined. table is actually a pointer array. You can access the key according to the subscript*/
    typedef struct dict {
        dictType *type;	/* Customize the implementation interfaces of various dictionaries */
        void *privdata; /* Custom data fields */
        dictht ht[2]; /* 2 A table, mainly used in rehash*/
        long rehashidx; /* rehashing not in progress if rehashidx == -1 */
        unsigned long iterators; /* number of iterators currently running */
    } dict;

3. Basic operation

  • Public operation

    • Find the index in the array according to the key

      /* Returns the index of a free slot that can be populated with
       * a hash entry for the given 'key'.
       * If the key already exists, -1 is returned
       * and the optional output parameter may be filled.
       * Note that if we are in the process of rehashing the hash table, the
       * index is always returned in the context of the second (new) hash table. */
      /* Calculate the slot it should be in according to the key. If the key already exists, return - 1 
       * The function passes in both key and hash values. This is because there may be conflicts. It is not possible to judge whether the key exists only by the hash value. You must also compare whether the key is the same
       * If the incoming existing is not empty, existing returns the existing node address
      static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
          unsigned long idx, table;
          dictEntry *he;
          if (existing) *existing = NULL;
          /* Expand the hash table if needed */
          if (_dictExpandIfNeeded(d) == DICT_ERR)
              return -1;
          for (table = 0; table <= 1; table++) {
              idx = hash & d->ht[table].sizemask;
              /* Search if this slot does not already contain the given key */
              he = d->ht[table].table[idx];
              while(he) {
                  if (key==he->key || dictCompareKeys(d, key, he->key)) {
                      if (existing) *existing = he;
                      return -1;
                  he = he->next;
              if (!dictIsRehashing(d)) break;
          return idx;
    • Capacity expansion

      /* Expand the hash table if needed */
      static int _dictExpandIfNeeded(dict *d)
          /* Incremental rehashing already in progress. Return. */
          if (dictIsRehashing(d)) return DICT_OK;
          /* If the hash table is empty expand it to the initial size. */
          /* If the array is still empty, it will be directly expanded to DICT_HT_INITIAL_SIZE, the default is 4 */
          if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);
          /* If we reached the 1:1 ratio, and we are allowed to resize the hash
           * table (global setting) or we should avoid it but the ratio between
           * elements/buckets is over the "safe" threshold, we resize doubling
           * the number of buckets. */
          /* There are several conditions for triggering capacity expansion
           * If the "fill rate" > = 1, that is, used > = size, the capacity should be expanded. Of course, it also depends on whether the current capacity is not allowed to be expanded
           * If fill rate > Dict_ force_ resize_ Ratio (the default is 5), that is, when the average conflict rate is greater than 5, the capacity expansion is forced
          if (d->ht[0].used >= d->ht[0].size &&
              (dict_can_resize ||
               d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
              return dictExpand(d, d->ht[0].used*2);
          return DICT_OK;
    • rehash

      /* Almost all dict operation functions have_ dictRehashStep is called, which actually allocates the capacity expansion process. No group capacity expansion may take a long time */
      static void _dictRehashStep(dict *d) {
          if (d->iterators == 0) dictRehash(d,1);
      /* Performs N steps of incremental rehashing. Returns 1 if there are still
       * keys to move from the old to the new hash table, otherwise 0 is returned.
       * Note that a rehashing step consists in moving a bucket (that may have more
       * than one key as we use chaining) from the old to the new hash table, however
       * since part of the hash table may be composed of empty spaces, it is not
       * guaranteed that this function will rehash even a single bucket, since it
       * will visit at max N*10 empty buckets in total, otherwise the amount of
       * work it does would be unbound and the function may block for a long time. */
      int dictRehash(dict *d, int n) {
          int empty_visits = n*10; /* Max number of empty buckets to visit. */
          if (!dictIsRehashing(d)) return 0;
          while(n-- && d->ht[0].used != 0) {
              dictEntry *de, *nextde;
              /* Note that rehashidx can't overflow as we are sure there are more
               * elements because ht[0].used != 0 */
              assert(d->ht[0].size > (unsigned long)d->rehashidx);
              while(d->ht[0].table[d->rehashidx] == NULL) { /* Non empty node found */
                  if (--empty_visits == 0) return 1; /* This line is not available in 3.0. It is written to prevent while from getting stuck on empty nodes */
              de = d->ht[0].table[d->rehashidx];
              /* Move all the keys in this bucket from the old to the new hash HT */
              while(de) {
                  uint64_t h;
                  nextde = de->next;
                  /* Get the index in the new hash table */
                  h = dictHashKey(d, de->key) & d->ht[1].sizemask; /* Insert all nodes under the slot into the corresponding slot of the new table */
                  de->next = d->ht[1].table[h];
                  d->ht[1].table[h] = de;
                  de = nextde;
              d->ht[0].table[d->rehashidx] = NULL;
          /* Check if we already rehashed the whole table... */
          /* If the element of h[0] is empty, which indicates that rehash is completed, h[0] is released and h[0] takes over h[1]*/
          if (d->ht[0].used == 0) {
              d->ht[0] = d->ht[1];
              d->rehashidx = -1;
              return 0;
          /* More to rehash... */
          return 1;
  • establish

    /* Create a new hash table */
    dict *dictCreate(dictType *type,
            void *privDataPtr)
        dict *d = zmalloc(sizeof(*d));
        return d;
    /* Initialize the hash table */
    int _dictInit(dict *d, dictType *type,
            void *privDataPtr)
        _dictReset(&d->ht[0]); /* All members are set to 0 or null*/
        d->type = type;
        d->privdata = privDataPtr;
        d->rehashidx = -1;
        d->iterators = 0;
        return DICT_OK;
  • insert

    /* Create a new entry based on the key. If the key already exists, null will be returned */
    dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
        long index;
        dictEntry *entry;
        dictht *ht;
        if (dictIsRehashing(d)) _dictRehashStep(d);
        /* Get the index of the new element, or -1 if
         * the element already exists. */
        /* For newly added nodes, if the key already exists, NULL will be returned directly*/
        if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
            return NULL;
        /* Allocate the memory and store the new entry.
         * Insert the element in top, with the assumption that in a database
         * system it is more likely that recently added entries are accessed
         * more frequently. */
        /* According to the principle of nearest access, the new node is inserted to the end */
        ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
        entry = zmalloc(sizeof(*entry));
        entry->next = ht->table[index];
        ht->table[index] = entry;
        /* Set the hash entry fields. */
        /* Set the key of the node */
        dictSetKey(d, entry, key);
        return entry;
    /* Add an element to the target hash table */
    int dictAdd(dict *d, void *key, void *val)
        /* First, create an entry based on the key */
        dictEntry *entry = dictAddRaw(d,key,NULL);
        if (!entry) return DICT_ERR;
        /* Set value for the entry */
        dictSetVal(d, entry, val);
        return DICT_OK;
  • delete

    /* Remove an element, returning DICT_OK on success or DICT_ERR if the
     * element was not found. */
    int dictDelete(dict *ht, const void *key) {
        return dictGenericDelete(ht,key,0) ? DICT_OK : DICT_ERR;
    /* Search and remove an element. This is an helper function for
     * dictDelete() and dictUnlink(), please check the top comment
     * of those functions. */
    /* Delete elements according to key */
    static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {
        uint64_t h, idx;
        dictEntry *he, *prevHe;
        int table;
        if (d->ht[0].used == 0 && d->ht[1].used == 0) return NULL; /* dict Null, short circuit return*/
        if (dictIsRehashing(d)) _dictRehashStep(d);
        h = dictHashKey(d, key); /* Calculate hash value */
        for (table = 0; table <= 1; table++) { /* Variables h[0] and h[1] respectively */
            idx = h & d->ht[table].sizemask; /* Calculate the slot position of the key in the hash talk according to the hash value of the key */
            he = d->ht[table].table[idx]; /* Get the entry list with the same hash value */
            prevHe = NULL;
            while(he) {
                if (key==he->key || dictCompareKeys(d, key, he->key)) { /* Found key */
                    /* Unlink the element from the list */
                    if (prevHe)
                        prevHe->next = he->next; /* Deletion logic of linked list */
                        d->ht[table].table[idx] = he->next;
                    if (!nofree) {
                        dictFreeKey(d, he);
                        dictFreeVal(d, he);
                    return he;
                prevHe = he;
                he = he->next;
            if (!dictIsRehashing(d)) break;
        return NULL; /* not found */

    Original source:

Posted by DamienRoche on Wed, 27 Oct 2021 08:20:37 -0700