HashMap Source Reading Notes

Keywords: Java less

1. Node class in HashMap:

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }
  1. Rewrite hashCode, key, and value's hashcode to disassociate or.
  2. Rewrite equals when both objects are equal for the same object or for the same key and value.

2. Calculation of hash value

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
  1. It is different from the unsigned right-shifting self or takes into account both the 16-bit high hash and the 16-bit low hash to make the hash value more scattered.

3. Focus on get(Object key)

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    
final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }
  1. As you can see, get() is the value that you look for with the hash and key of the key.
  2. In getNode(), first a series of judgments and assignments are made, and then key s are located in table s by subscripts.
  3. Location method: (n - 1) & hash, so that the value is always less than table length n.
  4. Then, if the keys are equal, the equals return, and if the keys are not equal, we can judge whether they are the storage structure of the red-black tree, and if so, we can find them on the red-black tree.
  5. If not, look up the list structure.

4. Core put(K key, V value)

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        //Judging whether to expand or not
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
  1. First, putVal(int hash, K key, V value, boolean onlyIfAbsent,boolean evict) is invoked.
  2. The first step is initialization.
  3. Then, locate the table without conflict, and store it directly on the table.
  4. If the conflict occurs, the key is judged to be equal, and if the key is equal, the Node of the old Germany is directly covered.
  5. Otherwise, continue to determine whether the header node is an instance of TreeNode, TreeNode is a red-black tree, and if so, insert it directly into the tree.
  6. If it's not a red-black tree, insert it at the end of the list.
  7. In hashmap, there is a property called TREEIFY_THRESHOLD, which is a threshold. If the number exceeds it, the linked list will be converted into a red-black tree, and if it is smaller, it will be changed back to the linked list. So hashMap uses three data structures: array, linked list and red-black tree.
  8. Each time a new node is added, the need for expansion is judged.

5. Expansion mechanism resize()

Firstly, three member variables are involved:

  1. Capacity:capacity:capacity
  2. Load Factor: Load Factor (0-1)
  3. Threshold: The flag threshold = capacity * loadFactor to determine whether expansion is required
  4. So the loading factor controls the conflict ratio of HashMap.
  5. Each expansion is doubled.
  6. Expansion will rebuild variables such as table s, so it will cost a lot.

Posted by Rairay on Thu, 02 May 2019 00:20:36 -0700