Detailed description of the add underlying method in HashSet

Keywords: Java

1. Learn this chapter and you will learn new ways to connect:

1. Member variables in HashSet: map, hash

    private transient HashMap<E,Object> map;

2. parameterless construction methods in HashSet

3. Detailed put() method in HashMap

4. The function of putval in put() method and Hash() method

5. Function and return value of resize() method in putval method

6. New Code() method to construct nodes in putval() method

7. The second if in the putVal() method deals with three situations

2. Detail the underlying code of add

1. When creating a HashSet collection:

    public HashSet() {
        map = new HashMap<>();
    }

As mentioned above, executing a create object is equivalent to calling the parameterless construction method HashSet and then creating a HashMap object to assign values to the map (a member variable in the HashSet);

2. Use the add method in HashSet:

    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

You can see that the return value is boolean and the put() method in the map is executed, where the incoming e is a generic in the set and PRESENT is a constant. Analyze the put method below:

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

1. The put method returns the return value of the putVal method. You can see that the hash method is called in its parameters. The function of the hash () method is detailed below:

    static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

The return value of this method is determined by the difference in the return value of the key.hashCode() method, which can be understood as follows: As long as the key.hashCode() is different, the return value will be different: (think: what is the function of the hashCode() method?)

Introduce the toString() method below: You can return the address of a reference object (which you can not write by default): Note: Variables of type String will override the method, so the output is a string

    public String toString() {
        return getClass().getName() + "@" + Integer.toHexString(hashCode());
    }

At this point, we understand that hashCode is the decimal representation of an address, and what we normally see is that the Integer.toHexString(hashCode()) method converts us to hexadecimal numbers:

public class Test {
	public static void main(String[] args) {
		Test test=new Test();
		System.out.println(test.toString());
		System.out.println(Integer.toHexString(test.hashCode()));
	}
}

The results are as follows:moon.Test@52e922
                                 52e922

Note: For normal String methods, where the hashCode method has been overridden, the conclusion is that hashCode returns the same value as long as the strings in the object are the same:

    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }

Prove as follows:

public class Test {
	public static void main(String[] args) {
		String str="Tom";
		String str2=new String("Tom");
		System.out.println(str==str2);
		System.out.println(str.hashCode());
		System.out.println(str2.hashCode());
	}
}

* The result of execution is:false
                              84274
                              84274  

The execution results indicate that hashCode results are the same, but addresses are different

2. putVal method:

First, take a look at the underlying code analysis:

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
      }

At the first if():

    if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;

(1) resize () method: member variable is assigned the first time: table and local variable tab give the same address (these two attributes are a set of one node (Node)

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

These lines of analysis were found in the code:

        newCap = DEFAULT_INITIAL_CAPACITY;

        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];

        table = newTab

        return newTab;

        static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

You can see that resize is used to give the table and the local variable tab the same address (the two attributes are a set of Node s) and determine for the first time the length of the array: n=16;

The second if() statement:

        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);

Parentheses determine whether the data entered this time already exists: we know that the value returned by hash varies according to the hashCode (), which means that each time a new address key is added, the method following the if() statement is executed, and then finally:

 ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

This successfully adds a data: and outputs true in the put method

If this is not the case, the set element of the same last output will be automatically assigned to the variable p, so the p-fact here finds the same variable as this hash

The following explores what else does when the hash of the data is the same:

    else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
    }

There are three scenarios discussed below:

1. When adding strings directly:

public class Test {
	public static void main(String[] args) {
		HashSet<String> set =new HashSet<>();
		set.add("Tom");
		set.add("Tom");
	}
}

In this case, because you add the string directly, you add the same key address and hash, i.e. if after else (the expression is true), then execute assigning p to e, and then proceed to the next step:

        if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
        }

Both if() are true at this time; so replace the old key with the new key and return non-null, put() returns false;

2. When the Stringl class uses object assignment;

import java.util.HashSet;
public class Test {
	public static void main(String[] args) {
		HashSet<String> set =new HashSet<>();
		set.add("Tom");
		String name =new String("Tom");
		System.out.println(set.add(name));
	}
}

As mentioned above, the String class has modified the hashCode() method, so the hash is the same at this time, so let's just compare this code:

    else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
    }

Hash before and after is the same, but - address is different: compare equals between the two; because equals in String is also rewritten, it is also: true, then last override:

        if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
        }

At this time it is false;

3. When a new object is created without a String class as output

import java.util.HashSet;
public class Test {
	public static void main(String[] args) {
		HashSet<String> set =new HashSet<>();
		set.add(new Test());
		set.add(new Test());
	}
}

There will be direct entries at this time due to different addresses;

3. When creating a project:

At this point, you can override hashCode () and equals () to make the project easier, which is why we learn the underlying code:

public class Student {
	public String id;
	public String name; 
	public Student(String id, String name) {
		this.id = id;
		this.name = name;
		
	}
	@Override
	public int hashCode() {
		return id.hashCode();
	}
	@Override
	public boolean equals(Object obj) {
		if(obj instanceof Student) {
			Student student =(Student)obj;
			return this.id.equals(student.id);
		}
		return false;
	}
}

) as shown in the diagram

 

 

This concludes the section: Thank you

Posted by BANDYCANDY on Sat, 10 Aug 2019 20:12:28 -0700