17. Collection Set,HashSet,TreeSet and their underlying implementations of HashMap and Red-Black Tree; Collection summary

ONE.Set Collection

The Characteristics of one.Set Set Set
Disorder, uniqueness

TWO.HashSet Collection

1. The underlying data structure is a hash table (an array of linked elements)

2. How does HashSet achieve the uniqueness of elements?

Look at the source code of add() in HashSet through the case of adding strings to HashSet to see why the same strings are not added to HashSet.

interface Collection {
...
}

interface Set extends Collection {
...
}

class HashSet implements Set {
private static final Object PRESENT = new Object();
private transient HashMap<E,Object> map;
//1. From this step, we can see that HashSet() is actually implemented using HashMap().
public HashSet() {
map = new HashMap<>();
}

public boolean add(E e) { //e=hello,world
//2. Inside the add () method is also the instance object invocation method of HashMap, where e is the added object and PRESENT is the one.
private static final Object PRESENT = new Object();Object.
  return map.put(e, PRESENT)==null;
}
}

class HashMap implements Map {
//3. put method to HashMap implementation   
public V put(K key, V value) { //key=e=hello,world

//4. See if the hash table is empty. If it is empty, open up space.
  if (table == EMPTY_TABLE) {
      inflateTable(threshold);
  }

  //5. Judging whether the object is null
  if (key == null)
      return putForNullKey(value);

  //6_1. Call the hash() method. By looking at this method, we know that the return value of this method is related to the hashCode() method of the object.
  int hash = hash(key); 

  //7. Find hash values in hash tables
  int i = indexFor(hash, table.length);
  //8. The initial condition of the for loop here is to assign table[i] to e if the hash is not found in the hash table.
    //No entryforCyclic comparisons, if any, enter comparisons
  for (Entry<K,V> e = table[i]; e != null; e = e.next) {
      Object k;
      //9. If the hash value is the same (to be honest, this step is too clear, since it can be queried, it means that the hash value of the two must be equal).
        //And if the address value or equls are the same, it will not be added.
      if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
          V oldValue = e.value;
          e.value = value;
          e.recordAccess(this);
          return oldValue;
          //There's no element added here.
      }
  }

  modCount++;
  //9. Add elements
  addEntry(hash, key, value, i); 
  return null;
}

transient int hashSeed = 0;

//6_2. This is the hash method in HashMap. From the implementation of this method, we can see that the only variable of this method is
hashCode()

final int hash(Object k) { //k=key=e=hello,
  int h = hashSeed;//This value defaults to 0.
  if (0 != h && k instanceof String) {
      return sun.misc.Hashing.stringHash32((String) k);
  }

  h ^= k.hashCode(); //What is called here is the hashCode() method of the object

  // This function ensures that hashCodes that differ only by
  // constant multiples at each bit position have a bounded
  // number of collisions (approximately 8 at default load factor).
  h ^= (h >>> 20) ^ (h >>> 12);
  return h ^ (h >>> 7) ^ (h >>> 4);
}

HashSet is actually implemented with HashMap(), which is the implementation class of the Map interface.
Calling the add() method of HashSet is actually calling put() in HashMap(), which mainly involves two aspects.

1. The hash value of the object is obtained by calling hash(), which is realized by operation of hashCode(), and controlled by the value of hashCode().

2. Hash tables are created, and the hash tables will generate revenue for each hash value.
Then the way to compare is

A. Look for the hash value in the hash table first (first comparison, see if there is a hash value of the current element in the hash table (this value is obtained by hashcode operation). If not, add the object corresponding to the hash value directly to the HashSet, and if there is a second comparison.

B. If there is a hash value in the hash table, get the object corresponding to the hash in the table if the address value of the two objects (e. key = key) or key.equal(e.key). (In the second comparison, if the hash value of two objects is the same, or can not be considered the same object, but also compare the address value of two objects, or equals(), here is a | |, as long as one satisfies the same, it can be considered the same element, not added.

3. Now you can answer the initial question, why only one string is stored when the string is the same, because hashCode() and equals() are rewritten in the String class, and the hashcode value and equals result of the String class are determined by the content of the string.

Let's look at hashCode() and equals() methods in String classes

1) Public int hashCode () returns the hash code of this string.
The hash code of String object is calculated according to the following formula:
s[0]*31^(n-1) + s[1]*31^(n-2) + … + s[n-1]
Using the int algorithm, where s[i] i s the first character of the string, n i s the length of the string, and ^ i s the power. (The hash value of an empty string is 0.)

2) public boolean equals(Object anObject) compares this string to the specified object.
The result is true if and only if the parameter is not null and the String object represents the same sequence of characters as the object.

So when the object is of String type, the address value (possibly equal (string constant pool knowledge points)) and equals method must be equal, so they will not be added.

4. If what is included in HashSet is a custom object, how can we achieve uniqueness?
1) Through the analysis of String class above, we know that the only two comparisons in HashSet are to compare the hash value (which is controlled by the hashcode value). If the value is the same, then compare the address value or equals(), so we can realize the uniqueness judgment of the custom object by rewriting hashCode() and equals().
2) The idea here is that as long as equals is rewritten so that the method compares the content of the object, the same object can be excluded.
So let hashCode() return a constant so that hash is equal and then judged by equals.
But this will cause each new object to be compared with the old object, which is too troublesome, so we can imitate the way hashCode() is rewritten in String class. The same member variable of the object determines the hashcode value of the object. The equals method also compares the member variables of the object.
Is it equal? So we come to the final conclusion that eclipse actually provides the final version of the two methods for custom class rewriting

public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + age;
        result = prime * result + ((name == null) ? 0 : name.hashCode());
        return result;
    }
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Student other = (Student) obj;
        if (age != other.age)
            return false;
        if (name == null) {
            if (other.name != null)
                return false;
        } else if (!name.equals(other.name))
            return false;
        return true;
    }

3) When developing, the code is very simple and can be generated automatically.

THREE.TreeSet Collection

1: The underlying data structure is a red-black tree (a self-balanced binary tree)

2: How does the underlying TreeSet guarantee the ordering of elements and the uniqueness of elements?
Similarly, let's look at the source code for TreeSet.

interface Collection {...}

interface Set extends Collection {...}

interface NavigableMap {

}

class TreeMap implements NavigableMap {
     public V put(K key, V value) {
        Entry<K,V> t = root;
        if (t == null) {
            compare(key, key); // type (and possibly null) check

            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        int cmp;
        Entry<K,V> parent;
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            if (key == null)
                throw new NullPointerException();
            Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }
}

class TreeSet implements Set {
    private transient NavigableMap<E,Object> m;

    public TreeSet() {
         this(new TreeMap<E,Object>());
    }

    public boolean add(E e) {
        return m.put(e, PRESENT)==null;
    }
}

//The real comparison depends on the compareTo() method of the element, which is defined in Comparable.
//So if you want to rewrite this method, you have to start with the Comparable interface. This interface represents natural sorting.

From the source code, we know that the underlying code of TreeSet can achieve uniqueness and orderliness of its elements in two ways

a: Natural ordering (elements are comparative)
Let the class to which the element belongs implement the Comparable interface and override the compareTo method

package cn.itcast_03;

public class Student implements Comparable<Student>{
    private String name;
    private int age;
    public Student() {
        super();
        // TODO Auto-generated constructor stub
    }
    public Student(String name, int age) {
        super();
        this.name = name;
        this.age = age;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public int getAge() {
        return age;
    }
    public void setAge(int age) {
        this.age = age;
    }
    @Override
    public int compareTo(Student s) {
        int num = 0;
        num = this.age - s.age;
        int num2 = num == 0 ? this.name.compareTo(s.name) : num;
        return num2;

    }

}

b: Comparator sorting (sets are comparable)
Let Collection Constructor Receive Comparator Implementation Class Objects

This method achieves comparison by calling the parametric construction of sets.
public TreeSet(Comperator comparator)//comparator sort

Comperator is an interface, and taking an interface as a parameter is actually an object of the implementation class that needs the interface.
Through API, we can know the format of the method in this interface. In addition, we often use the format of anonymous inner class when we need the object of the implementation class of the interface as the parameter.

TreeSet<Student> ts = new TreeSet<Student>(new Comparator<Student>() {
            public int compare(Student s1,Student s2) {
                //Note that the assignment priority is the lowest, === the largest, and the middle of the three items.
                int num = s1.getName().length() - s2.getName().length();
                int num2 = num == 0 ? s1.getName().compareTo(s2.getName()) : num;
                int num3 = num2 == 0 ? s1.getAge() - s2.getAge() : num2;
                return num3;                    
            }
        });

3. Problems needing attention

1) According to the understanding of the red-black tree of the source code and the underlying data structure, we know that the core of the data structure is to achieve a balanced binary tree by comparing the root node with the child node, in the default method of the red-black tree, the smaller is the left son, the larger is the right son, and the code is to make some comparison of the member variables. Get an int value, if the value is greater than 1, put on the right, if the value is less than 1, put on the left, if the value is equal, do not put in, through this int value to complete the collection of elements.

2) When we consider this value, we often use the trinomial operator to judge if the condition is greater than, less than, or equal. We are proficient in using the trinomial operator.

//If it is sorted according to the normal binary order, it will output the small one first, and we need to output the high one first, so we can exchange all S1 and S2 in order.
                //We can put the total score on the left (output first), and then the weight of the single subject is Chinese - Mathematics - English.
                int num = s2.getSum() - s1.getSum();
                int num2 = num ==0 ? s2.getChineseScore() - s1.getChineseScore() : num;
                int num3 = num2 == 0 ? s2.getMathScore() - s1.getMathScore() : num2;
                int num4 = num3 == 0 ? s2.getEnglishScore() - s1.getEnglishScore() : num3;
                return num4;

FOUR.Collection Set Summary (Mastery) Collection

    List orderly and repeatable
        |--ArrayList
            The underlying data structure is arrays, which make queries fast and add or delete slowly.
            Threads are insecure and efficient
        |--Vector
            The underlying data structure is arrays, which make queries fast and add or delete slowly.
            Thread Safety and Low Efficiency
        |--LinkedList
            The underlying data structure is linked list, which is slow to query and fast to add or delete.
            Threads are insecure and efficient
    Set disorderly, unique
        |--HashSet
            The underlying data structure is a hash table.
            How to ensure the uniqueness of elements?
                Depends on two methods: hashCode() and equals()
                These two methods can be automatically generated in development.
            |--LinkedHashSet
                The underlying data structure is linked list and hash table
                Ensuring Element Order by Link List
                Ensure element uniqueness by hash table
        |--TreeSet
            The underlying data structure is a red-black tree.
            How to ensure that elements are sorted?
                Natural ordering
                Comparator sort
            How to ensure the uniqueness of elements?
                Does it depend on whether the return value of the comparison is 0?

4: Who exactly do we use for Collection collections?
The only one?
Yes, Set
Ranking?
Yes: TreeSet
No: HashSet
If you know Set, but you don't know which Set it is, use HashSet.

No: List
Is it safe?
Yes, Vector
No: ArrayList or LinkedList
Multiple queries: ArrayList
Add, delete and add: LinkedList
If you know it's a List, but you don't know which List it is, use ArrayList.

If you know it's a Collection collection, but you don't know who to use, use ArrayList.

If you know how to use collections, use ArrayList.

5: Common data structures in collections (mastery)
ArrayXxx: The underlying data structure is an array, querying fast, adding or deleting slow
LinkedXxx: The underlying data structure is linked list, which is slow to query and fast to add or delete.
HashXxx: The underlying data structure is a hash table. Depends on two methods: hashCode() and equals()
TreeXxx: The underlying data structure is a binary tree. Two Sorting Ways: Natural Sorting and Comparator Sorting

FIVE. case

A: Obtain random numbers without duplication

B: Keyboard entry students output from high to low according to the total score

Posted by a.heresey on Tue, 11 Dec 2018 14:48:06 -0800

Programmer Group