In Java, HashSet should override the equals method and hashCode method

Keywords: Java Algorithm

When adding objects using HashSet in Java programming, the equals method and hashCode method must be rewritten because they must comply with the characteristics of Set (no order and no repetition).

First:

The Set set has no order and cannot be repeated.

Why: simulate a collection of reality.

The repetition here is only the repetition of the object

What is the repetition of an object: it refers to the same object.

What is the same object: in memory, the memory number is the same.

What is the representation of memory number: hash code (see the previous article).

Second:

What is the contradiction between this setting and implementation

In real life, as long as the attributes are the same, we think it is the same object.

This is different from the way computers compare the same object (computers use memory addresses, or hash codes)

Therefore, we need to rewrite the equals method and hashCode method (& &) to make the running results of the program conform to real life

The implementation classes of basic data types have overridden the above two methods.

Third:

Why rewrite the equals method and hashCode method (technical implementation principle):

When adding an object to the HashSet, the program first calculates the hash code of the object with the hashCode method.

Comparison: (1) if the hash code of the object is inconsistent with the hash code of the existing object in the collection, the object is not duplicated with other objects and is added to the collection!

            (2) , if there is the same hash code for the object, judge whether two objects with the same hash code are the same object through the equals method (the judgment standard is whether the attributes are the same)

  1> , same object, not added.

  2> , different objects, add!

Two questions:

1. Why are there different objects with the same hash code?

2. Why does the hash code need to be judged with the equals method after comparison?                        

  A: first of all:

According to the hashCode method of the Object class, it is impossible to return two identical hash codes. (the hash code uniquely identifies the Object)

  then:

The hash code returned by the hashCode method of Object class is unique (address uniqueness), but this can not make the running logic of the program conform to real life. (the logic is that objects with the same attributes are regarded as the same Object.) in order to make the running logic of the program conform to real life, the subclass of Object rewrites the hashCode method (the implementation classes of basic data types have rewritten two methods, and the user-defined classes need software engineering   The teacher rewrites it himself.

Then:

What is the purpose of rewriting? Rewriting is to achieve this purpose: different objects with the same attributes return the same hash code after calling their hashCode method.

however

When rewriting, we found that almost all writing methods can not avoid a bug: some objects with different properties (of course, different objects) will return the same hash code (i.e. duplicate code)

last:

To solve this problem: when the hash code is the same, the equals method is used to compare whether the corresponding attributes of the two objects are the same, so as to ensure no mistake.

In this way: the above two problems are solved.

Example: different attributes but same hash code

import java.util.HashSet;

import java.util.Iterator;

import java.util.Set;

class Person {

private String name;

private int id;

Person(String name,int id) {

this.name = name;

this.id = id;

}

public void setName(String name){

this.name = name;

}

public String getName(){

return name;

}

public void setId(int id){

this.id = id;

}

public int getId(){

return id;

}

public int hashCode(){

return name.hashCode()+id; //Use a combination of string hash value and Integer hash value

                                              //This will produce a heavy code, and in fact, the heavy code rate is very high

}

public boolean equals(Object obj){

if(obj instanceof Person){ //Test whether the object on its left is the instance of the class on its right, and return the boolean data type.

Person p = (Person)obj;

return(name.equals(p.name) && id == p.id);

}

return super.equals(obj);

}

}

public class TestHashSet2 {

public static void main(String[] args) {

Person p1 = new Person("a",1);

Person p2 = new Person("b",0);

Set<Person> set = new HashSet<Person>();

set.add(p1);

set.add(p2);

Iterator<Person> it = set.iterator();

while(it.hasNext()){

System.out.println(it.next().getName());

}

}

}
  • General idea: if the hashcodes are different, they must be different objects. If the hashcodes are the same, judge whether they are the same object according to the equlas() method.

  • This problem exists in HashSet, HashMap and HashTable.

Posted by eflopez on Wed, 10 Nov 2021 08:15:51 -0800