A Java version of the specification `equals()`

Keywords: Java Lambda

equals() of a Java version of the specification

Original text: A Canonical equals() For Java

Despite the help of seven Objects.equals() methods in Java, equals() methods are often written in redundant and confusing paradigms. This article will demonstrate how to refine the equals() method to the naked eye for inspection.

When you write a class, it automatically inherits the Object class. If you do not override the equals() method, you will default to the Object.euqals() method. It compares memory addresses by default, so only when you compare exactly the same two objects can you get the true return value. This scheme is "the most discriminative".

// DefaultComparison.java

class DefaultComparison {
  private int i, j, k;
  public DefaultComparison(int i, int j, int k) {
    this.i = i;
    this.j = j;
    this.k = k;
  }
  public static void main(String[] args) {
    DefaultComparison
      a = new DefaultComparison(1, 2, 3),
      b = new DefaultComparison(1, 2, 3);
    System.out.println(a == a);
    System.out.println(a == b);
  }
}
/* Output:
true
false
*/

Usually you want to let go of this restriction. Typically, if two objects are of the same type and all fields have the same value, you can assume that the two objects are equivalent, but sometimes you don't want to compare certain fields in equals(). This is part of the class design process.

An appropriate equals() method must satisfy five conditions:

  1. Reflexive: For any x, x.equals(x) should return true
  2. Symmetrical: For any X and y, x.equals(y) returns true if and only if y.equals(x) returns true
  3. Passing: For any x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  4. Consistent: For any x and y, no matter how many times x.equals(y) is invoked, it returns true or false consistently as long as the information of the object used for comparison remains unchanged.
  5. For any non-null x, x.equals(null) returns false

Here are some tests that satisfy the above conditions and determine whether the object you want to compare (called rval in the test) is equivalent to the current object:

  1. If rval is null, not equal
  2. If rval is this (you are comparing yourself with yourself), equivalence
  3. If rval is not the same class or its subclasses, it is not equal
  4. If all the above tests pass, you must decide which fields in rval are important (and consistent) and then compare them.

Java 7 introduces Objects classes to help with this process, which we can use to write a better equals() method

The following example compares different versions of Equality classes. To prevent duplication of code, we use the factory approach to build use cases. This EqualityFactory interface simply defines a make() method to generate EqualityObjects, so different EqualityFactories can generate different EqualitySubclasses:

// EqualityFactory.java
import java.util.*;

interface EqualityFactory {
  Equality make(int i, String s, double d);
}

Now we will define Equality, which contains three fields (which we think are all important in comparison) and the euqals() method to satisfy the four tests mentioned above. The constructor outputs the class name so that we can ensure that the type is correct when testing:

// Equality.java
import java.util.*;

public class Equality {
  protected int i;
  protected String s;
  protected double d;
  public Equality(int i, String s, double d) {
    this.i = i;
    this.s = s;
    this.d = d;
    System.out.println("made 'Equality'");
  }
  @Override
  public boolean equals(Object rval) {
    if(rval == null)
      return false;
    if(rval == this)
      return true;
    if(!(rval instanceof Equality))
      return false;
    Equality other = (Equality)rval;
    if(!Objects.equals(i, other.i))
      return false;
    if(!Objects.equals(s, other.s))
      return false;
    if(!Objects.equals(d, other.d))
      return false;
    return true;
  }
  public void
  test(String descr, String expected, Object rval) {
    System.out.format("-- Testing %s --%n" +
      "%s instanceof Equality: %s%n" +
      "Expected %s, got %s%n",
      descr, descr, rval instanceof Equality,
      expected, equals(rval));
  }
  public static void testAll(EqualityFactory eqf) {
    Equality
      e = eqf.make(1, "Monty", 3.14),
      eq = eqf.make(1, "Monty", 3.14),
      neq = eqf.make(99, "Bob", 1.618);
    e.test("null", "false", null);
    e.test("same object", "true", e);
    e.test("different type", "false", new Integer(99));
    e.test("same values", "true", eq);
    e.test("different values", "false", neq);
  }
  public static void main(String[] args) {
    testAll( (i, s, d) -> new Equality(i, s, d));
  }
}
/* Output:
made 'Equality'
made 'Equality'
made 'Equality'
-- Testing null --
null instanceof Equality: false
Expected false, got false
-- Testing same object --
same object instanceof Equality: true
Expected true, got true
-- Testing different type --
different type instanceof Equality: false
Expected false, got false
-- Testing same values --
same values instanceof Equality: true
Expected true, got true
-- Testing different values --
different values instanceof Equality: true
Expected false, got false
*/

The testAll() method performs comparisons with all the different types of objects we can think of. It uses factories to construct Equality objects.

In the main() method, note the simplification of testAll() method calls. Because EqualityFactory only has a single method, it can be implemented by defining make() method with a Lambda expression.

The equals() method above is cumbersome and annoying, but fortunately it can be simplified into a formal form. It is found that:

  1. Type checking instanceOf eliminates the need for null checking
  2. The comparison of this is superfluous, and a correct equals() method for self-comparison is no problem.

Because & is a short-circuit comparison, it exits and returns a false when it first encounters a failure. So, by concatenating these checks, & & we can write the equals() method more concisely:

// SuccinctEquality.java
import java.util.*;

public class SuccinctEquality extends Equality {
  public SuccinctEquality(int i, String s, double d) {
    super(i, s, d);
    System.out.println("made 'SuccinctEquality'");
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof SuccinctEquality &&
      Objects.equals(i, ((SuccinctEquality)rval).i) &&
      Objects.equals(s, ((SuccinctEquality)rval).s) &&
      Objects.equals(d, ((SuccinctEquality)rval).d);
  }
  public static void main(String[] args) {
    Equality.testAll( (i, s, d) ->
      new SuccinctEquality(i, s, d));
  }
}
/* Output:
made 'Equality'
made 'SuccinctEquality'
made 'Equality'
made 'SuccinctEquality'
made 'Equality'
made 'SuccinctEquality'
-- Testing null --
null instanceof Equality: false
Expected false, got false
-- Testing same object --
same object instanceof Equality: true
Expected true, got true
-- Testing different type --
different type instanceof Equality: false
Expected false, got false
-- Testing same values --
same values instanceof Equality: true
Expected true, got true
-- Testing different values --
different values instanceof Equality: true
Expected false, got false
*/

For each SuccinctEquality, the base class constructor is called before the derived class constructor. The output shows that our results are still correct. You can see that the short circuit occurs in empty null detection and "different types" detection, otherwise the tests under the comparison list in the equals() method will throw an exception when the type is converted.

When you assemble your classes with other classes, Objects.euqals() becomes dazzling:

// ComposedEquality.java
import java.util.*;

class Part {
  String ss;
  double dd;
  public Part(String ss, double dd) {
    this.ss = ss;
    this.dd = dd;
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Part &&
      Objects.equals(ss, ((Part)rval).ss) &&
      Objects.equals(dd, ((Part)rval).dd);
  }
}

public class ComposedEquality extends SuccinctEquality {
  Part part;
  public ComposedEquality(int i, String s, double d) {
    super(i, s, d);
    part = new Part(s, d);
    System.out.println("made 'ComposedEquality'");
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof ComposedEquality &&
      super.equals(rval) &&
      Objects.equals(part, ((ComposedEquality)rval).part);
  }
  public static void main(String[] args) {
    Equality.testAll( (i, s, d) ->
      new ComposedEquality(i, s, d));
  }
}
/* Output:
made 'Equality'
made 'SuccinctEquality'
made 'ComposedEquality'
made 'Equality'
made 'SuccinctEquality'
made 'ComposedEquality'
made 'Equality'
made 'SuccinctEquality'
made 'ComposedEquality'
-- Testing null --
null instanceof Equality: false
Expected false, got false
-- Testing same object --
same object instanceof Equality: true
Expected true, got true
-- Testing different type --
different type instanceof Equality: false
Expected false, got false
-- Testing same values --
same values instanceof Equality: true
Expected true, got true
-- Testing different values --
different values instanceof Equality: true
Expected false, got false
*/

Note the call to super.equals() -- you don't have to reinvent the wheel (and you don't always have access to all the necessary components of the base class).

Comparison between subclasses

Inheritance implies that two distinct subclasses may become "equivalent" when moulded upwards. Suppose you have a collection of Pet objects that naturally accept a subclass of Pet: in this case, it could be Dog and Pig. Each Pet has a name and size, as well as an internal unique id.

We use Objects classes to normalize the definitions of equals() and hashCode() methods, but we only define them in the base class Pet, and they do not contain id s. From the point of view of the equals() method, this means that the object is Pet, regardless of which particular type of Pet it is:

// SubtypeEquality.java
import java.util.*;

enum Size { SMALL, MEDIUM, LARGE }

class Pet {
  private static int counter = 0;
  private final int id = counter++;
  private final String name;
  private final Size size;
  public Pet(String name, Size size) {
    this.name = name;
    this.size = size;
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Pet &&
      // Objects.equals(id, ((Pet)rval).id) && // [1]
      Objects.equals(name, ((Pet)rval).name) &&
      Objects.equals(size, ((Pet)rval).size);
  }
  @Override
  public int hashCode() {
    return Objects.hash(name, size);
    // return Objects.hash(name, size, id);  // [2]
  }
  @Override
  public String toString() {
    return String.format("%s[%d]: %s %s %x",
      getClass().getSimpleName(), id,
      name, size, hashCode());
  }
}

class Dog extends Pet {
  public Dog(String name, Size size) {
    super(name, size);
  }
}

class Pig extends Pet {
  public Pig(String name, Size size) {
    super(name, size);
  }
}

public class SubtypeEquality {
  public static void main(String[] args) {
    Set<Pet> pets = new HashSet<>();
    pets.add(new Dog("Ralph", Size.MEDIUM));
    pets.add(new Pig("Ralph", Size.MEDIUM));
    pets.forEach(System.out::println);
  }
}
/* Output:
Dog[0]: Ralph MEDIUM a752aeee
*/

If we only consider types, then it makes sense - sometimes - only from the standpoint of their base classes, which is the basis of Liskov's replacement principle. This code fits this principle well, because the derived class does not add any extra methods that are not in the base class. Derivative classes differ only in behavior, not in interfaces (which is certainly not the usual case).

But we provide two different objects with the same data and put them in a HashSet < Pet > with only one remaining. This highlights that the equals() method is not a perfect mathematical concept, but (at least in part) a rigid method. In hashed data structures, hashCode() and equals() must be closely related to the same definition in order to work properly.

In the example above, Dog and Pig are hashed into the same basket by HashSet. Here, HashSet relies on the equals() method to distinguish objects, but equals() considers the two objects to be equal. HashSet does not add Pig because it already has the same object.

We can still make the code work by forcing a distinction between these two identical objects. Here, each Pet already has a unique id, so you can uncomment the tag [1] or switch the hashCode() method to the tag [2] code. In a normalized approach, you can do both by introducing all the "invariant" fields (invariance) so that equals() and hashCode() do not produce different values when stored and returned in hashed data structures. I quote "invariance" because you have to assess whether it will change.

Note: In hashCode(), if you only use one field, use Objects.hashCode(), and if you use multiple fields, use Objects.hash().

We can also solve this problem by following the standard form in subclasses to define equals(), but not including id:

// SubtypeEquality2.java
import java.util.*;

class Dog2 extends Pet {
  public Dog2(String name, Size size) {
    super(name, size);
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Dog2 &&
      super.equals(rval);
  }
}

class Pig2 extends Pet {
  public Pig2(String name, Size size) {
    super(name, size);
  }
  @Override
  public boolean equals(Object rval) {
    return rval instanceof Pig2 &&
      super.equals(rval);
  }
}

public class SubtypeEquality2 {
  public static void main(String[] args) {
    Set<Pet> pets = new HashSet<>();
    pets.add(new Dog2("Ralph", Size.MEDIUM));
    pets.add(new Pig2("Ralph", Size.MEDIUM));
    pets.forEach(System.out::println);
  }
}
/* Output:
Dog2[0]: Ralph MEDIUM a752aeee
Pig2[1]: Ralph MEDIUM a752aeee
*/

Note that the hashCode() method is the same, but the two objects are no longer equal, so they both appear in the HashSet. And super.equals() means that we don't have to access the private fields of the base class.

One explanation for this is that Java separates substitutability by defining hashCode() and equals(). We can still put Dog and Pig in a Set regardless of how hashCode() and equals() are defined, but objects do not behave correctly in hashed data structures unless these methods are defined in mind using hashed structures. Unfortunately, equals() is not just associated with the hashCode() method. This makes things more complicated when you try to avoid defining it for certain classes, which is the value of normalization. But it also complicates matters further, because sometimes you don't need to define these methods.

Posted by compt on Tue, 26 Mar 2019 01:57:28 -0700