Meituan side: what is the relationship between hashCode and the memory address of the object? I'm confused..

Keywords: Java

Source: juejin.cn/post/6971946031764209678

First look at the simplest print

System.out.println(new Object());

The fully qualified class name and a string of strings of this class will be output:

java.lang.Object@6659c656

@What is after the symbol? Is it hashcode or the memory address of the object? Or something else?

In fact, @ what follows is only the hashcode value of the object, the hashcode displayed in hexadecimal. Let's verify:

Object o = new Object();
int hashcode = o.hashCode();
// toString
System.out.println(o);
// hashcode hex
System.out.println(Integer.toHexString(hashcode));
// hashcode
System.out.println(hashcode);
// This method is also to obtain the hashcode of the object; However, unlike Object.hashcode, this method ignores the rewritten hashcode
System.out.println(System.identityHashCode(o));

Output results:

java.lang.Object@6659c656
6659c656
1717159510
1717159510

How is the hashcode of the object generated? Is it really a memory address?

The content of this article is based on JAVA 8 HotSpot

Generation logic of hashCode

The logic of generating hashCode in the JVM is not so simple. It provides several strategies, and the generation results of each strategy are different.

Take a look at the core method of generating hashCode in the openjdk source code:

static inline intptr_t get_next_hash(Thread * Self, oop obj) {
  intptr_t value = 0 ;
  if (hashCode == 0) {
     // This form uses an unguarded global Park-Miller RNG,
     // so it's possible for two threads to race and generate the same RNG.
     // On MP system we'll have lots of RW access to a global, so the
     // mechanism induces lots of coherency traffic.
     value = os::random() ;
  } else
  if (hashCode == 1) {
     // This variation has the property of being stable (idempotent)
     // between STW operations.  This can be useful in some of the 1-0
     // synchronization schemes.
     intptr_t addrBits = intptr_t(obj) >> 3 ;
     value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
  } else
  if (hashCode == 2) {
     value = 1 ;            // for sensitivity testing
  } else
  if (hashCode == 3) {
     value = ++GVars.hcSequence ;
  } else
  if (hashCode == 4) {
     value = intptr_t(obj) ;
  } else {
     // Marsaglia's xor-shift scheme with thread-specific state
     // This is probably the best overall implementation -- we'll
     // likely make this the default in future releases.
     unsigned t = Self->_hashStateX ;
     t ^= (t << 11) ;
     Self->_hashStateX = Self->_hashStateY ;
     Self->_hashStateY = Self->_hashStateZ ;
     Self->_hashStateZ = Self->_hashStateW ;
     unsigned v = Self->_hashStateW ;
     v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
     Self->_hashStateW = v ;
     value = v ;
  }

  value &= markOopDesc::hash_mask;
  if (value == 0) value = 0xBAD ;
  assert (value != markOopDesc::no_hash, "invariant") ;
  TEVENT (hashCode: GENERATE) ;
  return value;
}

It can be found from the source code that the generation strategy is controlled by a hashCode global variable, which is 5 by default; This variable is defined in another header file:

product(intx, hashCode, 5,                                            
         "(Unstable) select hashCode generation algorithm" )

It is clear in the source code that... (unstable) select the algorithm generated by hashCode, and the definition here can be controlled by the jvm startup parameters. First confirm the default value:

java -XX:+PrintFlagsFinal -version | grep hashCode

intx hashCode                                  = 5                                   {product}
openjdk version "1.8.0_282"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_282-b08)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.282-b08, mixed mode)

Therefore, we can configure different hashcode generation algorithms through the jvm startup parameters and test the generation results under different algorithms:

-XX:hashCode=N

Now let's look at the different performance of each hashcode generation algorithm.

Algorithm 0

if (hashCode == 0) {
     // This form uses an unguarded global Park-Miller RNG,
     // so it's possible for two threads to race and generate the same RNG.
     // On MP system we'll have lots of RW access to a global, so the
     // mechanism induces lots of coherency traffic.
     value = os::random();
  }

This generation algorithm uses a random number generation strategy of Park Miller RNG. However, it should be noted that... This random algorithm will appear spin waiting when it is highly concurrent

The first algorithm

if (hashCode == 1) {
    // This variation has the property of being stable (idempotent)
    // between STW operations.  This can be useful in some of the 1-0
    // synchronization schemes.
    intptr_t addrBits = intptr_t(obj) >> 3 ;
    value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
}

This algorithm is really the memory address of the object, and directly obtains the IntPtr of the object_ T type pointer.

In addition, the java series interview questions and answers are all sorted out. Wechat searches the Java technology stack and sends them in the background: the interview can be read online.

The second algorithm

if (hashCode == 2) {
    value = 1 ;            // for sensitivity testing
}

There is no need to explain this... Fixed return 1 should be used for internal test scenarios.

Interested students can try - XX:hashCode=2 to start this algorithm to see if the hashCode results have become 1.

The third algorithm

if (hashCode == 3) {
    value = ++GVars.hcSequence ;
}

The algorithm is also very simple, self incrementing. This self incrementing variable is used for the hashCode of all objects. Let's try the effect:

System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());
System.out.println(new Object());

//output
java.lang.Object@144
java.lang.Object@145
java.lang.Object@146
java.lang.Object@147
java.lang.Object@148
java.lang.Object@149

Sure enough, it's self increasing... It's a little interesting

The fourth algorithm

if (hashCode == 4) {
    value = intptr_t(obj) ;
}

In fact, there is little difference between this algorithm and the first algorithm. They all return the object address, but the first algorithm is a variant.

The fifth algorithm

The last is the default generation algorithm. This algorithm is used when the hashCode configuration is not equal to 0 / 1 / 2 / 3 / 4:

else {
     // Marsaglia's xor-shift scheme with thread-specific state
     // This is probably the best overall implementation -- we'll
     // likely make this the default in future releases.
     unsigned t = Self->_hashStateX ;
     t ^= (t << 11) ;
     Self->_hashStateX = Self->_hashStateY ;
     Self->_hashStateY = Self->_hashStateZ ;
     Self->_hashStateZ = Self->_hashStateW ;
     unsigned v = Self->_hashStateW ;
     v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
     Self->_hashStateW = v ;
     value = v ;
  }

Here is a hash value obtained by XOR operation through the current state value. It is more efficient than the previous self increasing algorithm and random algorithm, but the repetition rate should also be relatively higher. However, what does the repetition of hashCode matter

The jvm does not guarantee that this value will not be repeated. For example, the chain address method in HashMap is used to solve hash conflicts

Recent hot article recommendations:

1.1000 + Java interview questions and answers (2021 latest version)

2.Stop playing if/ else on the full screen. Try the strategy mode. It's really fragrant!!

3.what the fuck! What is the new syntax of xx ≠ null in Java?

4.Spring Boot 2.5 heavy release, dark mode is too explosive!

5.Java development manual (Songshan version) is the latest release. Download it quickly!

Feel good, don't forget to like + forward!

Posted by sac0o01 on Fri, 22 Oct 2021 17:04:34 -0700