Talk about automatic boxing and unboxing in Java

Keywords: Java

Introduction

The other day, a friend asked me what the following code would do.

public class Main {
    public static void main(String[] args) {
        Integer i1 = null;
        System.out.println(i1 == 1);
    }
}

I said I would say NPE (NullPointerException) null pointer exception, although it does report null pointer exception after execution, I was not able to answer the root cause of null pointer at that time. I've taken some time to read about it these days and wrote my own code Debug to put my understanding of Java auto-boxing and unboxing into this article.

Basic data types

There are eight basic data types (Primitive Type s) in Java, including four integers, two floating-point types, one character type char for Unicode-encoded character units, and one boolean type for true values. The integer and floating point types are as follows:
integer

typeStorage requirementsRange of values
int4 Bytes-2 147 483 648 ~ 2 147 483 647
short2 bytes-32 768 ~ 32 767
long8 bytes- 9 223 372 036 854 775 808 ~ 9 223 372 036 854 775 807
byte1 byte-128 ~ 127

Floating Point Type

typeStorage requirementsRange of values
float4 BytesApproximately +3.402 823 47E+38F
double8 bytesApproximately +1.797 693 134 862 315 70E+308

Automatic packing and unloading

Many times we encounter the need to convert a basic type like int into an object, as follows:

import java.util.ArrayList;

public class Main {
    public static void main(String[] args) {
        ArrayList<Integer> list = new ArrayList<>();
        list.add(3);
    }
}

At this point, the add method of the ArrayList object needs to pass in an Integer object. Why can we also pass in an int-type value? This involves automatic boxing and unboxing in Java.
In Java, all basic data types have a corresponding class. For example, Integer corresponds to the base type int. Typically, we call these classes Wrapper. These object wrapper classes have distinct names: Integer, Long, Float, Double, Short, Byte, Character, Void, and Boolean.
In the above code, when we call **list.add(3)**, it is automatically transformed to list.add(Integer.valueOf(3)), which is called automatic boxing, where the basic data types are automatically packaged into their corresponding wrapper types.
Similarly, when an Integer object is assigned an int value, for example:

import java.util.ArrayList;

public class Main {
    public static void main(String[] args) {
        ArrayList<Integer> list = new ArrayList<>();
        list.add(3);
        int index = 1;
        int i = list.get(index);
    }
}

It is automatically unboxed, that is, the compiler translates int i = list.get(index), to int i = list.get(index).intValue().
It can be automatically boxed and unboxed even in arithmetic expressions. For example, you can apply a self-increasing operator to a wrapper reference:

public class Main {
    public static void main(String[] args) {
        Integer i = 1;
        i++;
    }
}

The compiler will automatically insert an instruction to unbox objects, perform auto-increment calculations, and finally box the results.

About the'=='operator

As one of the operators,'=='can be used for wrapper type objects, basic data type values, and comparisons between the two, as follows:

public class Main {
    public static void main(String[] args) {
        Integer i1 = 27;
        Integer i2 = 27;
        // A comparison between wrapper types comparing the addresses of object reference i1 and object reference i2
        System.out.println("i1 == i2" + ", " + (i1 == i2));

        int i3 = 27;
        int i4 = 27;
        // A comparison between basic data types (numerical types) comparing the values stored in i3 and i4
        System.out.println("i3 == i4" + ", " + (i3 == i4));
        
        // Compare the numeric type with the corresponding wrapper type. The wrapper type object is automatically unpacked as the basic data type before comparing.
        // So it's essentially a comparison between two basic data types
        System.out.println("i1 == i3" + ", " + (i1 == i3));

        Integer i5 = 200;
        Integer i6 = 200;
        int i7 = 200;
        System.out.println("i5 == i6" + ", " + (i5 == i6));
        System.out.println("i5 == i7" + ", " + (i5 == i7));
    }
}

As explained by the comments in the code, when the'=='operator is wrapper-type on both sides, it compares whether the addresses that store the references to the two objects are equal, and when the symbols are basic data types on both sides, it compares whether the values of the two basic data are equal. When symbols are wrapper types on one side and basic data types on the other, wrapper type objects are automatically unpacked as basic data type values and compared, essentially between the two basic data types.
The results of the above code are as follows:
From the execution results, why do the comparisons between wrapper type objects i1 and i2, i5 and i6 have the opposite results? As mentioned earlier, when both sides of a symbol are wrapper types, the comparison is whether the addresses that store the references of the two objects are equal. That means that i1 and I2 are references to the same Integer type, and i5 and i6 are different Integer type references. Why?Let's check the code.
From the previous auto-unboxing content, when executing to Integer i1 = 27, it is equivalent to executing, Integer i1 = Integer.valueOf(27), Integer.valueOf(int i) code is as follows:

public static Integer valueOf(int i) {
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}

As you can see, when the value of i is between IntegerCache.low and IntegerCache.high, we return the value stored in the IntegerCache.cache array object, while the specific values of IntegerCache.low and IntegerCache.high are as follows:

    private static class IntegerCache {
    	// low has a value of -128
        static final int low = -128;
        // The value of high can be configured with the "java.lang.Integer.IntegerCache.high" property
        // Default 127
        static final int high;
        static final Integer cache[];

        static {
            // high value may be configured by property
            int h = 127;
            String integerCacheHighPropValue =
                sun.misc.VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
            if (integerCacheHighPropValue != null) {
                try {
                    int i = parseInt(integerCacheHighPropValue);
                    i = Math.max(i, 127);
                    // Maximum array size is Integer.MAX_VALUE
                    h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
                } catch( NumberFormatException nfe) {
                    // If the property cannot be parsed into an int, ignore it.
                }
            }
            high = h;

            cache = new Integer[(high - low) + 1];
            int j = low;
            for(int k = 0; k < cache.length; k++)
                cache[k] = new Integer(j++);

            // range [-128, 127] must be interned (JLS7 5.1.7)
            assert IntegerCache.high >= 127;
        }

        private IntegerCache() {}
    }

You can see that the default values for IntegerCache.low and IntegerCache.high are -128 and 127, respectively. Although IntegerCache.high can be modified through the JVM virtual machine configuration, I have never changed it. Therefore, when the basic data type values between [-128, 127] are automatically boxed (that is, Integer.valueOf(int i)), the fixed objects stored in the cache are returned. I think the Java developer designed this logic to avoid overhead due to frequent creation of common basic data types.

Note: Auto-packing specifications require boolean, byte, char < 127, short s and int s between -128 and 127 to be wrapped in fixed objects.

So let's see why i1 == i2 is true, because both i1 and i2 are actually references to the same object in the cache, as shown below:

Why is i5 == i6 false, because i5 and i6 are different object references, as shown below:

About the initial NPE

Remember what I said at the beginning when my friend asked me about my code snippet?

public class Main {
    public static void main(String[] args) {
        Integer i1 = null;
        System.out.println(i1 == 1);
    }
}

Why is a null pointer exception reported here? When the'=='operator ends with the wrapper type and the basic data type, the wrapper type is automatically unpacked as the basic data type and compared. So this is equivalent to executing i1.intValue() once more, and when you do this, NullPointerException appears.
As you can see in the previous section, I bold the compiler because auto-boxing and auto-unboxing happens during the compilation phase, and the compiler automatically adds boxing and unboxing methods such as Integer.valueOf(int i) and intValue() to us.
Referring to the previous comments, was it expected that the following code would be executed?

public class Main {
    public static void main(String[] args) {
        Byte bt1 = 127;
        Byte bt2 = null;
        byte bt3 = 127;
        // Different Object References
        System.out.println("bt1 == bt2" + ", " + (bt1 == bt2));
        // Compare after auto-unboxing
        System.out.println("bt1 == bt3" + ", " + (bt1 == bt3));


        Boolean b1 = true;
        Boolean b2 = null;
        boolean b3 = true;
        // Different Object References
        System.out.println("b1 == b2" + ", " + (b1 == b2));
        // Compare after auto-unboxing
        System.out.println("b1 == b3" + ", " + (b1 == b3));


        Character ch1 = 'a';
        Character ch2 = 'a';
        char ch3 = 'a';
        Character ch4 = null;
        // char is less than 127 wrapped in the same object, same object reference
        System.out.println("ch1 == ch2" + ", " + (ch1 == ch2));
        // Compare after auto-unboxing
        System.out.println("ch1 == ch3" + ", " + (ch1 == ch3));

        // NullPointerException will be reported below
        System.out.println("bt2 == bt3" + ", " + (bt2 == bt3));
        System.out.println("b2 == b3" + ", " + (b2 == b3));
        System.out.println("ch3 == ch4" + ", " + (ch3 == ch4));

    }
}

The results are as follows:

Last

For non-basic data type String, what about the following code execution results?

public class Main {
    public static void main(String[] args) {
        String s1 = "a";
        String s2 = "a";
        String s3 = "abcd";
        String s4 = "abcd";
        String s5 = null;
        System.out.println("s1 == s2" + ", " + (s1 == s2));
        System.out.println("s3 == s4" + ", " + (s3 == s4));
        System.out.println("s1 == s5" + ", " + (s1 == s5));
    }
}

Execution results,

This refers to the concept of string constant pool in Java. Wait until I have time to write another article to dissect it.

Posted by munky334 on Sun, 05 Dec 2021 13:31:06 -0800