Analyze java assembly instructions through javap command

Keywords: Java Eclipse architecture

1, javap command Brief

javap is the jdk's own anti parsing tool. Its function is to inverse parse the code area (assembly instruction), local variable table, exception table, code line offset mapping table, constant pool and other information corresponding to the current class according to the class bytecode file.
Of course, some of these information (such as local variable table, instruction and code line offset mapping table, parameter names of methods in constant pool, etc.) can only be output by specifying parameters when compiling into class files using javac. For example, if you directly javac xx.java, you will not generate the corresponding local variable table and other information, If you use javac -g xx.java, you can generate all relevant information. If you use eclipse, by default, eclipse will help you generate information such as local variable table, instruction and code line offset mapping table when compiling.
Through the assembly code generated by decompilation, we can deeply understand the working mechanism of java code. For example, we can view i + +; When this line of code is actually running, first obtain the value of variable i, then add 1 to this value, and finally assign the value after adding 1 to variable i.
Through the local variable table, we can view the scope range, slot location and other information of local variables, and even slot reuse and other information.

Usage format of javap:
javap <options> <classes>
classes is the class file you want to decompile.
Directly enter javap or javap -help on the command line. You can see that the options of javap are as follows:

 -help  --help  -?        Output this usage message
 -version                 Version information is actually the current version javap where jdk Version information, not class Where jdk Generated under.
 -v  -verbose             Output additional information (including line number, local variable table, disassembly and other details)
 -l                         Output line number and local variable table
 -public                    Show only public classes and members
 -protected               Show protected/Public classes and members
 -package                 Display package/Protected/Public classes and members (default)
 -p  -private             Show all classes and members
 -c                       Disassemble the code
 -s                       Output internal type signature
 -sysinfo                 Displays system information for the class being processed (route, size, date, MD5 hash)
 -constants               Show static final constants
 -classpath <path>        Specify where to find the user class file
 -bootclasspath <path>    Overwrite the location of the boot class file

The three options - v -l -c are commonly used.
javap -v classxx will not only output line number, local variable table information, decompile assembly code, but also output constant pool and other information used by the current class.
javap -l will output the line number and local variable table information.
javap -c decompiles the current class bytecode to generate assembly code.
When viewing the assembly code, you need to know the jvm instructions inside. You can refer to the official document:
https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html
In addition, you can also see the above information through the jclasslib tool, which is visual and has a better effect.

2, javap test and content explanation

The contents of javap output have been described earlier. There are many things. Here we mainly introduce three parts: code area (assembly instruction), local variable table and code line offset mapping.
If you need to analyze more information, you can use javap -v to view it.
In addition, in order to facilitate understanding, all assembly instructions are not only explained, but explained in the way of comments in the disassembly code

Write a code to test it:
Example 1: analyze the results after disassembly of the following code:

public class TestDate {
    
    private int count = 0;
    
    public static void main(String[] args) {
        TestDate testDate = new TestDate();
        testDate.test1();
    }
    
    public void test1(){
        Date date = new Date();
        String name1 = "wangerbei";
        test2(date,name1); 
        System.out.println(date+name1);
    }

    public void test2(Date dateP,String name2){
        dateP = null;
        name2 = "zhangsan";
    }

    public void test3(){
        count++;
    }
    
    public void  test4(){
        int a = 0;
        {
            int b = 0;
            b = a+1;
        }
        int c = a+1;
    }
}

The above code generates a class file through JAVAC -g, and then disassembles the bytecode through the javap command:
$ javap -c -l TestDate
Get the following contents (I summarized the instructions and other parts with reference to the official documents):

Warning: Binary file TestDate contains com.justest.test.TestDate
Compiled from "TestDate.java"
public class com.justest.test.TestDate {
  //The default construction method mainly completes some initialization operations when the construction method is executed, including initialization and assignment of some member variables
  public com.justest.test.TestDate();
    Code:
       0: aload_0 //Load the value of the variable with index 0 from the local variable table, that is, the reference of this, and push it onto the stack
       1: invokespecial #10 / / out of the stack, call Java / Lang / object. "< init >": () V to initialize the object, that is, the init() method of the object specified by this to complete the initialization
       4: aload_0  // 4 to 6 means that this.count = 0 is called, that is, count is copied to 0. Here the this reference is put on the stack
       5: iconst_0 //Push the constant 0 into the operand stack
       6: putfield     //Take out the two values (this reference, constant value 0) pressed in front of the stack, and assign 0 to count
       9: return
//The offset correspondence between the instruction and the number of code lines. The first number of each line corresponds to the number of code lines, and the second number corresponds to the number in front of the instruction in the previous code
    LineNumberTable:
      line 5: 0
      line 7: 4
      line 5: 9
    //In the local variable table, start+length indicates the offset position of the start and end of the life cycle of the variable in the bytecode (this life cycle starts from 0 to 10), slot is the slot of the variable in the local variable table (slot can be reused), name is the variable name, and Signatur local variable type description
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
         0      10     0  this   Lcom/justest/test/TestDate;
 
  public static void main(java.lang.String[]);
    Code:
// The new instruction creates a class COM / just / test / testdate object. The new instruction cannot completely create an object. The object can be created only after the initialization method is called (that is, after the invokespecial instruction is called),
       0: new  //Create an object and push the object reference onto the stack
       3: dup //Copy the data set by the operand stack and push it into the stack. At this time, there are two reference values in the stack
       4: invokespecial #20 / / pop the stack reference value and call its constructor to complete the initialization of the object
       7: astore_1 //pop the stack reference value and assign it (Reference) to the variable testDate in the local variable table
       8: aload_1  //Push the reference value of testDate onto the stack because testDate.test1(); testDate is called, and aload is used here_ 1 get the value of the corresponding variable testDate from the local variable table and push it into the operand stack
       9: invokevirtual #21 // Method test1:()V refers to the stack and calls the test1() method of testDate
      12: return //The whole main method ends and returns
    LineNumberTable:
      line 10: 0
      line 11: 8
      line 12: 12
    //For the local variable table, testDate starts the declaration cycle only after creation and assignment
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
         0      13     0  args   [Ljava/lang/String;
         8       5     1 testDate   Lcom/justest/test/TestDate;
 
  public void test1();
    Code:
       0: new           #27 / / create a date object from 0 to 7 and assign it to the date variable
       3: dup
       4: invokespecial #29                 // Method java/util/Date."<init>":()V
       7: astore_1
       8: ldc           #30 / / string wangerbei, press the constant "wangerbei" onto the stack
      10: astore_2  //Pop out the "wangerbei" pop in the stack and assign it to name1
      11: aload_0 //11 to 14, corresponding to test2(date,name1); The default is preceded by this
      12: aload_1 //Take the date variable from the local variable table
      13: aload_2 //Fetch name1 variable
      14: invokevirtual #32 / / method test2: (ljava / util / date; ljava / Lang / string;) V calls the test2 method
  // 17 to 38 correspond to System.out.println(date+name1);
      17: getstatic     #36                 // Field java/lang/System.out:Ljava/io/PrintStream;
  //20 to 35 are optimization means in the jvm. When multiple string variables are added, a string object will not be created in pairs, but an object will be created using StringBuilder
      20: new           #42                 // class java/lang/StringBuilder
      23: dup
      24: invokespecial #44                 // Method java/lang/StringBuilder."<init>":()V
      27: aload_1
      28: invokevirtual #45                 // Method java/lang/StringBuilder.append:(Ljava/lang/Object;)Ljava/lang/StringBuilder;
      31: aload_2
      32: invokevirtual #49                 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      35: invokevirtual #52                 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      38: invokevirtual #56 / / method Java / Io / printstream.println: (ljava / Lang / string;) the V invokevirtual instruction indicates calling a method based on a class
      41: return
    LineNumberTable:
      line 15: 0
      line 16: 8
      line 17: 11
      line 18: 17
      line 19: 41
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      42     0  this   Lcom/justest/test/TestDate;
             8      34     1  date   Ljava/util/Date;
            11      31     2 name1   Ljava/lang/String;
 
  public void test2(java.util.Date, java.lang.String);
    Code:
       0: aconst_null //Push a null value onto the stack
       1: astore_1 //Assign null to dateP
       2: ldc           #66 / / String zhangsan takes the string "zhangsan" from the constant pool and pushes it into the stack
       4: astore_2 //Assign a string to name2
       5: return
    LineNumberTable:
      line 22: 0
      line 23: 2
      line 24: 5
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       6     0  this   Lcom/justest/test/TestDate;
             0       6     1 dateP   Ljava/util/Date;
             0       6     2 name2   Ljava/lang/String;
 
  public void test3();
    Code:
       0: aload_0 //Take out this and press it into the stack
       1: dup   //Copy the value at the top of the operand stack and push it into the stack. At this time, there are two reference values of this object in the operation array stack
       2: getfield #12// Field count:I this out of the stack, get its count field, and then push it into the stack. At this time, there is a value of this and a value of count in the stack
       5: iconst_1 //Take out an int constant 1 and push it into the operand stack
       6: iadd  // Take count and 1 from the stack, add the count value and 1, and put the result on the stack
       7: putfield      #12 // Field count:I pop up two at a time. The first pops up the value calculated in the previous step, and the second pops up this. Assign the value to the count field of this
      10: return
    LineNumberTable:
      line 27: 0
      line 28: 10
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      11     0  this   Lcom/justest/test/TestDate;
 public void test4();
    Code:
       0: iconst_0
       1: istore_1
       2: iconst_0
       3: istore_2
       4: iload_1
       5: iconst_1
       6: iadd
       7: istore_2
       8: iload_1
       9: iconst_1
      10: iadd
      11: istore_2
      12: return
    LineNumberTable:
      line 33: 0
      line 35: 2
      line 36: 4
      line 38: 8
      line 39: 12
    //Next, the slot slots of b and c are the same. This is because the scope of b is in the method block. When the method block ends, the slot in the local variable table is released, and the subsequent variables can reuse this slot
    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      13     0  this   Lcom/justest/test/TestDate;
             2      11     1     a   I
             4       4     2     b   I
            12       1     2     c   I
}

Example 2: the following example
There is a User class first:

public class User {
    private String name;
    private int age;
 
    public String getName() {
        return name;
    }
 
    public void setName(String name) {
        this.name = name;
    }
 
    public int getAge() {
        return age;
    }
 
    public void setAge(int age) {
        this.age = age;
    }
}

Then write a test class that operates on the User object:

public class TestUser {
     
    private int count;
     
    public void test(int a){
        count = count + a;
    }
     
    public User initUser(int age,String name){
        User user = new User();
        user.setAge(age);
        user.setName(name);
        return user;
    }
     
    public void changeUser(User user,String newName){
        user.setName(newName);
    }
}

First, javac -g is compiled into a class file.
Then disassemble the TestUser class:
$ javap -c -l TestUser
The disassembly results are as follows:

Warning: Binary file TestUser contains com.justest.test.TestUser
Compiled from "TestUser.java"

public class com.justest.test.TestUser {

//default constructor 
  public com.justest.test.TestUser();

    Code:
       0: aload_0
       1: invokespecial #10                 // Method java/lang/Object."<init>":()V
       4: return

    LineNumberTable:
      line 3: 0

    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       5     0  this   Lcom/justest/test/TestUser;

  public void test(int);

    Code:
       0: aload_0 //Take the corresponding reference value corresponding to this and press it into the operand stack
       1: dup //Copy the data at the top of the stack and push it into the stack. At this time, there are two values in the stack, both of which are references to this object
       2: getfield      #18 / / reference out of the stack, obtain the corresponding count value through reference, and push it into the stack
       5: iload_1 //Get the value of a from the local variable table and push it into the stack
       6: iadd //Pop up the count value and a value in the stack, add, and push the result into the stack
       7: putfield      #18 / / after the previous operation, there are two values in the stack. The top of the stack is the result of the previous operation. Below the top of the stack is the this reference. In this step, the putfield instruction is used to assign the value of the top of the stack to the count field of the reference object
      10: return //return void

    LineNumberTable:
      line 8: 0
      line 9: 10

    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      11     0  this   Lcom/justest/test/TestUser;
             0      11     1     a   I

  public com.justest.test.User initUser(int, java.lang.String);

    Code:
       0: new           #23 / / class COM / just / test / User creates a User object and pushes the reference into the stack
       3: dup //Copy the top value of the stack and push it into the stack again. There are two address references of User objects in the stack
       4: invokespecial #25 / / method COM / just / test / user. "< init >": () V call user object initialization
       7: astore_3 //pop out the reference value of the user object from the stack and assign it to the user variable in the local variable table
       8: aload_3 //Get the user value from the local variable table, that is, the address reference of the user object, and push it into the stack
       9: iload_1 //Get the value of a from the local variable table and push it into the stack. Pay attention to the difference between aload and iload. One value is object reference and the other is int type data
      10: invokevirtual #26 / / method COM / just / test / User. setAge: (I) V operand stack pop out two values, one is the User object reference and the other is the value of A. call setAge method and pass the value of a to this method. setAge operates on the fields of objects in the heap
      13: aload_3 //Same as 7, press into stack
      14: aload_2 //Take name from the local variable table and push it onto the stack
      15: invokevirtual #29 / / methoduser.setName: (ljava / Lang / string;) V operand stack pop out two values, one is the User object reference and the other is the value of name. Call setName method and pass the value of a to this method. setName operates the fields of objects in the heap
      18: aload_3 //Take out the User reference from the local variable and push it onto the stack
      19: areturn //The areturn instruction is used to return the reference of an object, that is, the User reference in the previous step. This return value will be pushed into the stack of the method calling the current method. Objectref is popped from the operation stack of the current frame ([§ 2.6]( https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms -2.6))  and pushed onto the operand stack of the frame of the invoker

    LineNumberTable:
      line 12: 0
      line 13: 8
      line 14: 13
      line 15: 18

    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      20     0  this   Lcom/justest/test/TestUser;
             0      20     1   age   I
             0      20     2  name   Ljava/lang/String;
             8      12     3  user   Lcom/justest/test/User;

  public void changeUser(com.justest.test.User, java.lang.String);

    Code:
       0: aload_1 //Take the user from the local variable table, that is, the user object reference, and push it onto the stack
       1: aload_2 //Take newName from the local variable table and push it onto the stack
       2: invokevirtual #29 // Method User.setName:(Ljava/lang/String;)V pop stack newName value and TestUser reference, call its setName method, and pass the newName value to this method
       5: return

    LineNumberTable:
      line 19: 0
      line 20: 5

    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0       6     0  this   Lcom/justest/test/TestUser;
             0       6     1  user   Lcom/justest/test/User;
             0       6     2 newName   Ljava/lang/String;

public static void main(java.lang.String[]);

    Code:
       0: new      #1 / / class COM / just / test / TestUser creates a TestUser object and pushes the reference into the stack
       3: dup //Copy reference, push on stack
       4: invokespecial #43 / / method "< init >": () V reference value out of the stack, call the construction method, and initialize the object
       7: astore_1 //The reference value is out of the stack and assigned to the variable tu in the local variable table
       8: aload_1 //Take out the tu value and push it into the stack
       9: bipush    10 //Push int value 10 onto the stack
      11: ldc           #44 / / string wangerbei takes "wangerbei" from the constant pool and pushes it onto the stack
      13: invokevirtual #46    // Method initUser(ILjava/lang/String;)Lcom/justest/test/User;  Call the inituser method of tu and return the user object. The stack has three values: tu reference, 10 and "wangerbei". The return value of the inituser method, that is, the user reference, will also be pushed into the stack. Refer to the areturn instruction in inituser above
      16: astore_2 //User refers to the stack and assigns it to the user variable
      17: aload_1 //Take out the tu value and push it into the stack
      18: aload_2 //Take out the user value and push it into the stack
      19: ldc           #48 / / String lisi takes "lisi" from the constant pool and pushes it onto the stack
      21: invokevirtual #50 / / method changeUser: (LCOM / just / test / user; ljava / Lang / string;) V calls the changeUser method of tu and passes the user reference and lisi to this method
      24: return //return void
   
 LineNumberTable:
      line 23: 0
      line 24: 8
      line 25: 17
      line 26: 24

    LocalVariableTable:
      Start  Length  Slot  Name   Signature
             0      25     0  args   [Ljava/lang/String;
             8      17     1    tu   Lcom/justest/test/TestUser;
            17       8     2  user   Lcom/justest/test/User;

}

3, Summary

1. Through the javap command, you can view a java class disassembly, constant pool, variable table, instruction code line number table and so on.

2. Usually, we pay more attention to the instruction operation process in the disassembly of each method in the java class. These instructions are executed in sequence. You can refer to the official documents to see the meaning of each instruction. It is very simple:

https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.areturn

3. Through the analysis of each instruction operation in the code disassembly of the previous two examples, it can be found that the execution of a method usually involves the following memory operations:

(1) java stack: local variable table, operand stack. These operations are basically value operations.
(2) java heap. Operate through the address reference of the object.
(3) Constant pool.
(4) Other parts such as frame data area and method area (before jdk1.8, the constant pool was also in the method area) were not displayed in the test, which is explained here.

When performing value related operations:
An instruction can obtain data from local variable tables, constant pools, objects in the heap, method calls, system calls, etc. These data (which may refer to or may be references to objects) are pushed into the operand stack.
An instruction can also take one or more values (pop multiple times) from the operand stack to complete assignment, addition, subtraction, multiplication and division, method parameter transfer, system call, etc.



 

Posted by nick_whitmarsh on Thu, 28 Oct 2021 22:29:38 -0700