Deep understanding of JVM class loading mechanism

Keywords: Programming Java Spring jvm JDK

Class loading process

A type starts from being loaded into the memory of the virtual machine and ends at being unloaded. Its whole life cycle will go through seven stages: Loading, Verification, Preparation, Resolution, Initialization, Using and Unloading Collectively referred to as Linking, as shown in the figure:

The order of loading, verifying, preparing, initializing and unloading is determined, while the parsing phase is not necessarily: in some cases, it can start after the initialization phase.

Load

In the loading phase, the Java virtual machine needs to complete the following three things:

  1. Gets the binary byte stream that defines a class by its fully qualified name.
  2. The static storage structure represented by this byte stream is transformed into the runtime data structure of the method area.
  3. A java.lang.Class object representing this class is generated in memory as the access to various data of this class in the method area.

The "Loading" stage is a stage in the whole "Class Loading" process. Don't confuse it.

Verification

The purpose of verification is to ensure that the information contained in the byte stream of the Class file meets all the constraints of the Java virtual machine specification and does not endanger the security of the virtual machine itself. Generally speaking, the verification phase will complete the following four phases: file format verification, metadata verification, bytecode verification and symbol reference verification.

File format validation

Verify that the byte stream conforms to the Class file format specification and can be processed by the current version of the virtual machine, such as:

  • Start with magic number 0xCAFEBABE
  • Whether the primary and secondary version numbers are within the acceptance range of the current Java virtual machine
  • Whether there is an unsupported constant type in the constant of constant pool (check constant tag flag)
  • Whether there are constants pointing to nonexistent or non conformant types in various index values pointing to constants
  • Whether there is any non UTF-8-compliant data in constant of type const ﹣ utf8 ﹣ info
  • Is there any deleted or additional information in each part of the Class file and the file itself
  • ......

Metadata validation

Carry out semantic analysis on the information described by bytecode to ensure that the information described meets the requirements of Java language specification, such as:

  • Whether this class has a parent (all classes should have a parent except java.lang.Object)
  • Whether the parent class of this class inherits the class that is not allowed to be inherited (the class modified by final)
  • If this class is not an abstract class, does it implement all the methods required to be implemented in its parent class or interface
  • Whether the fields and methods in the class conflict with the parent class (for example, the final fields of the parent class are covered, or the method overloads that do not conform to the rules occur, for example, the method parameters are the same, but the return value types are different, etc.)
  • ......

Bytecode verification

Through data flow analysis and control flow analysis, it is determined that the program semantics are legal and logical, such as:

  • Ensure that the data type and instruction code sequence of the operand stack can work together at any time. For example, there will be no such situation as "an int type data is placed in the operation stack, but it is loaded into the local variable table according to the long type when it is used"
  • Ensure that no jump instruction will jump to a bytecode instruction outside the method body
  • Ensure that the type conversion in the method body is always valid, for example, the parent class = child class object is legal, and the returned object is illegal
  • ......

Symbol reference validation

Symbol reference validation occurs when the virtual machine converts symbol reference to direct reference, that is, the parsing phase. The main purpose is to check whether the class is missing or forbidden to access some external classes, methods, fields and other resources it depends on, such as:

  • Whether the fully qualified name described by string in symbol reference can find the corresponding class
  • Whether there is a method and field described by the field descriptor and simple name matching the method in the specified class
  • Whether the accessibility (private, protected, public, < package >) of classes, fields and methods in symbol references can be accessed by the current class
  • ......

Get ready

The preparation stage is the stage that allocates memory for the variables defined in the class (i.e. static variables, variables modified by static) and sets the initial value of the class variables. When the class variables are modified by final, they will be copied directly in the preparation stage instead of using the initial value, such as:

public static int a = 123;
public static final int B = 123;

The value of a in the preparation phase is 0, and the value of B is 123.

Here is the initial value of the basic data type: Type | default ---|---| int | 0 long | 0L byte | (byte)0 short | (short)0 char| '\u000' float | 0.0f double | 0.0d boolean | false reference| null

Analytic stage

The parsing phase is the process that the Java virtual machine replaces the symbol reference in the constant pool with the direct reference.

  • Symbolic References: a set of symbols is used to describe the referenced target. The symbols can be any form of literal quantity. As long as they can be used to locate the target unambiguously, the referenced target is not necessarily the resources that have been loaded into the memory of the virtual machine;
  • Direct References: a direct reference is a pointer, relative offset, or handle that can be indirectly located to the target. If there is a direct reference, the referenced target must already exist in the memory of the virtual machine;

Initialization

The initialization stage is the process of executing the * * class constructor < clinit > () *. It is the stage of actually executing Java code, such as assigning real values to class properties.

public static int a = 123;

After the initialization phase, the value of a is equal to 123.

  • The < clinit > () method is generated by the compiler automatically collecting the assignment actions of all class variables (static variables) in the class and merging the statements in the static {} block. The collection order is the code order in the source file;
  • The < clinit > () method is not necessary. If there is no static statement block and static attribute assignment in our source file, there will be no < clinit > () method for a long time.
  • The < clinit > () method will ensure synchronization by locking in the case of multithreading, and will only be executed once
  • The method of subclass < clinit > () needs to be executed before the method of subclass < clinit > () is executed, so the method of Object class < clinit > () is the first one to be executed

Class load time

Class initialization time

The virtual machine does not specify when to start loading a class, but strictly specifies when to initialize a class. There are only six cases:

  1. When new, getstatic, putstatic or invokestatic are four bytecode instructions, such as:
    • Use the new keyword to instantiate the object;
    • Read or set a static field of type (except final);
    • Call a static method of a class;
  2. When the method of java.lang.reflect package is used to make reflection call on the type;
  3. When initializing a class, if you find that its parent class has not been initialized, you need to trigger the initialization of its parent class first.
  4. When the virtual machine is started, the user needs to specify a main class to execute (including the class of main() method). The virtual machine first initializes this main class;
  5. When using the new dynamic language support of JDK 7;
  6. When a new default method added by JDK 8 is defined in an interface (interface method decorated by default keyword);

In addition, the way all reference types do not trigger initialization is called passive reference.

Passive citation

Referencing a static field of a parent class through a child class does not cause the child class to initialize:

/**
 * Passive use class field demonstration 1:
 * Referencing a static field of a parent class through a child class does not cause the child class to initialize
 **/
public class SuperClass {

    static {
        System.out.println("SuperClass init!");
    }

    public static int value = 123;
}

public class SubClass extends SuperClass {

    static {
        System.out.println("SubClass init!");
    }
}

/**
 * Demonstration of non active use of class fields
 **/
public class NotInitialization {

    public static void main(String[] args) {
        System.out.println(SubClass.value);
    }

}

Operation result:

SuperClass init!
123

After the above code runs, only "SuperClass init!" will be output Instead of outputting "SubClass init!". For a static field, only the class that defines the field directly will be initialized. Therefore, referencing the static field defined in the parent class through its subclass will only trigger the initialization of the parent class, but not the initialization of the subclass.

Defining a reference class through an array does not trigger the initialization of this class:

/**
 * Demonstration 2 of passive use class field:
 * Referencing a class through an array definition does not trigger its initialization
 **/
public class NotInitialization2 {

    public static void main(String[] args) {
        SubClass[] subClasses = new SubClass[1];
    }
}

Nothing is output after running.

Constants are stored in the constant pool of the calling class in the compilation phase. In essence, they are not directly referenced to the class that defines the constant, so they will not trigger the constant definition:

/**
 * Demonstration 3 of passive use class fields:
 * Constants are stored in the constant pool of the calling class in the compilation phase. In essence, they are not directly referenced to the class that defines the constant, so they will not trigger the constant definition
 * Class initialization
 **/
class ConstClass {

    static {
        System.out.println("ConstClass init!");
    }

    public static final String HELLOWORLD = "hello world";
}

/**
 * Demonstration of non active use of class fields
 **/
public class NotInitialization3 {

    public static void main(String[] args) {
        System.out.println(ConstClass.HELLOWORLD);
    }
}

Operation result:

hello world

Through javap -verbose NotInitialization3, we can find that hello world is already in the constant pool of the current class:

PS E:\> javap -verbose NotInitialization3.class
...
Constant pool:
   #1 = Methodref          #7.#21         // java/lang/Object."<init>":()V
   #2 = Fieldref           #22.#23        // java/lang/System.out:Ljava/io/PrintStream;
   #3 = Class              #24            // com/xiaolyuh/ConstClass
   #4 = String             #25            // hello world
  ...
  #25 = Utf8               hello world
 ...

Decompiled the result of NotInitialization3.class, it can be found that ConstClass.HELLOWORLD is optimized, as follows:

package com.xiaolyuh;

public class NotInitialization3 {
   public static void main(String[] args) {
        System.out.println("hello world");
    }
}

Interface initialization time

The compiler still generates a "< clinit > ()" class constructor for the interface to initialize the member variables defined in the interface.

When an interface is initialized, it does not require that all its parent interfaces have been initialized. It will only be initialized when the parent interface is actually used (such as referencing constants defined in the interface).

Class loader

For any class, its uniqueness in the Java virtual machine must be established by the class loader and the class itself. Each class loader has an independent class namespace.

Class loader type

  • Bootstrap Class Loader: responsible for loading the class libraries stored in the < JAVA HOME > \ lib directory, or in the path specified by the - Xbootclasspath parameter, and recognized by the Java virtual machine (identified by the file name, such as rt.jar, tools.jar, class libraries whose names do not match will not be loaded even if they are placed in the Lib directory). The class libraries are loaded into the memory of the virtual machine Medium;

  • Extension Class Loader: this class loader is implemented in the form of Java code in the class sun.misc.Launcher$ExtClassLoader. It is responsible for loading all class libraries in the < java_home > \ lib \ ext directory, or in the path specified by the java.ext.dirs system variable;

  • Application Class Loader: this class loader is implemented by sun.misc.Launcher$AppClassLoader, also known as "system class loader". It is responsible for loading all the class libraries on the user ClassPath. Developers can also use this classloader directly in the code;

Parents Delegation Model

The hierarchical relationship between various classloaders shown in the figure is called the "parents delegation model" of classloaders. The parent delegation model requires that all class loaders should have their own parent class loaders except the top-level boot class loaders. However, the parent-child relationship between class loaders is usually not implemented by Inheritance, but by Composition to reuse the code of the parent loader.

The working process of the parent delegation model is: all classes are loaded by the parent loader. When the parent loader fails to load this class, the child loader will try to load.

The biggest advantage of the parental delegation model is that classes in Java have a hierarchical relationship with priority along with their classloaders, ensuring that the same class will only be loaded by one loader.

The implementation of parent delegation model

// Using synchronized to ensure thread safety
protected synchronized Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException {
    // First, check whether the requested class has been loaded
    Class c = findLoadedClass(name);
    if (c == null) {
        try {
        if (parent != null) {
            c = parent.loadClass(name, false);
        } else {
            c = findBootstrapClassOrNull(name);
        }
        } catch (ClassNotFoundException e) {
            // If the parent loader throws ClassNotFoundException
            // Description the parent loader could not complete the load request
        }
        if (c == null) {
            // When the parent loader fails to load
            // Then call its findClass method to load the class
            c = findClass(name);
        }
    }
    if (resolve) {
        resolveClass(c);
    }
    return c;
}

First, check whether the requested loading type has been loaded. If not, call the loadClass() method of the parent loader. If the parent loader is empty, the startup class loader is used as the parent loader by default. If the parent class loader fails to load and throws a ClassNotFoundException exception, it will call its findClass() method to try to load.

Break parental delegation model

The two parents delegation model has been broken three times

  1. Before 1.2, the implementation of the custom class loader only covered the loadClass() method, which led to the destruction of the parental delegation model. After 1.2, the findClass() method was introduced to solve this problem.
  2. The basic type cannot call back the user's code, such as JNDI, JDBC, JCE, JAXB, JBI, etc. their interface definition is the basic type, but their implementation is each manufacturer, which leads to the basic type needs to call the user's code. Later, the Thread Context ClassLoader was introduced to solve this problem.
  3. The third "destruction" of the parental delegation model is caused by the user's pursuit of program dynamics. The "dynamics" here refers to some very "hot" door terms: Hot Swap of code, hot deployment of modules, etc.

Custom class loader

The custom class loader needs to inherit the ClassLoader class. In order not to destroy the parent delegation model, it is recommended that the custom class loader override the findClass() method and not the loadClass() method. Here is a ClassLoader that I implemented to load encrypted class files and prevent decompilation of core code.

Encrypt Class file

Code to encrypt the class file:

/**
 * Encrypt Class file
 *
 * @author yuhao.wang3
 * @since 2020/1/20 10:39
 */
public class EncryptionClassFileTask extends RecursiveAction {

    Logger logger = LoggerFactory.getLogger(EncryptionClassFileTask.class);

    /**
     * Public key
     */
    String publicKey;

    /**
     * Directories requiring encryption
     */
    private File file;

    /**
     * Package name to be encrypted
     */
    private List<String> packages;

    /**
     * Class name to exclude
     */
    private List<String> excludeClass;


    /**
     * @param file      Directory where files need to be encrypted
     * @param packages  Package name to be encrypted
     * @param publicKey Public key
     */
    public EncryptionClassFileTask(File file, List<String> packages, String publicKey) {
        this(file, packages, null, publicKey);
    }

    /**
     * @param file         Directory where files need to be encrypted
     * @param packages     Package name to be encrypted
     * @param excludeClass Class name to exclude
     * @param publicKey    Public key
     */
    public EncryptionClassFileTask(File file, List<String> packages, List<String> excludeClass, String publicKey) {
        this.file = file;
        this.excludeClass = excludeClass;
        this.publicKey = publicKey;
        this.packages = new ArrayList<>();

        if (Objects.nonNull(packages)) {
            for (String packageName : packages) {
                this.packages.add(packageName.replace('.', File.separatorChar));
            }
        }

        if (Objects.isNull(excludeClass)) {
            this.excludeClass = new ArrayList<>();
        }

        this.excludeClass.add("RsaClassLoader");
    }

    @Override
    protected void compute() {
        if (Objects.isNull(file)) {
            return;
        }

        File[] files = file.listFiles();
        List<EncryptionClassFileTask> fileTasks = new ArrayList<>();
        if (Objects.nonNull(files)) {
            for (File f : files) {
                // Split task
                if (f.isDirectory()) {
                    fileTasks.add(new EncryptionClassFileTask(f, packages, excludeClass, publicKey));
                } else {
                    if (f.getAbsolutePath().endsWith(".class")) {
                        if (!CollectionUtils.isEmpty(excludeClass) && excludeClass.contains(f.getName().substring(0, f.getName().indexOf(".")))) {
                            continue;
                        }
                        // If the packages is empty, all files in the folder will be encrypted directly
                        if (CollectionUtils.isEmpty(packages)) {
                            encryptFile(f);
                            return;
                        }
                        // If the packages are not empty, the files below the specified enrollment will be encrypted
                        for (String packageName : packages) {
                            if (f.getPath().contains(packageName.replace('.', File.separatorChar))) {
                                encryptFile(f);
                                return;
                            }
                        }
                    }
                }
            }
            // Submit and perform tasks
            invokeAll(fileTasks);
            for (EncryptionClassFileTask fileTask : fileTasks) {
                // Wait for task execution to complete
                fileTask.join();
            }
        }
    }

    private void encryptFile(File file) {
        try {
            logger.info("encryption[{}] File start", file.getPath());
            byte[] bytes = RSAUtil.encryptByPublicKey(RSAUtil.toByteArray(file), publicKey);

            try (FileChannel fc = new FileOutputStream(file.getPath()).getChannel()) {
                ByteBuffer bb = ByteBuffer.wrap(bytes);
                fc.write(bb);
                logger.info("encryption[{}] End of file", file.getPath());
            }
        } catch (IOException e) {
            logger.error("Encrypted file {} Exception:{}", file.getPath(), e.getMessage(), e);
        }
    }
}

Test class for encryption class file:

public class EncryptionClassFileTaskTest {
    public static void main(String[] args) throws Exception {
        String testPublicKey = "MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCE7vuntatVmQVp6xGlBa/U/cEkKtFjyhsTtn1inlYtw5KSasTfa/HMPwJKp1QchsGEt0usOkHHC9HuD8o1gKx/Dgjo6b/XGu6xhinyRjCJWLSHXGOq9VLryaThwZsRB4Bb5DU9NXkl8WE2ih8QEKO1143KeJ5SE38awi74im0dzQIDAQAB";

        List<String> excludeClass = Lists.newArrayList("MainController");
        List<String> packages = Lists.newArrayList("com.xiaolyuh.controller");

        encryptionClassFile("D:\\aes_class", packages, excludeClass, testPublicKey);
    }

    private static void encryptionClassFile(String filePath, List<String> packages, List<String> excludeClass, String publicKey) throws Exception {
        ForkJoinPool pool = new ForkJoinPool(16);

        File classFile = new File(filePath);
        if (!classFile.exists()) {
            throw new NoSuchFileException("File does not exist!");
        }
        pool.invoke(new EncryptionClassFileTask(classFile, packages, excludeClass, publicKey));
    }
}

Output:

16:38:03.396 [ForkJoinPool-1-worker-9] INFO com.xiaolyuh.utils.EncryptionClassFileTask - encryption[D:\aes_class\spring-boot-student-jvm-0.0.1-SNAPSHOT\BOOT-INF\classes\com\xiaolyuh\controller\UserController.class] File start
16:38:04.530 [ForkJoinPool-1-worker-9] INFO com.xiaolyuh.utils.EncryptionClassFileTask - encryption[D:\aes_class\spring-boot-student-jvm-0.0.1-SNAPSHOT\BOOT-INF\classes\com\xiaolyuh\controller\UserController.class] End of file

Decrypt Class file loader

/**
 * RSA Encrypted ClassLoader
 *
 * @author yuhao.wang3
 * @since 2020/1/19 17:06
 */
public class RsaClassLoader extends ClassLoader {
    private static final int MAGIC = 0xcafebabe;

    Logger logger = LoggerFactory.getLogger(RsaClassLoader.class);

    @Override
    public Class<?> loadClass(String name) throws ClassNotFoundException {
        try {
            // Load with parent loader first
            return getParent().loadClass(name);
        } catch (Throwable t) {
            return findClass(name);
        }
    }

    @Override
    protected Class<?> findClass(String name) throws ClassNotFoundException {
        String fileName = name.replace('.', File.separatorChar) + ".class";
        try (InputStream inputStream = getClass().getClassLoader().getResourceAsStream(fileName)) {
            BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

            logger.warn("Please enter the decryption private key, otherwise the service cannot be started");
            System.out.print("Please enter decryption private key:");
            String privateKey = br.readLine();
            logger.info("Decrypt[{}] File start", name);
            byte[] bytes = RSAUtil.decryptByPrivateKey(RSAUtil.toByteArray(inputStream), privateKey);
            logger.info("Decrypt[{}] End of file", name);
            return this.defineClass(name, bytes, 0, bytes.length);
        } catch (Exception e) {
            logger.info("Decrypt [{}] File exception: {}", name, e.getMessage(), e);
            throw new ClassNotFoundException(String.format("Decrypt [%s] File exception: %s", name, e.getCause()));
        }
    }
}

Test class:

/**
 * RSA Encrypted ClassLoader
 *
 * @author yuhao.wang3
 * @since 2020/1/19 17:06
 */
public class RsaClassLoaderTest {
    public static void main(String[] args) throws Exception {
        RsaClassLoader loader = new RsaClassLoader();
        Object object = loader.loadClass("com.xiaolyuh.controller.UserController2").newInstance();
        System.out.println("Use the default classloader: class :" + object.getClass() + "  ClassLoader:" + object.getClass().getClassLoader());

        Object object2 = loader.loadClass("com.xiaolyuh.controller.UserController").newInstance();
        System.out.println("Use custom class loader: class :" + object2.getClass() + "  ClassLoader:" + object2.getClass().getClassLoader());
        System.out.println(JSON.toJSONString(object2));
    }
}

Run prompt to input decryption public key. After input, the class is loaded successfully:

Use the default class loader: Class: class com.xiaolyuh.controller.usercontroller2 classloader: sun.misc.launcher $appclassloader @ 18b4aac2
 17: 21:30.233 [main] warn com.xiaolyuh.rsacloader - please enter the decryption private key, otherwise the service cannot be started
 Please enter the decryption private key:: miicdqibadanbgqqhkig9w0baqefaascal8wggjbageaogbaitu + 6e1q1wzbwnreaufr9t9wsqq0wpkgxo2fwkevi3dkpjqxn9r8c
 17: 21:34.252 [main] info com.xiaolyuh.rsacloader - decrypt [com.xiaolyuh.controller.UserController] file start
 17: 21:34.737 [main] info com.xiaolyuh.rsacloader - end of decryption [com.xiaolyuh.controller.UserController] file
 Use custom class loader: Class: class com.xiaolyuh.controller.usercontroller classloader: com.xiaolyuh.rsacloader @ 1c93084c

Source code

https://github.com/wyh-spring-ecosystem-student/spring-boot-student/tree/releases

Spring boot student JVM project

Reference resources

Deep understanding of JAVA virtual machine

Posted by sulin on Mon, 20 Jan 2020 03:17:20 -0800