Ali P7 Interviewer: could you briefly describe the implementation principle of class loading mechanism?

Keywords: Java

Interview question: principle of class loading mechanism

Interviewer survey point

Objective: to understand the interviewer's understanding of JVM, which belongs to the interview eight part essay series.

Scope of investigation: working for more than 3 years.

Technical background knowledge

Before answering this question, we need to understand what is the class loading mechanism?

Class loading mechanism

What is the class loading mechanism?

Simply put, class loading refers to reading the binary data in the class's. Class file into memory, placing it in the method area of the runtime data area, and then creating a java.lang.Class object in the heap area to encapsulate the class's data structure in the method area.

After the class loading process, we can build the instance object of this class in the program and complete the method call and operation of the object.

The basic working principle is shown in the figure below.

The original code of the. java suffix we wrote is compiled by the JVM to get the. class file.

The class loading mechanism is to load the. Class file into the JVM. We know that the runtime data area of the JVM is divided into heap memory, virtual machine stack, meta space, local method stack, program counter and other spaces. When the class is loaded, the data will be saved into the corresponding area according to the JVM memory rules.

Understanding class loaders

Let's think about it. In actual development, what classes need to be loaded when running a program?

  • Load directly from the local system, such as JRE and CLASSPATH.

  • Download. class files over the network

  • Load. class files from zip, jar and other archive files

  • Extract. class files from a proprietary database

  • Dynamically compile Java source files into. class files (server)

Since class loaders are responsible for the loading behavior of all classes related to system operation, the JVM provides three types of loaders for classes in different locations:

  1. Start the class loader, BootStrapClassLoader, the top-level loading class, which mainly loads the core class library, that is, the% JRE under our environment variable_ rt.jar, resources.jar, charsets.jar and class under home% \ lib. You can also change the loading directory of Bootstrap ClassLoader by specifying - Xbootclasspath and path when starting the jvm.
  2. Extension class loader, ExtClassLoader, loading directory% JRE_ jar packages and class files in the home% \ lib \ ext directory. You can also load the directory specified by the - D java.ext.dirs option
  3. Application class loader, AppClassLoader, also known as SystemAppClass. Load all classes and jar packages of the classpath of the current application

From the description of the above three class loaders, different loaders represent different loading functions. When a class defined by ourselves is to be loaded into memory, the working principle of the class loader is shown in the figure below.

Starting from Java 2, the class loading process adopts the Parents Delegation Model (PDM), which better ensures the security of the Java platform. In this mechanism, the BootStrapClassLoader of the JVM is the root loader, and other loaders have only one parent loader. Class loading first requires the parent class loader to load. Only when the parent class loader is powerless can its child class loader load itself.

PDM is only a mechanism recommended by Java, not mandatory. You can inherit the java.lang.ClassLoader class and implement your own class loader. If you want to maintain PDM, rewrite findClass(name); If you want to destroy PDM, override loadClass(name). JDBC uses thread context loader to break PDM, because JDBC only provides interface, not implementation.

Demonstration of class loader

The following code demonstrates the loader used by the class.

public class ClassLoaderExample {

    public static void main(String[] args) {
        ClassLoader loader=ClassLoaderExample.class.getClassLoader();
        System.out.println(loader);  //case1
        System.out.println(loader.getParent()); //case2
        System.out.println(loader.getParent().getParent()); //case3
    }
}
  • The code shown in case 1 indicates that the ClassLoaderExample class is loaded by the class loader.
  • The code shown in Case2 represents the parent loader of ClassLoaderExample
  • The code shown in Case2 represents the grandfather loader of ClassLoaderExample

The operation results are as follows:

sun.misc.Launcher$AppClassLoader@18b4aac2
sun.misc.Launcher$ExtClassLoader@29453f44
null

It is proved that ClassLoaderExample is loaded by AppClassLoader.

The last one should be the Bootstrap classloader, but the output here is null because BootStrapClassLoader is a class loader written in C/C + +, which has been embedded into the JVM kernel. When the JVM starts, BootStrapClassLoader will also start and load the core class library. When the core class library is loaded, BootStrapClassLoader will create instances of ExtClassLoader and AppClassLoader, and the two Java implemented class loaders will load the class libraries under their own path. This process can be seen in sun.misc.Launcher.

Why design PDM

Why use PDM to implement class loading in Java? There are several purposes

  1. Prevent multiple copies of the same bytecode from appearing in memory. If there is no PDM but each class loader loads it by itself, the user writes a class with the same name of java.lang.Object and puts it in ClassPath. Multiple class loaders can load this class into memory, and multiple different Object classes will appear in the system, so the comparison results between classes and the uniqueness of classes cannot be guaranteed. At the same time, It will also bring hidden dangers to the security of virtual machines.
  2. The two parent delegation mechanism can ensure that when multiple loaders load a class, it is finally loaded by one loader to ensure that the final loading results are the same.
  3. This can ensure that the System library is loaded first. Even if it is rewritten by itself, it always uses the System provided by the Java System. The System class written by itself has no chance to be loaded at all, so as to ensure security.

Loading principle of class

What does a class do during loading? What is its implementation principle?

Class starts from being loaded into the virtual machine memory to unloading the memory. Its whole life cycle includes seven stages: loading, verification, preparation, parsing, initialization, use and unloading. Their sequence is shown in the following figure:

The process of class loading includes five stages: loading, verification, preparation, parsing and initialization. In these five phases, the sequence of loading, verification, preparation and initialization is determined, while the parsing phase is not necessarily. In some cases, it can start after the initialization phase. It is also noted that several stages here start in sequence rather than in sequence, because these stages are usually intermixed and usually invoke or activate another stage in the execution of a phase.

The work performed in each stage is shown in the figure below.

The following is a detailed analysis of the detailed workflow of the class loader at each stage.

load

”The first process of loading "yes" class plus mechanism ". In the loading stage, the virtual machine mainly completes three things:

(1) Get the binary byte stream defined by a class through its fully qualified name

(2) The static storage structure represented by this byte stream is transformed into the runtime data structure of the method area

(3) A Class object representing this Class is generated in the heap as an access to these data in the method area.

verification

The main role of validation is to ensure the correctness of the loaded classes. It is also the first step in the connection phase. To put it bluntly, that is, the loaded. class file can't harm our virtual machine, so check and verify it first. He mainly completes the verification in four stages:

(1) Verification of file format: verify whether the byte stream of. Class file conforms to the format specification of class file and can be processed by the current version of virtual machine. It mainly checks the magic number, major version number, constant pool, etc. (magic number and major version number are the data information contained in the. Class file, which can be understood here).

(2) Metadata verification: it mainly performs semantic analysis on the information described by bytecode to ensure that the information described meets the requirements of java language specification, such as verifying whether this class has a parent class, whether the field methods in the class conflict with the parent class, etc.

(3) Bytecode verification: This is the most complex stage of the whole verification process. It mainly determines that the program semantics is legal and logical through data flow and control flow analysis. After verifying the data type in the metadata verification stage, this stage mainly analyzes the class methods to ensure that the class methods will not do anything harmful to the security of the virtual machine.

(4) Symbolic reference validation: it is the last stage of validation, which occurs when the virtual machine converts symbolic references to direct references. It is mainly used to verify the information outside the class itself. The purpose is to ensure that the parsing action can be completed.

For the whole class loading mechanism, the verification stage is a very important but non essential stage. If our code can ensure that there are no problems, we don't need to verify. After all, verification takes a certain time. Of course, we can use - xverity: none to turn off most of the validation.

prepare

The preparation phase is mainly to allocate memory for class variables and set initial values. This memory is allocated in the method area. At this stage, we only need to pay attention to two key words: class variable and initial value:

(1) Class variables (static) will allocate memory, but instance variables will not. Instance variables are mainly allocated to the java heap along with the instantiation of objects,

(2) The initial value here refers to the default value of the data type, not the value given by the display in the code. For example, public static int value = 1;, Here, the value after the preparation phase is 0, not 1. The action with a value of 1 is in the initialization phase.

Above, value is 0 after the preparation stage modified by static, but if it is modified by both final and static, it is 1 after the preparation stage. We can understand that static final puts the result into the constant pool of the class calling it in the compiler.

analysis

The parsing phase is mainly the process that the virtual machine converts the symbolic reference in the constant pool into a direct reference. What is symbolic application and direct reference?

Symbol reference: use a group of symbols to describe the referenced target, which can be literal in any form, as long as it can locate the target without ambiguity. For example, in the class, the teacher can use Zhang San to represent you or your student number to represent you, but in any way, these are only a code (symbol), which points to you (symbol reference) Direct reference: a direct reference is a pointer that can point to the target, a relative offset, or a handle that can directly or indirectly locate the target. Related to the memory implemented by the virtual machine, the direct reference of different virtual machines is generally different. The parsing action mainly refers to class or interface, field, class method, interface method, method type, method handle and call point qualifier.

initialization

A class will be initialized in the following cases.

  1. Create an instance of the class, that is, new an object

  2. Access or assign a value to a static variable of a class or interface

  3. Calling a static method of a class

  4. Reflection (Class.forName("com.gupao.Example"))

  5. Initialize the subclass of a class (the parent class of the subclass will be initialized first)

  6. The startup class indicated when the JVM starts, that is, the class with the same file name and class name

Class initialization steps:

  • If this class has not been loaded and linked, load and link it first

  • If the class has a direct parent and the class has not been initialized (Note: a class can only be initialized once in a class loader), initialize the direct parent (not applicable to interfaces)

  • If there are initialization statements (such as static variables and static blocks) in the added class, execute these initialization statements in turn.

Extended knowledge points for class loading

In the class loading mechanism, there are still many extensible knowledge. We consolidate and analyze it through three extended variants

  1. Why can't static methods call non static methods and variables
  2. Initialization order of static class and non static class programs

Why can't static methods call non static methods and variables

I think everyone should know that non static methods and variables cannot be called directly in static methods. Why?

After understanding the loading principle of classes, it is not difficult to find that the memory allocation time of static methods is different from that of instance methods.

  1. Static methods belong to classes. Memory will be allocated when the class is loaded. With the entry address, they can be called directly through "class name. Method name".
  2. Non static members (variables and methods) belong to the object of the class, so memory will be allocated only after the object is initialized, and then accessed through the object of the class.

It means that in static methods, non static member variables can be invoked, which may not be initialized. Therefore, the compiler will report an error.

In addition, there are other variants. For example, static blocks

public class ClassLoaderExample {
    
    static {
        //dosomething()
    }
}

When is a static block executed?

Static blocks in a class are executed during the initialization phase of the entire class loading process, not during the loading phase of the class loading process.

The initialization stage is the last stage in the class loading process. This stage is the process of executing the < clinit > method of the class constructor. The < clinit > method is generated by the compiler automatically collecting the assignment actions of all class variables (static variables) in the class and the statements in the static statement block. Once a class enters the initialization stage, it is bound to execute the static statement block. Therefore, static blocks must be executed during class loading, but not during loading.

Clinit is a class constructor method, that is, the jvm will call the clinit method during class loading – verification – parsing – initialization.

clinit is a class constructor that initializes static variables and static code blocks

class Example {

   static Log log = LogFactory.getLog(); // <clinit>

   private int x = 1;   // <init>

   Example(){
      // <init>
   }

   static {
      // <clinit>
   }

}

Initialization order of Java program

There are the following codes, please say their loading order

class Base {
    public Base() {
        System.out.println("Parent class construction method");
    }
  
    String b = "Parent class non static variable";
  
    {
        System.out.println(b);
        System.out.println("Non static code block of parent class");
    }
    static String a = "Parent static variable";
    static {
        System.out.println(a);
        System.out.println("Parent static code block");
    }
    public static void A() {
        System.out.println("Generic static method of parent class");
    }
}
class Derived extends Base {
    public Derived() {
        System.out.println("Subclass constructor");
    }
    String b = "Subclass non static variable";
    {
        System.out.println(b);
        System.out.println("Subclass non static code block");
    }
    static String a = "Subclass static variable";
    static {
        System.out.println(a);
        System.out.println("Subclass static block");
    }
    public static void A() {
        System.out.println("Subclass general static method");
    }
    public static void main(String[] args) {
        Base.A();
        Derived.A();
        new Derived();
    }
}

To solve this problem, you need to understand the loading order of classes. The initialization rules are as follows.

  • Parent static variable

  • Parent static code block

  • Subclass static variable

  • Subclass static code block

  • Parent class non static variable

  • Non static code block of parent class

  • Parent class constructor

  • Subclass non static variable

  • Subclass non static code block

  • Subclass constructor

In general, the parent class needs to be loaded first, then the child class, then the static method of the parent class, and then the child class.

Custom class loader

In addition to the three types of loaders provided by the system, we can also define our own class loaders.

You need to inherit the java.lang.ClassLoader class to implement the custom class loader, and override the findClass method or loadClass method.

1. If you don't want to break the parental delegation model, you just need to override the findClass method.

protected Class<?> findClass(String name) throws ClassNotFoundException {
  throw new ClassNotFoundException(name);
}

This method is not implemented. It directly returns ClassNotFoundException. Therefore, the custom class loader must override the findClass method.

2. If you want to break the parental delegation model, override the loadClass method.

protected Class<?> loadClass(String name, boolean resolve)
    throws ClassNotFoundException
{
    synchronized (getClassLoadingLock(name)) {
        // First, check if the class has already been loaded
        Class<?> c = findLoadedClass(name);
        if (c == null) {
            long t0 = System.nanoTime();
            try {
                if (parent != null) {
                    c = parent.loadClass(name, false);
                } else {
                    c = findBootstrapClassOrNull(name);
                }
            } catch (ClassNotFoundException e) {
                // ClassNotFoundException thrown if class not found
                // from the non-null parent class loader
            }

            if (c == null) {
                // If still not found, then invoke findClass in order
                // to find the class.
                long t1 = System.nanoTime();
                c = findClass(name);

                // this is the defining class loader; record the stats
                sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0);
                sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);
                sun.misc.PerfCounter.getFindClasses().increment();
            }
        }
        if (resolve) {
            resolveClass(c);
        }
        return c;
    }
}

The general process of loadClass method in ClassLoader is as follows:

  1. Check whether the class has been loaded. If so, there is no need to reload it;
  2. If it is not loaded, load it through the parent class (recursion in turn) or start the class loader (bootstrap);
  3. If it is not found, call the findClass method of this loader;

Do not destroy the parent delegation custom class loader practice

The implementation of custom class loader is mainly divided into three steps

  • Create a class that inherits the ClassLoader abstract class

  • Override findClass() method

  • Call defineClass() in the findClass() method.

Create a PrintClass.java class in the / tmp directory. The code is as follows.

public class PrintClass {
  public PrintClass(){
     System.out.println("PrintClass:"+getClass().getClassLoader());
     System.out.println("PrintClass Parent:"+getClass().getClassLoader().getParent());
  }
  public String print(){
    System.out.println("PrintClass method for print");
    return "PrintClass.print()";
  }
}

Use javac PrintClass to compile the source file to get the PrintClass.class file

Next, create a custom class loader in the Java project. The code is as follows.

public class MyClassLoader extends ClassLoader {

    private String classPath;

    public MyClassLoader(String classPath) {
        this.classPath = classPath;
    }

    @Override
    protected Class<?> findClass(String name) throws ClassNotFoundException {
        try {
            byte[] bytes = getClassBytes(name);
            Class<?> c = this.defineClass(name, bytes, 0, bytes.length);
            return c;
        } catch (Exception e) {
            e.printStackTrace();
        }
        return super.findClass(name);
    }
    private byte[] getClassBytes(String name) throws Exception {
        name = name.replaceAll("\\.", "/");
        FileInputStream fis = new FileInputStream(classPath + "/" + name + ".class");
        int len = fis.available();
        byte[] data = new byte[len];
        fis.read(data);
        fis.close();
        return data;
    }
}

MyClassLoader inherits ClassLoader and overrides the findClass method. In this method, the. class file is loaded from the specified path.

Write test code

public class ClassLoaderMain {

    public static void main(String[] args) throws Exception {
        MyClassLoader mc=new MyClassLoader("/tmp");
        Class clazz=mc.loadClass("PrintClass");
        Object o=clazz.newInstance();
        Method print=clazz.getDeclaredMethod("print",null);
        print.invoke(o,null);
    }
}

The operation results are as follows:

PrintClass:org.example.cl.MyClassLoader@5cad8086
PrintClass Parent:sun.misc.Launcher$AppClassLoader@18b4aac2
PrintClass method for print

As you can see, the class loader of PrintClass.class is MyClassLoader.

Destroy parent delegate custom class loader practice

Originally, the loadClass method in the ClassLoader class was implemented based on the parental delegation mechanism. To destroy the parent delegation, you only need to override the loadClass method.

In the MyClassLoader class, override the loadClass method. The code is as follows.

@Override
protected Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException {
  synchronized (getClassLoadingLock(name)) {
    // First, check if the class has already been loaded
    Class<?> c = findLoadedClass(name);
    if (c == null) {
      // If still not found, then invoke findClass in order
      // to find the class.
      long t1 = System.nanoTime();

      //Non custom classes are loaded by parental delegation
      if (!name.equals("PrintClass")) { 
        c = this.getParent().loadClass(name);
      } else { //Write your own class and use your own class loader.
        c = findClass(name);
      }
      // this is the defining class loader; record the stats
      sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);
      sun.misc.PerfCounter.getFindClasses().increment();
    }
    if (resolve) {
      resolveClass(c);
    }
    return c;
  }
}

Copy PrintClass.java to the / tmp/cl directory and modify the print method.

public class PrintClass {
  public PrintClass(){
     System.out.println("PrintClass:"+getClass().getClassLoader());
     System.out.println("PrintClass Parent:"+getClass().getClassLoader().getParent());
  }
  public String print(){
    System.out.println("PrintClass method for print NEW");  //Modified the print statement to distinguish the loaded classes
    return "PrintClass.print()";
  }
}

Write test code

public class ClassLoaderMain {

    public static void main(String[] args) throws Exception {
        MyClassLoader mc=new MyClassLoader("/tmp");
        Class clazz=mc.loadClass("PrintClass");
        System.out.println(clazz.getClassLoader());
        System.out.println();
        //Create the same PrintClass.class file in another directory
        MyClassLoader mc1=new MyClassLoader("/tmp/cl");
        Class clazz1=mc1.loadClass("PrintClass");
        System.out.println(clazz1.getClassLoader());
        System.out.println();
    }
}

In the above code, load the PrintClass.class file in the tmp and tmp/cl directories respectively, and the print results are as follows.

PrintClass:org.example.cl.MyClassLoader@5cad8086
PrintClass Parent:sun.misc.Launcher$AppClassLoader@18b4aac2
PrintClass method for print
PrintClass:org.example.cl.MyClassLoader@610455d6
PrintClass Parent:sun.misc.Launcher$AppClassLoader@18b4aac2
PrintClass method for print NEW

Conclusion: by rewriting the loadClass method, the class created by yourself can be directly loaded by the first loader without entrusting the parent loader to find it, so as to realize the destruction of parental delegation

How does Tomcat implement the isolation of application jar packages?

I believe many children have encountered this problem during the interview.

Before thinking about this problem, let's think about what problem Tomcat should solve as a JSP/Servlet container?

  1. A web container may need to deploy two applications. Different applications may depend on different versions of the same third-party class library. It is not necessary to have only one copy of the same class library on the same server. Therefore, it is necessary to ensure that the class libraries of each application are independent and isolated from each other.
  2. Deployed in the same web container, the same class library and the same version can be shared. Otherwise, if the server has 10 applications, 10 identical class libraries must be loaded into the virtual machine, which will inevitably lead to the problem of excessive memory consumption.
  3. The web container also has its own dependent class library, which can not be confused with the class library of the application. For security reasons, the class library of the container should be isolated from the class library of the program.

In order to achieve these purposes, Tomcat must not use the default class loading mechanism.

Reason: if you use the default class loader mechanism, you cannot load different versions of the same class library. The default class loader only cares about your fully qualified class name and only one copy, regardless of your version

Therefore, Tomcat implements its own class loader, which also breaks the mechanism of parent delegation. The following figure shows Tomcat's class loading mechanism.

We can see that the loading of the first three classes is consistent with the default. CommonClassLoader, CatalinaClassLoader, SharedClassLoader and WebappClassLoader are the class loaders defined by Tomcat. They load the Java class libraries in ${TOMCAT_HOME}/lib and / WebApp/WEB-INF / * respectively.

There are usually multiple instances of WebApp class loader and JSP class loader. Each Web application corresponds to a WebApp class loader and each JSP file corresponds to a JSP class loader.

  • commonLoader: Tomcat's most basic class loader. The classes in the loading path can be accessed by the Tomcat container itself and various webapps (web Applications);
  • catalinaLoader: the private class loader of Tomcat container. The classes in the loading path are not visible to Webapp;
  • sharedLoader: the class loader shared by each Webapp. The classes in the loading path are visible to all webapps, but not to the Tomcat container;
  • WebappClassLoader: the private class loader of each Webapp. The classes in the loading path are only visible to the current Webapp;

As can be seen from the delegation relationship in the figure:

All classes that can be loaded by CommonClassLoader can be used by Catalina ClassLoader and SharedClassLoader, so as to realize the sharing of public class libraries, while the classes that Catalina ClassLoader and Shared ClassLoader can load are isolated from each other.

WebAppClassLoader can use the classes loaded by SharedClassLoader, but each WebAppClassLoader instance is isolated from each other.

The loading range of JasperLoader is only the. Class file compiled by the JSP file. Its purpose is to realize the HotSwap function of JSP.

Obviously, in order to achieve isolation, Tomcat breaks the parental delegation, and each webappClassLoader loads the class files in its own directory.

Problem solving

Interview question: principle of class loading mechanism

Answer: class loading refers to reading the binary data in the. Class file of the class into memory, placing it in the method area of the runtime data area, and then creating a java.lang.Class object in the heap area to encapsulate the data structure of the class in the method area.

The loading mechanism of class includes five processes: loading, verification, preparation, parsing and initialization

  • Load: loads the. class file into memory
  • Validation: ensure that the loaded class conforms to the JVM specification
  • Preparation: formally allocate memory for class variables and set initial values
  • Resolution: the symbolic reference of the JVM constant pool is converted to a direct reference
  • Initialization: executes the constructor of the class.

Problem summary

A small interview question involves a huge amount of technical knowledge behind it.

In the interview, if you don't have systematic knowledge, you can't find the entry point when answering such questions. In particular, when there are too many entry points for such a more general question, the answer will be confused. Focus on [Mic learning architecture] official account, get more original works.

Posted by Zoud on Sun, 31 Oct 2021 22:41:10 -0700