javaagent usage guide

Keywords: Java jvm Spring Maven

Today I'm going to write about Java agent. At first, I'm not familiar with the concept of Java agent. Later, I heard bytecode stuffing in other people's population. After bTrace and Arthas, I gradually realized that Java also provides such a tool.

Static Instrument Before JVM Start

What is a Java agent?

java agent is a parameter of the java command. The parameter javaagent can be used to specify a jar package and has two requirements for the java package:

  1. The MANIFEST.MF file of this jar package must specify the Premain-Class entry.
  2. The class specified by Premain-Class must implement the premain() method.

The premain method, literally, is the class that runs before the main function. When the Java virtual machine starts, the JVM runs the premain method of the class Premain-Class in the jar package specified by - javaagent before executing the main function.

When you enter java on the command line, you can see the corresponding parameters, which are related to the java agent:

- Agentlib: <libname>[=<options>] Load the native agent library <libname>, such as-agent lib:hprof
	See also - agent lib: jdwp = help and - agent lib: hprof = help
 - Agent path: <pathname>[=<options>]
	Load the native agent library by the full path name
 - javaagent: <jarpath>[=<options>]
	Load the Java programming language proxy, see java.lang.instrument

Refer to java.lang.instrument in the above - javaagent parameter. This is a package defined in rt.jar. There are two important classes under this path:

The package provides tools to help developers dynamically modify Class types in the system while Java programs are running. Java agent is one of the key components to use the package. By name, it seems to be a Java proxy or something like that. In fact, it functions more like a Class-type converter, which accepts re-external requests at runtime to modify the Class-type.

In essence, Java Agent is a regular Java class that follows a strict set of conventions. As mentioned above, the javaagent command requires the premain() method in the specified class and the signature of the premain method. The signature must satisfy the following two formats:

public static void premain(String agentArgs, Instrumentation inst)
    
public static void premain(String agentArgs)

The JVM will first load the method with Instrumentation signature, and the second method will be loaded if the first one does not exist. This logic is in the sun.instrument.InstrumentationImpl class:

The Instrumentation class is defined as follows:

public interface Instrumentation {
    
    //Add a Class file converter, which is used to change the data of Class binary stream, and whether the parameter canRetransform setting allows re-conversion.
    void addTransformer(ClassFileTransformer transformer, boolean canRetransform);

    //Before class loading, the Class file is redefined. ClassDefinition represents a new definition of a class. If a class is loaded, it needs to be redefined using the retransformClasses method. After the addTransformer method is configured, subsequent class loads are intercepted by Transformer. For classes that have been loaded, retransform classes can be executed to re-trigger the interception of the Transformer. After the bytecode loaded by the class is modified, it will not be restored unless it is retransform again.
    void addTransformer(ClassFileTransformer transformer);

    //Delete a Class Converter
    boolean removeTransformer(ClassFileTransformer transformer);

    boolean isRetransformClassesSupported();

    //After class loading, redefine Class. This is very important. This method was added after 1.6. In fact, this method is an update class.
    void retransformClasses(Class<?>... classes) throws UnmodifiableClassException;

    boolean isRedefineClassesSupported();

    
    void redefineClasses(ClassDefinition... definitions)
        throws  ClassNotFoundException, UnmodifiableClassException;

    boolean isModifiableClass(Class<?> theClass);

    @SuppressWarnings("rawtypes")
    Class[] getAllLoadedClasses();

  
    @SuppressWarnings("rawtypes")
    Class[] getInitiatedClasses(ClassLoader loader);

    //Get the size of an object
    long getObjectSize(Object objectToSize);


   
    void appendToBootstrapClassLoaderSearch(JarFile jarfile);

    
    void appendToSystemClassLoaderSearch(JarFile jarfile);

    
    boolean isNativeMethodPrefixSupported();

    
    void setNativeMethodPrefix(ClassFileTransformer transformer, String prefix);
}

Most importantly, there are several ways to annotate above, which we will use below.

How to use Java agent?

Using javaagent requires several steps:

  1. Defining a MANIFEST.MF file must include the Premain-Class option, and Can-Redefine-Classes and Can Retransform-Classes options are usually added.
  2. Create a class specified by Premain-Class that contains the premain method, and the method logic is determined by the user himself.
  3. Preain's classes and MANIFEST.MF files are typed into jar packages.
  4. Start the method to be proxied using the parameter - javaagent: jar package path.

After performing the above steps, the JVM will first execute the premain method, through which most class loads will proceed. Note that most, not all, classes are loaded. Of course, the main omission is the system class, because many system classes are executed before the agent, and the loading of user classes is bound to be intercepted. That is to say, this method intercepts the loading activities of most classes before the main method starts. Since it can intercept the loading of classes, it can do such operations as rewriting classes. It can rewrite implementation classes with bytecode compiling tools of third parties, such as ASM, javassist, cglib and so on.

Through the above steps, we use code to achieve. To implement javaagent, you need to build two projects: one is to carry javaagent classes, which are packaged as jar packages separately; the other is the classes that javaagent needs to proxy. That is, javaagent does something before the main method in this project starts.

1. First, implement the Java agent project.

The structure of the project catalogue is as follows:

-java-agent
----src
--------main
--------|------java
--------|----------com.rickiyang.learn
--------|------------PreMainTraceAgent
--------|resources
-----------META-INF
--------------MANIFEST.MF

The first step is to create a class that contains the premain method:

import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.lang.instrument.Instrumentation;
import java.security.ProtectionDomain;

/**
 * @author: rickiyang
 * @date: 2019/8/12
 * @description:
 */
public class PreMainTraceAgent {

    public static void premain(String agentArgs, Instrumentation inst) {
        System.out.println("agentArgs : " + agentArgs);
        inst.addTransformer(new DefineTransformer(), true);
    }

    static class DefineTransformer implements ClassFileTransformer{

        @Override
        public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException {
            System.out.println("premain load Class:" + className);
            return classfileBuffer;
        }
    }
}

The above is a class I implemented, which implements the premain() method with Instrumentation parameters. Call the addTransformer() method to intercept all classes at startup.

Then create a new directory under the resources directory: META-INF, and create a new file under the directory: MANIFREST.MF:

Manifest-Version: 1.0
Can-Redefine-Classes: true
Can-Retransform-Classes: true
Premain-Class: PreMainTraceAgent

Notice that line 5 has a blank line.

Let's talk about the function of MANIFREST.MF file. If you don't specify it manually, pack it directly. By default, a MANIFREST.MF file will be generated in the packaged file:

Manifest-Version: 1.0
Implementation-Title: test-agent
Implementation-Version: 0.0.1-SNAPSHOT
Built-By: yangyue
Implementation-Vendor-Id: com.rickiyang.learn
Spring-Boot-Version: 2.0.9.RELEASE
Main-Class: org.springframework.boot.loader.JarLauncher
Start-Class: com.rickiyang.learn.LearnApplication
Spring-Boot-Classes: BOOT-INF/classes/
Spring-Boot-Lib: BOOT-INF/lib/
Created-By: Apache Maven 3.5.2
Build-Jdk: 1.8.0_151
Implementation-URL: https://projects.spring.io/spring-boot/#/spring-bo
 ot-starter-parent/test-agent

This is the default file, which contains some current version information, the startup class of the current project, and other parameters that allow you to do more.

Premain-Class: Class containing premain method (full path name of class)

Agent-Class: Class containing agent main method (full path name of class)

Boot-Class-Path: Sets the list of paths to be searched by the boot class loader. When the platform-specific mechanism for finding classes fails, the boot class loader searches for these paths. Search paths in the listed order. The paths in the list are separated by one or more spaces. Paths use the path component syntax of hierarchical URI s. If the path begins with a slash character ("/"), it is an absolute path, otherwise it is a relative path. The relative path is resolved according to the absolute path of the proxy JAR file. Ignore paths with incorrect formats and non-existent paths. If the agent is started at a certain time after the VM starts, the path that does not represent the JAR file is ignored. (Optional)

Can-Redefine-Classes: true denotes the class needed to redefine this agent, with a default value of false (optional)

Can-Retransform-Classes: true denotes the class needed to be able to convert this proxy, with a default value of false (optional)

Can-Set-Native-Method-Prefix: true indicates that the native method prefix required to set this agent is false (optional)

That is to say, the configuration information related to the running of the program is defined in the file, and the configuration items in the file are detected before the program runs.

There is no limit to the number of - javaagent parameters in a java program, so any number of javaagents can be added. All java agent s are executed in the order you define, for example:

java -javaagent:agent1.jar -javaagent:agent2.jar -jar MyProgram.jar

The sequence of program execution will be:

MyAgent1.premain -> MyAgent2.premain -> MyProgram.main

Speaking back to the Java agent project above, I typed the project into a jar package. When I packed it, I found that the MANIFREST.MF file was replaced by the default configuration. So I manually replaced the above configuration file with the file in the jar package. Here you need to pay attention.

Another way to avoid writing MANIFREST.MF files manually is to use the maven plug-in:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-jar-plugin</artifactId>
    <version>3.1.0</version>
    <configuration>
        <archive>
            <!--Automatic addition META-INF/MANIFEST.MF -->
            <manifest>
                <addClasspath>true</addClasspath>
            </manifest>
            <manifestEntries>
                <Premain-Class>com.rickiyang.learn.PreMainTraceAgent</Premain-Class>
                <Agent-Class>com.rickiyang.learn.PreMainTraceAgent</Agent-Class>
                <Can-Redefine-Classes>true</Can-Redefine-Classes>
                <Can-Retransform-Classes>true</Can-Retransform-Classes>
            </manifestEntries>
        </archive>
    </configuration>
</plugin>

The file can also be automatically generated by this plug-in.

The agent code is finished. Next, we will start a new project. You only need to write a class with main method.

public class TestMain {

    public static void main(String[] args) {
        System.out.println("main start");
        try {
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        System.out.println("main end");
    }
}

Simply, then all you need to do is associate the above proxy class with the test class. There are two ways:

If you use idea, you can click on the menu: run-debug configuration, and then specify your proxy package in the startup parameters:

The other way is to use the command line method instead of the compiler. Similar to the above, compile the above test class into a class file, and then run the class:

 #Compile this class into a class file
 > javac TestMain.java
 
 #Specify the agent program and run the class
 > java -javaagent:c:/alg.jar TestMain

Both of the above methods can be used to run, and the output results are as follows:

D:\soft\jdk1.8\bin\java.exe -javaagent:c:/alg.jar "-javaagent:D:\soft\IntelliJ IDEA 2019.1.1\lib\idea_rt.jar=54274:D:\soft\IntelliJ IDEA 2019.1.1\bin" -Dfile.encoding=UTF-8 -classpath D:\soft\jdk1.8\jre\lib\charsets.jar;D:\soft\jdk1.8\jre\lib\deploy.jar;D:\soft\jdk1.8\jre\lib\ext\access-bridge-64.jar;D:\soft\jdk1.8\jre\lib\ext\cldrdata.jar;D:\soft\jdk1.8\jre\lib\ext\dnsns.jar;D:\soft\jdk1.8\jre\lib\ext\jaccess.jar;D:\soft\jdk1.8\jre\lib\ext\jfxrt.jar;D:\soft\jdk1.8\jre\lib\ext\localedata.jar;D:\soft\jdk1.8\jre\lib\ext\nashorn.jar;D:\soft\jdk1.8\jre\lib\ext\sunec.jar;D:\soft\jdk1.8\jre\lib\ext\sunjce_provider.jar;D:\soft\jdk1.8\jre\lib\ext\sunmscapi.jar;D:\soft\jdk1.8\jre\lib\ext\sunpkcs11.jar;D:\soft\jdk1.8\jre\lib\ext\zipfs.jar;D:\soft\jdk1.8\jre\lib\javaws.jar;D:\soft\jdk1.8\jre\lib\jce.jar;D:\soft\jdk1.8\jre\lib\jfr.jar;D:\soft\jdk1.8\jre\lib\jfxswt.jar;D:\soft\jdk1.8\jre\lib\jsse.jar;D:\soft\jdk1.8\jre\lib\management-agent.jar;D:\soft\jdk1.8\jre\lib\plugin.jar;D:\soft\jdk1.8\jre\lib\resources.jar;D:\soft\jdk1.8\jre\lib\rt.jar;D:\workspace\demo1\target\classes;E:\.m2\repository\org\springframework\boot\spring-boot-starter-aop\2.1.1.RELEASE\spring-
...
...
...
1.8.11.jar;E:\.m2\repository\com\google\guava\guava\20.0\guava-20.0.jar;E:\.m2\repository\org\apache\commons\commons-lang3\3.7\commons-lang3-3.7.jar;E:\.m2\repository\com\alibaba\fastjson\1.2.54\fastjson-1.2.54.jar;E:\.m2\repository\org\springframework\boot\spring-boot\2.1.0.RELEASE\spring-boot-2.1.0.RELEASE.jar;E:\.m2\repository\org\springframework\spring-context\5.1.3.RELEASE\spring-context-5.1.3.RELEASE.jar com.springboot.example.demo.service.TestMain
agentArgs : null
premain load Class     :java/util/concurrent/ConcurrentHashMap$ForwardingNode
premain load Class     :sun/nio/cs/ThreadLocalCoders
premain load Class     :sun/nio/cs/ThreadLocalCoders$1
premain load Class     :sun/nio/cs/ThreadLocalCoders$Cache
premain load Class     :sun/nio/cs/ThreadLocalCoders$2
premain load Class     :java/util/jar/Attributes
premain load Class     :java/util/jar/Manifest$FastInputStream
...
...
...
premain load Class     :java/lang/Class$MethodArray
premain load Class     :java/lang/Void
main start
premain load Class     :sun/misc/VMSupport
premain load Class     :java/util/Hashtable$KeySet
premain load Class     :sun/nio/cs/ISO_8859_1$Encoder
premain load Class     :sun/nio/cs/Surrogate$Parser
premain load Class     :sun/nio/cs/Surrogate
...
...
...
premain load Class     :sun/util/locale/provider/LocaleResources$ResourceReference
main end
premain load Class     :java/lang/Shutdown
premain load Class     :java/lang/Shutdown$Lock

Process finished with exit code 0

The above output can be found as follows:

  1. All classes, including system classes and custom classes, are loaded before the main method is executed.
  2. ClassFileTransformer intercepts system classes and class objects implemented by itself.
  3. If you have to rewrite certain classes of objects, you can use the bytecode compiler to catch them when intercepting them.

The following is a dynamic replacement of a method using javassist:

package com.rickiyang.learn;

import javassist.*;

import java.io.IOException;
import java.lang.instrument.ClassFileTransformer;
import java.security.ProtectionDomain;

/**
 * @author rickiyang
 * @date 2019-08-06
 * @Desc
 */
public class MyClassTransformer implements ClassFileTransformer {
    @Override
    public byte[] transform(final ClassLoader loader, final String className, final Class<?> classBeingRedefined,final ProtectionDomain protectionDomain, final byte[] classfileBuffer) {
        // Operating the Date class
        if ("java/util/Date".equals(className)) {
            try {
                // Getting CtClass objects from ClassPool
                final ClassPool classPool = ClassPool.getDefault();
                final CtClass clazz = classPool.get("java.util.Date");
                CtMethod convertToAbbr = clazz.getDeclaredMethod("convertToAbbr");
                //The java.util.Date.convertToAbbr() method is rewritten here, and a print operation is added before the return.
                String methodBody = "{sb.append(Character.toUpperCase(name.charAt(0)));" +
                        "sb.append(name.charAt(1)).append(name.charAt(2));" +
                        "System.out.println(\"sb.toString()\");" +
                        "return sb;}";
                convertToAbbr.setBody(methodBody);

                // Returns bytecode and detachCtClass object
                byte[] byteCode = clazz.toBytecode();
                //detach means to remove the Date object that has been loaded by javassist in memory, and to reload the javassist if it is not found in memory next time
                clazz.detach();
                return byteCode;
            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }
        // If null is returned, the bytecode will not be modified
        return null;
    }
}

Dynamic Instrument after JVM Startup

Instrumentation described above is provided in JDK 1.5. Developers can only add hands and feet before main is loaded. In Instrumentation of Java SE 6, a new method of agent operation is provided: agent main, which can run after main function starts running.

Like the premain function, developers can write a Java class with an agent main function:

//With attach ment mechanism, the agent's target program VM may have been started a long time ago, of course, all its classes have been loaded. At this time, it is necessary to use Instrumentation # retransform Classes (Class <> classes) to enable the corresponding classes to be re-transformed, thus activating the re-transformed classes to execute the ClassFileTransformer column. Callbacks in tables
public static void agentmain (String agentArgs, Instrumentation inst)

public static void agentmain (String agentArgs)

Similarly, the method with Instrumentation parameter in agent main method is higher than that without priority. The developer must set "Agent-Class" in the manifest file to specify the class containing the agent main function.

After Java 6, the new implementation to load after startup is Attach api. The Attach API is simple, with only two main classes in the com.sun.tools.attach package:

  1. Virtual Machine literally represents a Java virtual machine, which is the target virtual machine that the program needs to monitor. It provides access to system information (such as memory dump, thread dump, class information statistics (such as the number of classes loaded and instances, etc.), load agent, Attach and Detach (the opposite behavior of Attach actions, By removing a proxy from the JVM and other methods, the functions that can be achieved can be said to be very powerful. This class allows us to connect remotely to the JVM by passing in a JVM PID (process id) to the attach method.

    Agent class injection operation is only one of its many functions. Register an agent to jvm through the loadAgent method. In the agent's agent program, an Instrumentation instance can be obtained. The instance can change the class bytecode before class loading or reload after class loading. When invoking the methods of the Instrumentation instance, these methods are processed using the methods provided in the ClassFileTransformer interface.

  2. Virtual Machine Descriptor is a container class for describing virtual machines, which cooperates with Virtual Machine class to complete various functions.

The principle of attach for dynamic injection is as follows:

With the attach(pid) method of the VirtualMachine class, you can attach to a running java process, then you can inject the jar package of the agent into the corresponding process through the loadAgent (agent JarPath), and the corresponding process will call the agent main method.

Since it is the communication between two processes, the connection must be established. The VirtualMachine.attach action is similar to the three handshakes created by TCP. The purpose is to build the connection of attach communication. Later operations, such as vm.loadAgent, actually write data streams to the socket. The target VM of the receiver will do different processing for different incoming data.

Let's test the use of agent main:

The engineering structure is the same as the premain test above. Write AgentMainTest, and then use the maven plug-in package to generate MANIFEST.MF.

package com.rickiyang.learn;

import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.lang.instrument.Instrumentation;
import java.security.ProtectionDomain;

/**
 * @author rickiyang
 * @date 2019-08-16
 * @Desc
 */
public class AgentMainTest {

    public static void agentmain(String agentArgs, Instrumentation instrumentation) {
        instrumentation.addTransformer(new DefineTransformer(), true);
    }
    
    static class DefineTransformer implements ClassFileTransformer {

        @Override
        public byte[] transform(ClassLoader loader, String className, Class<?> classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer) throws IllegalClassFormatException {
            System.out.println("premain load Class:" + className);
            return classfileBuffer;
        }
    }
}

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-jar-plugin</artifactId>
  <version>3.1.0</version>
  <configuration>
    <archive>
      <!--Automatic addition META-INF/MANIFEST.MF -->
      <manifest>
        <addClasspath>true</addClasspath>
      </manifest>
      <manifestEntries>
        <Agent-Class>com.rickiyang.learn.AgentMainTest</Agent-Class>
        <Can-Redefine-Classes>true</Can-Redefine-Classes>
        <Can-Retransform-Classes>true</Can-Retransform-Classes>
      </manifestEntries>
    </archive>
  </configuration>
</plugin>

After the agent is packaged, the test main method is written. The steps in the drawing above are: to detect the target JVM from an attach JVM and send agent.jar to it if the target JVM exists. I wrote the test a little simpler, found the current JVM and loaded agent.jar.

package com.rickiyang.learn.job;

import com.sun.tools.attach.*;

import java.io.IOException;
import java.util.List;

/**
 * @author rickiyang
 * @date 2019-08-16
 * @Desc
 */
public class TestAgentMain {

    public static void main(String[] args) throws IOException, AttachNotSupportedException, AgentLoadException, AgentInitializationException {
        //Get all running virtual machines in the current system
        System.out.println("running JVM start ");
        List<VirtualMachineDescriptor> list = VirtualMachine.list();
        for (VirtualMachineDescriptor vmd : list) {
            //If the name of the virtual machine is xxx, the virtual machine is the target virtual machine and gets the pid of the virtual machine.
            //Then load agent.jar and send it to the virtual machine
            System.out.println(vmd.displayName());
            if (vmd.displayName().endsWith("com.rickiyang.learn.job.TestAgentMain")) {
                VirtualMachine virtualMachine = VirtualMachine.attach(vmd.id());
                virtualMachine.loadAgent("/Users/yangyue/Documents/java-agent.jar");
                virtualMachine.detach();
            }
        }
    }

}

The list() method looks for all running JVM processes in the current system. You can print vmd.displayName() to see which JVM processes are running in the current system. Because the process name is the current class name when the main function executes, the current process id can be found in this way.

Note: jdk installed on mac can directly find the VirtualMachine class, but jdk installed in windows can not be found. If you encounter this situation, please manually add tools.jar in the lib directory to the liraries of the current project.

The output of the main method is:


You can see that actually a socket process is started to transfer agent.jar. The main method that prints the table name "running JVM start" is started first, and then enters the transform ation method of the proxy class.

instrument principle

The underlying implementation of instrument relies on JVMTI(JVM Tool Interface), which is a set of interfaces exposed by JVM for user extension. JVMTI is event-driven. Every time a JVM executes a certain logic, it calls back interfaces of events (if any), which can be extended by developers. Logic. JVMTIAgent is a dynamic library that provides agent on load, agent on attach and agent on unload functions through the interface exposed by JVMTI. instrument agent can be understood as a kind of JVMTIAgent dynamic library. Its alias is JPLISAgent(Java Programming Language Instrumentation Services Agent), which is a proxy specially designed to support the stuffing service written in the java language.

Loading instrument agent process at startup:
  1. Create and initialize JPLISAgent;

  2. Listen for VMInit events and do the following after the JVM initialization is complete:

    1. Create InstrumentationImpl objects;

    2. Listen for ClassFileLoadHook events;

    3. Call the loadClassAndCallPremain method of InstrumentationImpl, which calls the premain method of the Premain-Class class specified in MANIFEST.MF in javaagent.

  3. Parse the parameters of MANIFEST.MF file in javaagent, and set some contents in JPLISAgent according to these parameters.

Runtime loading instrument agent process:

The process of requesting the target JVM to load the corresponding agent through the attach mechanism of JVM is as follows:

  1. Create and initialize JPLISAgent;
  2. Parse the parameters in MANIFEST.MF in Java agent;
  3. Create InstrumentationImpl objects;
  4. Listen for ClassFileLoadHook events;
  5. Call the loadClassAndCallAgentmain method of InstrumentationImpl, in which the agent main method of the Agent-Class class specified in MANIFEST.MF in Java agent is called.

Limitations of Instrumentation

In most cases, we use Instrumentation to use its bytecode stuffing function, or in general, the Class Redefinition function, but there are the following limitations:

  1. Both premain and agent main modify bytecode when class files are loaded, that is to say, they must have parameters of Class type, and can not redefine a class that does not exist by bytecode files and custom class names.
  2. The bytecode modification of a class is called Class Transform, which eventually returns to the class redefinition Instrumentation#redefineClasses() method, which has the following limitations:
    1. The parent class of the new class must be the same as that of the old class.
    2. The number of interfaces implemented by new and old classes is the same, and they are the same.
    3. New and old class accessors must be consistent. The number of fields and field names of new and old classes should be the same.
    4. New and old methods of adding or deleting must be modified by private static/final.
    5. Method body can be modified.

In addition to the above approach, if you want to redefine a class, you can consider a class loader-based isolation approach: create a new custom class loader to define a new class through a new bytecode, but there are also limitations of calling the new class only through reflection.

Posted by Azarath on Sat, 17 Aug 2019 00:12:32 -0700