Invokedynamic-Java's Secret Weapon

Keywords: Java jvm Programming Oracle

The earliest work on invokedynamic s dates back at least to 2007, and the first successful dynamic call was made on 26 August 2008. This was earlier than the acquisition of Sun by Oracle and has been developed for a long time according to the standards of most developers.

The advantage of invokedynamic is that it is the first new bytecode since Java 1.0. It adds the existing invoke bytecode invokevirtual, invokestatic, invoke interface and invoke special. These four existing opcodes implement all forms of method assignment that Java developers are usually familiar with, especially:

  • invokevirtual - Standard invocation of instance methods
  • invokestatic - for dispatching static methods
  • invokeinterface - Used to invoke methods through interfaces
  • Invoke special - Used when non-virtual (i.e. "exact") scheduling is required

Some developers may be curious about why the platform requires all four opcodes, so let's look at a simple example of using different call opcodes to illustrate the difference between them:

public class InvokeExamples {
    public static void main(String[] args) {
        InvokeExamples sc = new InvokeExamples();
        sc.run();
    }

    private void run() {
        List<String> ls = new ArrayList<>();
        ls.add("Good Day");

        ArrayList<String> als = new ArrayList<>();
        als.add("Dydh Da");
    }
}

This will generate bytecode, which we can disassemble using the javap tool:

javap -c InvokeExamples.class

Results Output:

public class kathik
.
InvokeExamples {
  public kathik.InvokeExamples();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: new           #2                  // class kathik/InvokeExamples
       3: dup
       4: invokespecial #3                  // Method "<init>":()V
       7: astore_1
       8: aload_1
       9: invokespecial #4                  // Method run:()V
      12: return

  private void run();
    Code:
       0: new           #5                  // class java/util/ArrayList
       3: dup
       4: invokespecial #6                  // Method java/util/ArrayList."<init>":()V
       7: astore_1
       8: aload_1
       9: ldc           #7                  // String Good Day
      11: invokeinterface #8,  2            // InterfaceMethod java/util/List.add:(Ljava/lang/Object;)Z
      16: pop
      17: new           #5                  // class java/util/ArrayList
      20: dup
      21: invokespecial #6                  // Method java/util/ArrayList."<init>":()V
      24: astore_2
      25: aload_2
      26: ldc           #9                  // String Dydh Da
      28: invokevirtual #10                 // Method java/util/ArrayList.add:(Ljava/lang/Object;)Z
      31: pop
      32: return
}

This shows three of the Four calling operation codes (the other one, invokestatic is a trivial extension). First, we can see two calls (at bytes 11 and 28 of the run method):

ls.add("Good Day")

and

als.add("Dydh Da")

It looks very similar in Java source code, but in fact it is represented differently in bytecode.

For javac, the static type of variables ls is List < String >, and List is the interface. Therefore, the exact location of the add method in the runtime method table (commonly referred to as "vtable") has not been determined at compile time. Therefore, the source code compiler will issue invoke interface instructions and defer the actual lookup of the method until the actual VTable of ls can be checked and the location of the add method can be found.

Instead, the call als.add("Dydh Da") is received by als, and the static type of this type is the class type - ArrayList < String >. This means that the location of the method in the vtable is known at compile time. Therefore, javac can issue invokevirtual instructions for exact vtable entries. The final choice of the method is still determined at runtime, because it allows the method to be overwritten, but the vtable slot is determined at compile time.

Not only that, the example also shows two possible use cases of invokespecial. This opcode should be used when scheduling should be determined accurately at runtime, especially when method coverage is neither desired nor possible. The two situations demonstrated in this example are private method and parent class call (constructor of Object), because such methods are known at compile time and cannot be overridden.

Smart readers will notice that all calls to Java methods are compiled into one of these four codes, so there is a problem - what is the role of invokedynamic and why is it useful for Java developers?

The main goal of these functions is to create a bytecode to handle a new method dispatch, which essentially allows application-level code to determine which method the call will execute, and to do so only when the call is about to execute. This enables language and framework writers to support more dynamic programming styles than the Java platform previously provided.

The goal is that user code uses API methods to determine runtime calls without suffering performance losses and security issues associated with reflection. In fact, once the functionality is fully developed, invokedynamic's stated goal will be as fast as conventional method scheduling (invokevirtual).

When Java 7 arrives, JVM adds support for the execution of new bytecode. However, no matter what Java code is submitted, javac will not generate bytecode containing invokedynamic. Instead, this functionality is only used to support JRuby and other dynamic languages running on the JVM.

This has changed in Java 8, where invokedynamic is now generated and used in the background to implement lambda expressions and default methods, as well as Nashorn's main scheduling mechanism. However, Java application developers still do not have a direct method for fully dynamic method parsing. That is to say, the Java language does not have keywords or libraries to create generic invokedynamic call points. This means that, despite its powerful capabilities, the mechanism is still obscure for most Java developers. Let's see how to use it in our code.

Introduction to Method Handles

In order for invokedynamics to work properly, the key concept is method handles. This is one way to indicate that a method should be invoked from an invokedynamic call point. The general idea is that each invokedynamic instruction is associated with a special method called boot method or BSM - bootstrap method. When the interpreter arrives at the invokedynamic instruction, the BSM is called, and the BSM returns an object (including a method handle) indicating the method that the call point should actually execute.

This is somewhat similar to reflection, but reflection has limitations that make it unsuitable for use with invokedynamics. Instead, add java.lang.invoke.MethodHandle (and subclasses) to the Java 7 API to represent the method invokedynamic can locate. The MethodHandle class receives some special processing from the JVM to make it work correctly.

The method handle can be thought of as a method, a safe, modern way to complete the core reflection, and to achieve maximum type security as far as possible. They are necessary for invokedynamic s, but they can also be used independently.

Method type

The Java approach can be considered to consist of four basic components:

  • Name name
  • Signature signature (including return type)
  • Defined Category Class
  • Bytecode bytecode code to implement this method

This means that if you want to refer to methods, you need a way to effectively represent method signatures (instead of using the horrible Class <?>[] technique that must use reflection).

In other words, the first building block required for a method handle is a method that represents the method signature to be looked up. In the method handle API introduced in Java 7, this role is performed by the java.lang.invoke.MethodType class, which uses immutable instances to represent signatures. To get MethodType, use the methodType() factory method. This is a variable parameter method that takes class objects as parameters.

The first parameter is the class object corresponding to the return type of the signature. The remaining parameters are class objects corresponding to the type of method parameters in the signature. For example:

// Signature of toString()
MethodType mtToString = MethodType.methodType(String.class);

// Signature of a setter method
MethodType mtSetter = MethodType.methodType(void.class, Object.class);

// Signature of compare() from Comparator<String>
MethodType mtStringComparator = MethodType.methodType(int.class, String.class, String.class);

With MethodType, we can now use it and define methods to find the name and class of method handles. To do this, we need to call the static MethodHandles.lookup() method. This provides us with a "lookup context" based on the access rights of the currently executing method (that is, the method calling lookup().

The lookup context object has many methods with names beginning with "find", such as findVirtual(), findConstructor(), and findStatic(). These methods will return the actual method handle, but only if the lookup context is created in a method that can access (call) the requested method. Unlike reflection, there is no way to break this access control. In other words, the method handle does not have the equivalent of the setAccessible() method. For example:

public MethodHandle getToStringMH() {
    MethodHandle mh = null;
    MethodType mt = MethodType.methodType(String.class);
    MethodHandles.Lookup lk = MethodHandles.lookup();

    try {
        mh = lk.findVirtual(getClass(), "toString", mt);
    } catch (NoSuchMethodException | IllegalAccessException mhx) {
        throw (AssertionError)new AssertionError().initCause(mhx);
    }

    return mh;
}

There are two methods on MethodHandle that can be used to call method handles invoke() and + invokeExact(). Both methods take the receiver parameters and call parameters as parameters, so they sign as follows:

public final Object invoke(Object... args) throws Throwable;
public final Object invokeExact(Object... args) throws Throwable;

The difference between the two is that invokeExact() attempts to invoke method handles directly using the precise parameters provided. On the other hand, invoke() can slightly change method parameters as needed. invoke() performs the asType() transformation, which can transform parameters according to the following rule set:

  • If necessary, basic types will be packed in boxes.
  • If necessary, the basic type of packing will be cancelled.
  • Basic types will be expanded if necessary
  • The void return type will be converted to 0 (for the original return type) and to null for the expected reference type.
  • Whatever the static type, it is assumed that null values are correct and can be passed

Let's look at a simple invocation example that takes into account the following rules:

Object rcvr = "a";
try {
    MethodType mt = MethodType.methodType(int.class);
    MethodHandles.Lookup l = MethodHandles.lookup();
    MethodHandle mh = l.findVirtual(rcvr.getClass(), "hashCode", mt);

    int ret;
    try {
        ret = (int)mh.invoke(rcvr);
        System.out.println(ret);
    } catch (Throwable t) {
        t.printStackTrace();
    }
} catch (IllegalArgumentException | NoSuchMethodException | SecurityException e) {
    e.printStackTrace();
} catch (IllegalAccessException x) {
    x.printStackTrace();
}

In more complex examples, method handles can provide a clearer way to perform the same dynamic programming tasks as core reflection. Not only that, but method handles have been designed from the outset to work better with JVM's low-level execution model and possibly with better performance (although performance stories are evolving).

Method Processing and Call Dynamics

Invokedynamic uses method handles through the boot method mechanism. Unlike invokevirtual, invokedynamic instructions have no receiver objects. Instead, they behave like invokestatic and use BSM to return objects of CallSite type. This object contains a method handle (called a "target"), which represents the method to be executed as a result of the invokedynamic instruction.

When loading a class containing invokedynamic s, the call point is said to be in an "unrestricted" state, and after the BSM returns, it is said that the generated CallSite and method handle are "restricted" to the call site.

The signature of BSM is as follows (note that BSM can have any name):

static CallSite bootstrap(MethodHandles.Lookup caller, String name, MethodType type);

If you want to create code that actually contains invokedynamic, you need to use a bytecode operation library (because the Java language does not contain the required constructs). In the rest of this article, we will need to use the ASM library to generate bytecodes containing invokedynamic instructions. From a Java application perspective, these files are displayed as regular class files (although they certainly do not have Java source code representations). Java code treats them as "black boxes," but we can still call methods and take advantage of invokedynamics and related functions.

Let's look at an ASM-based class that uses invokedynamic to create a "Hello World".

public class InvokeDynamicCreator {

    public static void main(final String[] args) throws Exception {
        final String outputClassName = "kathik/Dynamic";
        try (FileOutputStream fos
                = new FileOutputStream(new File("target/classes/" + outputClassName + ".class"))) {
            fos.write(dump(outputClassName, "bootstrap", "()V"));
        }
    }

    public static byte[] dump(String outputClassName, String bsmName, String targetMethodDescriptor)
            throws Exception {
        final ClassWriter cw = new ClassWriter(0);
        MethodVisitor mv;

        // Setup the basic metadata for the bootstrap class
        cw.visit(V1_7, ACC_PUBLIC + ACC_SUPER, outputClassName, null, "java/lang/Object", null);

        // Create a standard void constructor
        mv = cw.visitMethod(ACC_PUBLIC, "<init>", "()V", null, null);
        mv.visitCode();
        mv.visitVarInsn(ALOAD, 0);
        mv.visitMethodInsn(INVOKESPECIAL, "java/lang/Object", "<init>", "()V");
        mv.visitInsn(RETURN);
        mv.visitMaxs(1, 1);
        mv.visitEnd();

        // Create a standard main method
        mv = cw.visitMethod(ACC_PUBLIC + ACC_STATIC, "main", "([Ljava/lang/String;)V", null, null);
        mv.visitCode();
        MethodType mt = MethodType.methodType(CallSite.class, MethodHandles.Lookup.class, String.class,
                MethodType.class);
        Handle bootstrap = new Handle(Opcodes.H_INVOKESTATIC, "kathik/InvokeDynamicCreator", bsmName,
                mt.toMethodDescriptorString());
        mv.visitInvokeDynamicInsn("runDynamic", targetMethodDescriptor, bootstrap);
        mv.visitInsn(RETURN);
        mv.visitMaxs(0, 1);
        mv.visitEnd();

        cw.visitEnd();

        return cw.toByteArray();
    }

    private static void targetMethod() {
        System.out.println("Hello World!");
    }

    public static CallSite bootstrap(MethodHandles.Lookup caller, String name, MethodType type) throws NoSuchMethodException, IllegalAccessException {
        final MethodHandles.Lookup lookup = MethodHandles.lookup();
        // Need to use lookupClass() as this method is static
        final Class<?> currentClass = lookup.lookupClass();
        final MethodType targetSignature = MethodType.methodType(void.class);
        final MethodHandle targetMH = lookup.findStatic(currentClass, "targetMethod", targetSignature);
        return new ConstantCallSite(targetMH.asType(type));
    }
}

The code is divided into two parts. The first part uses the ASM Visitor API to create a class file named kathik.Dynamic. Note the key call to visitInvokeDynamicInsn(). The second part contains the target method to bind to the call point and the BSM required for invokedynamic instructions.

Note that these methods are within the InvokeDynamicCreator class, not part of our generated class kathik.Dynamic. This means that at runtime, Invoke DynamicCreator must also be located on the classpath and kathik.Dynamic, otherwise the method will not be found.

When running InvokeDynamicCreator, it will create a new class file, Dynamic.class, which contains an invokedynamic instruction, as we can see using javap on the class:

public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=0, locals=1, args_size=1
         0: invokedynamic #20,  0             // InvokeDynamic #0:runDynamic:()V
         5: return

This example shows the simplest invokedynamic case, which uses the special case of a constant CallSite object. This means that BSM (and lookup) is only executed once, so subsequent calls are fast.

However, more complex invokedynamic usage can quickly become complex, especially when the call point and target method can be changed in the lifecycle of the program.

In the next article, we'll look at some more advanced use cases and build some examples, as well as more in-depth details of invokedynamic.

Posted by Koobazaur on Mon, 07 Oct 2019 12:42:45 -0700