Dragon killing in Java: how to modify the syntax tree?

Author: don't learn countless programmers

Source: https://my.oschina.net/u/4030990/blog/3211858

There are not many API documents on how to modify the abstract syntax tree of Java on the Internet, so this article records the relevant knowledge points for later reference.

Introduction to JCTree

JCTree is the base class of syntax tree elements and contains an important field pos, which is used to indicate the position of the current syntax tree node (JCTree) in the syntax tree. Therefore, we cannot directly use the new keyword to create the syntax tree node, even if it is created, it is meaningless.

In addition, the data structure and data processing are decoupled in combination with the visitor mode. Some source codes are as follows:

public abstract class JCTree implements Tree, Cloneable, DiagnosticPosition {

    public int pos = -1;

    ...

    public abstract void accept(JCTree.Visitor visitor);

    ...
}

We can see that JCTree is an abstract class. Here we focus on several subclasses of JCTree

  1. JCStatement: declare syntax tree nodes. Common subclasses are as follows

    • JCBlock: statement block syntax tree node
    • JCReturn: return statement syntax tree node
    • JCClassDecl: class definition syntax tree node
    • JCVariableDecl: field / variable definition syntax tree node
  2. JCMethodDecl: method definition syntax tree node

  3. JCModifiers: access flag syntax tree node

  4. JCExpression: expression syntax tree node. Common subclasses are as follows

    • JCAssign: assignment statement syntax tree node
    • JCIdent: identifier syntax tree node, which can be variable, type, keyword, etc

TreeMaker introduction

TreeMaker is used to create a series of syntax tree nodes. As we said above, the new keyword cannot be directly used to create jctrees. Therefore, Java provides us with a tool, TreeMaker, which will set the pos field for the JCTree object we created during creation, so we must use the context sensitive TreeMaker object to create syntax tree nodes.

For specific API introduction, refer to treemaker API. Next, focus on several common methods.

TreeMaker.Modifiers

The TreeMaker.Modifiers method is used to create access flag syntax tree nodes (JCModifiers). The source code is as follows

public JCModifiers Modifiers(long flags) {
    return Modifiers(flags, List.< JCAnnotation >nil());
}

public JCModifiers Modifiers(long flags,
    List<JCAnnotation> annotations) {
        JCModifiers tree = new JCModifiers(flags, annotations);
        boolean noFlags = (flags & (Flags.ModifierFlags | Flags.ANNOTATION)) == 0;
        tree.pos = (noFlags && annotations.isEmpty()) ? Position.NOPOS : pos;
        return tree;
}

  1. Flags: access flags
  2. Annotations: list of annotations

Flags can be represented by the enumeration class com.sun.tools.javac.code.Flags. For example, we can use this to generate the following access flags.

treeMaker.Modifiers(Flags.PUBLIC + Flags.STATIC + Flags.FINAL);

public static final

TreeMaker.ClassDef

TreeMaker.ClassDef is used to create a class definition syntax tree node (JCClassDecl). The source code is as follows:

public JCClassDecl ClassDef(JCModifiers mods,
    Name name,
    List<JCTypeParameter> typarams,
    JCExpression extending,
    List<JCExpression> implementing,
    List<JCTree> defs) {
        JCClassDecl tree = new JCClassDecl(mods,
                                     name,
                                     typarams,
                                     extending,
                                     implementing,
                                     defs,
                                     null);
        tree.pos = pos;
        return tree;
}

  1. mods: access flag, which can be created through TreeMaker.Modifiers
  2. Name: class name
  3. Typeparams: generic parameter list
  4. extending: parent class
  5. implementing: interface implemented
  6. defs: detailed statements of class definitions, including definitions of fields and methods

TreeMaker.MethodDef

TreeMaker.MethodDef is used to create a method definition syntax tree node (JCMethodDecl). The source code is as follows

public JCMethodDecl MethodDef(JCModifiers mods,
    Name name,
    JCExpression restype,
    List<JCTypeParameter> typarams,
    List<JCVariableDecl> params,
    List<JCExpression> thrown,
    JCBlock body,
    JCExpression defaultValue) {
        JCMethodDecl tree = new JCMethodDecl(mods,
                                       name,
                                       restype,
                                       typarams,
                                       params,
                                       thrown,
                                       body,
                                       defaultValue,
                                       null);
        tree.pos = pos;
        return tree;
}

public JCMethodDecl MethodDef(MethodSymbol m,
    Type mtype,
    JCBlock body) {
        return (JCMethodDecl)
            new JCMethodDecl(
                Modifiers(m.flags(), Annotations(m.getAnnotationMirrors())),
                m.name,
                Type(mtype.getReturnType()),
                TypeParams(mtype.getTypeArguments()),
                Params(mtype.getParameterTypes(), m),
                Types(mtype.getThrownTypes()),
                body,
                null,
                m).setPos(pos).setType(mtype);
}

  1. mods: access flag
  2. Name: method name
  3. restype: return type
  4. Typeparams: generic parameter list
  5. params: parameter list
  6. Throw: exception declaration list
  7. Body: method body
  8. defaultValue: default method (which default in the interface may be)
  9. m: Method symbol
  10. mtype: method type. It contains multiple types, including generic parameter type, method parameter type, exception parameter type and return parameter type.

The return type restype is null or treeMaker.TypeIdent(TypeTag.VOID) represents the return void type

TreeMaker.VarDef

TreeMaker.VarDef is used to create a field / variable definition syntax tree node (JCVariableDecl). The source code is as follows

public JCVariableDecl VarDef(JCModifiers mods,
    Name name,
    JCExpression vartype,
    JCExpression init) {
        JCVariableDecl tree = new JCVariableDecl(mods, name, vartype, init, null);
        tree.pos = pos;
        return tree;
}

public JCVariableDecl VarDef(VarSymbol v,
    JCExpression init) {
        return (JCVariableDecl)
            new JCVariableDecl(
                Modifiers(v.flags(), Annotations(v.getAnnotationMirrors())),
                v.name,
                Type(v.type),
                init,
                v).setPos(pos).setType(v.type);
}

  1. mods: access flag
  2. Name: parameter name
  3. vartype: type
  4. init: initialization statement
  5. v: Variable symbol

TreeMaker.Ident

TreeMaker.Ident is used to create identifier syntax tree node (JCIdent). The source code is as follows

public JCIdent Ident(Name name) {
        JCIdent tree = new JCIdent(name, null);
        tree.pos = pos;
        return tree;
}

public JCIdent Ident(Symbol sym) {
        return (JCIdent)new JCIdent((sym.name != names.empty)
                                ? sym.name
                                : sym.flatName(), sym)
            .setPos(pos)
            .setType(sym.type);
}

public JCExpression Ident(JCVariableDecl param) {
        return Ident(param.sym);
}

TreeMaker.Return

TreeMaker.Return is used to create a return statement (JCReturn). The source code is as follows

public JCReturn Return(JCExpression expr) {
        JCReturn tree = new JCReturn(expr);
        tree.pos = pos;
        return tree;
}

TreeMaker.Select

TreeMaker.Select is used to create domain access / method access (the method access here only takes the name, and the method call needs TreeMaker.Apply) syntax tree node (JCFieldAccess). The source code is as follows

public JCFieldAccess Select(JCExpression selected,
    Name selector) 
{
        JCFieldAccess tree = new JCFieldAccess(selected, selector, null);
        tree.pos = pos;
        return tree;
}

public JCExpression Select(JCExpression base,
    Symbol sym) {
        return new JCFieldAccess(base, sym.name, sym).setPos(pos).setType(sym.type);
}

  1. selected:. The expression to the left of the operator
  2. selector:. The expression to the right of the operator

The following is an example. The Java statement generated by one statement is the second statement

one. TreeMaker.Select(treeMaker.Ident(names.fromString("this")), names.fromString("name"));

two. this.name

TreeMaker.NewClass

TreeMaker.NewClass is used to create a new statement syntax tree node (JCNewClass). The source code is as follows:

public JCNewClass NewClass(JCExpression encl,
    List<JCExpression> typeargs,
    JCExpression clazz,
    List<JCExpression> args,
    JCClassDecl def) {
        JCNewClass tree = new JCNewClass(encl, typeargs, clazz, args, def);
        tree.pos = pos;
        return tree;
}

  1. encl: I don't quite understand the meaning of this parameter. I see that in many examples, this parameter is set to null
  2. typeargs: parameter type list
  3. clazz: type of object to be created
  4. args: parameter list
  5. def: class definition

TreeMaker.Apply

TreeMaker.Apply is used to create a method call syntax tree node (JCMethodInvocation). The source code is as follows:

public JCMethodInvocation Apply(List<JCExpression> typeargs,
    JCExpression fn,
    List<JCExpression> args) {
        JCMethodInvocation tree = new JCMethodInvocation(typeargs, fn, args);
        tree.pos = pos;
        return tree;
}

  1. typeargs: parameter type list
  2. fn: call statement
  3. args: parameter list

TreeMaker.Assign

The TreeMaker.Assign user creates the assignment statement syntax tree node (JCAssign), and the source code is as follows:

ublic JCAssign Assign(JCExpression lhs,
    JCExpression rhs) {
        JCAssign tree = new JCAssign(lhs, rhs);
        tree.pos = pos;
        return tree;
}

  1. lhs: left expression of assignment statement
  2. rhs: expression on the right of assignment statement

TreeMaker.Exec

TreeMaker.Exec is used to create an executable statement syntax tree node (JCExpressionStatement). The source code is as follows:

public JCExpressionStatement Exec(JCExpression expr) {
        JCExpressionStatement tree = new JCExpressionStatement(expr);
        tree.pos = pos;
        return tree;
}

TreeMaker.Apply and TreeMaker.Assign need to wrap a layer of TreeMaker.Exec to obtain a JCExpressionStatement

TreeMaker.Block

TreeMaker.Block is used to create a syntax tree node (JCBlock) for combined statements. The source code is as follows:

public JCBlock Block(long flags,
    List<JCStatement> stats) {
        JCBlock tree = new JCBlock(flags, stats);
        tree.pos = pos;
        return tree;
}

  1. Flags: access flags
  2. stats: statement list

com.sun.tools.javac.util.List introduction

When we operate the abstract syntax tree, we sometimes involve the operation of List, but this List is not the java.util.List we often use, but com.sun.tools.javac.util.List. This List is strange. It is a chain structure with head nodes and tail nodes, but only the tail node is a List. Here's what we need to know.

public class List<A> extends AbstractCollection<A> implements java.util.List<A> {
    public A head;
    public List<A> tail;
    private static final List<?> EMPTY_LIST = new List<Object>((Object)null, (List)null) {
        public List<Object> setTail(List<Object> var1) {
            throw new UnsupportedOperationException();
        }

        public boolean isEmpty() {
            return true;
        }
    };

    List(A head, List<A> tail) {
        this.tail = tail;
        this.head = head;
    }

    public static <A> List<A> nil() {
        return EMPTY_LIST;
    }

    public List<A> prepend(A var1) {
        return new List(var1, this);
    }

    public List<A> append(A var1) {
        return of(var1).prependList(this);
    }

    public static <A> List<A> of(A var0) {
        return new List(var0, nil());
    }

    public static <A> List<A> of(A var0, A var1) {
        return new List(var0, of(var1));
    }

    public static <A> List<A> of(A var0, A var1, A var2) {
        return new List(var0, of(var1, var2));
    }

    public static <A> List<A> of(A var0, A var1, A var2, A... var3) {
        return new List(var0, new List(var1, new List(var2, from(var3))));
    }

    ...
}

com.sun.tools.javac.util.ListBuffer

com.sun.tools.javac.util.List is inconvenient to use, so a layer is encapsulated on it. The encapsulated class is ListBuffer. The operation of this class is very similar to the java.util.List usage we often use.

public class ListBuffer<A> extends AbstractQueue<A> {

    public static <T> ListBuffer<T> of(T x) {
        ListBuffer<T> lb = new ListBuffer<T>();
        lb.add(x);
        return lb;
    }

    /** The list of elements of this buffer.
     */
    private List<A> elems;

    /** A pointer pointing to the last element of 'elems' containing data,
     *  or null if the list is empty.
     */
    private List<A> last;

    /** The number of element in this buffer.
     */
    private int count;

    /** Has a list been created from this buffer yet?
     */
    private boolean shared;

    /** Create a new initially empty list buffer.
     */
    public ListBuffer() {
        clear();
    }

    /** Append an element to buffer.
     */
    public ListBuffer<A> append(A x) {
        x.getClass(); // null check
        if (shared) copy();
        List<A> newLast = List.<A>of(x);
        if (last != null) {
            last.tail = newLast;
            last = newLast;
        } else {
            elems = last = newLast;
        }
        count++;
        return this;
    }
    ........
}

Introduction to com.sun.tools.javac.util.Names

This is a tool class that creates names for us. The names of classes, methods and parameters need to be created through this class. A method often used in it is fromString(). The general usage method is as follows.

Names names  = new Names()
names. fromString("setName");

Actual combat drill

Above, we have learned about how to operate the abstract syntax tree. Next, we will write several real cases to deepen our understanding.

Variable correlation

In a class, the parameter we often operate on is a variable, so how to use the characteristics of the abstract syntax tree to operate variables for us? Next, we will some operations on variables.

Generating variables

For example, generate private String age; For such a variable, use the VarDef method we talked about above

// Generate parameters, such as: private String age;
treeMaker.VarDef(treeMaker.Modifiers(Flags.PRIVATE), names.fromString("age"), treeMaker.Ident(names.fromString("String")), null);

Assign values to variables

For example, do we want to generate private String name = "BuXueWuShu", or use the VarDef method

// private String name = "BuXueWuShu"
treeMaker.VarDef(treeMaker.Modifiers(Flags.PRIVATE),names.fromString("name"),treeMaker.Ident(names.fromString("String")),treeMaker.Literal("BuXueWuShu"))

Add two literal quantities

For example, we generate String add = "a" + "b";, Borrow the Exec method and Assign method mentioned above

// add = "a"+"b"
treeMaker.Exec(treeMaker.Assign(treeMaker.Ident(names.fromString("add")),treeMaker.Binary(JCTree.Tag.PLUS,treeMaker.Literal("a"),treeMaker.Literal("b"))))

+=Grammar

For example, if we want to generate add += "test", the literal amount is similar to that above.

// add+="test"
treeMaker.Exec(treeMaker.Assignop(JCTree.Tag.PLUS_ASG, treeMaker.Ident(names.fromString("add")), treeMaker.Literal("test")))

++Grammar

For example, you want to generate + + i

treeMaker.Exec(treeMaker.Unary(JCTree.Tag.PREINC,treeMaker.Ident(names.fromString("i"))))

Method correlation

When we operate on variables, we basically need to generate methods. How to generate and operate methods? Next, let's demonstrate the operation methods related to methods.

No parameter, no return value

We can use the MethodDef method mentioned above for generation

/*
    Method generation with no parameters and no return value
    public void test(){

    }
 */
// Define method body
ListBuffer<JCTree.JCStatement> testStatement = new ListBuffer<>();
JCTree.JCBlock testBody = treeMaker.Block(0, testStatement.toList());

JCTree.JCMethodDecl test = treeMaker.MethodDef(
        treeMaker.Modifiers(Flags.PUBLIC), // Method limit
        names.fromString("test"), // Method name
        treeMaker.Type(new Type.JCVoidType()), // Return type
        com.sun.tools.javac.util.List.nil(),
        com.sun.tools.javac.util.List.nil(),
        com.sun.tools.javac.util.List.nil(),
        testBody,	// Method body
        null
);

Return value with or without parameters

We can use the MethodDef method mentioned above for generation

/*
    Method generation with no parameters and no return value
    public void test2(String name){
        name = "xxxx";
    }
 */
ListBuffer<JCTree.JCStatement> testStatement2 = new ListBuffer<>();
testStatement2.append(treeMaker.Exec(treeMaker.Assign(treeMaker.Ident(names.fromString("name")),treeMaker.Literal("xxxx"))));
JCTree.JCBlock testBody2 = treeMaker.Block(0, testStatement2.toList());

// Generate input parameters
JCTree.JCVariableDecl param = treeMaker.VarDef(treeMaker.Modifiers(Flags.PARAMETER), names.fromString("name"),treeMaker.Ident(names.fromString("String")), null);
com.sun.tools.javac.util.List<JCTree.JCVariableDecl> parameters = com.sun.tools.javac.util.List.of(param);

JCTree.JCMethodDecl test2 = treeMaker.MethodDef(
        treeMaker.Modifiers(Flags.PUBLIC), // Method limit
        names.fromString("test2"), // Method name
        treeMaker.Type(new Type.JCVoidType()), // Return type
        com.sun.tools.javac.util.List.nil(),
        parameters, // Input parameter
        com.sun.tools.javac.util.List.nil(),
        testBody2,
        null
);

There are parameters and return values

 /*
    There are parameters and return values
    public String test3(String name){
       return name;
    }
 */

ListBuffer<JCTree.JCStatement> testStatement3 = new ListBuffer<>();
testStatement3.append(treeMaker.Return(treeMaker.Ident(names.fromString("name"))));
JCTree.JCBlock testBody3 = treeMaker.Block(0, testStatement3.toList());

// Generate input parameters
JCTree.JCVariableDecl param3 = treeMaker.VarDef(treeMaker.Modifiers(Flags.PARAMETER), names.fromString("name"),treeMaker.Ident(names.fromString("String")), null);
com.sun.tools.javac.util.List<JCTree.JCVariableDecl> parameters3 = com.sun.tools.javac.util.List.of(param3);

JCTree.JCMethodDecl test3 = treeMaker.MethodDef(
        treeMaker.Modifiers(Flags.PUBLIC), // Method limit
        names.fromString("test4"), // Method name
        treeMaker.Ident(names.fromString("String")), // Return type
        com.sun.tools.javac.util.List.nil(),
        parameters3, // Input parameter
        com.sun.tools.javac.util.List.nil(),
        testBody3,
        null
);

special

After learning how to define parameters and methods, we still have many statements to learn, such as how to generate new statements, how to generate statements for method calls, and how to generate if statements. Next, let's learn some special grammar.

new an object

// Create a new statement CombatJCTreeMain combatJCTreeMain = new CombatJCTreeMain();
JCTree.JCNewClass combatJCTreeMain = treeMaker.NewClass(
        null,
        com.sun.tools.javac.util.List.nil(),
        treeMaker.Ident(names.fromString("CombatJCTreeMain")),
        com.sun.tools.javac.util.List.nil(),
        null
);
JCTree.JCVariableDecl jcVariableDecl1 = treeMaker.VarDef(
        treeMaker.Modifiers(Flags.PARAMETER),
        names.fromString("combatJCTreeMain"),
        treeMaker.Ident(names.fromString("CombatJCTreeMain")),
        combatJCTreeMain
);

Method call (no parameters)

JCTree.JCExpressionStatement exec = treeMaker.Exec(
        treeMaker.Apply(
                com.sun.tools.javac.util.List.nil(),
                treeMaker.Select(
                        treeMaker.Ident(names.fromString("combatJCTreeMain")), // . content on the left
                        names.fromString("test") // . content on the right
                ),
                com.sun.tools.javac.util.List.nil()
        )
);

Method call (with parameters)

// Create a method call combatJCTreeMain.test2("hello world!");
JCTree.JCExpressionStatement exec2 = treeMaker.Exec(
        treeMaker.Apply(
                com.sun.tools.javac.util.List.nil(),
                treeMaker.Select(
                        treeMaker.Ident(names.fromString("combatJCTreeMain")), // . content on the left
                        names.fromString("test2") // . content on the right
                ),
                com.sun.tools.javac.util.List.of(treeMaker.Literal("hello world!")) // Content in method
        )
);

if statement

/*
    Create an if statement
    if("BuXueWuShu".equals(name)){
        add = "a" + "b";
    }else{
        add += "test";
    }
 */
// "BuXueWuShu".equals(name)
JCTree.JCMethodInvocation apply = treeMaker.Apply(
        com.sun.tools.javac.util.List.nil(),
        treeMaker.Select(
                treeMaker.Literal("BuXueWuShu"), // . content on the left
                names.fromString("equals") // . content on the right
        ),
        com.sun.tools.javac.util.List.of(treeMaker.Ident(names.fromString("name")))
);
//  add = "a" + "b"
JCTree.JCExpressionStatement exec3 = treeMaker.Exec(treeMaker.Assign(treeMaker.Ident(names.fromString("add")), treeMaker.Binary(JCTree.Tag.PLUS, treeMaker.Literal("a"), treeMaker.Literal("b"))));
//  add += "test"
JCTree.JCExpressionStatement exec1 = treeMaker.Exec(treeMaker.Assignop(JCTree.Tag.PLUS_ASG, treeMaker.Ident(names.fromString("add")), treeMaker.Literal("test")));

JCTree.JCIf anIf = treeMaker.If(
        apply, // Judgment statement in if statement
        exec3, // Conditional statement
        exec1  // Statement with invalid condition
);

Source address: https://github.com/modouxiansheng/Doraemon

summary

I feel shallow on paper. I absolutely know that I have to practice it.

I hope you can experiment on this machine after reading this article.

Set a few parameters by yourself and learn Lombok to generate get and set methods. Although this knowledge will not be used in daily development, if you use this knowledge, others will not and you will. In fact, the gap will be opened slowly.

All the codes involved in this article are on github. After pulling it down, you can find the CombatJCTreeProcessor class globally.

Posted by horstuff on Wed, 24 Nov 2021 03:08:02 -0800