Fast JSON source code analysis -- deserialization

Keywords: Java JSON RESTful

2021SC@SDUSC

This article is sent synchronously in the personal blog, and the address is Redbit's personal journey

outline

Last Fast JSON source code analysis -- deserialization (6) The two parsearray APIs of fastjson are deeply compared, and the API design and internal logic architecture are analyzed through similarities and differences. This article will continue to explore the deserialization structure of fast JSON for JSON object arrays.
First consider the deserialization architecture for a single type of JSON object array

1. Parsearray (class <? >, collection) method

    public void parseArray(Class<?> clazz, @SuppressWarnings("rawtypes") Collection array) {
        parseArray((Type) clazz, array);
    }

This method converts the parameter clazz of the ordinary class (class <? >) into the Type, and calls another API, which can reflect the logic of the internal call of fastjson.

The developers of fastjson have always implemented a principle, paying attention to the ease of use of API and the simplification and de redundancy of internal logic. Such a structure is very mature, and it is also the most basic skill to improve code efficiency and code readability

Next, look at the next structure

2. parseArray(Type, Collection) method

    public void parseArray(Type type, Collection array) {
        parseArray(type, array, null);//null refers to fieldName
    }

This method still considers the structure and converts the original API into other API calls.
The method parameter fieldName corresponding to the null value here takes into account that fastjson users need to specify special deserialization naming for some attributes of the object. For example, it is required to bind the cid attribute of the object to the id field in the JSON field. You only need to add a line of Annotation declaration before the cid of the object. Here, because we have no special provisions on the property name that needs special binding for deserialization, the passed parameter of the calling method is set to null.

3. parseArray(Type, Collection, Object) method

This method is the core method for the specific operation of deserialization of a single type object array. The code specifies the use of token by the parsearray API, the deserialization of explicit type data (executed by the deserialization instance of a specific type), and the feedback of exceptions to illegal JSON strings. Please see the code. Please refer to the notes in the code for help.

    @SuppressWarnings({ "unchecked", "rawtypes" })
    public void parseArray(Type type, Collection array, Object fieldName) {
        int token = lexer.token(); //Get token
        if (token == JSONToken.SET || token == JSONToken.TREE_SET) {
            lexer.nextToken(); //If the corresponding position is set series data, read the next token
            token = lexer.token();
        }

        if (token != JSONToken.LBRACKET) { 
		/** Checking the legitimacy of a token is essentially checking the legitimacy of the incoming JSON string
		* For example, if the string does not contain the starting character '[' of the array, the string data can be considered not JSON data or data error
		* Throw an exception to abort the deserialization operation */
            throw new JSONException("field " + fieldName + " expect '[', but " + JSONToken.name(token) + ", " + lexer.info());
        }

        ObjectDeserializer deserializer = null;
		//Defines an abstract deserialization instance, which is used to load the concrete deserialization instance defined by its subclass

		//By passing in the type, you can quickly distinguish the two most common deserialization types, int and String, and call the deserialization instance directly
        if (int.class == type) { //Check int type data
            deserializer = IntegerCodec.instance;
            lexer.nextToken(JSONToken.LITERAL_INT);
        } else if (String.class == type) { //Check String type
            deserializer = StringCodec.instance;
            lexer.nextToken(JSONToken.LITERAL_STRING);
        } else {
		/** If it is not int/String type data, call out the deserialization instance of this type through configuration
		* It may be a type registered in fastjson (see the introduction to parseObject for details)
		* It may also be an unregistered type. You need to internally generate a deserializer of this type according to the type of the attribute, and return the call here after registration */
            deserializer = config.getDeserializer(type);
            lexer.nextToken(deserializer.getFastMatchToken());
        }
		//The following code, using the deserialization instance, starts deserialization and saves the result in the List
        ParseContext context = this.context;
        this.setContext(array, fieldName);
        try {
            for (int i = 0;; ++i) { //Deserialize the fields of an object in the JSON string
                if (lexer.isEnabled(Feature.AllowArbitraryCommas)) { //Check the status of the configuration feature (AllowArbitraryCommas)
                    while (lexer.token() == JSONToken.COMMA) {
                        lexer.nextToken();
                        continue;
                    }
                }

                if (lexer.token() == JSONToken.RBRACKET) {
                    break; //When the closing bracket ']' is encountered, it directly jumps out of the loop, indicating that the token has been traversed and the deserialization is completed
                }

                if (int.class == type) { //Deserialize int type
                    Object val = IntegerCodec.instance.deserialze(this, null, null);
                    array.add(val); //Add to the returned data
                } else if (String.class == type) { //Deserialize String
                    String value;
                    if (lexer.token() == JSONToken.LITERAL_STRING) {
                        value = lexer.stringVal();
                        lexer.nextToken(JSONToken.COMMA);
                    } else {
                        Object obj = this.parse();
                        if (obj == null) {
                            value = null;
                        } else {
                            value = obj.toString();
                        }
                    }

                    array.add(value);
                } else { //Deserialize normal objects
                    Object val;
                    if (lexer.token() == JSONToken.NULL) {
					//Avoid deserialization of null objects and generate non empty objects (generate errors)
                        lexer.nextToken();
                        val = null;
                    } else {
                        val = deserializer.deserialze(this, type, i);
                    }
                    array.add(val);
                    checkListResolve(array);
                }

                if (lexer.token() == JSONToken.COMMA) { //If a comma '' is encountered, jump to the next token and continue the loop traversal (continue)
                    lexer.nextToken(deserializer.getFastMatchToken());
                    continue;
                }
            }
        } finally {
            this.setContext(context);
        }

        lexer.nextToken(JSONToken.COMMA);//Jump to the content of the next JSON string and let subsequent method calls
    }

This method code is long, but it is readable. It clearly specifies the deserialization execution order of a single type JSON object array. First retrieve the token, read the type in which the data is stored, use the abstract deserializer, load the created deserialization instance for a specific data type, and take advantage of polymorphism to deserialize each object in the JSON array. Save the data in the Collection of parameters and return to the original method.

Through the deserialization sequence of array by fastjson, it is not difficult to find that parseArray uses different operation logic from parseObject (in fact, it uses the logic of object deserialization and calls its methods to deserialize each JSON string (representing a JSON object) circularly.
Since this API is aimed at JSON objects of a single type, we only need to specify a deserialization instance, which can be used to deserialize all JSON objects. In the process of execution, this API tries to avoid the same code as the parsearray API that specifies the type for each element, and use the code in the most specific way as possible.
This operation has obvious advantages. If the two parseArray APIs have some of the same code, this code must be universal. It is not a good thing whether it is executed for a single type or an API that specifies each element type. Common codes can either be extracted into public methods (which will destroy the structure of API calls) , it either reduces the execution efficiency (general code must be less special than special code), increases code redundancy and reduces maintainability.
Therefore, fastjson developers try their best to design deserialization code for single type objects here, which reduces the degree of coupling and improves efficiency. They are extremely specific when they need specificity, and achieve great versatility when they need public code.

last

This paper introduces the internal logic of the parseArray()API for a single type, and clarifies the structure of API packaging and calling through simple comparison. Starting from the outermost parsearray (class <? >, collection), we check the way in which the API encapsulates a more universal API layer by layer, and finally explain the parseArray(Type, Collection, Object) in detail Method, you can see how fastjson uses token to check the type of elements in the array, create a specific deserialization instance (the same logic as parseObject), and deserialize each element of JSON string (separated by comm).
In this process, we learned the clever handling of code structure and redundancy by fastjson developers. These skills are applicable to our daily code development. After all, concise code and easy to understand API structure are the basic methods to improve code readability.
In the next article, we will focus on specifying the implementation logic and structure of the JSON array deserialization APIparseArray(String, Type []) of each element type for different types, and analyze its similarities and differences with the structure mentioned in this paper in a comparative way to better understand the difficulties.
Thank you for your reading and guidance!

Posted by weneedmoregold on Sun, 05 Dec 2021 19:22:58 -0800