Java coding practice of functional programming: using inertia to write high-performance and abstract code

Keywords: Java Back-end

Introduction:   This article will take lazy loading as an example to introduce various concepts in functional programming step by step, so readers don't need any foundation of functional programming, just need to know a little about Java 8.

Author Xuan Heng
Source: Ali technical official account

This article will take lazy loading as an example to introduce various concepts in functional programming step by step, so readers don't need any foundation of functional programming, just need to know a little about Java 8.

Does this abstraction necessarily lead to code performance degradation?

The dream of programmers is to write "high cohesion, low coupling" code, but from experience, the more abstract code often means lower performance. The assembly that can be executed directly by the machine has the strongest performance, followed by C language, and Java has lower performance because of its high level of abstraction. The business system is also restricted by the same law. The performance of the data addition, deletion, modification and query interface at the bottom is the highest, and the performance of the upper business interface is low due to the addition of various business verification and message sending.

Performance concerns also restrict programmers from more reasonable abstraction of modules.

Let's look at a common system abstraction. "User" is a common entity in the system. In order to unify the "user" abstraction in the system, we define a general domain model user. In addition to the user's id, it also contains department information, user's supervisor, etc. These are attributes that are often aggregated and used together in the system:

public class User {
    // User id
    private Long uid;
    // The user's Department, in order to keep the example simple, use the ordinary string here
    // Need to call the address book system remotely to obtain
    private String department;
    // The user's supervisor, in order to keep the example simple, is represented by an id
    // Need to call the address book system remotely to obtain
    private Long supervisor;
    // Permissions held by the user
    // Remote call permission is required to obtain the system
    private Set< String> permission;
}

This looks great, "User "All the commonly used attributes are concentrated in one entity. As long as the User is used as the parameter of the method, the method basically does not need to query other User information. However, once implemented, problems will be found. The Department and supervisor information needs to be obtained by calling the address book system remotely, and the permission needs to be obtained by calling the permission system remotely. You must pay for each User construction The cost of these two remote calls, even if some information is not used. For example, the following method shows this situation (judge whether a User is the supervisor of another User):

public boolean isSupervisor(User u1, User u2) {
    return Objects.equals(u1.getSupervisor(), u2.getUid());
}

In order to use the general User entity in the above method parameters, we must pay an additional price: remote calls can obtain completely unused permission information. If there is a problem with the permission system, it will also affect the stability of irrelevant interfaces.

With this in mind, we may want to give up the scheme of general entity, let the bare uid diffuse in the system, and scatter the user information query code everywhere in the system.

In fact, you can continue to use the above abstraction with a little improvement. You only need to turn department, supervisor and permission into lazy loaded fields and make external calls when necessary. This has many advantages:

  • Business modeling only needs to consider fitting the business without considering the underlying performance problems, so as to truly realize the decoupling of business layer and physical layer
  • Business logic is separated from external calls. No matter how the external interface changes, we always have an adaptation layer to ensure the stability of the core logic
  • Business logic looks like pure entity operation, which is easy to write unit tests and ensure the correctness of core logic

However, we often encounter some problems in the process of practice. This paper combines some skills of Java and functional programming to implement an inert loading tool class.

Second, strictness and inertia: the essence of Supplier in Java 8

Java 8 introduces a new functional interface Supplier. From the perspective of old Java programmers, it is just an interface that can obtain arbitrary values. Lambda is just the syntax sugar of this interface implementation class. This is from the perspective of language rather than calculation. When you understand strict ness and laziness After the difference, there may be a view closer to the essence of computing.

Because Java and C are strict programming languages, we are used to calculating variables where they are defined. In fact, there is another programming language genre, which calculates variables when they are used, such as Haskell.

Therefore, the essence of Supplier is to introduce the mechanism of inert computing into Java language. In order to realize equivalent inert computing in Java, it can be written as follows:

Supplier< Integer> a = () -> 10 + 1;
int b = a.get() + 1;

III. Further Optimization of Supplier: Lazy

The Supplier also has a problem that it will recalculate every time it gets the value through get. The real lazy calculation should cache the value after the first get. Just wrap the Supplier a little:

/**
* In order to facilitate the interaction with the standard Java functional interface, Lazy also implements the Supplier
*/
public class Lazy< T> implements Supplier< T> {

    private final Supplier< ? extends T> supplier;
    
    // Use the value attribute to cache the calculated value of the supplier
    private T value;

    private Lazy(Supplier< ? extends T> supplier) {
        this.supplier = supplier;
    }

    public static < T> Lazy< T> of(Supplier< ? extends T> supplier) {
        return new Lazy< >(supplier);
    }

    public T get() {
        if (value == null) {
            T newValue = supplier.get();

            if (newValue == null) {
                throw new IllegalStateException("Lazy value can not be null!");
            }

            value = newValue;
        }

        return value;
    }
}

Write the previous inert calculation code through Lazy:

Lazy< Integer> a = Lazy.of(() -> 10 + 1);
int b = a.get() + 1;
// get will not recalculate, but directly use the cached value
int c = a.get();

Optimize our previous common user entities through this lazy loading tool class:

public class User {
    // User id
    private Long uid;
    // The user's Department, in order to keep the example simple, use the ordinary string here
    // Need to call the address book system remotely to obtain
    private Lazy< String> department;
    // The user's supervisor, in order to keep the example simple, is represented by an id
    // Need to call the address book system remotely to obtain
    private Lazy< Long> supervisor;
    // Permissions contained by the user
    // Remote call permission is required to obtain the system
    private Lazy< Set< String>> permission;
    
    public Long getUid() {
        return uid;
    }
    
    public void setUid(Long uid) {
        this.uid = uid;
    }
    
    public String getDepartment() {
        return department.get();
    }
    
    /**
    * Because department is an lazily loaded attribute, the set method must pass in the calculation function instead of the specific value
    */
    public void setDepartment(Lazy< String> department) {
        this.department = department;
    }
    // ... similar omissions after
}

A simple example of constructing a User entity is as follows:

Long uid = 1L;
User user = new User();
user.setUid(uid);
// departmentService is an rpc call
user.setDepartment(Lazy.of(() -> departmentService.getDepartment(uid)));
// ....

This looks good, but when you continue to use it in depth, you will find some problems: the two attributes of the user are related to the Department and the supervisor. You need to obtain the user department through the rpc interface, and then obtain the supervisor according to the Department through another rpc interface. The code is as follows:

String department = departmentService.getDepartment(uid);
Long supervisor = SupervisorService.getSupervisor(department);

But now department is no longer a calculated value, but a Lazy object with Lazy calculation. How should the above code be written? "Functor" is used to solve this problem

Four Lazy realization Functor

Quick understanding: similar to the stream api in Java or the map method in Optional. A function can be understood as an interface, and a map can be understood as a method in an interface.

1 calculation object of functor

Collection < < T >, optional < < T >, and lazy < < T >, which we have just implemented in Java, have a common feature, that is, they all have and only have one generic parameter. In this article, we temporarily call them boxes and remember them as box < < T >, because they are like a universal container that can be packaged in any type.

Definition of functor

Functor operation can apply a function mapping t to S to Box < T > to make it Box < S >. An example of converting numbers in Box into strings is as follows:

The reason why the box contains types instead of 1 and "1" is that the box does not necessarily contain a single value, such as a set, or even a more complex multi value mapping relationship.

It should be noted that it is not easy to define a signature that satisfies box < s > map (function < T, s > function) to make box < T > a functor. The following is a counterexample:

// The counterexample cannot be a functor, because this method does not truthfully reflect the mapping relationship of function in the box
public Box< S> map(Function< T,S> function) {
    return new Box< >(null);
}

Therefore, functor is a stricter definition than map method. It also requires map to meet the following laws, which are called functor law (the essence of the law is to ensure that map method can truthfully reflect the mapping relationship defined by parameter function):

  • Unit element law: box < T > after the identity function is applied, the value will not change, that is, box.equals(box.map(Function.identity())) is always true (here equals is just a mathematically equivalent meaning)
  • Compound Law: suppose there are two functions f1 and f2, map (x - > f2 (f1 (x))) and map(f1).map(f2) is always equivalent

Obviously, Lazy satisfies the above two laws.

3 Lazy functor

Although so many theories are introduced, the implementation is very simple:

    public < S> Lazy< S> map(Function< ? super T, ? extends S> function) {
        return Lazy.of(() -> function.apply(get()));
    }

It can be easily proved that it satisfies the functor law.

We can easily solve the problems we have encountered before through the map. The functions passed in the map can be calculated assuming that the Department information has been obtained:

Lazy< String> departmentLazy = Lazy.of(() -> departmentService.getDepartment(uid));
Lazy< Long> supervisorLazy = departmentLazy.map(
    department -> SupervisorService.getSupervisor(department)
);

4 encountered a more difficult situation

Now we can not only construct inert values, but also calculate another inert value with one inert value, which looks perfect. But when you use it further, you find more difficult problems.

I now need two parameters, department and supervisor, to call the permission system to obtain permission, and the two values of department and supervisor are inert values. Try nested map s first:

Lazy< Lazy< Set< String>>> permissions = departmentLazy.map(department ->
         supervisorLazy.map(supervisor -> getPermissions(department, supervisor))
);

The type of return value seems a little strange. What we expect is Lazy < set < string > >, but what we get here is one more layer, which becomes Lazy < Lazy < set < string > >. Moreover, as the number of nested map layers increases, the generic level of Lazy will also increase. Examples of three parameters are as follows:

Lazy< Long> param1Lazy = Lazy.of(() -> 2L);
Lazy< Long> param2Lazy = Lazy.of(() -> 2L);
Lazy< Long> param3Lazy = Lazy.of(() -> 2L);
Lazy< Lazy< Lazy< Long>>> result = param1Lazy.map(param1 ->
        param2Lazy.map(param2 ->
                param3Lazy.map(param3 -> param1 + param2 + param3)
        )
);

This needs the following monadic operation to solve.

V. Lazy implementation list (Monad)

Quick understanding: it is similar to the Java stream api and the flatmap function in Optional

1 definition of list

The major difference between monad and functor lies in the received function. Functor functions generally return native values, while monad functions return boxed values. If the function in the figure below uses map instead of flatmap, it will turn into a Russian Doll - a two-layer box.

Of course, monad also has monad law, but it is more complex than functor law. It will not be explained here. Its role is similar to functor law to ensure that flatmap can truthfully reflect the mapping relationship of function.

2 Lazy list

The implementation is also simple:

    public < S> Lazy< S> flatMap(Function< ? super T, Lazy< ? extends S>> function) {
        return Lazy.of(() -> function.apply(get()).get());
    }

Use flatmap to solve previous problems:

Lazy< Set< String>> permissions = departmentLazy.flatMap(department ->
         supervisorLazy.map(supervisor -> getPermissions(department, supervisor))
);

Three parameters:

Lazy< Long> param1Lazy = Lazy.of(() -> 2L);
Lazy< Long> param2Lazy = Lazy.of(() -> 2L);
Lazy< Long> param3Lazy = Lazy.of(() -> 2L);
Lazy< Long> result = param1Lazy.flatMap(param1 ->
        param2Lazy.flatMap(param2 ->
                param3Lazy.map(param3 -> param1 + param2 + param3)
        )
);

The rule is that map is used for the last value, and flatmap is used for others.

3 digression: monadic grammar in functional language

After reading the above example, you will find that lazy calculation is very troublesome. You have to go through many flatmap s and maps every time to get the lazy value. This is actually a compromise made by Java without native support for functional programming. Haskell supports the use of do notation to simplify Monad operation. If Haskell is used for the example of the above three parameters, it is written as follows:

do
    param1 < - param1Lazy
    param2 < - param2Lazy
    param3 < - param3Lazy
    -- notes: do In notation return Meaning and Java Quite different
    -- It means packing values into boxes,
    -- Equivalent Java The writing method is Lazy.of(() -> param1 + param2 + param3)
    return param1 + param2 + param3

Although there is no syntax sugar in Java, God closes a door and opens a window. In Java, you can clearly see what each step is doing and understand the principle. If you have read the content before this article, you can certainly understand that this do notation is constantly doing flatmap.

Vi. Lazy's final code

So far, the Lazy code we have written is as follows:

public class Lazy< T> implements Supplier< T> {

    private final Supplier< ? extends T> supplier;

    private T value;

    private Lazy(Supplier< ? extends T> supplier) {
        this.supplier = supplier;
    }

    public static < T> Lazy< T> of(Supplier< ? extends T> supplier) {
        return new Lazy< >(supplier);
    }

    public T get() {
        if (value == null) {
            T newValue = supplier.get();

            if (newValue == null) {
                throw new IllegalStateException("Lazy value can not be null!");
            }

            value = newValue;
        }

        return value;
    }

    public < S> Lazy< S> map(Function< ? super T, ? extends S> function) {
        return Lazy.of(() -> function.apply(get()));
    }

    public < S> Lazy< S> flatMap(Function< ? super T, Lazy< ? extends S>> function) {
        return Lazy.of(() -> function.apply(get()).get());
    }
}

7. Construct an entity that can automatically optimize performance

Using Lazy, we write a factory to construct a common User entity:

@Component
public class UserFactory {
    
    // Department service, rpc interface
    @Resource
    private DepartmentService departmentService;
    
    // Supervisor service, rpc interface
    @Resource
    private SupervisorService supervisorService;
    
    // Permission service, rpc interface
    @Resource
    private PermissionService permissionService;
    
    public User buildUser(long uid) {
        Lazy< String> departmentLazy = Lazy.of(() -> departmentService.getDepartment(uid));
        // Obtain supervisor through Department
        // department -> supervisor
        Lazy< Long> supervisorLazy = departmentLazy.map(
            department -> SupervisorService.getSupervisor(department)
        );
        // Access through departments and supervisors
        // department, supervisor -> permission
        Lazy< Set< String>> permissionsLazy = departmentLazy.flatMap(department ->
            supervisorLazy.map(
                supervisor -> permissionService.getPermissions(department, supervisor)
            )
        );
        
        User user = new User();
        user.setUid(uid);
        user.setDepartment(departmentLazy);
        user.setSupervisor(supervisorLazy);
        user.setPermissions(permissionsLazy);
    }
}

The factory class is constructing an evaluation tree. Through the factory class, we can clearly see the evaluation dependencies between User attributes. At the same time, the User object can automatically optimize the performance at runtime. Once a node is evaluated, the values of all attributes on the path will be cached.

VIII. Exception handling

Although we make user.getDepartment() seem to be a pure memory operation through laziness, it is actually a remote call, so various unexpected exceptions may occur, such as timeout and so on.

Exception handling must not be handed over to business logic, which will affect the purity of business logic and make us waste all our previous efforts. The ideal way is to give it to the loading logic Supplier of the lazy value. In the calculation logic of the Supplier, all kinds of exceptions are fully considered to retry or throw exceptions. Although throwing exceptions may not be so "functional", it is close to Java programming habits, and when the key values are not obtained, the operation of business logic should be blocked through exceptions.

IX. summary

The entity constructed by this method can place all the attributes required in business modeling. Business modeling only needs to consider fitting the business without considering the underlying performance problems, so as to truly realize the decoupling of business layer and physical layer.

At the same time, UserFactory is essentially an adaptation layer of the external interface. Once the external interface changes, you only need to modify the adaptation layer, which can protect the stability of the core business code.

Because the external calls of the business core code are greatly reduced and the code is closer to pure operation, it is easy to write unit tests. Through unit tests, the stability of the core code can be guaranteed without errors.

Ten digressions: the missing coritization and application functor in Java

Think about it. The purpose of doing so much just now is to make the function signed as C f(A,B) apply to the boxed types box < a > and box < b > without modification, and generate a box < C >. There is a more convenient method in functional language, that is, application function.

The concept of application functor is very simple. It is to apply the boxed function to the boxed value, and finally get a boxed value. It can be realized in Lazy as follows:

    // Note that the function here is installed in lazy
    public < S> Lazy< S> apply(Lazy< Function< ? super T, ? extends S>> function) {
        return Lazy.of(() -> function.get().apply(get()));
    }

However, it is not useful to implement this in Java, because Java does not support coritization.

Coriolism allows us to fix several parameters of the function into a new function. If the function signature is f(a,b), the language supporting coriolism allows direct f(a) calls. At this time, the return value is a function that only receives B.

In the case of supporting coriolism, ordinary functions can be applied to boxed types only by applying functors several times in a row. For example, Haskell's example is as follows (< * > is the syntax sugar of applying functors in Haskell, and F is a function signed C, f (a, b). The syntax is not completely correct, but only expresses a meaning):

-- notes: The result is box c
box f < *> box a < *> box b

reference material

  • A similar Lazy implementation is provided in the Java functional class library VAVR. However, if you only want to use this class, the introduction of the whole library is still a little heavy. You can use the idea of this article to implement it yourself
  • Advanced functional programming: This paper refers to the analogy method of the inner box to a certain extent: Nuggets
  • Haskell fundamentals of functional programming
  • Java functional programming

Original link
This article is the original content of Alibaba cloud and cannot be reproduced without permission.  

Posted by cnl83 on Sun, 07 Nov 2021 17:50:56 -0800