Introduction to practical functional Java (PFJ)

Keywords: Java Functional Programming

[note] this article is translated from: Introduction To Pragmatic Functional Java - DZone Java

Practical functional Java is a modern, very concise but readable java coding style based on the concept of functional programming.

Practical functional Java (PFJ) attempts to define a new idiomatic java coding style. The coding style will make full use of all the functions of the current and upcoming Java versions, and involve compilers to help write concise but reliable and readable code.
Although this style can even be used in Java 8, it looks more concise and concise in Java 11. It becomes more expressive in Java 17 and benefits from every new Java language feature.
But PFJ is not a free lunch. It requires significant changes in developers' habits and methods. Changing habits is not easy, especially the traditional imperative habits.
Is it worth it? exactly! PFJ code is concise, expressive and reliable. It's easy to read and maintain, and in most cases, if the code can be compiled - it can work!

Elements of practical functional Java

PFJ comes from a wonderful book Effective Java Book, which contains some additional concepts and conventions, especially from Functional Programming (FP). Note that although the FP concept is used, PFJ does not attempt to enforce FP specific terms. (although we also provide references for those interested in further exploring these concepts).
PFJ focuses on:

  • Reduce the psychological burden.
  • Improve code reliability.
  • Improve long-term maintainability.
  • Use the compiler to help write the correct code.
  • Making it easy and natural to write correct code, while writing incorrect code is still possible, it should take effort.

Despite the ambitious goals, there are only two key PFJ rules:

  • Avoid null as much as possible.
  • There are no business exceptions.

Each key rule is discussed in more detail below:

Avoid null as much as possible (ANAMAP rule)

The nullability of variables is one of the special states. They are well known sources of runtime errors and boilerplate code. To eliminate these problems and represent possible missing values, PFJ uses the option < T > container. This covers all situations where such values can occur - return values, input parameters, or fields.
In some cases, for example, for performance or compatibility with existing frameworks, classes may use null internally. These situations must be clearly documented and invisible to class users, that is, all class API s should use option < T >.
This approach has several advantages:

  • Nullable variables are immediately visible in code. There is no need to read the documentation, check the source code, or rely on comments.
  • The compiler distinguishes nullable and non nullable variables and prevents incorrect assignment between them.
  • All templates required for null checking are eliminated.

No business exception (NBE rule)

The PFJ only uses exceptions to indicate fatal, unrecoverable (technical) failures. Such exceptions may be intercepted only for the purpose of logging and / or closing the application normally. All other exceptions and their interception are discouraged and avoided as far as possible.
Business exception is another case of special status. In order to propagate and process business level errors, PFJ uses the result < T > container. Again, this covers all cases where errors can occur - return values, input parameters, or fields. Practice has shown that few fields, if any, need to use this container.
There is no justification for using business level exceptions. Interact with existing Java libraries and legacy code through dedicated wrapping methods. The result < T > container contains implementations of these wrapper methods.
No business exception rule has the following advantages:

  • Methods that can return errors are immediately visible in the code. There is no need to read the documentation, check the source code, or analyze the call tree to see what exceptions can be thrown and under what conditions.
  • The compiler enforces correct error handling and propagation.
  • There are few templates for error handling and propagation.
  • We can write code for happy day scenarios and handle the original intention of error exception at the most convenient point, which has never been realized.
  • The code remains composable, easy to read and reason, and there are no hidden interruptions or unexpected transformations in the execution process - what you read is what will be executed.

Convert legacy code to PFJ style code

OK, the key rules look good and useful, but what will the real code look like?
Let's start with a very typical back-end Code:

public interface UserRepository {
    User findById(User.Id userId);
}

public interface UserProfileRepository {
    UserProfile findById(User.Id userId);
}

public class UserService {
    private final UserRepository userRepository;
    private final UserProfileRepository userProfileRepository;

    public UserWithProfile getUserWithProfile(User.Id userId) {
        User user = userRepository.findById(userId);
        if (user == null) {
            throw UserNotFoundException("User with ID " + userId + " not found");
        }
        UserProfile details = userProfileRepository.findById(userId);
        return UserWithProfile.of(user, details == null ? UserProfile.defaultDetails() : details);
    }
}

The interface at the beginning of the example is provided for context clarity. The main point of interest is the getUserWithProfile method. Let's analyze it step by step.

  • The first statement retrieves the user variable from the user repository.
  • Since the user may not exist in the repository, the user variable may be null. The following null checks to verify whether this is the case. If so, a business exception is thrown.
  • The next step is to retrieve the user profile details. Lack of detail is not considered an error. Conversely, when details are missing, the configuration file uses the default values.

There are several problems with the above code. First, if no value exists in the repository, null is returned, which is not obvious from the interface. We need to examine the documentation, study the implementation, or guess how these repositories work.
Sometimes annotations are used to provide hints, but this still does not guarantee the behavior of the API.
To solve this problem, let's apply the rule to the repository:

public interface UserRepository {
    Option<User> findById(User.Id userId);
}

public interface UserProfileRepository {
    Option<UserProfile> findById(User.Id userId);
}

No guesswork is needed now - the API explicitly tells you that there may not be a return value.
Now let's look at the getUserWithProfile method again. The second thing to note is that the method may return a value or may throw an exception. This is a business exception, so we can apply the rule. Main goal of the change - clarify the fact that the method may return a value or an error:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {

OK, now that we have cleaned up the API, we can start changing the code. The first change is now returned by the userRepository
Caused by option < user >:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<User> user = userRepository.findById(userId);
}

Now we need to check whether the user exists. If not, an error is returned. Using the traditional imperative method, the code should be as follows:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<User> user = userRepository.findById(userId);
   
    if (user.isEmpty()) {
        return Result.failure(Causes.cause("User with ID " + userId + " not found"));
    }

}
The code doesn't look very attractive, but it's no worse than the original, so keep it as it is for the time being.
The next step is to try to convert the rest of the code:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<User> user = userRepository.findById(userId);
   
    if (user.isEmpty()) {
        return Result.failure(Causes.cause("User with ID " + userId + " not found"));
    }

    Option<UserProfile> details = userProfileRepository.findById(userId);
   
}

Here's the problem: the details and users are stored in the option < T > container, so to assemble the UserWithProfile, we need to extract the values in some way. There may be different methods here, for example, using the Option.fold() method. The generated code will certainly not be beautiful, and it is likely to violate the rules.
There is another way - use the fact that option < T > is a container with special attributes.
In particular, you can convert values in option < T > using the Option.map() and Option.flatMap() methods. In addition, we know that the details value will be provided by the repository or replaced with the default value. To do this, we can use the Option.or() method to extract details from the container. Let's try these methods:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<User> user = userRepository.findById(userId);
   
    if (user.isEmpty()) {
        return Result.failure(Causes.cause("User with ID " + userId + " not found"));
    }

    UserProfile details = userProfileRepository.findById(userId).or(UserProfile.defaultDetails());
   
    Option<UserWithProfile> userWithProfile =  user.map(userValue -> UserWithProfile.of(userValue, details));
   
}

Now we need to write the last step - converting the userWithProfile container from option < T > to result < T >:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<User> user = userRepository.findById(userId);
   
    if (user.isEmpty()) {
        return Result.failure(Causes.cause("User with ID " + userId + " not found"));
    }

    UserProfile details = userProfileRepository.findById(userId).or(UserProfile.defaultDetails());

    Option<UserWithProfile> userWithProfile =  user.map(userValue -> UserWithProfile.of(userValue, details));

    return userWithProfile.toResult(Cause.cause(""));
}

Let's leave the error reason in the return statement blank for the time being, and then look at the code again.
We can easily find a problem: we must know that userWithProfile always exists - this situation has been handled above when user does not exist. How can we solve this problem?
Note that we can call user.map() without checking whether the user exists. The transformation is applied only if the user exists, otherwise it will be ignored. In this way, we can eliminate the if(user.isEmpty()) check. Let's move the user details retrieval and conversion to the UserWithProfile in the lambda passed to user.map():

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<UserWithProfile> userWithProfile = userRepository.findById(userId).map(userValue -> {
        UserProfile details = userProfileRepository.findById(userId).or(UserProfile.defaultDetails());
        return UserWithProfile.of(userValue, details);
    });
   
    return userWithProfile.toResult(Cause.cause(""));
}

Now you need to change the last line because the userWithProfile may be missing. This error will be the same as the previous version, because userWithProfile is missing only if the value returned by userRepository.findById(userId) is missing:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    Option<UserWithProfile> userWithProfile = userRepository.findById(userId).map(userValue -> {
        UserProfile details = userProfileRepository.findById(userId).or(UserProfile.defaultDetails());
        return UserWithProfile.of(userValue, details);
    });
   
    return userWithProfile.toResult(Causes.cause("User with ID " + userId + " not found"));
}

Finally, we can inline details and userWithProfile because they are only used once after creation:

public Result<UserWithProfile> getUserWithProfile(User.Id userId) {
    return userRepository.findById(userId)
        .map(userValue -> UserWithProfile.of(userValue, userProfileRepository.findById(userId)
                                                                             .or(UserProfile.defaultDetails())))
        .toResult(Causes.cause("User with ID " + userId + " not found"));
}

Notice how indentation helps group code into logically linked parts.
Let's analyze the result code:

  • The code is more concise, written for happy day scenarios, without explicit error or null checking, and without interfering with business logic
  • There is no simple way to skip or avoid error or null checking. It is direct and natural to write correct and reliable code.

Less obvious observations:

  • All types are derived automatically. This simplifies refactoring and eliminates unnecessary confusion. If necessary, you can still add types.
  • If at some point the repository will start returning result < T > instead of option < T >, the code will remain unchanged, except that the last transformation (toResult) will be deleted.
  • In addition to replacing the ternary operator with the Option.or() method, the resulting code looks like if we move the code in the original return statement passed to lambda to the map() method.

The last observation is very useful for starting to write (reading is usually not a problem) PFJ style code easily. It can be rewritten as the following empirical rule: find the value on the right. Compare:

User user = userRepository.findById(userId); // < -- the value is to the left of the expression

and

return userRepository.findById(userId)
.map(user -> ...); // < -- the value is to the right of the expression

This useful observation contributes to the transition from legacy imperative code style to PFJ.

Interaction with legacy code

Needless to say, the existing code does not follow the PFJ method. It throws exceptions, returns null, and so on. Sometimes you can rewrite this code to make it compatible with PFJ, but usually this is not the case. This is especially true for external libraries and frameworks.

Call legacy code

There are two main problems with legacy code calls. Each of them is related to the violation of the corresponding PFJ rules:

Handling business exceptions

Result < T > contains an auxiliary method called lift(), which covers most use cases. The method signature looks like this:

static <R> Result<R> lift(FN1<? extends Cause, ? super Throwable> exceptionMapper, ThrowingSupplier<R> supplier)

The first parameter is the function that converts the exception to a Cause instance (in turn, it is used to create a result < T > instance in case of failure). The second parameter is lambda, which encapsulates the call to the actual code that needs to be compatible with PFJ.
The simplest function is provided in the Causesutility class, which converts an exception to an instance of Cause: fromThrowable(). They can be used with Result.lift(), as follows:

public static Result<URI> createURI(String uri) {
    return Result.lift(Causes::fromThrowable, () -> URI.create(uri));
}

Processing null value returns

This is quite simple - if the API can return null, just wrap it in option < T > using the Option.option() method.

Provide legacy API s

Sometimes it is necessary to allow legacy code to call code written in PFJ style. In particular, this usually happens when some smaller subsystems are converted to PFJ style, but the rest of the system is still written in the old style and needs to retain the API. The most convenient way is to split the implementation into two parts - PFJ style API and adapter, which only adapts the new API to the old API. This can be a very useful simple auxiliary method, as follows:

public static <T> T unwrap(Result<T> value) {
    return value.fold(
        cause -> { throw new IllegalStateException(cause.message()); },
        content -> content
    );
}

There is no readily available auxiliary method in result < T > for the following reasons:

  • There may be different use cases and different types of exceptions can be thrown (checked and unchecked).
  • Converting Cause to a different specific exception depends largely on the specific use case.

Manage variable scope

This section will focus on various practical cases when writing PFJ style code.
The following example assumes that result < T > is used, but this is largely irrelevant because all considerations also apply to option < T >. In addition, the example assumes that the functions invoked in the example are converted to return Result<T> instead of throwing exceptions.

Nested scope

Function style code makes extensive use of lambda to perform the calculation and conversion of values in option < T > and result < T > containers. Each lambda implicitly creates scopes for its parameters -- they can be accessed inside the lambda body, but not outside it.
This is usually a useful attribute, but it is unusual for traditional imperative code and may be inconvenient at first. Fortunately, there is a simple technology that can overcome the perceived inconvenience.
Let's look at the following imperative Code:

var value1 = function1(...); // function1()
 Exceptions may be thrown
var value2 = function2(value1, ...); // function2() may throw an exception
var value3 = function3(value1, value2, ...); // function3() may throw an exception

The variable value1 should be accessible to call function2() and function3(). This does mean that converting directly to PFJ style will not work:

function1(...)
.flatMap(value1 -> function2(value1, ...))
.flatMap(value2 -> function3(value1, value2, ...)); // < -- error, value1 is not accessible

In order to maintain the accessibility of values, we need to use nested scopes, that is, nested calls are as follows:

function1(...)
.flatMap(value1 -> function2(value1, ...)
    .flatMap(value2 -> function3(value1, value2, ...)));

The second call to flatMap() is for the value returned by function2, not the value returned by the first flatMap(). In this way, we keep value1 in scope and make it accessible to function 3.
Although you can create nested scopes of any depth, often multiple nested scopes are more difficult to read and follow. In this case, it is strongly recommended to extract deeper ranges into specialized functions.

Parallel scope

Another frequently observed situation is the need to calculate / retrieve several independent values and then call or build an object. Let's look at the following example:

var value1 = function1(...);    // function1() may throw an exception
var value2 = function2(...);    // function2() may throw an exception
var value3 = function3(...);    // function3() may throw an exception
return new MyObject(value1, value2, value3);

At first glance, converting to a PFJ style can be exactly the same as a nested scope. The visibility of each value will be the same as imperative code. Unfortunately, this results in deep nesting of ranges, especially if you need to get many values.
In this case, option < T > and result < T > provide a set of all() methods. These methods perform a "parallel" calculation of all values and return a dedicated version of the mapperx <... > interface. This interface has only three methods -- id(), map(), and flatMap(). The map() and flatMap() methods work exactly the same way as the corresponding methods in option < T > and result < T >, except that they accept Lambdas with different numbers of parameters. Let's see how it works in practice and convert the above imperative code to PFJ style:

return Result.all(
          function1(...),
          function2(...),
          function3(...)
        ).map(MyObject::new);

In addition to being compact and flat, this method has some advantages. First, it clearly expresses its intention to calculate all values before use. Imperative code does this in sequence, hiding the original intent. The second advantage - the calculation of each value is independent and does not bring unnecessary values into the range. This reduces the context required to understand and reason about each function call.

Alternative scope

A less common but still important situation is that we need to retrieve a value, but if it is not available, we use an alternative source of the value. When multiple alternatives are available, the frequency of this situation is even lower and more painful when error handling is involved.
Let's look at the following imperative Code:

MyType value;

try {
    value = function1(...);
} catch (MyException e1) {
    try {
        value = function2(...);    
    } catch(MyException e2) {
        try {
            value = function3(...);
        } catch(MyException e3) {
            ... // repeat as many times as there are alternatives
        }
    }
}

The code is artificially designed because nested cases are often hidden in other methods. Nevertheless, the overall logic is not simple, mainly because we need to deal with errors in addition to selecting values. Error handling confuses the code and hides the original intention - to choose the first available alternative - in error handling.
Changing to PFJ style makes the intention very clear:

var value = Result.any(
        function1(...),
        function2(...),
        function3(...)
    );

Unfortunately, there is an important difference: the original imperative code only evaluates the second and subsequent alternatives when necessary. In some cases, this is not a problem, but in many cases, it is very undesirable. Fortunately, Result.any() has an inert version. Using it, we can rewrite the code as follows:

var value = Result.any(
        function1(...),
        () -> function2(...),
        () -> function3(...)
    );

The converted code now behaves exactly as its imperative counterpart.

Brief technical overview of option < T > and result < T >

These two containers are monad s in functional programming terms.
Option < T > is a direct implementation of Option/Optional/Maybe monad.
Result < T > is a specially simplified and specialized version of either < L, R >: the left type is fixed and the Cause interface should be implemented. Specialization makes the API very similar to option < T > and eliminates many unnecessary inputs at the cost of losing generality.
This particular implementation focuses on two things:

  • Interoperability with existing JDK classes such as optional < T > and stream < T >
  • API for explicit expression of intent

The last sentence deserves further explanation.
Each container has several core methods:

  • Factory method
  • map() conversion method, which converts the value but does not change the special state: present option < T > keep present, success result < T > keep success.
  • The flatMap() conversion method can change the special state besides conversion: convert option < T > present to empty or result < T > success to failure.
  • fold() method, which handles both cases (present/empty of option < T > and success/failure of result < T >).

In addition to the core methods, there are a number of auxiliary methods that are useful in frequently observed use cases.
Among these methods, one set of methods is clearly designed to produce side effects.
Option < T > methods with the following side effects:

Option<T> whenPresent(Consumer<? super T> consumer);
Option<T> whenEmpty(Runnable action);
Option<T> apply(Runnable emptyValConsumer, Consumer<? super T> nonEmptyValConsumer);

Result < T > methods with the following side effects:

Result<T> onSuccess(Consumer<T> consumer);
Result<T> onSuccessDo(Runnable action);
Result<T> onFailure(Consumer<? super Cause> consumer);
Result<T> onFailureDo(Runnable action);
Result<T> apply(Consumer<? super Cause> failureConsumer, Consumer<? super T> successConsumer);

These methods provide the reader with tips on the side effects of code processing rather than conversion.

Other useful tools

In addition to option < T > and result < T >, PFJ uses some other common classes. Each method will be described in more detail below.

Functions

JDK provides many useful functional interfaces. Unfortunately, the functional interface of general-purpose functions is limited to two versions: single parameter function < T, R > and two parameter bifunction < T, u, R >.
Obviously, this is not enough in many practical situations. In addition, for some reason, the type parameters of these functions are the opposite of those declared in Java: the result type is listed last, and in the function declaration, it is defined first.
PFJ uses a consistent set of function interfaces for functions with 1 to 9 parameters. For brevity, they are called FN1... FN9. So far, there are no function use cases with more parameters (usually this is code). However, if necessary, the list can be further expanded.

Tuples (tuples)

Tuples are special containers that can be used to store multiple different types of values in a single variable. Unlike a class or record, the value stored in it has no name. This makes them indispensable tools for capturing arbitrary sets of values while preserving types. A good example of this use case is the implementation of the Result.all() and Option.all() method sets.
In a sense, tuples can be considered as a set of frozen parameters prepared for function calls. From this perspective, the decision to make tuple internal values accessible only through the map() method sounds reasonable. However, tuples with 2 parameters have additional accessors, and tuple2 < T1, T2 > can be used as an alternative to various pair < T1, T2 > implementations.
PFJ is implemented using a consistent set of tuples with 0 to 9 values. Provide tuples with 0 and 1 values for consistency.

conclusion

Practical functional Java is a modern, very concise but readable java coding style based on the concept of functional programming. Compared with the traditional idiomatic java coding style, it provides many benefits:

  • PFJ uses the Java compiler to help write reliable code:

    • Compiled code is usually valid
    • Many errors move from runtime to compile time
    • Some categories of errors, such as NullPointerException or unhandled exceptions, have actually been eliminated
  • PFJ significantly reduces the amount of boilerplate code associated with error propagation and processing and null checking
  • PFJ focuses on clearly expressing intention and reducing psychological burden

Posted by tbales on Fri, 05 Nov 2021 16:05:27 -0700