Pits traversed using parallelStream and how to avoid problems in asynchronous operations

Keywords: Java Programming

Java 8 has been around for a long time, and now it's version 12 of Java.

In my previous company, lambad is recommended for traversing sets of operations when writing code.

That's like the following

        List<Integer>  list  =  new ArrayList<>();
        for (int j = 0; j < 1000; j++) {
            list.add(j);
        }
        list.stream().forEach(value -> {
            System.out.println(value);
        });

This efficiency is similar to the traditional foreach and for loop traversal, because the Point-Opening forEast method will find that the following method is actually used to traverse the collection.

In fact, the internal use of for traversal, so there is no difference in efficiency between the two, of course, because each company's programming habits are different, some people prefer the traditional for traversal.

Because the above traversal method will not improve efficiency, there is another way to do so.

parallelStream()
        List<Integer>  list  =  new ArrayList<>();
        for (int j = 0; j < 1000; j++) {
            list.add(j);
        }
        list.parallelStream().forEach(value -> {
            System.out.println(value);
        });

The above method is asynchronous.

This traversal method, because it is asynchronous traversal, will produce a situation that the order of traversal is out of order. Of course, it also has the corresponding advantage that the traversal speed will be fast, which can be used when the generated results do not consider the problem of sorting and the amount of data is large.

 

However, there are advantages and disadvantages, because asynchronism, so the need to consider the issue of threads, is the generated results really what you want?

Run a piece of code with the following examples:

public static void main(String[] args) {
        List<Integer>  list  =  new ArrayList<>();
        for (int j = 0; j < 1000; j++) {
            list.add(j);
        }
        System.out.println("The length of the collection that was initially generated:"+list.size());
        //Parallel Stream can cause loss when traversing data
        for (int i = 0; i < 10 ; i++) {

            List<Integer> parseList = new ArrayList<>();
            list.parallelStream().forEach(integer -> {
                parseList.add(integer);
            });
            System.out.println("Set Length for Each Traverse:"+ parseList.size());
        }
    }

I first created a 1000-length collection, and then traversed it many times. However, I found that the last traversed collection had a few data, and the array crossed the bounds when traversed many times.

Because of this situation, there have been two problems with parallelStream before. I always thought it was not safe enough to use parallelStream. In fact, when I collated this blog today, I suddenly found out that this problem is that the result of traversal turned to list is thread-safe.

In fact, when traversing normally, the results can be checked. In fact, the results of each traversal are still consistent with the original results.

So the pot can only be thrown on the list that receives the data.

At this point, you need to wrap the list

List<Integer> synchronizedList = Collections.synchronizedList(parseList);

This will look at the modified code and the results.

public static void main(String[] args) {
        List<Integer>  list  =  new ArrayList<>();
        for (int j = 0; j < 1000; j++) {
            list.add(j);
        }
        System.out.println("The length of the collection that was initially generated:"+list.size());
        //Parallel Stream can cause loss when traversing data
        for (int i = 0; i < 10 ; i++) {

            List<Integer> parseList = new ArrayList<>();
            List<Integer> synchronizedList = Collections.synchronizedList(parseList);
            list.parallelStream().forEach(integer -> {
                synchronizedList.add(integer);
            });
            System.out.println("Set Length for Each Traverse:"+ synchronizedList.size());
        }
    }

In this way, the results of each traversal are the same, and the speed will be much faster due to asynchrony than before.

Similarly, how to create thread-safe sets allows the map to be wrapped accordingly, thus avoiding new bug s that clearly feel right and do not match the results you want.

Similarly, StringBuffer for Parallel Stream is not applicable to StringBuilder because the former is thread-safe.

 

Any questions are welcome to add.

Posted by bulldorc on Fri, 06 Sep 2019 00:49:31 -0700