What did JAVA8 bring to me - the concept of streams and collectors

Keywords: Java

Up to now, I dare not define stream. Conceptually speaking, it should also be a data element. But in our previous code example, we can see that he is more likely to represent a set of behavior combinations for data processing. This makes it difficult for the author to understand his definition. So the author does not make a statement. Comrades, understand for yourselves.
Before there is no stream, the display iterator is usually used to process the data in the collection. Let's use the example of the former students. The goal is to get the first two students whose credits are more than 5.

 1 package com.aomi;
 2 
 3 import java.util.ArrayList;
 4 import java.util.Iterator;
 5 import java.util.List;
 6 import static java.util.stream.Collectors.toList;
 7 
 8 public class Main {
 9 
10     public static void main(String[] args) {
11         // TODO Auto-generated method stub
12 
13         List<Student> stus = getSources();
14 
15         Iterator<Student> ite = stus.iterator();
16 
17         List<String> names = new ArrayList<>();
18         int limit = 2;
19         while (ite.hasNext() && limit > 0) {
20 
21             Student stu = ite.next();
22 
23             if (stu.getScore() > 5) {
24 
25                 names.add(stu.getName());
26                 limit--;
27             }
28         }
29 
30         for (String name : names) {
31             System.out.println(name);
32         }
33 
34     }
35 
36     public static List<Student> getSources() {
37         List<Student> students = new ArrayList<>();
38 
39         Student stu1 = new Student();
40 
41         stu1.setName("lucy");
42         stu1.setSex(0);
43         stu1.setPhone("13700227892");
44         stu1.setScore(9);
45 
46         Student stu2 = new Student();
47         stu2.setName("lin");
48         stu2.setSex(1);
49         stu2.setPhone("15700227122");
50         stu2.setScore(9);
51 
52         Student stu3 = new Student();
53         stu3.setName("lili");
54         stu3.setSex(0);
55         stu3.setPhone("18500227892");
56         stu3.setScore(8);
57 
58         Student stu4 = new Student();
59 
60         stu4.setName("dark");
61         stu4.setSex(1);
62         stu4.setPhone("16700555892");
63         stu4.setScore(6);
64 
65         students.add(stu1);
66         students.add(stu2);
67         students.add(stu3);
68         students.add(stu4);
69 
70         return students;
71     }
72 
73 }

This is the case with streaming.

public static void main(String[] args) {
        // TODO Auto-generated method stub

        List<Student> stus = getSources();

        List<String> names = stus.stream()
                .filter(st -> st.getScore() > 5)
                .limit(2)
                .map(st -> st.getName())
                .collect(toList());
        for (String name : names) {
            System.out.println(name);
        }

    }

Comparing these two pieces of code is mainly to illustrate a concept: the previous practice is to iterate externally, which is most reflected in the author's definition of a collection name outside. Now we should be able to clearly feel that the flow is iterating internally. That is to say, flow has helped you iterate. All we need to do is pass in the relevant function to get the desired result. As for the benefits of internal iteration, I can't feel it personally. The only feeling is that the code becomes simple and clear. But officials say Stream libraries do a lot of optimizing and exploiting performance for our internal iterations. For example, parallel operations. So I listen to the authorities.

In fact, in the process of using the stream, we use many method functions. For example, the limit method, filter method and so on. This is defined as a flow operation. But no matter what the operation is, you have to have a data source. The summary is as follows:

  • Data source: Data used to generate streams, such as collections.
  • Flow operation: Similar to limit method, filter method.

There is another feature of streams - partial stream operations are not performed. Usually when the collect function is executed, it starts to execute each function. So we can subdivide the flow operations:

  • Data source: Data used to generate streams, such as collections.
  • Intermediate operation: Similar to limit method, filter method. These operations have changed a chain of operations, with a little pipeline concept.
  • Terminal operation: Execute the operation chain above. For example, the collect function.

From the above explanation, we can feel that the flow is like collecting related target operations first. What does that mean? It's to plan what you want to do first and give the last order to carry it out. The next command is the collect function. This is very similar to. NET's Linq. And remember that he can only execute it once. That is to say, once the stream is executed, it is not possible to use it.
I'll list the functions I used before.

For Each: Terminal
collect: terminal
count: terminal
limit: middle
filter: middle
map: in the middle.
sorted: in the middle

So far we have used streams to construct a stream through collections. The author has never talked about it. Now I will talk about some ways to build the flow.
In the stream library, we have a method called Stream.

package com.aomi;

import java.util.Optional;
import java.util.stream.Stream;

public class Main {

	public static void main(String[] args) {
		// TODO Auto-generated method stub

		Stream stream = Stream.of("I", "am", "aomi");

		Optional<String> firstWord = stream.findFirst();
		
		if(firstWord.isPresent())
		{
			System.out.println("First word:"+firstWord.get());
		}
	}

}

Operation results:

Look at the code for the of method. as follows

public static<T> Stream<T> of(T... values) {
        return Arrays.stream(values);
    }

Explains that we may specify a type to build a stream. The above can be modified to read

Stream<String> stream = Stream.of("I", "am", "aomi");

The findFirst function is used to indicate that the first value is returned. Is it possible that the data source is empty? So he could return null. So you can use a representation called Optional class to be null. In this way, we can use the Optional class method to further do security operations. For example, determine whether there is a value (isPresent())
I want to build an int type array stream to play with. In order to facilitate the author to try the above code. But I found that the report was wrong.

What if I change int to Integer? No problem. So be careful to use the reference type. The int type corresponds to the Integer type.

 1 package com.aomi;
 2 
 3 import java.util.Optional;
 4 import java.util.stream.Stream;
 5 
 6 public class Main {
 7 
 8     public static void main(String[] args) {
 9         // TODO Auto-generated method stub
10 
11         Stream<Integer> stream = Stream.of(1, 2, 9);
12 
13         Optional<Integer> firstWord = stream.findFirst();
14         
15         if(firstWord.isPresent())
16         {
17             System.out.println("First word:"+firstWord.get());
18         }
19     
20     }
21 
22 }

Operation results:

What do you want to do with the int type? Reform

 1 package com.aomi;
 2 
 3 import java.util.OptionalInt;
 4 import java.util.stream.IntStream;
 5 
 6 public class Main {
 7 
 8     public static void main(String[] args) {
 9         // TODO Auto-generated method stub
10 
11         IntStream stream = IntStream.of(1, 2, 9);
12 
13         OptionalInt firstWord = stream.findFirst();
14         
15         if(firstWord.isPresent())
16         {
17             System.out.println("First word:"+firstWord.getAsInt());
18         }
19     
20     }
21 
22 }

Operation results:

Let's guess from the above example: Is it a Double type, just change it to Double Stream? Try.

 1 package com.aomi;
 2 
 3 import java.util.OptionalDouble;
 4 import java.util.stream.DoubleStream;
 5 
 6 public class Main {
 7 
 8     public static void main(String[] args) {
 9         // TODO Auto-generated method stub
10 
11         DoubleStream stream = DoubleStream.of(1.3, 2.3, 9.5);
12 
13         OptionalDouble firstWord = stream.findFirst();
14         
15         if(firstWord.isPresent())
16         {
17             System.out.println("First word:"+firstWord.getAsDouble());
18         }
19     
20     }
21 
22 }

Operation results:

It turned out that our guess was right. So see if the stream you are working on is an int or double, enter the XxStream as possible to build the stream. In this way, there is no disassembly and packaging in the process of flow. It must be performance. Let's see how we can generate streams if the data source is an array.

1  public static Collector<CharSequence, ?, String> joining(CharSequence delimiter,
2                                                              CharSequence prefix,
3                                                              CharSequence suffix) {
4         return new CollectorImpl<>(
5                 () -> new StringJoiner(delimiter, prefix, suffix),
6                 StringJoiner::add, StringJoiner::merge,
7                 StringJoiner::toString, CH_NOID);
8     }

Look at a code called the toList function.

1 public static <T>
2     Collector<T, ?, List<T>> toList() {
3         return new CollectorImpl<>((Supplier<List<T>>) ArrayList::new, List::add,
4                                    (left, right) -> { left.addAll(right); return left; },
5                                    CH_ID);
6     }

We found that he would return a Collector type together. From the above we can see that his task is to process the final data. We designated him as a collector. Let's look at the interface code of the collector.

 1 public interface Collector<T, A, R> {
 2 
 3     Supplier<A> supplier();
 4 
 5     BiConsumer<A, T> accumulator();
 6 
 7     BinaryOperator<A> combiner();
 8 
 9     Function<A, R> finisher();
10 
11     Set<Characteristics> characteristics();
12 }

Just look at the first four methods to see if there is a sense of familiarity. I want to illustrate the role of these five methods. We must understand the concept of parallel reduction. As I said earlier, streaming is an internal iteration, and Stream libraries do a lot of optimization for us. One of them is parallelism. He used the branch / merge framework introduced by JAVA 7. That is to say, the flow will be divided into many sub-streams recursively, and then the sub-streams can be executed in parallel. Finally, the results of the two sub-streams are merged into one final result. The merger of the two is called reduction. As shown below. Quoted in JAVA8 Actual Warfare

 

 

We have to follow the meaning of the picture. Three methods of the Collector class are called in the graph of the subflow.

  • The supplier method: Use the place where the data store is created.
  • accumulator method: used for iteration of subflow execution process. That is, traversing each item will be executed. So we can do some work here.
  • finisher method: Returns the final result, where you can further process the result.

After each sub-stream ends, they merge. This is the time to look at the flow mechanism map.

  • combiner method: The results of each substream are passed in, and we can do some work here.
  • finisher method: Returns the final result. The same as the sub-stream above.

There seems to be no characteristics. It's not like that. This method is used to illustrate what optimizations the current flow has. When the stream is executed in this way, it is clear how it will be executed. For example, parallel.
He is an enum class. Values are as follows

  • UNORDERED: This means that results during execution are not affected by reduction and traversal
  • CONCURRENT: Represents that accumulator methods can be called on multiple lines. Parallel execution can also be performed. Of course, the former unordered data is paralleled. Unless the collector is marked UNORDERED.
  • IDENTITY_FINISH: This means that this is an identical function, even if the result is the same. You can skip without doing it.

With the instructions above, let's write our own Collector - remove the same words
DistinctWordCollector class:

 1 package com.aomi;
 2 
 3 import java.util.ArrayList;
 4 import java.util.Collections;
 5 import java.util.EnumSet;
 6 import java.util.List;
 7 import java.util.Set;
 8 import java.util.function.BiConsumer;
 9 import java.util.function.BinaryOperator;
10 import java.util.function.Function;
11 import java.util.function.Supplier;
12 import java.util.stream.Collector;
13 
14 public class DistinctWordCollector implements Collector<String, List<String>, List<String>> {
15 
16     @Override
17     public Supplier<List<String>> supplier() {
18         // TODO Auto-generated method stub
19         return () -> new ArrayList<String>();
20     }
21 
22     /**
23      * Processing of Substream Processing Items
24      */
25     @Override
26     public BiConsumer<List<String>, String> accumulator() {
27         // TODO Auto-generated method stub
28         return (List<String> src, String val) -> {
29 
30             if (!src.contains(val)) {
31                 src.add(val);
32             }
33         };
34     }
35 
36     /**
37      * The Execution Function of the Combination of the Two
38      */
39     @Override
40     public BinaryOperator<List<String>> combiner() {
41         // TODO Auto-generated method stub
42         return (List<String> src1, List<String> src2) -> {
43             for (String val : src2) {
44                 if (!src1.contains(val)) {
45                     src1.add(val);
46                 }
47             }
48             return src1;
49         };
50     }
51 
52     @Override
53     public Function<List<String>, List<String>> finisher() {
54         // TODO Auto-generated method stub
55         return Function.identity();
56     }
57 
58     @Override
59     public Set<Characteristics> characteristics() {
60         // TODO Auto-generated method stub
61         return Collections.unmodifiableSet(EnumSet.of(Characteristics.IDENTITY_FINISH, Characteristics.CONCURRENT));
62     }
63 
64 }

Operation result

The result is exactly what we want - eliminating duplicate aomi

Posted by devork on Thu, 02 May 2019 02:50:37 -0700