Two Concurrent Types of Java: Computing-intensive and IO-intensive

Keywords: Java JDK Programming network

Computing-intensive and IO-intensive are two typical examples of Java concurrent programming. This elephant is going to talk about its own content in this regard. This article is relatively basic and only suitable for children's shoes that are just beginning. Please don't spray them.
Computing intensive
Computing-intensive, as the name implies, is that applications need a lot of CPU computing resources. In the era of multi-core CPU, we need to let every CPU core participate in the calculation and make full use of the CPU's performance. This is not a waste of server configuration, if there is a single-threaded program running on a very good server configuration, it will be a great waste. For computing-intensive applications, CPU's core count is the only way to work, so in order to make its advantages fully play out and avoid too much thread context switching, the ideal solution is:
Threads = CPU core + 1
It can also be set to CPU core * 2, depending on the version of JDK and CPU configuration (the CPU of the server has hyperthreading). For JDK 1.8, a parallel computation is added, and the number of ideal threads with dense computation is equal to the number of threads in the CPU kernel*2
Computing folder size is a typical example. The code is very simple, so I won't explain much.

import java.io.File;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;

/**
 * Calculate folder size
 * @author Pineapple elephant
 */
public class FileSizeCalc {

    static class SubDirsAndSize {
        public final long size;
        public final List<File> subDirs;

        public SubDirsAndSize(long size, List<File> subDirs) {
            this.size = size;
            this.subDirs = Collections.unmodifiableList(subDirs);
        }
    }
    
    private SubDirsAndSize getSubDirsAndSize(File file) {
        long total = 0;
        List<File> subDirs = new ArrayList<File>();
        if (file.isDirectory()) {
            File[] children = file.listFiles();
            if (children != null) {
                for (File child : children) {
                    if (child.isFile())
                        total += child.length();
                    else
                        subDirs.add(child);
                }
            }
        }
        return new SubDirsAndSize(total, subDirs);
    }
    
    private long getFileSize(File file) throws Exception{
        final int cpuCore = Runtime.getRuntime().availableProcessors();
        final int poolSize = cpuCore+1;
        ExecutorService service = Executors.newFixedThreadPool(poolSize);
        long total = 0;
        List<File> directories = new ArrayList<File>();
        directories.add(file);
        SubDirsAndSize subDirsAndSize = null;
        try{
            while(!directories.isEmpty()){
                List<Future<SubDirsAndSize>> partialResults= new ArrayList<Future<SubDirsAndSize>>();
                for(final File directory : directories){
                    partialResults.add(service.submit(new Callable<SubDirsAndSize>(){
                        @Override
                        public SubDirsAndSize call() throws Exception {
                            return getSubDirsAndSize(directory);
                        }
                    }));
                }
                directories.clear();
                for(Future<SubDirsAndSize> partialResultFuture : partialResults){
                    subDirsAndSize = partialResultFuture.get(100,TimeUnit.SECONDS);
                    total += subDirsAndSize.size;
                    directories.addAll(subDirsAndSize.subDirs);
                }
            }
            return total;
        } finally {
            service.shutdown();
        }
    }
    
    public static void main(String[] args) throws Exception {
        for(int i=0;i<10;i++){
            final long start = System.currentTimeMillis();
            long total = new FileSizeCalc().getFileSize(new File("e:/m2"));
            final long end = System.currentTimeMillis();
            System.out.format("Folder size: %dMB%n" , total/(1024*1024));
            System.out.format("Time spent: %.3fs%n" , (end - start)/1.0e3);
        }
    }
}


After 10 executions, the results are as follows:
    
In the example above, the thread pool is set to CPU core + 1, which results in the elephant running out on the working computer (CPU: G630 memory: 4G JDK1.7.0_51). If you increase the thread pool here, for example, to 100, you will find that it takes more time. The elephant consumes 0.297 seconds at most, which is 0.079 seconds, or 79 milliseconds, from the previous minimum of 0.218. Of course, this extra time seems nothing to us, only a few seconds, but for the CPU is quite long, because the CPU is based on nanoseconds as the unit of calculation, 1 millisecond = 1000000 nanoseconds. So increasing thread pools increases the switching cost of CPU contexts, and sometimes program optimization is accumulated from these tiny places.
IO-intensive
For IO-intensive applications, it is easy to understand that most of the development we are doing now is WEB applications, involving a large number of network transmission, not only that, interaction with databases and caches also involves IO, once IO occurs, threads will be in a waiting state, when IO ends and data is ready, threads will continue to execute. Therefore, we can find that for IO-intensive applications, we can set more threads in the thread pool, so that during the waiting time for IO, threads can do other things to improve the efficiency of concurrent processing.
Can the amount of data in this thread pool be set freely? Of course not. Please remember that thread context switching comes at a cost. A set of formulas for IO-intensive applications are summarized.
Threads = CPU core /(1-blocking factor)
The blocking coefficient is usually between 0.8 and 0.9, or 0.8 or 0.9. For dual-core CPU, the ideal number of threads is 20. Of course, this is not absolute. It needs to be adjusted according to the actual situation and actual business.
    final int poolSize = (int)(cpuCore/(1-0.9))
This elephant briefly talks about concurrent types, aiming at attracting valuable ideas, so that beginners of concurrent programming can have some understanding of the wrong place, but also ask you to point out.
After nagging about the above, and then nagging about the JDK version, every Java version upgrade means that the performance of virtual machines and GC has a certain degree of improvement, so JDK 1.7 is faster than JDK 1.6 in concurrent processing speed, pay attention to the degree of multi-threading, please add-server parameters, concurrent effect is better. Now that JDK 1.8 has been out for so long, should your JDK be upgraded?
This is an original pineapple elephant. If you want to reproduce it, please indicate the source. http://www.blogjava.net/bolo

Posted by cutups on Sat, 20 Apr 2019 22:24:33 -0700