sql optimization: query optimization with too much data

Keywords: Database SQL

1. Scenario: Data with large data need to be put on the first page of statistics, generally aggregation, grouping and so on, query and statistics according to the month and day of the year.If the amount of data is small, hundreds of thousands of data are fine.However, as the amount of data increases, queries become slower and slower.It's time to optimize ~

In the beginning, my idea was this: If you use multi-threading, because the amount of data queried each day is very small, can you use multi-threading? If each thread queries one day, 30 days a month and 30 threads a month, will it be faster?

Thus, it is implemented in a multi-threaded way.The code is as follows:

 

private ExecutorService executorService = new ThreadPoolExecutor(30,30,1, TimeUnit.MILLISECONDS,new LinkedBlockingDeque<>());
public List<Map> getCiServiceBadEvaNumStatistic(SAASIndexQuery saasIndexQuery) throws InvocationTargetException, IllegalAccessException {
        String startDate = saasIndexQuery.getStartDate();
        String endDate = saasIndexQuery.getEndDate();
        int days = DateUtil.getDatebetweenOfDayNum(DateUtil.parseDate(startDate,DateUtil.dateFormatPattern),DateUtil.parseDate(endDate,DateUtil.dateFormatPattern));

        CompletionService<List<CiOrderStatisticSection>> completionService = new ExecutorCompletionService<List<CiOrderStatisticSection>>(executorService);

        List<CiOrderStatisticSection> allList = new ArrayList<>();

        long start = System.currentTimeMillis();
        logger.info("Test Asynchronous Time start:" + System.currentTimeMillis());
        //CountDownLatch countDownLatch = new CountDownLatch(days);
        SAASIndexQuery everyDaySaas = new SAASIndexQuery();
        BeanUtils.copyProperties(everyDaySaas,saasIndexQuery);
        for(int i = 0;i<days;i++){
            everyDaySaas.setStartDate(DateUtil.afterNDay(saasIndexQuery.getStartDate(),i,DateUtil.dateFormatPattern));
            everyDaySaas.setEndDate(DateUtil.afterNDay(everyDaySaas.getStartDate(),1,DateUtil.dateFormatPattern));
            //countDownLatch.countDown();
            int finalI = i;
            completionService.submit(new Callable<List<CiOrderStatisticSection>>() {
                @Override
                public List<CiOrderStatisticSection> call() throws Exception {
                    //allList.addAll(biSaasCiDeviceDayExMapper.getCiServiceNegativeRate(saasIndexQuery));
                    //countDownLatch.countDown();
                    System.out.println("====="+ finalI +"=====");
                    return biSaasCiDeviceDayExMapper.getCiServiceNegativeRate(saasIndexQuery);
                }
            });
        }
        System.out.println("==============" + (System.currentTimeMillis()-start) + "Millisecond");
        long t = System.currentTimeMillis();
        for (int i = 0;i<days;i++){
            System.out.println("for Cycle time==============+"+i + (System.currentTimeMillis()-t) + "Millisecond");

            try {
                Future<List<CiOrderStatisticSection>> future = completionService.take();
                List<CiOrderStatisticSection>  ciList = future.get();
                allList.addAll(ciList);
            } catch (InterruptedException e) {
                e.printStackTrace();
            } catch (ExecutionException e) {
                e.printStackTrace();
            }
        }
        long end = System.currentTimeMillis();
        logger.info("Test Asynchronous Time end:" + (end-start) + "Millisecond");
        System.out.println("Test Asynchronous Time end:" + (end-start) + "Millisecond");

}

After testing, we found that there was a blocking problem with multithreaded take every time. We didn't know where the blocking was.Whether it is a thread pool, a LinkedBlockingDeque or a for-loop take has been blocked for some time and it has not been understood that each for-loop takes more than 200 milliseconds and 30 cycles more than 6 seconds.Well, come on, it's not as fast as it used to be.

 

 

2. Yesterday, I changed my thinking: Do a good job of control when querying directly from the database.Each query first queries the maximum and minimum values of IDS based on month and year, and then queries in sql with IDS in the range of maximum and minimum.Perhaps the id ea is this: well, it turned out to be OK.

Posted by mrchuckles2002 on Thu, 16 Apr 2020 16:58:48 -0700