Experience of QL optimization with twists and turns

background

Recently, I received an SQL tuning task, which performs nearly five seconds in a month of data statistics in the development environment. Originally thought it was a smooth optimization, but I didn't expect that in order to get the best result, it went through twists and turns.

First impression

Initial slow SQL

select case when sum(n_xsajs) is null then 0 else sum(n_xsajs) end as value ,ay.c_aymc as name,ay.c_aydm as id, case when tbdata.tb is null then 0 else 
        round( cast ( sum(n_xsajs) - tbdata.tb as numeric )/ cast( tbdata.tb as numeric),2)*100 end as tb, case when sum(n_xsajs) < tbdata.tb then -1 when sum(n_xsajs) 
            > tbdata.tb then 1 else 0 end as tbTag, case when hbdata.hb is null then 0 else round( cast ( sum(n_xsajs) - hbdata.hb as numeric )/ cast( hbdata.hb as 
            numeric),2)*100 end as hb, case when sum(n_xsajs) < hbdata.hb then -1 when sum(n_xsajs) > hbdata.hb then 1 else 0 end as hbTag 

from db_zntsfx.t_xsaj_ay aj 
right join ( select c_aydm,c_aymc from db_zntsfx.d_ay ay WHERE ay.c_aylb = '14' ) ay 
        on aj.c_aydm = ay.c_aydm 
left join ( select sum(n_xsajs) tb,ay.c_aydm 
                    from db_zntsfx.t_xsaj_ay aj 
                        right join ( select c_aydm,c_aymc from db_zntsfx.d_ay ay WHERE ay.c_aylb = '14' ) ay 
                        on aj.c_aydm = ay.c_aydm  and substr(aj.c_fydm,0,2)= '3' and aj.c_tjq >= '201608' and aj.c_tjq <= '201608' 
                    GROUP BY ay.c_aymc,ay.c_aydm ) tbdata   
        on aj.c_aydm = tbdata.c_aydm 
left join ( select sum(n_xsajs) hb,ay.c_aydm 
                    from db_zntsfx.t_xsaj_ay aj 
                    right join ( select c_aydm,c_aymc from db_zntsfx.d_ay ay WHERE ay.c_aylb = '14' ) ay 
                    on aj.c_aydm = ay.c_aydm  and substr(aj.c_fydm,0,2) = '3' and aj.c_tjq >= '201707' and aj.c_tjq <= '201707' 
                            GROUP BY ay.c_aymc,ay.c_aydm ) hbdata 
        on aj.c_aydm = hbdata.c_aydm  and substr(aj.c_fydm,0,2) = '3' and aj.c_tjq >= '201708' and aj.c_tjq <= '201708' 
GROUP BY ay.c_aymc,ay.c_aydm,tbdata.tb,hbdata.hb limit 17 offset 1

The original slow SQL was initially formatted.
But there are three reasons for manual formatting:

The original formatted level of SQL is still unclear
Manual formatting is the process of understanding the meaning of SQL business
Tool automatic formatting is only indented according to keywords, which is not conducive to the presentation of the SQL level.

After SQL Formatting

After formatting, it can be clearly seen that the business of SQL is "to count the number of new criminal cases received by type 14 in August and to compare the same number of cases with each other".

The main problems of SQL are:

The execution speed is slow and the development environment is 4 seconds.
The SQL body has a lot of repetition, repetitive sub-queries and group by.
The index column uses a function, substr(aj.c_fydm,0,2) ='3'.
The results of SQL statistics are incorrect, and the cases aggregated according to the cases are repetitive.

--Verification SQL

select  t.id,t.name,count(*) from (
--primarySQL
) t group by t.id,t.name having count(*)>1

Verification results:

id	name	count
0E0401540000	Disputes over rural land contract	2
0E1014000000	The third party withdraws the lawsuit	2
0E0601010500	Disputes over recourse for labour remuneration	2
0E1003010000	Apply for declaration of citizen's incapacity for civil conduct	2
0E0901000000	Disputes over tort liability	2
0E0401551500	Disputes over property service contracts	2
0E0802080000	Disputes over Share Transfer	2
0E0201030000	Property disputes after divorce	2
0E0201020000	divorce dispute	2
0E1013010000	Execution of objection by outsiders	2
0E0401350900	Contracts Disputes in Rural Housing Construction	2

First tuning

View the database version

db_zntsfx=# select version();
                                                 version                                                  
---------------------------------------------------------------------------------------------------
 PostgreSQL 9.5.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
(1 row)

With statement rewrites SQL, because abase 3.5.3 is used, so I want to try to use with grammar to solve the problem of duplication of SQL structure.
Rewritten SQL

View execution plans and speed

Result
1. The structure of the SQL statement is still unclear, and the complexity is almost the same as before.
2.SQL runs slower than before, nearly 6 seconds. The main reason is that the result set of with statement can not be filtered by month, and the result set of subsequent operation is too large.

Second tuning

Rewrite SQL

View the execution plan

Result
1. The level of SQL statements is much clearer and the execution speed is the same as before.
2. According to "Database Selection Index Exploration I", combination or coverage index will speed up the execution of SQL and add coverage index.

create index i_ay_zh01 on db_zntsfx.d_ay(c_aydm, c_aymc);
create index i_xsay_zh01 on db_zntsfx.t_xsaj_ay(c_aydm, c_tjq, c_fydm, n_xsajs);

Look at the implementation plan again

The execution time of SQL is 100ms

Correctness of preliminary investigation results

select id,name,count(*) from (
   --Second rewriteSQL
) t group by t.id,t.name having count(*) >1

Verification results:

id	name	count

No duplicate case statistics

Third tuning

Doubts about the correctness of statistical results

Just when I thought I had found a perfect solution to both clarity and efficiency of SQL, Liu Guoming, the DBA team, questioned the correctness of the query results.
Brother Guoming thought that group by should be followed by left join.

Verification process

The data of current, year-on-year and ring-on-ring cases of one kind of cause are taken separately.

1. Computing the comparative ring data of "disputes over the right to use homestead" with SQL after the second transformation.

2. New SQL calculates the comparative ring data of "disputes over the right to use homestead" separately.

Verification results

The statistical results of "Perfect SQL" after rewriting are incorrect

Reflect

Why is the rewritten SQL statistics wrong?
Answer: Looking at the execution plan of SQL, we find that the reason for the statistical errors in SQL rewriting is that we assume that the year-to-year, ring ratio and current monthly data are connected to the main table d_ay separately, but they are not. In fact, the main table first does Cartesian product with the current monthly result set, then Cartesian product with the same comparative data, and Cartesian product with the new set and the ring ratio data. Causes data duplication.

Correct SQL

Correct implementation plan

SQL execution time: less than 10ms (development environment)

summary

In the process of SQL optimization, it is necessary to format SQL, understand business, view execution plan, try index, rewrite SQL and even redesign table structure. It takes one or more times to achieve the required performance standards.
For the join order of tables, don't take it for granted that SQL is executed in the order it was written. At the same time, the platform of SQLFX finds that there are more than four tables joined in the SQL of many projects. With the increase of data, the Cartesian product of these SQLs will increase exponentially, resulting in the shortage of CPU and IO resources. Hope that the project team can pay attention to these potential performance problems of SQL, hope that SYBASE switch to ABASE, the performance improvement brought by database platform upgrade will not be offset by slow SQL in the program, hope that "Andy Bill Theorem" will not happen around us.

Posted by Floetic on Mon, 27 May 2019 12:23:51 -0700

Programmer Group

Experience of QL optimization with twists and turns

Experience of QL optimization with twists and turns

background

First impression

First tuning

Second tuning

Third tuning

Correct SQL

Correct implementation plan

SQL execution time: less than 10ms (development environment)

summary

Hot Keywords