Over() parsing function
Note: aggregate functions (such as sum(), max(), etc.) can calculate some aggregate value based on a group, but aggregate functions can only return one row of records for a group. If you want to return multiple rows of records for a group, you need to use an analysis function.
rank()/dense_rank over(partition by ... order by ...)
Note: over() is on what conditions
partition by which field to group;
order by which field;
Be careful:
(1) when using rank() / deny_rank(), you must bring order by, otherwise it is illegal
(2) the difference of rank() / deny﹐rank():
rank(): jump sort. If there are two first levels, the next is the third level.
Deny_rank(): if there are two first levels, the next level is still the second level.
Example: query the highest wage employee information of each department
General writing
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO FROM SCOTT.EMP E, (SELECT E.DEPTNO, MAX(E.SAL) SAL FROM SCOTT.EMP E GROUP BY E.DEPTNO) ME WHERE E.DEPTNO = ME.DEPTNO AND E.SAL = ME.SAL;
Using the Over() function
Method 1:
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO FROM (SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, RANK() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) RANK --On the basis of division by department, wages are graded from high to low RANK Increment from 1 FROM EMP E) E WHERE E.RANK = 1 ;
Method two:
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO FROM (SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, DENSE_RANK() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) RANK FROM EMP E) E WHERE E.RANK = 1;
Comparison query result: on the left is the query result by general method, and on the right is the query result by analysis function (the results of the two methods are the same)
min()/max() over(partition by ...)
When querying employee information, query the difference between the employee's salary and the minimum and maximum salary of the Department
General writing:
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, E.SAL - ME.MIN_SAL DIFF_MIN_SAL, ME.MAX_SAL - E.SAL DIFF_MAX_SAL FROM SCOTT.EMP E, (SELECT E.DEPTNO, MIN(E.SAL) MIN_SAL, MAX(E.SAL) MAX_SAL FROM SCOTT.EMP E GROUP BY E.DEPTNO) ME WHERE E.DEPTNO = ME.DEPTNO ORDER BY E.DEPTNO, E.SAL;
Use the analysis function:
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, MIN(E.SAL) OVER(PARTITION BY E.DEPTNO) MIN_SAL, MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) MAX_SAL, NVL(E.SAL - MIN(E.SAL) OVER(PARTITION BY E.DEPTNO), 0) DIFF_MIN_SAL, NVL(MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) - E.SAL, 0) DIFF_MAX_SAL FROM EMP E;
Note: there is no sorting condition here. If order by is added,
Max() over (partition by.. order by.. desc) collation can only be desc, otherwise it will not work. The corresponding value of the field with the highest sorting value so far will be queried
Min() over (partition by.. order by.. asc) collation can only be asc, otherwise it will not work. The corresponding value of the field with the lowest sorting value so far will be queried,
As follows:
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, MIN(E.SAL) OVER(PARTITION BY E.DEPTNO) MIN_SAL01, MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) MAX_SAL01, MIN(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) MIN_SAL02, MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) MAX_SAL02, --No effect MIN(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) MIN_SAL03, --No effect MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) MAX_SAL03, MIN(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ASC) MIN_SAL04, MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ASC) MAX_SAL04, --No effect NVL(E.SAL - MIN(E.SAL) OVER(PARTITION BY E.DEPTNO), 0) DIFF_MIN_SAL, NVL(MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) - E.SAL, 0) DIFF_MAX_SAL FROM EMP E;
The results are as follows:
Lead() / lag() over (partition by... Order by...) takes the n-th line record before / after
Explain:
Lead (column name, n,m): the value of < column name > of the record in the nth row after the current record; if there is no value, the default value is m; if there is no parameter n,m, the value of < column name > of the record in the first row after the current record is found, and if there is no value, the default value is null.
Lag (column name, n,m): the value of < column name > of the record in front of the nth row of the current record; if there is no value, the default value is m; if there is no parameter n,m, the value of < column name > of the record in front of the first row of the current record is found, and if there is no value, the default value is null.
Example: query the difference between an individual's salary and the salary that is one and the first higher than yourself
Using analysis functions
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, LEAD(E.SAL, 1, 0) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) LEAD_SAL, --Record the following n Row record LAG(E.SAL, 1, 0) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) LAG_SAL, --Record the first N Row record NVL(LEAD(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) - E.SAL, 0) DIFF_LEAD_SAL, NVL(E.SAL - LEAD(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL), 0) DIFF_LAG_SAL FROM EMP E;
Query results:
First value / last value() over (partition by...) takes the first and last record
Example:
SELECT E.EMPNO, E.ENAME, E.JOB, E.MGR, E.SAL, E.DEPTNO, FIRST_VALUE(E.SAL) OVER(PARTITION BY E.DEPTNO) FIRST_SAL, LAST_VALUE(E.SAL) OVER(PARTITION BY E.DEPTNO) LAST_SAL FROM EMP E;
Query results:
ROW_NUMBER() OVER(PARTITION BY.. ORDER BY..) sorting (application: paging)
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO,E.ROW_NUM FROM (SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, ROW_NUMBER() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) ROW_NUM FROM EMP E) E WHERE E.ROW_NUM > 3;
Query results:
Supplement:
Similar paging operations can also be implemented with rownum and fetch (new features after oracle12c)
sum/avg/count() over(partition by ..)
Example 1:
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, SUM(E.SAL) OVER(PARTITION BY E.DEPTNO) SUM_SAL, --Count totals in a group AVG(E.SAL) OVER(PARTITION BY E.DEPTNO) AVG_SAL, --Count the average in a group COUNT(E.SAL) OVER(PARTITION BY E.DEPTNO) COUNT_SAL --Group by a column and count the number of records in the group FROM EMP E;
Query results:
Example 2 (full statistics): it is the data set that counts the total sales of departments, the total sales of the whole company, the average sales of departments, and the average sales of the whole company
SELECT A.DEPT_ID, A.SALE_DATE, A.GOODS_TYPE, A.SALE_CNT, SUM(A.SALE_CNT) OVER(PARTITION BY A.DEPT_ID) DEPT_TOTAL, --Total department sales SUM(A.SALE_CNT) OVER() CMP_TOTAL, --Total sales of the company AVG(A.SALE_CNT) OVER(PARTITION BY A.DEPT_ID) DEPT_AVG, --Department sales average AVG(A.SALE_CNT) OVER() CMP_AVG --Company sales average FROM LEARN_FUN_KEEP A;
rows/range between … preceding and … following up and down range evaluation
Description: unbounded: uncontrolled, unlimited
Before: before
following: after
rows between ... preceding and ... following
Example 1: display the salary of employees in each department, with the maximum salary of the Department
SELECT E.DEPTNO, E.EMPNO, E.ENAME, E.SAL, LAST_VALUE(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ROWS /*MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ROWS*/ --unbounded preceding and unbouned following For the previous record and the next record of all current records, that is, all records in the table --unbounded: Uncontrolled, unlimited --preceding: stay...before --following: stay...after BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) MAX_SAL FROM EMP E;
The results are as follows:
Writing method II;
SELECT E.DEPTNO, E.EMPNO, E.ENAME, E.SAL, MAX(E.SAL) OVER(PARTITION BY E.DEPTNO /*ORDER BY E.SAL*/) MAX_SAL FROM EMP E;
Example 2: Group departments with a summary showing the first line to the current line
SELECT EMPNO, ENAME, DEPTNO, SAL, --Be careful ROWS BETWEEN unbounded preceding AND current row The summary from the first line to the current line SUM(SAL) OVER(PARTITION BY DEPTNO ORDER BY ENAME ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) max_sal FROM SCOTT.EMP;
The results are as follows:
Example 3: summary from current line to last line
SELECT EMPNO, ENAME, DEPTNO, SAL, --Be careful ROWS BETWEEN current row AND unbounded following It refers to the summary from the current line to the last line SUM(SAL) OVER(PARTITION BY DEPTNO ORDER BY ENAME ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) max_sal FROM SCOTT.EMP;
The results are as follows:
Example 4: summary from the previous row (rownum-1) of the current row to the current row
SELECT EMPNO, ENAME, DEPTNO, SAL, --Be careful ROWS BETWEEN 1 preceding AND current row Refers to the previous line of the current line(rownum-1)Summary to current line SUM(SAL) OVER(PARTITION BY DEPTNO ORDER BY ENAME ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) max_sal FROM SCOTT.EMP;
The results are as follows:
Example 5: summary of the previous row (rownum-1) to the next two rows (rownum+2) of the current row
SELECT EMPNO, ENAME, DEPTNO, SAL, --Be careful ROWS BETWEEN 1 preceding AND 1 following Refers to the previous line of the current line(rownum-1)Next line to current line(rownum+2)Summary SUM(SAL) OVER(PARTITION BY DEPTNO ORDER BY ENAME ROWS BETWEEN 1 PRECEDING AND 2 FOLLOWING) max_sal FROM SCOTT.EMP;
The results are as follows:
Other
NULLS FIRST/LAST put the null value field record in the first or last display
Explain:
The records are arranged in full and grouped by RANK(), deny ﹣ RANK(), row ﹣ number (), but sometimes there is a null value, which will affect the correctness of the result
NULLS FIRST/LAST can help us to put the record of null value field in the first or last display in the process of sorting arrangement with null value, and help us get the desired result.
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, RANK() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL NULLS LAST) FROM EMP E;
The results are as follows:
NTILE(n)
SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO, NTILE(3) OVER(ORDER BY E.SAL DESC NULLS LAST) ALL_CMP, --If only the first third is taken, all_cmp=1 Yes, if only the middle third is taken, all_cmp=2 that will do NTILE(3) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC NULLS LAST) ALL_DEPT --Each department is divided into three parts FROM EMP E
The results are as follows:
keep(dense_rank first/last)
Explain:
1. What's the meaning of "keep" (deny [rank first / last)?
Keep literally means' keep ', that is to say, keep records that meet the conditions in the brackets of keep()
Here we can imagine that there will be multiple records, that is, there will be multiple last or first)
Deny? Rank is a sort strategy
first/last is a screening strategy
About question 2:
The reason to use min is to make the final result unique, because sometimes there are multiple last or first situations.
3. Why don't you use deny? Rank?
ORA-02000: missing deny \ rank keyword
After changing to rank, an error is reported directly. As for the reason, I understand that rank cannot represent the relative order of records sorting
For example: record rank deny
100 1 1
100 1 1
95 3 2
The relative position between the third record and the first and second records should be 1, but it cannot be expressed by rank.
Example:
View the record with the earliest sales record and the smallest sales volume in Department D02.
SELECT A.DEPT_ID, MIN(A.SALE_CNT) KEEP(DENSE_RANK FIRST ORDER BY A.SALE_DATE) MIN_EARLY_DATE FROM LEARN_FUN_KEEP A WHERE A.DEPT_ID = 'D02' GROUP BY A.DEPT_ID;