Oracle analysis function Over()

Keywords: Oracle

Over() parsing function

Note: aggregate functions (such as sum(), max(), etc.) can calculate some aggregate value based on a group, but aggregate functions can only return one row of records for a group. If you want to return multiple rows of records for a group, you need to use an analysis function.

rank()/dense_rank over(partition by ... order by ...)

Note: over() is on what conditions

partition by which field to group;

order by which field;

Be careful:

(1) when using rank() / deny_rank(), you must bring order by, otherwise it is illegal

(2) the difference of rank() / deny﹐rank():

rank(): jump sort. If there are two first levels, the next is the third level.
Deny_rank(): if there are two first levels, the next level is still the second level.

Example: query the highest wage employee information of each department

General writing

SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO
  FROM SCOTT.EMP E,
       (SELECT E.DEPTNO, MAX(E.SAL) SAL FROM SCOTT.EMP E GROUP BY E.DEPTNO) ME
 WHERE E.DEPTNO = ME.DEPTNO
   AND E.SAL = ME.SAL;

Using the Over() function

Method 1:

SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO
  FROM (SELECT E.ENAME,
               E.JOB,
               E.SAL,
               E.DEPTNO,
               RANK() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) RANK  --On the basis of division by department, wages are graded from high to low RANK Increment from 1
          FROM EMP E) E
 WHERE E.RANK = 1 ;

Method two:

SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO
  FROM (SELECT E.ENAME,
               E.JOB,
               E.SAL,
               E.DEPTNO,
               DENSE_RANK() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) RANK
          FROM EMP E) E
 WHERE E.RANK = 1;

Comparison query result: on the left is the query result by general method, and on the right is the query result by analysis function (the results of the two methods are the same)

 

 

min()/max() over(partition by ...)

When querying employee information, query the difference between the employee's salary and the minimum and maximum salary of the Department

General writing:

SELECT E.ENAME,
       E.JOB,
       E.SAL,
       E.DEPTNO,
       E.SAL - ME.MIN_SAL DIFF_MIN_SAL,
       ME.MAX_SAL - E.SAL DIFF_MAX_SAL
  FROM SCOTT.EMP E,
       (SELECT E.DEPTNO, MIN(E.SAL) MIN_SAL, MAX(E.SAL) MAX_SAL
          FROM SCOTT.EMP E
         GROUP BY E.DEPTNO) ME
 WHERE E.DEPTNO = ME.DEPTNO
 ORDER BY E.DEPTNO, E.SAL;

Use the analysis function:

SELECT E.ENAME,
       E.JOB,
       E.SAL,
       E.DEPTNO,
       MIN(E.SAL) OVER(PARTITION BY E.DEPTNO) MIN_SAL,
       MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) MAX_SAL,
       NVL(E.SAL - MIN(E.SAL) OVER(PARTITION BY E.DEPTNO), 0) DIFF_MIN_SAL,
       NVL(MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) - E.SAL, 0) DIFF_MAX_SAL
  FROM EMP E;

Note: there is no sorting condition here. If order by is added,

Max() over (partition by.. order by.. desc) collation can only be desc, otherwise it will not work. The corresponding value of the field with the highest sorting value so far will be queried
Min() over (partition by.. order by.. asc) collation can only be asc, otherwise it will not work. The corresponding value of the field with the lowest sorting value so far will be queried,

As follows:

SELECT E.ENAME,
       E.JOB,
       E.SAL,
       E.DEPTNO,
       MIN(E.SAL) OVER(PARTITION BY E.DEPTNO) MIN_SAL01,
       MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) MAX_SAL01,
       MIN(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) MIN_SAL02,
       MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) MAX_SAL02, --No effect
       MIN(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) MIN_SAL03, --No effect
       MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC) MAX_SAL03,
       MIN(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ASC) MIN_SAL04,
       MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ASC) MAX_SAL04, --No effect
       NVL(E.SAL - MIN(E.SAL) OVER(PARTITION BY E.DEPTNO), 0) DIFF_MIN_SAL,
       NVL(MAX(E.SAL) OVER(PARTITION BY E.DEPTNO) - E.SAL, 0) DIFF_MAX_SAL
  FROM EMP E;

The results are as follows:

 

 

Lead() / lag() over (partition by... Order by...) takes the n-th line record before / after

Explain:

Lead (column name, n,m): the value of < column name > of the record in the nth row after the current record; if there is no value, the default value is m; if there is no parameter n,m, the value of < column name > of the record in the first row after the current record is found, and if there is no value, the default value is null.
Lag (column name, n,m): the value of < column name > of the record in front of the nth row of the current record; if there is no value, the default value is m; if there is no parameter n,m, the value of < column name > of the record in front of the first row of the current record is found, and if there is no value, the default value is null.

Example: query the difference between an individual's salary and the salary that is one and the first higher than yourself

Using analysis functions

SELECT E.ENAME,
        E.JOB,
        E.SAL,
        E.DEPTNO,
        LEAD(E.SAL, 1, 0) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) LEAD_SAL, --Record the following n Row record
        LAG(E.SAL, 1, 0) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) LAG_SAL, --Record the first N Row record
        NVL(LEAD(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) - E.SAL,  
            0) DIFF_LEAD_SAL,
        NVL(E.SAL - LEAD(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL),
            0) DIFF_LAG_SAL
   FROM EMP E;

Query results:

 

 

First value / last value() over (partition by...) takes the first and last record

Example:

 SELECT E.EMPNO,
        E.ENAME,
        E.JOB,
        E.MGR,
        E.SAL,
        E.DEPTNO,
        FIRST_VALUE(E.SAL) OVER(PARTITION BY E.DEPTNO) FIRST_SAL,
        LAST_VALUE(E.SAL) OVER(PARTITION BY E.DEPTNO) LAST_SAL
   FROM EMP E;

Query results:

ROW_NUMBER() OVER(PARTITION BY.. ORDER BY..) sorting (application: paging)

 SELECT E.ENAME, E.JOB, E.SAL, E.DEPTNO,E.ROW_NUM
   FROM (SELECT E.ENAME,
                E.JOB,
                E.SAL,
                E.DEPTNO,
                ROW_NUMBER() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL) ROW_NUM
           FROM EMP E) E
  WHERE E.ROW_NUM > 3;

Query results:

Supplement:

Similar paging operations can also be implemented with rownum and fetch (new features after oracle12c)

sum/avg/count() over(partition by ..)

Example 1:

SELECT E.ENAME,
       E.JOB,
       E.SAL,
       E.DEPTNO,
       SUM(E.SAL) OVER(PARTITION BY E.DEPTNO) SUM_SAL,     --Count totals in a group
       AVG(E.SAL) OVER(PARTITION BY E.DEPTNO) AVG_SAL,     --Count the average in a group
       COUNT(E.SAL) OVER(PARTITION BY E.DEPTNO) COUNT_SAL  --Group by a column and count the number of records in the group
  FROM EMP E;

Query results:

 

 

Example 2 (full statistics): it is the data set that counts the total sales of departments, the total sales of the whole company, the average sales of departments, and the average sales of the whole company

SELECT A.DEPT_ID,
       A.SALE_DATE,
       A.GOODS_TYPE,
       A.SALE_CNT,
       SUM(A.SALE_CNT) OVER(PARTITION BY A.DEPT_ID) DEPT_TOTAL, --Total department sales
       SUM(A.SALE_CNT) OVER() CMP_TOTAL,  --Total sales of the company
       AVG(A.SALE_CNT) OVER(PARTITION BY A.DEPT_ID) DEPT_AVG, --Department sales average
       AVG(A.SALE_CNT) OVER() CMP_AVG  --Company sales average
  FROM LEARN_FUN_KEEP A;

rows/range between … preceding and … following up and down range evaluation

Description: unbounded: uncontrolled, unlimited
Before: before
following: after

 rows between ... preceding and ... following

Example 1: display the salary of employees in each department, with the maximum salary of the Department

SELECT E.DEPTNO,
       E.EMPNO,
       E.ENAME,
       E.SAL,
       LAST_VALUE(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ROWS
       /*MAX(E.SAL) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL ROWS*/
       --unbounded preceding and unbouned following For the previous record and the next record of all current records, that is, all records in the table
       --unbounded: Uncontrolled, unlimited
       --preceding: stay...before
       --following: stay...after
        BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) MAX_SAL
  FROM EMP E;

The results are as follows:

 

 

Writing method II;

SELECT E.DEPTNO,
       E.EMPNO,
       E.ENAME,
       E.SAL,
       MAX(E.SAL) OVER(PARTITION BY E.DEPTNO
       /*ORDER BY E.SAL*/) MAX_SAL
  FROM EMP E;

Example 2: Group departments with a summary showing the first line to the current line

SELECT EMPNO,
       ENAME,
       DEPTNO,
       SAL,
       --Be careful ROWS BETWEEN unbounded preceding AND current row  The summary from the first line to the current line
       SUM(SAL) OVER(PARTITION BY DEPTNO 
                     ORDER BY ENAME 
                     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) max_sal
  FROM SCOTT.EMP;

The results are as follows:

 

 

Example 3: summary from current line to last line

SELECT EMPNO,
       ENAME,
       DEPTNO,
       SAL,
       --Be careful ROWS BETWEEN current row AND unbounded following It refers to the summary from the current line to the last line
       SUM(SAL) OVER(PARTITION BY DEPTNO 
                     ORDER BY ENAME 
                     ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) max_sal
  FROM SCOTT.EMP;

The results are as follows:

 

 

Example 4: summary from the previous row (rownum-1) of the current row to the current row

SELECT EMPNO,
       ENAME,
       DEPTNO,
       SAL,
       --Be careful ROWS BETWEEN 1 preceding AND current row Refers to the previous line of the current line(rownum-1)Summary to current line 
       SUM(SAL) OVER(PARTITION BY DEPTNO 
                     ORDER BY ENAME ROWS 
                     BETWEEN 1 PRECEDING AND CURRENT ROW) max_sal
  FROM SCOTT.EMP;

The results are as follows:

 

Example 5: summary of the previous row (rownum-1) to the next two rows (rownum+2) of the current row

SELECT EMPNO,
       ENAME,
       DEPTNO,
       SAL,
       --Be careful ROWS BETWEEN 1 preceding AND 1 following Refers to the previous line of the current line(rownum-1)Next line to current line(rownum+2)Summary
       SUM(SAL) OVER(PARTITION BY DEPTNO 
                     ORDER BY ENAME 
                     ROWS BETWEEN 1 PRECEDING AND 2 FOLLOWING) max_sal
  FROM SCOTT.EMP;

The results are as follows:

 

Other

NULLS FIRST/LAST put the null value field record in the first or last display

Explain:

The records are arranged in full and grouped by RANK(), deny ﹣ RANK(), row ﹣ number (), but sometimes there is a null value, which will affect the correctness of the result
NULLS FIRST/LAST can help us to put the record of null value field in the first or last display in the process of sorting arrangement with null value, and help us get the desired result.

SELECT E.ENAME,
       E.JOB,
       E.SAL,
       E.DEPTNO,
       RANK() OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL NULLS LAST)
  FROM EMP E;

The results are as follows:

NTILE(n)

SELECT E.ENAME,
       E.JOB,
       E.SAL,
       E.DEPTNO,
       NTILE(3) OVER(ORDER BY E.SAL DESC NULLS LAST) ALL_CMP, --If only the first third is taken, all_cmp=1 Yes, if only the middle third is taken, all_cmp=2 that will do
       NTILE(3) OVER(PARTITION BY E.DEPTNO ORDER BY E.SAL DESC NULLS LAST) ALL_DEPT  --Each department is divided into three parts
  FROM EMP E

The results are as follows:

keep(dense_rank first/last)

Explain:

1. What's the meaning of "keep" (deny [rank first / last)?
Keep literally means' keep ', that is to say, keep records that meet the conditions in the brackets of keep()
Here we can imagine that there will be multiple records, that is, there will be multiple last or first)
Deny? Rank is a sort strategy
first/last is a screening strategy

About question 2:
The reason to use min is to make the final result unique, because sometimes there are multiple last or first situations.
3. Why don't you use deny? Rank?
ORA-02000: missing deny \ rank keyword
After changing to rank, an error is reported directly. As for the reason, I understand that rank cannot represent the relative order of records sorting
For example: record rank deny
100 1 1
100 1 1
95 3 2
The relative position between the third record and the first and second records should be 1, but it cannot be expressed by rank.

Example:

View the record with the earliest sales record and the smallest sales volume in Department D02.

SELECT A.DEPT_ID,
       MIN(A.SALE_CNT) KEEP(DENSE_RANK FIRST ORDER BY A.SALE_DATE) MIN_EARLY_DATE
  FROM LEARN_FUN_KEEP A
 WHERE A.DEPT_ID = 'D02'
 GROUP BY A.DEPT_ID;

Posted by BruceRowe on Sun, 08 Mar 2020 23:43:43 -0700