- 1,GROUP BY Standard Grouping
- 2,GROUP BY Extended Grouping
- 3,GROUP BY Extension Function
- 3.1,GROUPING FUNCTION
- 3.2,GROUPING_ID Function
- 3.3,GROUP_ID function
- 4,summary
1. GROUP BY Standard Grouping
For the sake of narration, I will call the grouping of simple GROUP BY clause as standard grouping, and the column (or expression) appearing in GROUP BY clause as grouping column.
1.1. Overview of GROUP BY
In grouping queries, the GROUP BY clause is used to group selected row sets according to the value of a specified column or expression, and return a row of data collected from the group for each group. Basic grammar:
SELECT expression1, expression2, ... expression_n, aggregate_function (aggregate_expression) FROM tables [WHERE conditions] GROUP BY expression1, expression2, ... expression_n [HAVING having_condition];
Simple examples:
SELECT t.dept_code, MAX(t.post_salary) max_salary, -- Department's highest post salary MIN(t.post_salary) min_salary, -- Minimum Wage for Departments AVG(t.post_salary) avg_salary, -- Average post salary of Department SUM(t.post_salary) sum_salary, -- The sum of department post wages COUNT(t.post_salary) cnt_salary -- Departmental Wages FROM demo.t_staff t GROUP BY t.dept_code HAVING AVG(t.post_salary)>3500 ORDER BY t.dept_code;
Matters needing attention:
- 1. Grouping columns or aggregation functions or constants can only appear in SELECT clauses.
- 2. In the HAVING clause, only grouping columns or aggregation functions or constants can appear.
- 3. In GROUP BY clause, only columns or constants of scalar primitive types (such as VARCHAR2, NUMBER, DATE, etc.) can appear, but columns of BLOB, CLOB and other types cannot appear.
- 4. In fact, grouped columns may not appear in SELECT clauses, but this may make the meaning of query results unclear.
- 5. In SELECT, HAVING, GROUP BY clauses, there are also some special functions, such as SYSDATE (feeling meaningless).
1.2. The difference between WHERE and HAVING?
Let me start by saying, "What's the difference between WHERE and HAVING?" This is definitely a problem! Because there is no comparability between the two, in fact, people who ask such questions often have a weak SQL foundation.
In the query sentence containing GROUP BY clause, the function of WHERE clause is to filter the row data before grouping the query results, and remove the unqualified rows; while the function of HAVING clause is to filter the group data after grouping the query results, and remove the unqualified groups. In other words, because aggregation functions provide information about groups, it is impossible to provide information about groups before grouping. That is to say, aggregation functions cannot be used in WHERE clauses, which is why HAVING clauses appear. With the HAVING clause, we can easily filter the group data after grouping. Generally speaking, what can be filtered by WHERE should not be filtered by HAVING!
2. GROUP BY Extended Grouping
In the actual business development, only standard grouping may not be enough, often need more dimensions of subtotal, total. For this kind of requirement, Oracle provides abundant extended grouping functions; although UNION ALL can generally achieve similar results, it is not flexible enough and has low performance.
2.1. ROLLUP grouping
ROLLUP is an extension of the GROUP BY clause, which allows the calculation of subtotals and totals of standard groupings and some dimensions. Syntax:
GROUP BY ROLLUP(grouping_column_reference_list)
The result of ROLLUP calculation is related to the order of specified columns in parentheses, because the grouping process of ROLLUP is directional. First the standard grouping is calculated, then the higher subtotal is calculated progressively from right to left until all columns are calculated, and finally the total is calculated. When n columns are specified in ROLLUP, there are n+1 grouping modes in the whole calculation process. For example, the grouping process of GROUP BY ROLLUP(A,B) is equivalent to that of GROUP BY(A,B) grouping in the first step, GROUP BY(A,NULL) grouping in the second step and GROUP BY (NULL) grouping in the third step.
Example 1:
SELECT t.dept_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY ROLLUP(t.dept_code);
Result:
DEPT_CODE SUM_SALARY MAX_SALARY -------------------------------------------------- ---------- ---------- 010102 13500 7500 010103 7850 5050 21350 7500
Example 2:
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY ROLLUP(t.dept_code,t.post_code);
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY ---------------------------------------------- -------------------------------------------- ---------- ---------- 010102 P40 7500 7500 010102 P50 6000 6000 010102 13500 7500 010103 P40 5050 5050 010103 P50 2800 2800 010103 7850 5050 21350 7500
Example 3, Partial ROLLUP grouping (without some subtotals, which can be used for total timing):
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY t.dept_code,ROLLUP(t.post_code);
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY --------------------------------------------- -------------------------------------------- ---------- ---------- 010102 P40 7500 7500 010102 P50 6000 6000 010102 13500 7500 010103 P40 5050 5050 010103 P50 2800 2800 010103 7850 5050
2.2. CUBE grouping
CUBE is an extension of GROUP BY clause, which allows calculation of standard grouping and subtotals and totals of all dimensions. Syntax:
GROUP BY CUBE(grouping_column_reference_list)
CUBE counts all possible groupings and generates cross reports. CUBE has more groupings than ROLLUP, and contains ROLLUP statistical results, and the results are independent of the order of grouping columns, but if the order of columns is different, the default ranking of results will be different. When n columns are specified in CUBE, there are two n-th power grouping modes in the whole calculation process. For example, GROUP BY CUBE(A,B) is equivalent to grouping by GROUP BY(A,B), grouping by GROUP BY(A,NULL), grouping by GROUP BY(NULL,B) and grouping by GROUP BY(NULL,NULL).
Example 1:
SELECT t.dept_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY CUBE(t.dept_code);
Result:
DEPT_CODE SUM_SALARY MAX_SALARY -------------------------------------------------- ---------- ---------- 21350 7500 010102 13500 7500 010103 7850 5050
Example 2:
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY CUBE(t.dept_code,t.post_code);
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY --------------------------------------------- ------------------------------------------ ---------- ---------- 21350 7500 P40 12550 7500 P50 8800 6000 010102 13500 7500 010102 P40 7500 7500 010102 P50 6000 6000 010103 7850 5050 010103 P40 5050 5050 010103 P50 2800 2800
Example 3, Partial CUBE grouping (without some subtotals, which can be used for total timing):
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY t.dept_code,CUBE(t.post_code);
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY -------------------------------------------- ---------------------------------------- ---------- ---------- 010102 13500 7500 010102 P40 7500 7500 010102 P50 6000 6000 010103 7850 5050 010103 P40 5050 5050 010103 P50 2800 2800
GROUPING SETS grouping
GROUPING SETS is an extension of the GROUP BY clause, which allows the subtotal of multiple standard groups to be calculated at one time. Syntax:
GROUP BY GROUPING SETS(grouping_column_reference_list)
The calculation results of GROUPING SETS have nothing to do with the order of grouping columns, and the ranking of result sets has nothing to do with the order of grouping columns. When n columns are specified in GROUPING SETS, there are n grouping methods in the whole calculation process. For example, GROUPING SETS(A,B,C) is equivalent to the results of UNION ALL of GROUP BY A, GROUP BY B and GROUP BY C.
Example 1:
SELECT t.dept_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY GROUPING SETS(t.dept_code);
Result:
DEPT_CODE SUM_SALARY MAX_SALARY -------------------------------------------------- ---------- ---------- 010102 13500 7500 010103 7850 5050
Example 2:
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY GROUPING SETS(t.dept_code,t.post_code);
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY --------------------------------------------- ------------------------------------------ ---------- ---------- 010102 13500 7500 010103 7850 5050 P50 8800 6000 P40 12550 7500
Example 3, partial GROUPING SETS grouping:
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY t.dept_code,GROUPING SETS(t.post_code);
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY -------------------------------------------- ----------------------------------------- ---------- ---------- 010103 P40 5050 5050 010102 P40 7500 7500 010102 P50 6000 6000 010103 P50 2800 2800
Example 4: GROUPING SETS can accept ROLLUP and CUBE as its parameters; GROUPING SETS only groups single columns, but does not provide aggregate functions. If GROUPING SETS is required to provide aggregate functions, ROLLUP or CUBE can be used as parameters. (Note that ROLLUP and CUBE do not accept GROUPING SETS as parameters, nor do ROLLUP and CUBE take each other as parameters.) Yes):
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary,MAX(t.post_salary) max_salary FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY GROUPING SETS(ROLLUP(t.dept_code),ROLLUP(t.post_code));
Result:
DEPT_CODE POST_CODE SUM_SALARY MAX_SALARY -------------------------------------------- ------------------------------------------ ---------- ---------- 010102 13500 7500 010103 7850 5050 P50 8800 6000 P40 12550 7500 21350 7500 21350 7500
2.4. Complex grouping (combination column grouping, duplicate column grouping, connection grouping)
Simply put: Combining column grouping allows multiple columns or column combinations in ROLLUP, CUBE and GROUPING SETS; repeating column grouping allows group columns to repeat after GROUP BY; connecting grouping allows multiple ROLLUP, CUBE or GROUPING SETS after GROUP BY.
Combination column grouping has the function of filtering some subtotals or calculating some additional subtotals. The former part of ROLLUP and part of CUBE are not totaled. Using combination column grouping can not only realize the function of part of ROLLUP or part of CUBE, but also can totalize. For example, ROLLUP(A,(B,C)) can filter the subtotals of B and C and calculate the total of AB C.
The grouping level of connection grouping is a Cartesian product of the levels of all ROLLUP, CUBE or GROUPING SETS groupings. For example, ROLLUP(A,B),ROLLUP(C,D,E) has a grouping level of (2 + 1) * (3 + 1) = 12, CUBE(A,B),CUBE(C,D,E) has a grouping level of (4) * (8) = 32, CUBE(A,B),CUBE(C,D,E) has a grouping level of (4) * (8) = 32.
In my understanding, complex grouping is nothing more than the comprehensive application of standard extended grouping. In practical development, we may encounter some requirements that can not be achieved only by using standard extended grouping. At this time, we can consider flexible use of standard extended grouping, through complex grouping.
3. GROUP BY Extension Function
3.1. GROUPING Function
GROUPING Grammar: GROUPING(expr). Because there may be NULL in the original data, and the subtotal or aggregate value may also be NULL, the data will be confused. When the function appears in the SELECT clause, if the expr of the aggregated row is NULL, it returns 1; if the expr of the regular row is NULL, it returns 0. Usually, a grouping column is taken as a parameter of the function, and then aggregated rows and regular rows are distinguished by judging its return value, so as to further beautify or filter the result set. Example:
SELECT DECODE(GROUPING(t.dept_code),1,'Total',t.dept_code) dept_code, SUM(t.post_salary) sum_salary,GROUPING(t.dept_code) gd FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY ROLLUP(t.dept_code);
Result:
DEPT_CODE SUM_SALARY GD -------------------------------------------------- ---------- ---------- 010102 13500 0 010103 7850 0 //Total 213501
3.2. GROUPING_ID Function
GROUPING_ID Grammar: GROUPING_ID(expr [, expr]...). When the function appears in the SELECT clause, it returns the value corresponding to the GROUPING bit vector associated with the row. The GROUPING_ID function is calculated in the order from left to right. If this column is a grouped column, it is 0. If it is a subtotal or a total, it is 1. Then the result is composed into a binary sequence (bit vector) in the order of columns. Finally, the bit vector is converted into a decimal number. The GROUPING_ID function is functionally equivalent to the result of multiple GROUPING functions. With GROUPING_ID, it is no longer necessary to write multiple GROUPING functions, and the line filtering conditions are easier to express. This function is particularly useful when query results have multiple aggregation levels, and can sort and filter the result set by its return value. Example:
SELECT t.dept_code,t.post_code,SUM(t.post_salary) sum_salary, GROUPING_ID(t.dept_code) gd, GROUPING_ID(t.post_code) gp, GROUPING_ID(t.dept_code,t.post_code) gdp, GROUPING_ID(t.post_code,t.dept_code) gpd FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY CUBE(t.dept_code,t.post_code) ORDER BY GROUPING_ID(t.dept_code,t.post_code);
Result (the result set corresponds to an ascending order of GDP):
DEPT_CODE POST_CODE SUM_SALARY GD GP GDP GPD ----------------------------- ------------------------ ---------- ---------- ---------- ---------- ---------- 010102 P40 7500 0 0 0 0 010103 P40 5050 0 0 0 0 010102 P50 6000 0 0 0 0 010103 P50 2800 0 0 0 0 010103 7850 0 1 1 2 010102 13500 0 1 1 2 P50 8800 1 0 2 1 P40 12550 1 0 2 1 21350 1 1 3 3
3.3. GROUP_ID Function
GROUP_ID Syntax: GROUP_ID(). When the function appears in the SELECT clause, if there are n duplicates in the result set, it will return numbers ranging from 0 to n-1, which is very useful for eliminating duplicate groupings from the query results. Example (typically eliminating all duplicate rows by HAVING GROUP_ID()<1):
SELECT t.dept_code,SUM(t.post_salary) sum_salary,GROUP_ID() group_id FROM demo.t_staff t WHERE t.dept_code IN('010102','010103') GROUP BY ROLLUP(t.dept_code),CUBE(t.dept_code);
Result:
DEPT_CODE SUM_SALARY GROUP_ID -------------------------------------------------- ---------- ---------- 010102 13500 0 010103 7850 0 010102 13500 2 010103 7850 2 010102 13500 1 010103 7850 1 21350 0
4, summary
This paper mainly describes GROUP BY related knowledge points in Oracle, such as standard grouping, extended grouping, extended function, etc.
Links to this article: http://www.cnblogs.com/hanzongze/p/Oracle-Group-By.html
Copyright Statement: This article is a blogger of Blog Garden Han Chung se Originality, the author reserves the right of signature! You are welcome to use this article through reprinting, deduction or other means of dissemination, but you must give the author's signature and link to this article in a clear place! My first blog, level is limited, if there are inappropriate, please criticize and correct, thank you!