PostgreSQL aggregate expression FILTER, order, within group usage

Keywords: github PostgreSQL SQL GreenPlum

Label

PostgreSQL, aggregation, filter, order, within group

background

PostgreSQL has powerful analysis functions, such as supporting multi-dimensional analysis, supporting four kinds of aggregation, supporting window query, supporting recursive query and so on.

For the use of four major types of aggregation, please refer to

<PostgreSQL aggregate function 1 : General-Purpose Aggregate Functions>

<PostgreSQL aggregate function 2 : Aggregate Functions for Statistics>

<PostgreSQL aggregate function 3 : Aggregate Functions for Ordered-Set>

<PostgreSQL aggregate function 4 : Hypothetical-Set Aggregate Functions>

Multidimensional analysis, please refer to

Greenplum Best Practices - Use of Multidimensional Analysis (CUBE, ROLLUP, GROUPING SETS in GreenPlum and Oracle)

<PostgreSQL 9.5 new feature - Support GROUPING SETS, CUBE and ROLLUP.>

For window queries, please refer to

Accelerated Analysis and Implementation of Sequential Data Merging Scenarios - Composite Index, Accelerated Window Grouping Query, Abnormal Recursive Acceleration

Quick Start PostgreSQL Application Development and Management - 4 Advanced SQL Usage

Refer to Recursive Queries

Quick Start PostgreSQL Application Development and Management - 3 Access Data

This article mainly introduces the advanced usage of aggregate expressions.

aggregate_name (expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name (ALL expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name ( * ) [ FILTER ( WHERE filter_clause ) ]  
  
aggregate_name ( [ expression [ , ... ] ] ) WITHIN GROUP ( order_by_clause ) [ FILTER ( WHERE filter_clause ) ]  

Example

1. After grouping, we need to find out the count of composite conditions and the count of grouping.

postgres=# create table test(id int, c1 int);  
CREATE TABLE  
postgres=# insert into test select generate_series(1,10000), random()*10;  
INSERT 0 10000  
  
postgres=# select * from test limit 10;  
 id | c1   
----+----  
  1 | 10  
  2 |  4  
  3 |  6  
  4 |  1  
  5 |  4  
  6 |  9  
  7 |  9  
  8 |  7  
  9 |  5  
 10 |  4  
(10 rows)  
postgres=# select count(*), count(*) filter (where id<1000) from test group by c1;  
 count | count   
-------+-------  
  1059 |   118  
   998 |   109  
   999 |   101  
  1010 |    95  
   468 |    48  
   544 |    43  
   964 |   107  
   956 |   103  
  1021 |    87  
   977 |   101  
  1004 |    87  
(11 rows)  

2. We need to aggregate multiple records into a string or array in order. We can also add filter s to aggregate only records with composite conditions.

postgres=# select string_agg(id::text, '-' order by id) filter (where id<100) from test group by c1;  
                string_agg                   
-------------------------------------------  
 35-65-74-97  
 4-12-19-31-36-40-85-89-90-98-99  
 17-18-22-42-43-44-58-59-64-70-75-83-84  
 11-14-15-16-21-30-41-54-62-67-73-80-81-94  
 2-5-10-51-79-93-96  
 9-26-45-46-47-61  
 3-27-28-37-48-55-56-68-69-77-92  
 8-20-24-33-34-49-50-60-63-66-78-91  
 25-39-53-57-71-76-82-87-95  
 6-7-29-32-38-72-86-88  
 1-13-23-52  
(11 rows)  

3. We need to go to each grouping, the median value of a field.

postgres=# select percentile_cont(0.5) within group (order by id) from test group by c1;  
 percentile_cont   
-----------------  
          4911.5  
            5210  
            4698  
          4699.5  
            4955  
          5061.5  
            5115  
            5176  
          4897.5  
            5087  
            4973  
(11 rows)  

4. Median Value after De-filtering Conditions

postgres=# select percentile_cont(0.5) within group (order by id) filter (where id<100) from test group by c1;  
 percentile_cont   
-----------------  
            69.5  
              40  
              58  
            47.5  
              51  
            45.5  
              55  
            49.5  
              71  
              35  
              18  
(11 rows)  

Summary

PostgreSQL's analytical approach is comprehensive, and it is recommended that users learn more about the links I gave at the beginning to help improve productivity.

Reference resources

https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SYNTAX-AGGREGATES

Posted by thirdeye on Mon, 11 Feb 2019 17:09:19 -0800