Label
PostgreSQL, BRIN block-level index, scanning method, data structure, pages_per_range algorithm
background
BRIN is a new block-level index interface for PostgreSQL 9.5, which stores the boundary values (maximum, minimum) and other statistical information of the indexed fields at the block level.
When a field needs to be retrieved, the whole BRIN index needs to be scanned (this is the point worth optimizing at the core level of the BRIN index in the future). Then skip the HEAP PAGE that does not meet the requirement and scan the HEAP PAGE that does not meet the requirement. To achieve the purpose of data filtering.
As a result of the principle, the better the correlation is, the better the filterability of BRIN index is.
BRIN also supports a variety of types, multi-column fields and so on.
1, multiple columns
2, single row
3. Spatial data types
If you still don't know about BRIN, you can read some of my case articles.
PostgreSQL 10.0 preview Enhancement - BRIN Index Update smooth ness
<PostgreSQL 9.5 new feature - BRIN (block range index) index>
How many blocks do pages_per_range, the parameter supported by BRIN index, count the boundaries at a time?
This paper will take the following scenario as an example to explain the setting algorithm of pages_per_range parameter of BRIN index, the optimization of BRIN index column and the idea of optimization of BRIN index core.
brin Scanning Principle
The scanning principle of BRIN index is very simple. It scans the metadata of BRIN, compares the metadata with the user input conditions, filters the HEAP PAGE that does not meet the requirements, and only scans the HEAP PAGE that needs to be scanned.
Relevance optimization of BRIN index columns
Because BRIN is a block-level index, if the boundaries of blocks are large, or the overlap between blocks is high, then the filter of BRIN index is poor.
So BRIN is only suitable for storing columns with good linear correlation with values.
pg_stats.correlation can observe the linear correlation of columns.
Of course, we can also modify its storage artificially, change its linear correlation (sorted storage is the simplest method), and even change the local linear correlation. If you want to know more about the underlying principles, please refer to the following article.
Points Optimized by Multiple Conditional Scanning
When our query condition is multiple query conditions, PostgreSQL combines the scans of multiple indexes into one, skipping unqualified ones. This is bitmapAnd, bitmapOr.
<PostgreSQL bitmapAnd, bitmapOr, bitmap index scan, bitmap heap scan>
However, because the scanning of PostgreSQL BRIN index needs to scan the whole BRIN index, so every condition needs to be scanned once. When BRIN itself is large, the condition will multiply in more than one time.
The scanning mode of BRIN index is the key point that PostgreSQL can optimize at the future kernel level, such as organizing the boundary of BRIN by tree, without sweeping it all at once (too violent). PostgreSQL hasn't been done at present. Perhaps no one has built a BRIN index of page s_per_range=1 on tens of billions of tables.
In fact, I'll show you this later. Everyone knows why I want to optimize the pages_per_range parameter.
BRIN Index Parameter pages_per_range Selection Recommendation Algorithms
pages_per_range is granular and defaults to 128 (which means counting boundaries every 128 data blocks), which determines two things.
1. The accuracy of BRIN index. pages_per_range=1, indicating that the boundary is accurate to a block of data. The smaller the pages_per_range, the higher the accuracy, the better the filtering (note that the better the filtering depends on the linear correlation of the column, otherwise it is blind).
2. The size of BRIN index itself. The smaller pages_per_range, the larger the BRIN index itself. The bigger the BRIN is, the higher the cost of scanning BRIN blocks with a single BRIN index is.
So how appropriate is pages_per_range set?
According to my experience, 311GB table, set to 512 is a good choice. The smaller the table, the smaller the pages_per_range setting can be.
For 311GB tables, if pages_per_range=1, the BRIN index itself is as large as 1.6GB. It's hard to sweep. When set to 512, it's only about a few MB. Sweep it quickly (although the filterability may be poor, BRIN is swept once for every condition).
DEMO
1,pages_per_range=1
postgres=# \d bi_user_tmall_vis1 Unlogged table "public.bi_user_tmall_vis1" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- uid | bigint | | | bid | bigint | | | cnt | integer | | | Indexes: "idx_bi_user_tmall_vis1" brin (bid, cnt) WITH (pages_per_range='1') public | idx_bi_user_tmall_vis1 | index | postgres | bi_user_tmall_vis1 | 1644 MB | postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where bid=1 and cnt between 1 and 100; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=264463.65..274155.70 rows=7351 width=20) (actual time=8213.046..8213.057 rows=4 loops=1) Output: uid, bid, cnt Recheck Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Rows Removed by Index Recheck: 153 Heap Blocks: lossy=1 Buffers: shared hit=269675 -> Bitmap Index Scan on idx_bi_user_tmall_vis1 (cost=0.00..264461.81 rows=7379 width=0) (actual time=8213.023..8213.023 rows=10 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=269674 Planning time: 0.046 ms Execution time: 8213.080 ms (11 rows)
2,pages_per_range=128
postgres=# \d bi_user_tmall_vis1 Unlogged table "public.bi_user_tmall_vis1" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- uid | bigint | | | bid | bigint | | | cnt | integer | | | Indexes: "idx_bi_user_tmall_vis1" brin (bid, cnt) WITH (pages_per_range='128') public | idx_bi1 | index | postgres | bi_user_tmall_vis1 | 13 MB | postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=2071.47..28408.93 rows=7351 width=20) (actual time=61.110..62.974 rows=4 loops=1) Output: uid, bid, cnt Recheck Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Rows Removed by Index Recheck: 20092 Heap Blocks: lossy=128 Buffers: shared hit=2236 -> Bitmap Index Scan on idx_bi1 (cost=0.00..2069.63 rows=20096 width=0) (actual time=61.100..61.100 rows=1280 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=2108 Planning time: 0.072 ms Execution time: 62.994 ms (11 rows) postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100) or (bid=2000 and cnt <10000) or (bid=12000 and cnt <10000); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=6324.38..242299.15 rows=153721 width=20) (actual time=184.909..191.652 rows=138 loops=1) Output: uid, bid, cnt Recheck Cond: (((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) OR ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) OR ((bi_user_tmall_vis1.bid = 12000) AND (bi _user_tmall_vis1.cnt < 10000))) Rows Removed by Index Recheck: 60150 Heap Blocks: lossy=384 Buffers: shared hit=6708 -> BitmapOr (cost=6324.38..6324.38 rows=180864 width=0) (actual time=184.896..184.896 rows=0 loops=1) Buffers: shared hit=6324 -> Bitmap Index Scan on idx_bi1 (cost=0.00..2069.63 rows=20096 width=0) (actual time=61.600..61.600 rows=1280 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=2108 -> Bitmap Index Scan on idx_bi1 (cost=0.00..2069.73 rows=80384 width=0) (actual time=61.522..61.522 rows=1280 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=2108 -> Bitmap Index Scan on idx_bi1 (cost=0.00..2069.73 rows=80384 width=0) (actual time=61.773..61.773 rows=1280 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 12000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=2108 Planning time: 0.091 ms Execution time: 191.684 ms (19 rows)
3, pages_per_range=256
public | idx_bi | index | postgres | bi_user_tmall_vis1 | 6624 kB | postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where bid=1 and cnt between 1 and 100; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=1038.00..53587.92 rows=7351 width=20) (actual time=30.259..33.742 rows=4 loops=1) Output: uid, bid, cnt Recheck Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Rows Removed by Index Recheck: 40188 Heap Blocks: lossy=256 Buffers: shared hit=1310 -> Bitmap Index Scan on idx_bi (cost=0.00..1036.16 rows=40192 width=0) (actual time=30.251..30.251 rows=2560 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=1054 Planning time: 0.061 ms Execution time: 33.759 ms (11 rows) postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100) or (bid=2000 and cnt <10000) or (bid=12000 and cnt <10000); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=3223.91..265138.74 rows=153721 width=20) (actual time=90.760..105.509 rows=138 loops=1) Output: uid, bid, cnt Recheck Cond: (((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) OR ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) OR ((bi_user_tmall_vis1.bid = 12000) AND (bi _user_tmall_vis1.cnt < 10000))) Rows Removed by Index Recheck: 120438 Heap Blocks: lossy=768 Buffers: shared hit=3930 -> BitmapOr (cost=3223.91..3223.91 rows=200960 width=0) (actual time=90.746..90.746 rows=0 loops=1) Buffers: shared hit=3162 -> Bitmap Index Scan on idx_bi (cost=0.00..1036.16 rows=40192 width=0) (actual time=30.838..30.838 rows=2560 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=1054 -> Bitmap Index Scan on idx_bi (cost=0.00..1036.23 rows=80384 width=0) (actual time=29.966..29.966 rows=2560 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=1054 -> Bitmap Index Scan on idx_bi (cost=0.00..1036.23 rows=80384 width=0) (actual time=29.940..29.940 rows=2560 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 12000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=1054 Planning time: 0.131 ms Execution time: 105.555 ms (19 rows)
4,pages_per_range=512
public | idx_bi | index | postgres | bi_user_tmall_vis1 | 3336 kB | postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where bid=1 and cnt between 1 and 100; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=521.47..105255.40 rows=7351 width=20) (actual time=16.024..25.791 rows=4 loops=1) Output: uid, bid, cnt Recheck Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Rows Removed by Index Recheck: 80380 Heap Blocks: lossy=512 Buffers: shared hit=529 read=511 -> Bitmap Index Scan on idx_bi (cost=0.00..519.63 rows=80384 width=0) (actual time=16.010..16.010 rows=5120 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=528 Planning time: 0.238 ms Execution time: 25.822 ms (11 rows) postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100) or (bid=2000 and cnt <10000) or (bid=12000 and cnt <10000); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=1674.17..315338.06 rows=153721 width=20) (actual time=47.115..78.014 rows=138 loops=1) Output: uid, bid, cnt Recheck Cond: (((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) OR ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) OR ((bi_user_tmall_vis1.bid = 12000) AND (bi _user_tmall_vis1.cnt < 10000))) Rows Removed by Index Recheck: 241014 Heap Blocks: lossy=1536 Buffers: shared hit=2608 read=512 -> BitmapOr (cost=1674.17..1674.17 rows=241151 width=0) (actual time=47.099..47.099 rows=0 loops=1) Buffers: shared hit=1584 -> Bitmap Index Scan on idx_bi (cost=0.00..519.63 rows=80384 width=0) (actual time=16.167..16.167 rows=5120 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=528 -> Bitmap Index Scan on idx_bi (cost=0.00..519.63 rows=80384 width=0) (actual time=15.494..15.494 rows=5120 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=528 -> Bitmap Index Scan on idx_bi (cost=0.00..519.63 rows=80384 width=0) (actual time=15.437..15.437 rows=5120 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 12000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=528 Planning time: 0.145 ms Execution time: 78.062 ms (19 rows)
5,pages_per_range=sqrt(pg_class.relpages)=6384
public | idx_bi | index | postgres | bi_user_tmall_vis1 | 312 kB | postgres=# create index idx_bi on bi_user_tmall_vis1 using brin (bid,cnt) WITH (pages_per_range='6384'); CREATE INDEX postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100) or (bid=2000 and cnt <10000) or (bid=12000 and cnt <10000); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=252.98..3620468.00 rows=153721 width=20) (actual time=4.027..138.993 rows=138 loops=1) Output: uid, bid, cnt Recheck Cond: (((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) OR ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) OR ((bi_user_tmall_vis1.bid = 12000) AND (bi _user_tmall_vis1.cnt < 10000))) Rows Removed by Index Recheck: 1002150 Heap Blocks: lossy=6384 Buffers: shared hit=1662 read=4848 -> BitmapOr (cost=252.98..252.98 rows=3006577 width=0) (actual time=4.010..4.010 rows=0 loops=1) Buffers: shared hit=126 -> Bitmap Index Scan on idx_bi (cost=0.00..45.90 rows=1002192 width=0) (actual time=1.373..1.373 rows=63840 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=42 -> Bitmap Index Scan on idx_bi (cost=0.00..45.90 rows=1002192 width=0) (actual time=1.325..1.325 rows=63840 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=42 -> Bitmap Index Scan on idx_bi (cost=0.00..45.90 rows=1002192 width=0) (actual time=1.310..1.310 rows=63840 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 12000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=42 Planning time: 0.307 ms Execution time: 139.046 ms (19 rows) postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100) or (bid=2000 and cnt <10000) ; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=132.06..2459840.30 rows=80537 width=20) (actual time=2.735..112.409 rows=65 loops=1) Output: uid, bid, cnt Recheck Cond: (((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) OR ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000))) Rows Removed by Index Recheck: 1002223 Heap Blocks: lossy=6384 Buffers: shared hit=6468 -> BitmapOr (cost=132.06..132.06 rows=2004385 width=0) (actual time=2.720..2.720 rows=0 loops=1) Buffers: shared hit=84 -> Bitmap Index Scan on idx_bi (cost=0.00..45.90 rows=1002192 width=0) (actual time=1.401..1.401 rows=63840 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=42 -> Bitmap Index Scan on idx_bi (cost=0.00..45.90 rows=1002192 width=0) (actual time=1.318..1.318 rows=63840 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 2000) AND (bi_user_tmall_vis1.cnt < 10000)) Buffers: shared hit=42 Planning time: 0.126 ms Execution time: 112.449 ms (16 rows) postgres=# explain (analyze,timing,costs,buffers,verbose) select * from bi_user_tmall_vis1 where (bid=1 and cnt between 1 and 100); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on public.bi_user_tmall_vis1 (cost=47.73..1258330.06 rows=7351 width=20) (actual time=1.381..97.717 rows=4 loops=1) Output: uid, bid, cnt Recheck Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Rows Removed by Index Recheck: 1002284 Heap Blocks: lossy=6384 Buffers: shared hit=6426 -> Bitmap Index Scan on idx_bi (cost=0.00..45.90 rows=1002192 width=0) (actual time=1.368..1.368 rows=63840 loops=1) Index Cond: ((bi_user_tmall_vis1.bid = 1) AND (bi_user_tmall_vis1.cnt >= 1) AND (bi_user_tmall_vis1.cnt <= 100)) Buffers: shared hit=42 Planning time: 0.109 ms Execution time: 97.744 ms (11 rows)
Contrast of different pages_per_range
Index accuracy | Single Table Data Volume | Single table size | Index size | 1 conditions | 2 conditions | 3 conditions |
---|---|---|---|---|---|---|
pages_per_range=1 | 6 billion 400 million | 311GB | 1.6GB | 8.2 seconds | - | - |
pages_per_range=128 | 6 billion 400 million | 311GB | 13MB | 62 milliseconds | - | 191 milliseconds |
pages_per_range=256 | 6 billion 400 million | 311GB | 6MB | 33 milliseconds | - | 105 milliseconds |
pages_per_range=512 | 6 billion 400 million | 311GB | 3MB | 25 milliseconds | - | 78 milliseconds |
pages_per_range=sqrt(pg_class.relpages)=6384 | 6 billion 400 million | 311GB | 300KB | 97 milliseconds | 112 milliseconds | 139 milliseconds |
Although the precision is high, the bigger the index itself is, the higher the cost of scanning the index itself is. 8.2 seconds is the case.
When the precision is set to 512, the single condition becomes 25 milliseconds, while the index size is only 3 MB.
It's not unexpected to be unhappy.
Reflections on BRIN Kernel Optimization
In order to reduce the scanning overhead of BRIN index itself, we can regenerate the boundary of BRIN index into a tree, scan through the tree, improve the speed, not the way of full scan.
In the future, we just need to consider the accuracy = 1. Because this filter is the best, and the scanning cost of BRIN index itself is very low. Thus, the efficiency of BRIN index is greatly improved under the condition of massive data.
Summary
This paper mainly talks about the principle of BRIN index, the principle of scanning, the principle of precision parameters, and how to select precision parameters. It also talks about how to reduce the cost of BRIN index itself by adjusting the method of optimizing BRIN index scanning.
Well, have a good time. You know how much PostgreSQL, she can give you how much, PG is a very playable enterprise-level open source database, refueling.
The characteristics of BRIN index can be used to support massive data filtering at trillion level or even larger, while the storage of index and the impact on writing are almost zero. Absolutely belongs to the characteristics of black technology level.
Reference resources
PostgreSQL 10.0 preview Enhancement - BRIN Index Update smooth ness
<PostgreSQL 9.5 new feature - BRIN (block range index) index>