Label
PostgreSQL , pgbench , tpcb
background
pgbench is a test tool for PG. The built-in test CASE is tpcb test. It also supports users to write test CASE by themselves.
A large number of custom CASE references
https://github.com/digoal/blog/blob/master/201711/readme.md
When we use tpcb to test CASE, if the amount of data generated is too large, for example, I recently generated 1 trillion CASE, we can consider using partition tables. However, when the performance of PG built-in partition tables is very large, using PREPARED STATEMENT will lead to performance degradation.
It is recommended to use the pg_pathman partition plug-in to eliminate this performance problem.
Or you can change the SQL of pgbench and use udf and dynamic SQL to achieve the same performance.
Implementation of UDF Dynamic SQL for pgbench tpcb Partition Table
1. Primary QUERY is as follows
Read and write case
\set aid random(1, 100000 * :scale) \set bid random(1, 1 * :scale) \set tid random(1, 10 * :scale) \set delta random(-5000, 5000) BEGIN; UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; SELECT abalance FROM pgbench_accounts WHERE aid = :aid; UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); END;
Read-only case
\set aid random(1, 100000 * :scale) SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
Reconstruction of tpcb pgbench_accounts partition table
1. Definition structure
For example, 1 trillion
pgbench -i -I dt --tablespace=tbs1 -s 10000000
2. Using partitions
create table p (like pgbench_accounts) partition by RANGE ( aid ) tablespace tbs1;
3. Create 4097 partitions with 244140625 records per partition
do language plpgsql $$ declare i_rows_perpartition int8 := 244140625; begin for i in 0..4096 loop execute format ('create table pgbench_accounts%s partition of p for values from (%s) to (%s) tablespace tbs1', i, i*i_rows_perpartition, (i+1)*i_rows_perpartition); end loop; end; $$;
4. Rename table
drop table pgbench_accounts; alter table p rename to pgbench_accounts; -- alter table pgbench_accounts add constraint pk_pgbench_accounts_aid primary key (aid) using index tablespace tbs2;
5. New Task Table and Parallel Loading of Data
drop table task; create table task(id int primary key); insert into task select i from generate_series(0,4095) t(i);
6. Create a new basic data table and write 244140625 pieces of data.
create table init_accounts(aid int8); insert into init_accounts select generate_series(0,244140624);
7. Create UDF, call it once and write 244140625 pieces of data to a single partition.
create or replace function tpcb_init_accounts() returns void as $$ declare v_id int; begin with tmp as (select * from task limit 1 for update skip locked), tmp1 as (delete from task using tmp where task.id=tmp.id) select id into v_id from tmp; if found then execute format ('insert into pgbench_accounts%s select aid+%s*244140625::int8, ((aid+%s*244140625::int8)-1)/100000 + 1, 0 from init_accounts on conflict do nothing', v_id, v_id, v_id); end if; end; $$ language plpgsql strict;
8. Using pgbench, 64 concurrencies are opened to generate 1 trillion tpcb pgbench_accounts test data.
vi test.sql select tpcb_init_accounts(); nohup pgbench -M prepared -n -r -f ./test.sql -c 64 -j 64 -t 100 >./init.log 2>&1 &
9. Create task tables to generate pgbench_accounts partition table PK
drop table task; create table task(id int primary key); insert into task select i from generate_series(0,4095) t(i);
10. New UDF for generating pgbench_accounts partition table PK
create or replace function tpcb_init_accounts_pkey() returns void as $$ declare v_id int; begin with tmp as (select * from task limit 1 for update skip locked), tmp1 as (delete from task using tmp where task.id=tmp.id) select id into v_id from tmp; if found then execute format ('analyze pgbench_accounts%s', v_id); execute format ('alter table pgbench_accounts%s add constraint pk_pgbench_accounts%s_aid primary key (aid) using index tablespace tbs2', v_id, v_id); end if; end; $$ language plpgsql strict;
11. Generating pgbench_accounts partition table PK
vi test.sql select tpcb_init_accounts_pkey(); nohup pgbench -M prepared -n -r -f ./test.sql -c 64 -j 64 -t 100 >./init.log 2>&1 &
If you use the pg_pathman plug-in, don't bother.
1. Using pgbench to generate structure
pgbench -i -I dt --tablespace=tbs1 -s 10000000
2. Convert to partition table
https://github.com/postgrespro/pg_pathman
PostgreSQL 9.5 + Efficient Partition Table Implementation - pg_pathman
3. Using pgbench to generate data
pgbench -i -I g -n --tablespace=tbs1 -s 10000000
4, add PK
Table level parallel analyze surface pgbench -i -I p -n -s 10000000 --index-tablespace=tbs2
tpcb UDF
1. Read and write
create or replace function tpcb_rw( i_aid int8, i_bid int4, i_tid int4, i_delta int4, i_rows_perpartition int8 ) returns int as $$ declare i_suffix int := i_aid/i_rows_perpartition; res int; begin execute format('execute p1_%s(%s,%s)', i_suffix, i_delta, i_aid); execute format('execute p2_%s(%s)', i_suffix, i_aid) into res; UPDATE pgbench_tellers SET tbalance = tbalance + i_delta WHERE tid = i_tid; UPDATE pgbench_branches SET bbalance = bbalance + i_delta WHERE bid = i_bid; INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (i_tid, i_bid, i_aid, i_delta, CURRENT_TIMESTAMP); return res; exception when others then execute format('prepare p1_%s(int,int8) as UPDATE pgbench_accounts%s SET abalance = abalance + $1 WHERE aid = $2', i_suffix, i_suffix); execute format('prepare p2_%s(int8) as SELECT abalance FROM pgbench_accounts%s WHERE aid = $1', i_suffix, i_suffix); execute format('execute p1_%s(%s,%s)', i_suffix, i_delta, i_aid); execute format('execute p2_%s(%s)', i_suffix, i_aid) into res; return res; end; $$ language plpgsql strict;
2, read only
create or replace function tpcb_ro( i_aid int8, i_rows_perpartition int8 ) returns int as $$ declare i_suffix int := i_aid/i_rows_perpartition; res int; begin execute format('execute p2_%s(%s)', i_suffix, i_aid) into res; return res; exception when others then execute format('prepare p2_%s(int8) as SELECT abalance FROM pgbench_accounts%s WHERE aid = $1', i_suffix, i_suffix); execute format('execute p2_%s(%s)', i_suffix, i_aid) into res; return res; end; $$ language plpgsql strict;
3. Change to a custom script
Read and write CASE
vi rw.sql \set aid random(1, 100000 * :scale) \set bid random(1, 1 * :scale) \set tid random(1, 10 * :scale) \set delta random(-5000, 5000) SELECT tpcb_rw(:aid,:bid,:tid,:delta,244140625);
Read-only case
vi ro.sql \set aid random(1, 100000 * :scale) SELECT tpcb_ro(:aid,244140625);
Reference resources
https://github.com/digoal/blog/blob/master/201711/readme.md
https://github.com/postgrespro/pg_pathman
PostgreSQL 9.5 + Efficient Partition Table Implementation - pg_pathman