PostgreSQL uses repmgr to configure cascading replication

Keywords: PostgreSQL Database Linux

Recent projects require the configuration of a four-node cluster of shared storage to enable the cluster to form a load balancing.

However, you know that PostgreSQL does not support the generation of multiple instances using the same data directory. When executing pg_ctl start, if an instance of the specified data directory is running, the instance will fail to down load the database.Therefore, we chose a strategy based on pgpool-II + repmgr to implement load balancing strategy based on NAS shared storage.The architecture is illustrated below:

There are many ways to implement shared disks. This paper mainly explains how to use repmgr:

(Zero) Preparations

In this example, four virtual machines are prepared with IP addresses:

192.168.100.1 node1

192.168.100.2 node2

192.168.100.3 node3

192.168.100.4 node4

In NAS storage, directories data1, data2, data3, and data4 are created and mounted on each data node.*Note that directory permissions must be 700

Install repmgr on all nodes.Of these, initdb is executed on only node1 nodes, while node2, node3, and node4 are not initialized.

Add the following configuration to postgresql.conf of the node1 node:

hot_standby = on
wal_level = 'replica'
max_wal_senders = 50

Create repmgr users and databases

[Harvest@node1 ~] createuser -s repmgr
[Harvest@node1 ~] createdb repmgr -O repmgr

Configure pg_hba.conf, adding the following rules:

local    replication     repmgr                                              trust
host     replication     repmgr                 127.0.0.1/32                 trust
host     replication     repmgr                 192.168.100.0/0              trust
local    repmgr          repmgr                                              trust
host     repmgr          repmgr                 127.0.0.1/32                 trust
host     repmgr          repmgr                 192.168.100.0/0              trust

Start the database and test whether you can connect to it through repmgr in node2 (or node3, node4)

[Harvest@node1 ~] pg_ctl start

[Harvest@node2 ~] psql 'host=node1 user=repmgr dbname=repmgr connect_timeout=2'

(1) Install repmgr

First, unzip the repmgr and install:

[Harvest@node1 opt]cd repmgr-x.x.x

[Harvest@node1 repmgr-x.x.x] ./configure && make && make install #Both sed and pg_config are required to execute the configure. Basically, the linux system will have sed and pg_config will have postgresql installed

(2) Configuring node1

Create the configuration file repmgr1.conf on node1:

node_id=1
 node_name=node1
 conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
 data_directory='/opt/data/data1'

Next, register node1 as the master server:

[Harvest@node1 ~] repmgr -f repmgr1.conf primary register

If the following echoes on the screen, the registration is successful:

INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (id: 1) registered

Verify cluster state

[Harvest@node1 ~]repmgr -f repmgr.conf cluster show

The echo is as follows:

ID | Name  | Role    | Status    | Upstream | Connection string
 ----+-------+---------+-----------+----------+--------------------------------------------------------
  1  | node1 | primary | * running |          | host=node1 dbname=repmgr user=repmgr connect_timeout=2

(2) adding node2

Configure repmgr2.conf as described above:

node_id=2
 node_name=node2
 conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
 data_directory='/opt/data/data2'

Create a node2 backup, which uses the pg_basebackup command to back up the node1 node's data directory to the data2 directory

[Harvest@node2 ~] repmgr -h node1 -U repmgr -d repmgr -f repmgr.conf standby clone

The echo below indicates that the backup was successful

NOTICE: using configuration file "/etc/repmgr.conf"
 NOTICE: destination directory "/opt/data/data2" provided
 INFO: connecting to source node
 NOTICE: checking for available walsenders on source node (2 required)
 INFO: sufficient walsenders available on source node (2 required)
 INFO: creating directory "/opt/data/data2"...
 NOTICE: starting backup (using pg_basebackup)...
 HINT: this may take some time; consider using the -c/--fast-checkpoint option
 INFO: executing:
      pg_basebackup -l "repmgr base backup" -D /opt/data/data2 -h node1 -U repmgr -X stream
 NOTICE: standby clone (using pg_basebackup) complete
 NOTICE: you can now start your PostgreSQL server
 HINT: for example: pg_ctl -D /opt/data/data2 start

[Harvest@node2 ~] repmgr -f repmgr.conf standby register

The echo is as follows:

NOTICE: standby node "node2" (ID: 2) successfully registered

Verify cluster state

[Harvest@node1 ~]repmgr -f repmgr.conf cluster show

     ID | Name  | Role    | Status    | Upstream | Location | Connection string
    ----+-------+---------+-----------+----------+----------+--------------------------------------
     1  | node1 | primary | * running |          | default  | host=node1 dbname=repmgr user=repmgr
     2  | node2 | standby |   running | node1    | default  | host=node2 dbname=repmgr user=repmgr

(3) Registering node3 and node4

Noe3 and Noe4 are configured the same as Noe2, except that the parameter'--upstream-node-id=2'is added, which specifies the source of the data flow:

[Harvest@node3 ~] repmgr -h node2 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --upstream-node-id=2

[Harvest@node3 ~] repmgr -f /etc/repmgr.conf standby register --upstream-node-id=2

Similarly, the data stream source for node4 is node3:

[Harvest@node4 ~] repmgr -h node3 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --upstream-node-id=3

[Harvest@node4 ~] repmgr -f /etc/repmgr.conf standby register --upstream-node-id=3

Above, the cascade replication configuration of repmgr is used.

Reference resources

1.repmgr Manual: https://repmgr.org/docs/current/index.html

2. Manual in Chinese: http://citusdb.cn/?p=1068

Posted by kryles on Fri, 03 Apr 2020 13:26:34 -0700

Programmer Group

PostgreSQL uses repmgr to configure cascading replication

Hot Keywords