PostgreSQL 10.0 preview enhancements - libpq supports multiple host connections (failover,LB) to make database HA and applications more closely coordinated

Keywords: PostgreSQL JDBC git github

Label

PostgreSQL , 10.0 , libpq , jdbc , failover , loadbalance , multi host , target_session_attrs

background

The word database should be familiar in the Internet. But how can master-backup switching and applications work together seamlessly? You may have such questions.

1. What type of QUERY is sent to the main library and what type of QUERY is sent to the standby?

2. How does the client connection cooperate after the role switching between the main library and the standby?

There are ways in which the industry can answer these two questions.

1. Cluster software is usually used, and VIP is used to represent the role of main and backup repositories. No matter how to switch, VIP1 is always the main repository and VIP2 is always the backup repository.

For the client, it is judged by VIP whether it is the main library or the backup library. The program itself controls which SQL is sent to the main library and which SQL is sent to the backup library.

A typical example

https://github.com/digoal/PostgreSQL_HA_with_primary_standby_2vip

2. Using VIP is a way to make database and application form a tacit understanding. Is there any better way? For example, can the database driver cooperate with the role judgment?

Oracle does a good job, Oracle's client can cooperate with RAC, automatically identify failover, switchover.

PostgreSQL jdbc driver also achieves similar functions, supporting failover, load balance.

https://jdbc.postgresql.org/documentation/94/connect.html

The parameters of PostgreSQL jdbc are as follows

1.

targetServerType = {any, master, slave, preferSlave}  
  
Allows opening connections to only servers with required state, the allowed values are any, master, slave and preferSlave.   
The master/slave distinction is currently done by observing if the server allows writes.   
The value preferSlave tries to connect to slaves if any are available, otherwise allows falls back to connecting also to master.  
  
any Represents connecting to any node  
preferSlave Represents preferential connection slave node  
master or slave Represents a connection master or slave Nodes.  

2.

hostRecheckSeconds = int  
  
Controls how long in seconds the knowledge about a host state is cached in JVM wide global cache.   
The default value is 10 seconds.  
  
Time interval for re-detection of node status  

3.

loadBalanceHosts = boolean  
  
In default mode (disabled) hosts are connected in the given order.   
If enabled hosts are chosen randomly from the set of suitable candidates.  
  
Whether to select nodes randomly or not, load balancing  

4.

Connection Fail-over  
  
To support simple connection fail-over it is possible to define multiple endpoints (host and port pairs) in the connection url separated by commas.   
The driver will try to once connect to each of them in order until the connection succeeds.   
If none succeed, a normal connection exception is thrown.  
  
//When multiple target nodes are configured, JDBC will connect the target sequentially until the connection is successful.  
  
jdbc Connection string grammar  
  
The syntax for the connection url is:  
jdbc:postgresql://host1:port1,host2:port2/database  
  
The simple connection fail-over is useful when running against a high availability postgres installation that has identical data on each node.   
For example streaming replication postgres or postgres-xc cluster.  

5.

Typically, an application can configure two connection pools, one for master, to which write requests are sent. The other configuration is slave priority, while turning on load balancing parameters and sending read requests to this connection pool.

For example an application can create two connection pools.   
One data source is for writes, another for reads.   
The write pool limits connections only to master node:  
  
jdbc:postgresql://node1,node2,node3/accounting?targetServerType=master .   
  
And read pool balances connections between slaves nodes, but allows connections also to master if no slaves are available:  
jdbc:postgresql://node1,node2,node3/accounting?targetServerType=preferSlave&loadBalanceHosts=true  

PostgreSQL 10.0 libpq adds multiple connections

PostgreSQL 10.0 libpq layer also adds multi-connection function. The target_session_attrs parameter is introduced in the design, which can be set to read-write or any. Different target_session_attrs configurations correspond to different node detection mechanisms.

target_session_attrs=read-write, uses show transaction_read_only to detect the node, returns on to indicate that it is a read-only node, and off to indicate that it is a read-write node. (standby returns on, while default_transaction_read_only allows the master to return on as well).

target_session_attrs=any, which means no detection.

The commit information for the two patch es is as follows.

libpq: Allow connection strings and URIs to specify multiple hosts.  
  
author	Robert Haas <rhaas@postgresql.org>	  
Thu, 3 Nov 2016 21:25:20 +0800 (09:25 -0400)  
committer	Robert Haas <rhaas@postgresql.org>	  
Thu, 3 Nov 2016 21:25:20 +0800 (09:25 -0400)  
commit	274bb2b3857cc987cfa21d14775cae9b0dababa5  
tree	488b5fd46e2cb4acdab7fb2dd30c4e4d1d4bb7d1	tree | snapshot  
parent	770671062f130a830aa89100c9aa2d26f8d4bf32	commit | diff  
libpq: Allow connection strings and URIs to specify multiple hosts.  
  
It's also possible to specify a separate port for each host.  
  
Previously, we'd loop over every address returned by looking up the  
host name; now, we'll try every address for every host name.  
  
Patch by me.  Victor Wagner wrote an earlier patch for this feature,  
which I read, but I didn't use any of his code.  Review by Mithun Cy.  
libpq: Add target_session_attrs parameter.  
  
author	Robert Haas <rhaas@postgresql.org>	  
Wed, 30 Nov 2016 01:18:31 +0800 (12:18 -0500)  
committer	Robert Haas <rhaas@postgresql.org>	  
Wed, 30 Nov 2016 01:18:31 +0800 (12:18 -0500)  
Commit 274bb2b3857cc987cfa21d14775cae9b0dababa5 made it possible to  
specify multiple IPs in a connection string, but that's not good  
enough for the case where you have a read-write master and a bunch of  
read-only standbys and want to connect to whichever server is the  
master at the current time.  This commit allows that, by making it  
possible to specify target_session_attrs=read-write as a connection  
parameter.  
  
There was extensive discussion of the best name for the connection  
parameter and its values as well as the best way to distinguish master  
and standbys.  For now, adopt the same solution as JDBC: if the user  
wants a read-write connection, issue 'show transaction_read_only' and  
rejection the connection if the result is 'on'.  In the future, we  
could add additional values of this new target_session_attrs parameter  
that issue different queries; or we might have some way of  
distinguishing the server type without resorting to an SQL query; but  
right now, we have this, and that's (hopefully) a good start.  
  
Victor Wagner and Mithun Cy.  Design review by Álvaro Herrera, Catalin  
Iacob, Takayuki Tsunakawa, and Craig Ringer; code review by me.  I  
changed Mithun's patch to skip all remaining IPs for a host if we  
reject a connection based on this new parameter, rewrote the  
documentation, and did some other cosmetic cleanup.  
  
Discussion: http://postgr.es/m/CAD__OuhqPRGpcsfwPHz_PDqAGkoqS1UvnUnOnAB-LBWBW=wu4A@mail.gmail.com  

Introduction to the usage of libpq

URI format

postgresql://[user[:password]@][netloc][:port][,...][/dbname][?param1=value1&...]  
  
postgresql://  
postgresql://localhost  
postgresql://localhost:5433  
postgresql://localhost/mydb  
postgresql://user@localhost  
postgresql://user:secret@localhost  
postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp  
postgresql://host1:123,host2:456/somedb?target_session_attrs=any&application_name=myapp  

Configure multiple target nodes, and host:port is separated by commas.

host  
  
Comma-separated list of host names.   
If a host name begins with a slash, it specifies Unix-domain communication rather than TCP/IP communication;   
the value is the name of the directory in which the socket file is stored.   
If multiple host names are specified, each will be tried in turn in the order given.   
The default behavior when host is not specified is to connect to a Unix-domain socket in /tmp (or whatever socket directory was specified when PostgreSQL was built).   
On machines without Unix-domain sockets, the default is to connect to localhost.  
  
port  
  
Port number to connect to at the server host, or socket file name extension for Unix-domain connections.   
If the host parameter included multiple, comma-separated hosts, this parameter may specify a list of ports of equal length,   
or it may specify a single port number to be used for all hosts.  
  
target_session_attrs  
  
If this parameter is set to read-write, only a connection in which read-write transactions are accepted by default is considered acceptable.   
The query show transaction_read_only will be sent upon any successful connection;   
if it returns on, the connection will be closed.   
If multiple hosts were specified in the connection string, any remaining servers will be tried just as if the connection attempt had failed.   
The default value of this parameter, any, regards all connections as acceptable.  

With libpq, you can also achieve the same effect as jdbc (load balancing, automatically find master).

HA only needs to be responsible for switching roles, and no longer need to be responsible for switching IP addresses. You can be more careful.

For this discussion of patch, see Mail Group, URL at the end of this article.

PostgreSQL community style is very rigorous, a patch may be discussed in the mail group for several months or even years, according to your opinion repeated amendments, patch merged into master is very mature, so the stability of PostgreSQL is well known.

Reference resources

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=274bb2b3857cc987cfa21d14775cae9b0dababa5

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=721f7bd3cbccaf8c07cad2707826b83f84694832

https://www.postgresql.org/docs/devel/static/libpq-connect.html#libpq-connstring

https://jdbc.postgresql.org/documentation/94/connect.html

Posted by qiangxd on Sun, 06 Jan 2019 20:09:10 -0800