Label
PostgreSQL , 10.0 , libpq , jdbc , failover , loadbalance , multi host , target_session_attrs
background
The word database should be familiar in the Internet. But how can master-backup switching and applications work together seamlessly? You may have such questions.
1. What type of QUERY is sent to the main library and what type of QUERY is sent to the standby?
2. How does the client connection cooperate after the role switching between the main library and the standby?
There are ways in which the industry can answer these two questions.
1. Cluster software is usually used, and VIP is used to represent the role of main and backup repositories. No matter how to switch, VIP1 is always the main repository and VIP2 is always the backup repository.
For the client, it is judged by VIP whether it is the main library or the backup library. The program itself controls which SQL is sent to the main library and which SQL is sent to the backup library.
A typical example
https://github.com/digoal/PostgreSQL_HA_with_primary_standby_2vip
2. Using VIP is a way to make database and application form a tacit understanding. Is there any better way? For example, can the database driver cooperate with the role judgment?
Oracle does a good job, Oracle's client can cooperate with RAC, automatically identify failover, switchover.
PostgreSQL jdbc driver also achieves similar functions, supporting failover, load balance.
https://jdbc.postgresql.org/documentation/94/connect.html
The parameters of PostgreSQL jdbc are as follows
1.
targetServerType = {any, master, slave, preferSlave} Allows opening connections to only servers with required state, the allowed values are any, master, slave and preferSlave. The master/slave distinction is currently done by observing if the server allows writes. The value preferSlave tries to connect to slaves if any are available, otherwise allows falls back to connecting also to master. any Represents connecting to any node preferSlave Represents preferential connection slave node master or slave Represents a connection master or slave Nodes.
2.
hostRecheckSeconds = int Controls how long in seconds the knowledge about a host state is cached in JVM wide global cache. The default value is 10 seconds. Time interval for re-detection of node status
3.
loadBalanceHosts = boolean In default mode (disabled) hosts are connected in the given order. If enabled hosts are chosen randomly from the set of suitable candidates. Whether to select nodes randomly or not, load balancing
4.
Connection Fail-over To support simple connection fail-over it is possible to define multiple endpoints (host and port pairs) in the connection url separated by commas. The driver will try to once connect to each of them in order until the connection succeeds. If none succeed, a normal connection exception is thrown. //When multiple target nodes are configured, JDBC will connect the target sequentially until the connection is successful. jdbc Connection string grammar The syntax for the connection url is: jdbc:postgresql://host1:port1,host2:port2/database The simple connection fail-over is useful when running against a high availability postgres installation that has identical data on each node. For example streaming replication postgres or postgres-xc cluster.
5.
Typically, an application can configure two connection pools, one for master, to which write requests are sent. The other configuration is slave priority, while turning on load balancing parameters and sending read requests to this connection pool.
For example an application can create two connection pools. One data source is for writes, another for reads. The write pool limits connections only to master node: jdbc:postgresql://node1,node2,node3/accounting?targetServerType=master . And read pool balances connections between slaves nodes, but allows connections also to master if no slaves are available: jdbc:postgresql://node1,node2,node3/accounting?targetServerType=preferSlave&loadBalanceHosts=true
PostgreSQL 10.0 libpq adds multiple connections
PostgreSQL 10.0 libpq layer also adds multi-connection function. The target_session_attrs parameter is introduced in the design, which can be set to read-write or any. Different target_session_attrs configurations correspond to different node detection mechanisms.
target_session_attrs=read-write, uses show transaction_read_only to detect the node, returns on to indicate that it is a read-only node, and off to indicate that it is a read-write node. (standby returns on, while default_transaction_read_only allows the master to return on as well).
target_session_attrs=any, which means no detection.
The commit information for the two patch es is as follows.
libpq: Allow connection strings and URIs to specify multiple hosts. author Robert Haas <rhaas@postgresql.org> Thu, 3 Nov 2016 21:25:20 +0800 (09:25 -0400) committer Robert Haas <rhaas@postgresql.org> Thu, 3 Nov 2016 21:25:20 +0800 (09:25 -0400) commit 274bb2b3857cc987cfa21d14775cae9b0dababa5 tree 488b5fd46e2cb4acdab7fb2dd30c4e4d1d4bb7d1 tree | snapshot parent 770671062f130a830aa89100c9aa2d26f8d4bf32 commit | diff libpq: Allow connection strings and URIs to specify multiple hosts. It's also possible to specify a separate port for each host. Previously, we'd loop over every address returned by looking up the host name; now, we'll try every address for every host name. Patch by me. Victor Wagner wrote an earlier patch for this feature, which I read, but I didn't use any of his code. Review by Mithun Cy.
libpq: Add target_session_attrs parameter. author Robert Haas <rhaas@postgresql.org> Wed, 30 Nov 2016 01:18:31 +0800 (12:18 -0500) committer Robert Haas <rhaas@postgresql.org> Wed, 30 Nov 2016 01:18:31 +0800 (12:18 -0500) Commit 274bb2b3857cc987cfa21d14775cae9b0dababa5 made it possible to specify multiple IPs in a connection string, but that's not good enough for the case where you have a read-write master and a bunch of read-only standbys and want to connect to whichever server is the master at the current time. This commit allows that, by making it possible to specify target_session_attrs=read-write as a connection parameter. There was extensive discussion of the best name for the connection parameter and its values as well as the best way to distinguish master and standbys. For now, adopt the same solution as JDBC: if the user wants a read-write connection, issue 'show transaction_read_only' and rejection the connection if the result is 'on'. In the future, we could add additional values of this new target_session_attrs parameter that issue different queries; or we might have some way of distinguishing the server type without resorting to an SQL query; but right now, we have this, and that's (hopefully) a good start. Victor Wagner and Mithun Cy. Design review by Álvaro Herrera, Catalin Iacob, Takayuki Tsunakawa, and Craig Ringer; code review by me. I changed Mithun's patch to skip all remaining IPs for a host if we reject a connection based on this new parameter, rewrote the documentation, and did some other cosmetic cleanup. Discussion: http://postgr.es/m/CAD__OuhqPRGpcsfwPHz_PDqAGkoqS1UvnUnOnAB-LBWBW=wu4A@mail.gmail.com
Introduction to the usage of libpq
URI format
postgresql://[user[:password]@][netloc][:port][,...][/dbname][?param1=value1&...] postgresql:// postgresql://localhost postgresql://localhost:5433 postgresql://localhost/mydb postgresql://user@localhost postgresql://user:secret@localhost postgresql://other@localhost/otherdb?connect_timeout=10&application_name=myapp postgresql://host1:123,host2:456/somedb?target_session_attrs=any&application_name=myapp
Configure multiple target nodes, and host:port is separated by commas.
host Comma-separated list of host names. If a host name begins with a slash, it specifies Unix-domain communication rather than TCP/IP communication; the value is the name of the directory in which the socket file is stored. If multiple host names are specified, each will be tried in turn in the order given. The default behavior when host is not specified is to connect to a Unix-domain socket in /tmp (or whatever socket directory was specified when PostgreSQL was built). On machines without Unix-domain sockets, the default is to connect to localhost. port Port number to connect to at the server host, or socket file name extension for Unix-domain connections. If the host parameter included multiple, comma-separated hosts, this parameter may specify a list of ports of equal length, or it may specify a single port number to be used for all hosts. target_session_attrs If this parameter is set to read-write, only a connection in which read-write transactions are accepted by default is considered acceptable. The query show transaction_read_only will be sent upon any successful connection; if it returns on, the connection will be closed. If multiple hosts were specified in the connection string, any remaining servers will be tried just as if the connection attempt had failed. The default value of this parameter, any, regards all connections as acceptable.
With libpq, you can also achieve the same effect as jdbc (load balancing, automatically find master).
HA only needs to be responsible for switching roles, and no longer need to be responsible for switching IP addresses. You can be more careful.
For this discussion of patch, see Mail Group, URL at the end of this article.
PostgreSQL community style is very rigorous, a patch may be discussed in the mail group for several months or even years, according to your opinion repeated amendments, patch merged into master is very mature, so the stability of PostgreSQL is well known.
Reference resources
https://www.postgresql.org/docs/devel/static/libpq-connect.html#libpq-connstring