HAProxy Scheduling Algorithm

Keywords: Web Server Session iptables Nginx

Article Directory

HAProxy Scheduling Algorithm

HAProxy specifies the scheduling algorithm for back-end servers through a fixed parameter balance, which can be configured in the listen or backend options.
HAProxy's scheduling algorithms are divided into static and dynamic scheduling algorithms, but some algorithms can convert each other between static and dynamic algorithms based on their parameters.

1. Static algorithms

Static algorithm: Polls fair dispatch according to pre-defined rules, does not care about the current load, number of links and corresponding speed of the back-end server, and can not modify weights in real time, it can only take effect by restarting HAProxy.

1. static-rr

Weight-based polling scheduling, runtime adjustment of weights and slow startup of back-end servers are not supported, and there is no limit on the number of back-end hosts

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance static-rr
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 2 check inter 3000 fall 2 rise 5

2. first

Depending on where the server is in the list, dispatches from top to bottom, but only if the number of connections to the first server reaches the maximum, new requests will be assigned to the next service, thus ignoring the server's weight settings

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance first
    server web1 192.168.7.103:80 maxconn 2 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

2. Dynamic algorithms:

Dynamic algorithm: Scheduling is adjusted appropriately based on the state of the back-end server, such as priority scheduling to servers with low current load, and weights can be dynamically adjusted at haproxy runtime without restarting.

Server Dynamic Weight Adjustment

[root]# yum install socat
[root]# echo "show info" | socat stdio /var/lib/haproxy/haproxy.sock
[root]# echo "get weight web_host/web1" | socat stdio /var/lib/haproxy/haproxy.sock
1 (initial 1)

[root]# echo "set weight web_host/web1 2" | socat stdio /var/lib/haproxy/haproxy.sock
Backend is using a static LB algorithm and only accepts weights '0%' and '100%'.

1. roundrobin

Weight-based polling dynamic scheduling algorithm, which supports run-time adjustment of weights, is not exactly equal to rr round-robin in lvs. roundrobin in HAProxy supports slow start (new servers will gradually increase the number of forwards), and it supports a maximum of 4095 real server s in each backend

roundrobin is the default scheduling algorithm and supports dynamic adjustment of real server weights.

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance roundrobin
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 2 check inter 3000 fall 2 rise 5

Dynamically adjust permissions:

[root]# echo "get weight web_host/web1" | socat stdio /var/lib/haproxy/haproxy.sock1
1 (initial 1)

[root]# echo "set weight web_host/web1 3" | socat stdio /var/lib/haproxy/haproxy.sock1
[root]# echo "get weight web_host/web1" | socat stdio /var/lib/haproxy/haproxy.sock1
3 (initial 1)

#Manually down load back-end services
[root]# echo "disable web_host/web1" | socat stdio /var/lib/haproxy/haproxy.sock1

2. leastconn

The dynamic weighted minimum connection, which supports runtime adjustment and slow startup of weights, is the least priority scheduling of current backend server connections (new client connections), which is more suitable for scenarios with long connections, such as MySQL.

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance leastconn
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

3. Hybrid algorithm

Some algorithms can be either static or dynamic.

1. source

The source address hash, which is based on the user's source address hash and forwards the request to the back-end server, defaults to static as-you-take mode, but can be changed through the options supported by hash-type, and subsequent requests for the same source address will be forwarded to the same back-end web server, which is more suitable for scenarios such as session retention/caching business.

The source address has two ways of calculating the server selection that forwards client requests to the back-end server.
Modeling and consistency hash, respectively

map-base modelling
hash array modeling based on total server weight, which is static, does not support online weight adjustment, does not support slow startup, and balances the scheduling of back-end servers. The disadvantage is that when the total server weight changes, that is, when the server is online or offline, the overall scheduling results will change due to weight changes..

Sample Modeling Configuration

listen web_host
bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
mode tcp
log global
balance source
server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

Consistency hash:
Consistent hash, which is dynamic and supports online weight adjustment and slow start, has the advantage of scheduling when the total weight of the server changes
The effect of the result is local and will not cause big changes, hash(o) mod n.

hash object:
Hash object to back-end server mapping:

Consistency hash diagram:
Online and offline scheduling of back-end servers

Consistency hash configuration example:

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode tcp
    log global
    balance source
    hash-type consistent
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

2. uri

Hash based on the uri requested by the user and forward the request to the back-end specified server, or use a modelling or consistency hash as defined by map-based and consistent.

uri modelling configuration example

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance uri
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

uri consistency hash configuration example

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance uri
    hash-type consistent
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

3. url_param:

url_param hash es the parameter name in the params section of the url requested by the user and dispatches it to a picked server after the total server weight is divided; usually used to track users to ensure that requests from the same user always go to the same real server

hypothesis:
    url = http://www.magedu.com/foo/bar/index.php?k1=v1&k2=v2
//Then:
    host = "www.magedu.com"
    url_param = "k1=v1&k2=v2"

Example of url_param modulo configuration

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance url_param name,age #Supports hash for single and multiple url_param values
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

url_param consistency hash configuration example

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance url_param name,age #Supports hash for single and multiple url_param values
    hash-type consistent
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

4. hdr

Hash the specified information in each http header request of the user. The http header specified by name here will be taken out and hash calculated, and then distributed to a selected server after the total server weight is divided. If there is no valid value, the default polling schedule will be used.

hdr modelling configuration example

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance hdr(User-Agent)
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

hdr consistency hash configuration example

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance hdr(User-Agent)
    hash-type consistent
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

5. rdp-cookie

rdp-cookie load on remote desktop, keep session with cookie

rdp-cookie modelling configuration example

listen RDP
    bind 192.168.7.101:3389
    balance rdp-cookie
    mode tcp
    server rdp0 172.18.132.20:3389 check fall 3 rise 5 inter 2000 weight 1

rdp-cookie consistency hash configuration example

listen RDP
    bind 192.168.7.101:3389
    balance rdp-cookie
    hash-type consistent
    mode tcp
    server rdp0 172.18.132.20:3389 check fall 3 rise 5 inter 2000 weight 1

Implementation based on iptables

[root]# vim /etc/sysctl.conf
[root]# sysctl -p
net.ipv4.ip_forward = 1

[root]# iptables -t nat -A PREROUTING -d 192.168.7.101 -p tcp --dport 3389 -j DNAT --todestination 172.18.139.20:3389
[root]# iptables -t nat -A POSTROUTING -s 192.168.0.0/21 -j SNAT --to-source 192.168.7.101

6. random

Starting at version 1.9, a load balancing algorithm called random was added, which is based on a random number as the key to the consistency hash. Random load balancing is useful for large farms or for frequently adding or deleting servers because it avoids the hammer effect caused by roundrobin or leastconn in this case.

random configuration instance

listen web_host
    bind 192.168.7.101:80,:8801-8810,192.168.7.101:9001-9010
    mode http
    log global
    balance random
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

Summary of algorithms

algorithm TCP/HTTP Dynamic/Static
static-rr tcp/http static state
first tcp/http static state
roundrobin tcp/http dynamic
leastconn tcp/http dynamic
random tcp/http dynamic
source tcp/http
Uri http
url_param http Depending on whether hash_type is consistent
hdr http
rdp-cookie tcp

Algorithmic usage scenarios

Algorithms Use scenarios
first Use less
static-rr A web Cluster for session sharing
roundrobin
random
leastconn data base
source Session Holding Based on Client Public Network IP
Uri Cache server, CDN service provider, Blue flood, Baidu, Aliyun, Tencent
url_param
hdr Next step based on client request header
rdp-cookie Rarely used

Difference between Layer 4 and Layer 7

Layer 4: IP+PORT Forwarding
Layer 7: Protocol + Content Exchange

Four-tier load:
In a four-tier load device, select the target address of the message sent by the client (originally the IP address of the load balancing device) and the corresponding web server IP address according to the rules for selecting the web server set by the load balancing device, so that the client can establish a TCP connection with the server directly and send data.

Seven-tier Agent:
The seven-tier load balancing server acts as a reverse proxy server. The server shakes hands three times to establish a TCP connection, while the client shakes hands three times to access the webserver and then establishes a TCP connection to send message information to the seven-tier load balancing; then the seven-tier load balancing and thenSelect a specific webserver according to the set balancing rules, then establish a TCP connection with this webserver by shaking hands three times, then the webserver sends the required data to the seven-tier load balancing device, and the load balancing device sends the data to the client; therefore, the seven-tier load balancing device acts as a proxy server.

IP Transport:

The real IP address of the client needs to be recorded in the web server for scenarios such as access statistics, security protection, behavior analysis, area ranking, etc.

Four-Layer IP Transport

haproxy configuration:

listen web_prot_http_nodes
    bind 192.168.7.101:80
    mode tcp
    balance roundrobin
    server web1 blogs.studylinux.net:80 send-proxy check inter 3000 fall 3 rise 5

nginx configuration:

server {
    listen 80 proxy_protocol;
    #listen 80;
    server_name blogs.studylinux.net;
......
}

Layer 7 IP Transport:

How to pass the client's real IP to the back-end server when haproxy is working on Layer 7

defaults
    option forwardfor
#Or:
    option forwardfor header X-Forwarded-xxx #Custom pass IP parameters, back-end web server writes X-Forwardedxx,

#If option forwardfor is written, the back-end server web format is X-Forwarded-For
#listen configuration:
listen web_host
    bind 192.168.7.101:80
    mode http
    log global
    balance random
    server web1 192.168.7.103:80 weight 1 check inter 3000 fall 2 rise 5
    server web2 192.168.7.104:80 weight 1 check inter 3000 fall 2 rise 5

web Server Log Format Configuration
Configure a web server to record client IP addresses for load balancing transmissions

#apache configuration:
LogFormat "%{X-Forwarded-For}i %a %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-
Agent}i\"" combined

#tomcat configuration:
pattern='%{X-Forwarded-For}i %l %T %t "%r" %s %b "%{User-
Agent}i"'/>

#nginx log format:
log_format main '"$http_x_forwarded_For" - $remote_user [$time_local] "$request"'
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" ';

Posted by hbalagh on Thu, 22 Aug 2019 18:19:26 -0700