Introduction to the varnish foundation of WEB caching system

Keywords: Linux Session shell socket

Previously, we talked about the cache control mechanism in http protocol and the introduction of varnish architecture components. Please refer to https://www.cnblogs.com/qiuhom-1874/p/12620538.html Today, let's talk about how to configure and use varnish;

We mentioned earlier that there are two configuration files in varnish, one is / etc/varnish/varnish.params, which mainly defines some runtime parameters of the master process of varnish and the key file that varnishd listens to on that socket and connects to varnish. The other is / etc/varnish/default.vcl, which is actually Varn The default cache policy configuration file specified in the ish.params file, which is mainly used to configure cache related policies and configuration files written in the varnish proprietary configuration language VCL. Let's learn about the varnish.params configuration file first

[root@test_node1-centos7 ~]# vim /etc/varnish/varnish.params 
  
# Varnish environment configuration description. This was derived from
# the old style sysconfig/defaults settings

# Set this to 1 to make systemd reload try to switch VCL without restart.
RELOAD_VCL=1

# Main configuration file. You probably want to change it.
VARNISH_VCL_CONF=/etc/varnish/default.vcl

# Default address and port to bind to. Blank address means all IPv4
# and IPv6 interfaces, otherwise specify a host name, an IPv4 dotted
# quad, or an IPv6 address in brackets.
# VARNISH_LISTEN_ADDRESS=192.168.1.5
VARNISH_LISTEN_PORT=6081

# Admin interface listen address and port
VARNISH_ADMIN_LISTEN_ADDRESS=127.0.0.1
VARNISH_ADMIN_LISTEN_PORT=6082

# Shared secret file for admin interface
VARNISH_SECRET_FILE=/etc/varnish/secret

# Backend storage specification, see Storage Types in the varnishd(5)
# man page for details.
VARNISH_STORAGE="malloc,256M"

# User and group for the varnishd worker processes
VARNISH_USER=varnish
VARNISH_GROUP=varnish

# Other options, see the man page varnishd(1)
#DAEMON_OPTS="-p thread_pool_min=5 -p thread_pool_max=500 -p thread_pool_timeout=300"

Tip: the parameter "RELOAD_VCL" is mainly used to specify whether or not varnish supports switching VCL configuration files without restarting. 1 indicates support and 0 indicates no support. Here, it is generally set to 1. Do not restart varnish at will during work. Once you restart many cache items, they will fail. Most likely, because of cache failure, the front-end access pressure will be pressed on the real server at the back end, leading to This parameter specifies the cache policy configuration file, which is default.vcl by default. We can edit this file directly, and then compile it into VCL with different configuration names through VCL compilation. This parameter specifies the port that varnishd provides access to outside. Generally, there is no proxy in front of this parameter. We need to put this Change the port to 80 or 443; varnish ﹣ admin ﹣ listen ﹣ address this parameter specifies the listening address of varnish's management interface. For security, the local loopback address is usually used to prevent remote connection; varnish ﹣ admin ﹣ listen ﹣ port this parameter specifies the port that varnish's management interface listens to. Usually, this port can not be changed, because the management interface is generally only used by administrators; Varn This parameter specifies the authentication file used by the management interface connection of varnish, which usually does not need to be changed; this parameter specifies the cache storage mode of varnish, which has three types: malloc memory storage, and its configuration syntax is malloc[,size], all cache entries will be invalid after this storage method is restarted; the second is file file, configuration syntax file [, path [, size [, In general, we only need to specify the path and file size of the file. This method of disk file storage is black box, and all cache items will be invalid after restart. The third method is disk file storage, which is different from the second one. After restart, all cache items are valid, but this storage method is still in the experimental stage in varnish 4.0 There are only two ways that we can use. One is memory storage, the other is file black box storage. These two ways are that all cache items fail after restart, so it is not allowed to restart at will on the varnish cache server. The two parameters, namely, VARNISH_USER and VARNISH_GROUP, specify the starting user and group of the varnishd process. The parameter DAEMON_OPTS specifies the varnish runtime parameter, which is required for each parameter Use - p to specify, which can be used repeatedly to specify different parameters; - r means to set the specified parameters to read-only status; here, it is prompted that the varnish overloaded VCL configuration file is to directly use the varnish special overloaded command varnish \\\\\\ Look;

Tip: because port 80 is occupied by httpd running on this machine, I take port 8000 as an example to provide external services; at the same time, we also give the cache of varnish is based on the black box storage mode of files, and specify the file size of 500M;

Tip: you can see that the port on which varnish provides external services has been set up, but you can use a browser to access port 8000 to prompt 503, saying that the backend server has not been found. This is because by default, the backend server specified by varnish is 127.0.0.1:8080. We want to configure the backend host server in default.vcl, as shown below

Tip: the above configuration indicates that the default host address of web services provided by the backend is 127.0.0.1 and the port is 8080. Generally, we need to change this configuration file to specify the backend host and port, and then recompile the configuration file and load it for use;

Tip:: after modifying the configuration file in this way, you need to use the varnishadm tool to connect to the command line interface provided by varnish to compile the configuration file, and then load it for use. First, let's talk about how to use the varnishadm tool

[root@test_node1-centos7 ~]# varnishadm --help
varnishadm: invalid option -- '-'
usage: varnishadm [-n ident] [-t timeout] [-S secretfile] -T [address]:port command [...]
        -n is mutually exlusive with -S and -T
[root@test_node1-centos7 ~]# 

Tip: the above is the help for using the varnishadm command. In fact, this command is very easy to use. We just need to use - S (uppercase) to specify the secret file, and then use - T to specify the address and port that the varnish host management interface listens to. Of course, to connect like this is an interactive connection, which will give us an interactive interface to enter commands to operate varnish. If you don'T want to interact The command can be given later, which is similar to the usage of mysql. Next, we use the varinshadm tool to connect to the management interface of varnish;

[root@test_node1-centos7 ~]# varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
200        
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.10.0-693.el7.x86_64,x86_64,-sfile,-smalloc,-hcritbit
varnish-4.0.5 revision 07eff4c29

Type 'help' for command list.
Type 'quit' to close CLI session.

quit
500        
Closing CLI connection
[root@test_node1-centos7 ~]# 

Prompt: if you see the above interface, you can successfully connect to the management interface of varinshadm with the varinshadm tool. Enter quit to launch the management interface. Enter help to refer to the command list

[root@test_node1-centos7 ~]# varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
200        
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.10.0-693.el7.x86_64,x86_64,-sfile,-smalloc,-hcritbit
varnish-4.0.5 revision 07eff4c29

Type 'help' for command list.
Type 'quit' to close CLI session.

help
200        
help [<command>]
ping [<timestamp>]
auth <response>
quit
banner
status
start
stop
vcl.load <configname> <filename>
vcl.inline <configname> <quoted_VCLstring>
vcl.use <configname>
vcl.discard <configname>
vcl.list
param.show [-l] [<param>]
param.set <param> <value>
panic.show
panic.clear
storage.list
vcl.show [-v] <configname>
backend.list [<backend_expression>]
backend.set_health <backend_expression> <state>
ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
ban.list

Tip: after connecting to varnish with the varnishadm tool, each command executed will return a number similar to the status code in the HTTP protocol. 200 means that the command is executed successfully and the corresponding content is returned. The meaning of this status is similar to the meaning of the status code in http. From the help list above, it lists the basic use methods of each command, such as he P command can directly use help to indicate the command list and format. Help indicates to view the usage of a command. As follows, to view the usage of ping command, you can type help ping on the command line interface

Tip: from the above help information, we can learn that the main function of ping command is to see if varnish is alive. If we click ping command in the varnish shell to return 200 and PONG 1585892735 1.0, it means that varnish is alive on the host;

To continue with the above topic, we have modified the default.vcl configuration file. How do we need to compile it? Continue to view command help

Tip: the above indicates the usage and description of vcl.load command, which is mainly used to compile and load VCL files. The usage method is vcl.load + configuration name (this name is customized by us, and we can say any legal name) + configuration file name; as follows

Tip: the above means compiling the default.vcl configuration file and calling it test1. We can check the current VCL configurations by typing vcl.list in the varnish shell

Tip: you can see that there are two configurations. One is named boot, and its status is active, which means it is currently in use. The other is the name test1 we just compiled. The status is available, which means it is available, which means that we can use vcl.use to switch between uses. Next, let's look at the usage of vcl.use and try to switch between us Newly compiled configuration;

Tip: the vcl.use command is mainly used to switch to the configuration with the specified configuration name. From the above return results, test1 is now in the active state, which means that the configuration of test1 is now applied by varnish. Next, we can try to access the external access port provided by varnish in the browser;

Tip: you can see that we can normally get the response from the back-end httpd server by visiting port 8000. It means that there is no problem with the back-end host ip and port we have configured. At the same time, from the above results, varnish is also a reverse proxy service software. Generally, varnish can be used as a reverse proxy, but the scheduling algorithm in it is very simple, only polling and weighted polling, The reason why the algorithm is less is that its strength is not to be used as a reverse proxy server, but as a cache server, to respond to the requests of clients, and rarely to get resources from the back-end through reverse proxy;

After configuring the backend web host of varnish, let's learn the syntax of VCL

  VCL(varnish configuration Lanuage) is a domain specific configuration language, which is mainly used to write cache policies. VCL has multiple state engines, which are related to each other, but are isolated from each other. Each state engine can use return(X) to indicate the next level of engine. Each state engine corresponds to a configuration end in the VCL file, that is, subroutine, which is about the processing flow For example, VCL ﹣ hash -- > return (HIT) -- > VCL ﹣ hit; the process depends on what return is. Return (HIT) means that the subroutine of the next level of processing is VCL ﹣ hit;

varnish4.0VCL syntax has the following points:

1) VCL file must be vcl 4.0; start

Example:

Prompt: except for the instructions beginning with #, which indicate that the existing configuration is effective; "#" indicates comments

2) both / / and ා signs and / * * / represent comments, the first two represent single line comments, and the last one represents multi line comments;

3) subroutines must have the sub keyword specified

Example:

Tip: this means that a subroutine is named vcl_recv

4) no loop, limited by the built-in variables of the engine

5) use the parameter of return() function as the keyword of the next operation to end the statement, and use return to switch the state engine;

VCL FSM specific:

1) handle each request separately;

2) each request is independent of other requests at any given time;

3) the states are related, but isolated;

4) return (action); exit a state and specify varnish to enter the next state;

5) the built-in VCL code always exists and is attached to your own VCL; that is to say, we don't write any VCL code, it has its own built-in VCL code by default, and this code is always the same; we can use vcl.show -v in the varnish shell Specify the configuration name to view the current effective configuration details (default VCL code + self written VCL configuration code), as follows

[root@test_node1-centos7 ~]# varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
200        
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.10.0-693.el7.x86_64,x86_64,-sfile,-smalloc,-hcritbit
varnish-4.0.5 revision 07eff4c29

Type 'help' for command list.
Type 'quit' to close CLI session.


varnish> vcl.list 
200        
active          0 boot


varnish> vcl.show -v boot
200        
// VCL.SHOW 0 1221 input
#
# This is an example VCL file for Varnish.
#
# It does not do anything by default, delegating control to the
# builtin VCL. The builtin VCL is called when there is no explicit
# return statement.
#
# See the VCL chapters in the Users Guide at https://www.varnish-cache.org/docs/
# and http://varnish-cache.org/trac/wiki/VCLExamples for more examples.

# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;

# Default backend definition. Set this to point to your content server.
backend default {
    .host = "192.168.0.99";
    .port = "80";
}

sub vcl_recv {
    # Happens before we check if we have this in cache already.
    #
    # Typically you clean up the request here, removing cookies you don't need,
    # rewriting the request, etc.
}

sub vcl_backend_response {
    # Happens after we have read the response headers from the backend.
    #
    # Here you clean the response headers, removing silly Set-Cookie headers
    # and other mistakes your backend does.
}

sub vcl_deliver {
    # Happens when we have all the pieces we need, and are about to send the
    # response to the client.
    #
    # You can do accounting or modifying the final object here.
}

// VCL.SHOW 1 5479 Builtin
/*-
 * Copyright (c) 2006 Verdens Gang AS
 * Copyright (c) 2006-2014 Varnish Software AS
 * All rights reserved.
 *
 * Author: Poul-Henning Kamp <phk@phk.freebsd.dk>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *

 *
 * The built-in (previously called default) VCL code.
 *
 * NB! You do NOT need to copy & paste all of these functions into your
 * own vcl code, if you do not provide a definition of one of these
 * functions, the compiler will automatically fall back to the default
 * code from this file.
 *
 * This code will be prefixed with a backend declaration built from the
 * -b argument.
 */

vcl 4.0;

#######################################################################
# Client side


sub vcl_recv {
    if (req.method == "PRI") {
        /* We do not support SPDY or HTTP/2.0 */
        return (synth(405));
    }
    if (req.method != "GET" &&
      req.method != "HEAD" &&
      req.method != "PUT" &&
      req.method != "POST" &&
      req.method != "TRACE" &&
      req.method != "OPTIONS" &&
      req.method != "DELETE") {
        /* Non-RFC2616 or CONNECT which is weird. */
        return (pipe);
    }

    if (req.method != "GET" && req.method != "HEAD") {
        /* We only deal with GET and HEAD by default */
        return (pass);
    }
    if (req.http.Authorization || req.http.Cookie) {
        /* Not cacheable by default */
        return (pass);
    }
    return (hash);
}

sub vcl_pipe {
    # By default Connection: close is set on all piped requests, to stop
    # connection reuse from sending future requests directly to the
    # (potentially) wrong backend. If you do want this to happen, you can undo
    # it here.
    # unset bereq.http.connection;
    return (pipe);
}

sub vcl_pass {
    return (fetch);
}

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (lookup);
}

sub vcl_purge {
    return (synth(200, "Purged"));
}

sub vcl_hit {
    if (obj.ttl >= 0s) {
        // A pure unadultered hit, deliver it
        return (deliver);
    }
    if (obj.ttl + obj.grace > 0s) {
        // Object is in grace, deliver it
        // Automatically triggers a background fetch
        return (deliver);
    }
    // fetch & deliver once we get the result
    return (fetch);
}

sub vcl_miss {
    return (fetch);
}

sub vcl_deliver {
    return (deliver);
}

/*
 * We can come here "invisibly" with the following errors: 413, 417 & 503
 */
sub vcl_synth {
    set resp.http.Content-Type = "text/html; charset=utf-8";
    set resp.http.Retry-After = "5";
    synthetic( {"<!DOCTYPE html>
<html>
  <head>
    <title>"} + resp.status + " " + resp.reason + {"</title>
  </head>
  <body>
    <h1>Error "} + resp.status + " " + resp.reason + {"</h1>
    <p>"} + resp.reason + {"</p>
    <h3>Guru Meditation:</h3>
    <p>XID: "} + req.xid + {"</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
"} );
    return (deliver);
}

#######################################################################
# Backend Fetch

sub vcl_backend_fetch {
    return (fetch);
}

sub vcl_backend_response {
    if (beresp.ttl <= 0s ||
      beresp.http.Set-Cookie ||
      beresp.http.Surrogate-control ~ "no-store" ||
      (!beresp.http.Surrogate-Control &&
        beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
      beresp.http.Vary == "*") {
        /*
        * Mark as "Hit-For-Pass" for the next 2 minutes
        */
        set beresp.ttl = 120s;
        set beresp.uncacheable = true;
    }
    return (deliver);
}

sub vcl_backend_error {
    set beresp.http.Content-Type = "text/html; charset=utf-8";
    set beresp.http.Retry-After = "5";
    synthetic( {"<!DOCTYPE html>
<html>
  <head>
    <title>"} + beresp.status + " " + beresp.reason + {"</title>
  </head>
  <body>
    <h1>Error "} + beresp.status + " " + beresp.reason + {"</h1>
    <p>"} + beresp.reason + {"</p>
    <h3>Guru Meditation:</h3>
    <p>XID: "} + bereq.xid + {"</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
"} );
    return (deliver);
}

#######################################################################
# Housekeeping

sub vcl_init {
    return (ok);
}

sub vcl_fini {
    return (ok);
}



varnish> 

Note: the above is that we have not written any VCL configuration codes, and there are VCL configuration codes by default. From the above configuration, there are three main types of syntax for VCL configuration language

The first type defines subroutine. The main format is

sub subroutine {
	...
}

The second type is the if else condition judgment branch. The format is as follows

if CONDITION {
	...
} else {	
	...
}
			

The third type is that each subroutine needs to be terminated by the return statement to specify the next subroutine

After understanding the above basic syntax, let's take a look at the built-in functions and keywords of VCL

First of all, the function regsub(str,regex,sub) is a function built into VCL to find replacement strings. This function only replaces the first matching string. If there are more than one string matching, it will not be processed, Sub) there is only one difference between this function and the one above. This function is to replace all matching strings. ban(boolean expression) is used to clean up cache items. Hash [data (input) is used to hash input. synthetic(str) is used to combine users into strings, which is usually used to embed other codes;

Keywords: call subroutine, return(action),new,set,unset

Operators: = =,! =, ~, >,! , variable assignment=

There are five types of built-in variables: req. * indicates that the request message sent from the client is related; req.http. * indicates that the request header variable, such as req.http.User-Agent, refers to the value of the user agent header in the HTTP request message; req.http.Referer indicates the value of the application HTTP request header Referer; bereq. * indicates that the HTTP request sent to the back-end host is related; For example, bereq.http. * refers to the value of the HTTP request header sent to the back-end host, which is logically the same as that of req.http. * beresp. *: the response message from BE host to varnish is related; resp. *: the response message from varnish to client is related; these four types of variables are the same logic. Http. * refers to the value of the HTTP request header; obj. * refers to the cache pair stored in the cache space The attribute of image;

Common variables:

bereq.*, req.*: 
    bereq.http.HEADERS
    bereq.request: Request method;
    bereq.url: Requested url;
    bereq.proto: Requested protocol version;
    bereq.backend: Indicates the back-end host to call;
					
    req.http.Cookie: In the request message of the client Cookie The value of the head; 
    req.http.User-Agent ~ "chrome"
						
						
    beresp.*, resp.*: 
    beresp.http.HEADERS
    beresp.status: Response status code;
    reresp.proto: Agreement version;
    beresp.backend.name: BE Host name of the host;
    beresp.ttl: BE The remaining cacheable time of the content responded by the host;
						
    obj.*
    obj.hits: The number of hits this object has made from the cache;
    obj.ttl: Object ttl value
						
server.*
    server.ip
    server.hostname
client.*
    client.ip
	

User specified variables are set with set instruction, unset means delete;

Example: specify the response header. If the cache is hit, set the value of the corresponding header to "HIT via" + server ip address. If the cache is not hit, the value of the corresponding header is "MISS via" + server ip address

 

Note: the above configuration needs to be written in VCL ﹣ delivery, which mainly deals with all the varnish response client messages;

Test, visit in the browser, see the value of the first X-Cache response to determine whether the request is hit by the cache;

For the first access, it must not be hit by the cache, because there is no cache at all, so the value of the first access to X-Cache should be "MISS via 192.168.0.99"

Tip: you can see that the first visit is indeed miss, so will the second and subsequent visits be miss?

Tip: the value of the first part of the response to the second access to X-Cache becomes hit via 192.168.0.99, indicating that the second access is hit by the cache;

Posted by zarathu on Sat, 04 Apr 2020 17:38:21 -0700