Detailed Explanation of Apache's Common Functions

Keywords: Apache PHP vim curl

Apache is a web server with the highest usage. A in LAMP refers to it. Because of its open source, stability, security and other characteristics, it is widely used. The previous article has documented how to build a LAMP architecture, which is only the first step. Apache service is the most important one, and it is also the core of LAMP. The following is a record of the features that have been used frequently since Apache was used.

I. Three working modes of Apache

Apache has three stable mpm modes: prefork, worker and event. http-2.2 version of httpd default mpm mode is prefork, 2.4 version of httpd default is event mode. You can view it through httpd-V.

[root@linuxblogs ~]# httpd -V | grep -i "server mpm"   
Server MPM:     Prefork

When compiling, you can specify by configure parameters:

--with-mpm=prefork|worker|event

1. Preork working mode

Apache fork s some subprocesses in advance at the beginning of startup, and then waits for requests to come in. This is done to reduce the overhead of creating and destroying processes frequently. Each subprocess has only one thread and can only process one request at a time point.

Advantages: mature and stable, compatible with all old and new modules. At the same time, there is no need to worry about thread security.

Disadvantage: A process occupies more system resources and consumes more memory. Moreover, it is not good at handling high concurrent requests.

2. worker working mode

The mixed mode of multi-process and multi-threading is used. It also forks several sub-processes in advance (a relatively small number), and then each sub-process creates some threads, including a listening thread. Each request is assigned to a thread to serve. Threads are lighter than processes, because threads usually share the memory space of the parent process, so the memory footprint is reduced a little. In high concurrency scenarios, because there are more available threads than prefork, performance is better.

Advantages: less memory, better performance under high concurrency.
Disadvantage: Thread security must be considered.

3. Evet working mode

It is very similar to the worker model, the biggest difference is that it solves the problem of resource waste of long-term occupied threads in the keep-alive scenario. In event MPM, there will be a special thread to manage these keep-alive type threads. When there is a real request coming, the request will be passed to the service thread, and after execution, it will be allowed to release. This enhances the request processing capability in high concurrency scenarios.

HTTP uses keepalive to reduce the number of TCP connections, but because of the need to bind to server threads or processes, a busy server will consume all threads. Event MPM is a new model to solve this problem. It separates service processes from connections. The number of threads available is the key resource limitation when the server is processing fast and has a very high click-through rate. Event MPM is the most effective way at this time, but it can not work under HTTPS access.

 

Apache User Authentication

Sometimes, we need to set up a user authentication mechanism for some special access to increase security. For example, our personal website, generally has a management background, although the management background itself has a password, but we can set up a layer of user authentication in order to be more secure.

1. Editing configuration files

vim /usr/local/apache2/conf/extra/httpd-vhosts.conf

In the corresponding virtual host configuration, add the following configuration: (the bold part is to add content)

<VirtualHost *:80>   
    DocumentRoot "/usr/local/apache2/htdocs"   
ServerName www.123.com   
ServerAlias www.abc.com   
    <Directory /usr/local/apache2/htdocs/admin.php>   
        AllowOverride AuthConfig   
        AuthName "Please input you acount."   
        AuthType Basic   
        AuthUserFile /usr/local/apache2/htdocs/.htpasswd   
        require valid-user   
    </Directory>   
</VirtualHost>

Description: First, specify which directory to validate, AuthName customizes, and AuthUserFile specifies where the user password file is.

2. Creating User Names and Password Files for Encryption

htpasswd -c /usr/local/apache2/htdocs/.htpasswd liwei   
htpasswd -m /usr/local/apache2/htdocs/.htpasswd admin

When the first user is created, the - c option creates the. htpasswd file, and the - m option adds users and enters the password according to the prompt.

3. Restart apache service

apachectl -t   
apachectl graceful

First check whether the configuration is correct, and then use graceful equivalent to reload configuration, without restarting apache service, the effect is the same. Test, enter the password through the browser at www.123.com/admin.php.

 

3. Setting up default virtual host

The default virtual host is the first virtual host in the configuration file. The default virtual host has a feature that any domain name that is resolved to this server, no matter what domain name, will be accessed as long as it is not configured in the configuration file. If we use IP directly, we will visit this site. In order to avoid other people's confusing parsing, the default, that is, the first virtual host, should be banned. We have banned the use of allow, deny statements.

1. Configure default virtual host

vim /usr/local/apache2/conf/extra/httpd-vhosts.conf

Add a virtual host record:

<VirtualHost *:80>   
    DocumentRoot "/var/123"   
    ServerName xxxxx.com.cn   
    <Directory /var/123>   
        Order allow,deny   
        Deny from all   
    </Directory>   
</VirtualHost>

Create / var/123 directory and set 600 permissions, daemon users can not access:

mkdir /var/123     
 chmod -R 600 /var/123

2. Restart apache server

apachectl -t   
apachectl graceful

If accessed by IP or other parsed domain name, the prompt is found:

Forbidden
You don't have permission to access / on this server.

IV. Domain Name 301 Jump

A site will inevitably have multiple domain names, and multiple domain names must have a primary and secondary, for example, my website can be accessed by two domain names: www.itepub.cn and www.linuxblogs.cn, but you find that no matter which domain name I use, I will eventually jump to www.linuxblogs.con. This behavior is called domain name jump, where 301 is only a status code, jump in addition to 301 there are 302, 301 is a permanent jump, 302

Is a temporary jump, the site must be set to 301, so that the search engine is more friendly.

1. Configuration Domain Name Jump

# vim /usr/local/apache2/conf/extra/httpd-vhosts.conf   
<IfModule mod_rewrite.c>   
    RewriteEngine on   
    RewriteCond %{HTTP_HOST} ^www.abc.com$   
    RewriteRule ^/(.*)$ http://www.123.com/$1 [R=301,L]   
</IfModule>

Configuration: When accessing aaa, jump to 123 sites.

2. Configuring Multiple Domain Name Jumps

<IfModule mod_rewrite.c>   
    RewriteEngine on   
    RewriteCond %{HTTP_HOST} ^www.abc.com$ [OR]   
    RewriteCond %{HTTP_HOST} ^www.abcd.com$   
    RewriteRule ^/(.*)$ http://www.123.com/$1 [R=301,L]   
</IfModule>

3. Restart the server and test it

apachectl -t   
apachectl graceful

Test:

# curl -x192.168.0.8:80  www.abc.com -I   
HTTP/1.1 301 Moved Permanently   
Date: Tue, 25 Oct 2016 15:48:10 GMT   
Server: Apache/2.2.31 (Unix) PHP/5.5.38   
Location: http://www.123.com/   
# curl -x192.168.0.8:80  www.abcd.com -I   
HTTP/1.1 301 Moved Permanently   
Date: Tue, 25 Oct 2016 15:48:49 GMT   
Server: Apache/2.2.31 (Unix) PHP/5.5.38   
Location: http://www.123.com/   
Content-Type: text/html; charset=iso-8859-1

Through the above tests, it is found that both abc and abcd can jump to www.123.com domain name, as can browser access.

 

Apache Log Cutting

Every time we visit a website, we record several logs. Of course, the premise is that the log has been set up, the log is not managed, and the log files will grow larger and larger over time. How can we avoid producing such a large log file? Actually apache has a configuration to archive logs according to our needs, such as a new log every day or a new log every hour.

1. First, simply set the path name of the log

vim /usr/local/apache2/conf/extra/httpd-vhosts.conf

Editors add the following:

ErrorLog "logs/error.log"   
CustomLog "logs/access.log" combined

Logs stored in the / usr/local/apache2/logs directory are specified as error.log and access.log respectively. Combine is the format of log display. Log format can refer to the format specified in the configuration file httpd.conf, as follows:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined   
LogFormat "%h %l %u %t \"%r\" %>s %b" common

2. Setting apache log partition

Similarly edit the configuration file httpd-vhosts.conf

ErrorLog "|/usr/local/apache2/bin/rotatelogs -l /usr/local/apache2/logs/aaa-error_%Y%m%d.log 86400"   
CustomLog "|/usr/local/apache2/bin/rotatelogs -l /usr/local/apache2/logs/aaa-access_%Y%m%d.log 86400" combined

ErrorLog is the error log and CustomLog is the access log. | This is the pipe character, which means that the generated log is handed over to rotatelog, which is the tool that apache comes with to cut the log. - The function is to calibrate the time zone to UTC, that is, Beijing time. 86400, in seconds, exactly one day, so the log will be cut once a day. The combined at the end is the format of the log, which is defined in httpd.conf.

 

6. Logs that do not record the specified file type

If a website visits a lot, then there will be a lot of access logs, but there are some access logs that we can actually ignore, such as some pictures of the website, as well as static objects such as js, css. And the access to these files is often huge, and even if it is useless to record these logs, how can we ignore not to record these logs?

1. Configuration log does not record access to pictures

vim /usr/local/apache2/conf/extra/httpd-vhosts.conf

Relevant configuration:

SetEnvIf Request_URI ".*\.gif$" image-request   
SetEnvIf Request_URI ".*\.jpg$" image-request   
SetEnvIf Request_URI ".*\.png$" image-request   
SetEnvIf Request_URI ".*\.bmp$" image-request   
SetEnvIf Request_URI ".*\.swf$" image-request   
SetEnvIf Request_URI ".*\.js$"  image-request   
SetEnvIf Request_URI ".*\.css$" image-request   
CustomLog "|/usr/local ... _%Y%m%d.log 86400" 
combined env=!image-request

Description: On the basis of the original log configuration, some definitions of image-request are added, such as marking the end of gif, jpg, bmp, swf, js, css as image-request, and then adding a tag env=!

 

7. Apache Configuration of Static Cache

The static files refer to pictures, js, css and other files. Users visit a site. In fact, most of the elements are pictures, js, css and so on. These static files are actually cached on the local computer by the client's browser. The purpose is not to download them from the server when they request again. This speeds up the speed and improves the user experience. But these static files can not always be cached, it always has some timeliness, so you have to set this expiration time.

1. Configuring static caching

# vim /usr/local/apache2/conf/extra/httpd-vhosts.conf   
<IfModule mod_expires.c>   
    ExpiresActive on   
    ExpiresByType image/gif "access plus 1 days"   
    ExpiresByType image/jpeg "access plus 24 hours"   
    ExpiresByType image/png "access plus 24 hours"   
    ExpiresByType text/css "now plus 2 hour"   
    ExpiresByType application/x-javascript "now plus 2 hours"   
    ExpiresByType application/javascript "now plus 2 hours"   
    ExpiresByType application/x-shockwave-flash "now plus 2 hours"   
    ExpiresDefault "now plus 0 min"   
</IfModule>

Or use mod_headers module to implement:

<IfModule mod_headers.c>   
    # File caching for htm,html,txt class for one hour 
    <filesmatch "\.(html|htm|txt)$">   
        header set cache-control "max-age=3600"   
    </filesmatch>   

    # File caching of css, js, swf classes for one week 
    <filesmatch "\.(css|js|swf)$">   
        header set cache-control "max-age=604800"   
    </filesmatch>   

    # jpg,gif,jpeg,png,ico,flv,pdf file caching for one year 
    <filesmatch "\.(ico|gif|jpg|jpeg|png|flv|pdf)$">   
        header set cache-control "max-age=29030400"   
    </filesmatch>   
</IfModule>

Explanation: The time unit here can be days, hours or even min. There are two different methods, mod_expires used above and mod_headers used below. If you want to use these modules, you must support them beforehand. How to see if it is supported, use commands:

# /usr/local/apache2/bin/apachectl -M

2. Restart the server and verify it

apachectl -t   
apachectl graceful

Verification:

# curl -x127.0.0.1:80 'http://www.123.com/static/image/common/online_admin.gif' -I   
HTTP/1.1 200 OK   
Date: Wed, 26 Oct 2016 03:51:26 GMT   
Server: Apache/2.2.31 (Unix) PHP/5.5.38   
Last-Modified: Tue, 31 May 2016 03:08:36 GMT   
ETag: "46891b-16b-5341ab0597500"   
Accept-Ranges: bytes   
Content-Length: 363   
Cache-Control: max-age=86400   
Expires: Thu, 27 Oct 2016 03:51:26 GMT   
Content-Type: image/gif

8. Apache Configuration Anti-theft Chain

If your website has many beautiful pictures, such as your website domain name www.123.com and the picture address is www.123.com/image/111.jpg, then others can put this address directly on their own website. His users can view the picture directly from his website, while the actual picture is accessed from your website, the bandwidth consumption generated does not mean anything to you. Meaning, we should restrict these pictures. How to configure pictures that are strictly prohibited from accessing your site on third-party websites?

1. Configuration of anti-theft chain

# vim /usr/local/apache2/conf/extra/httpd-vhosts.conf   
SetEnvIfNoCase Referer "^http://.*\.123\.com" local_ref   
SetEnvIfNoCase Referer ".*\.abc\.com" local_ref   
SetEnvIfNoCase Referer "^$" local_ref   
<filesmatch "\.(txt|doc|mp3|zip|rar|jpg|gif)">   
    Order Allow,Deny   
    Allow from env=local_ref   
</filesmatch>

Description: In this configuration, referer is a noun, which is actually the link of the last visit to the website. Configuration referer is based on the limitation of source links. If the source links are not what we want, we will refuse them directly. This is the principle of anti-theft chain. Of course, not only images, mp3, rar, zip and other files also support. The default in the above configuration is to reject all but referers in the defined list.

 

9. Apache Access Control

In fact, we can control the access of apache. We can set up a white list or a blacklist. When you changed httpd.conf earlier, you already saw two keywords: allow and deny. Let's first look at the rules of allow and deny.

1, example 1

Order deny,allow   
deny from all   
allow from 127.0.0.1

Our judgment is based on the following:

  • Look at the one behind Order, which is in the front and which is in the back.
  • If deny is in the front, then you need to look at the sentence deny from, and then at the sentence allow from.
  • Rules are matched one by one, and deny or allow will take effect.

2, example 2

Order allow,deny   
deny from all   
allow from 127.0.0.1

This will be deny all, 127.0.0.1 will also be deny. Because the order is allow and then deny. Although allow started 127, it was rejected later.

3, example 3

Order allow,deny   
deny from all

The above rules mean that none of them can be passed.

4, example 4

Order deny,allow   
deny from all   
The rules above indicate that none of them can.   

Order deny,allow   
Only the order, no specific rules, means that all can pass (default), because allow is at the end.   

Order allow,deny   
This expression, all cannot pass (default), because deny is at the end.

5. For a directory restriction

For example, this directory is very important, only allow our company IP access, of course, this directory can be the root directory of the website, that is, the entire site.

<Directory /usr/local/apache2/htdocs>   
    Order deny,allow   
    Deny from all    
    Allow from 127.0.0.1   </Directory>

6. Restrict the URL for the request

<filesmatch "(.*)admin(.*)">   
    Order deny,allow   
    Deny from all    
    Allow from 127.0.0.1   
</filesmatch>

filesmatch grammar is used here to express the meaning of matching.

7, validation

# curl -x192.168.0.8:80 www.123.com/admin.php -I      
HTTP/1.1 403 Forbidden      
Date: Wed, 26 Oct 2016 06:24:54 GMT      
Server: Apache/2.2.31 (Unix) PHP/5.5.38      
Content-Type: text/html; charset=iso-8859-1
# curl -x127.0.0.1:80 www.123.com/admin.php -I   
HTTP/1.1 401 Authorization Required   
Date: Wed, 26 Oct 2016 06:25:03 GMT   
Server: Apache/2.2.31 (Unix) PHP/5.5.38   
WWW-Authenticate: Basic realm="Please input you acount."   
Content-Type: text/html; charset=iso-8859-1

10. Prohibit parsing PHP

It's very useful to prohibit parsing PHP in a directory. When we do website security, we use it a lot. For example, some directories can upload files. In order to avoid the Trojan horse in uploading files, we prohibit access to parsing PHP under this directory.

1. Configuration prohibits parsing php

<Directory /usr/local/apache2/htdocs/data>   
    php_admin_flag engine off    
    <filesmatch "(.*)php">   
        Order deny,allow   
        Deny from all    
    </filesmatch>   
</Directory>

Note: The statement php_admin_flag engine off prohibits the parsing of PHP control statements, but this configuration is not enough, because users can still access PHP files after this configuration, but not parse, but can download, users download PHP files is also inappropriate, so it is necessary to prohibit it again.

 

11. Prohibit the designation of user_agent

User_agent is called browser identity. At present, the main browsers are IE, chrome, Firefox, 360, Safari of iphone, Android mobile phone, Baidu search engine, google search engine and so on. Each browser has its corresponding user_agent. To avoid the innocent consumption of bandwidth caused by some useless search engines or machine crawlers.

<IfModule mod_rewrite.c>   
    RewriteEngine on   
    RewriteCond %{HTTP_HOST} ^www.abc.com$ [OR]   
    RewriteCond %{HTTP_HOST} ^www.abcd.com$   
    RewriteRule ^/(.*)$ http://www.123.com/$1 [R=301,L]   

    RewriteCond %{HTTP_USER_AGENT} ".*Firefox.*" [NC,OR]   
    RewriteCond %{HTTP_USER_AGENT} ".*Tomato Bot.*" [NC]   
    RewriteRule .* - [F]    
</IfModule>

The rewrite module is also used to restrict the specified user_agent. In this case, RewriteRule. * -[F] can directly prohibit access, rewriteond matches with user_agent, NC means case-insensitive, OR means or, join the next condition. If we want to restrict Baidu's search engine, we can add a rule like this:

RewriteCond %{HTTP_USER_AGENT} ^.*Baiduspider/2.0.* [NC]   
RewriteRule .* - [F] 

Limit a directory

We can allow and deny to go to a subdirectory in the root directory of the current website. Of course, this rewrite can also be implemented. The configuration is as follows:

<IfModule mod_rewrite.c>   
    RewriteEngine on   
    RewriteCond %{REQUEST_URI} ^.*/tmp/* [NC]   
    RewriteRule .* - [F]   
</IfModule>

This configuration limits all requests that contain / tmp / characters.


This is originally created by Zhang Dashen. Please note consciously that:
Reprinted please indicate from Zhang Chenyun's Personal Website - Cloud Coded Note Address of this article: http://www.itzcy.com/blog/1299.html
Unless noted, Zhang Chenyun's personal website - Cloud Coding Note articles are original, reproduced please indicate the origin and link!

Reprinted please indicate from Zhang Chenyun's Personal Website - Cloud Coded Note Address of this article: http://www.itzcy.com/blog/1299.html
Unless noted, Zhang Chenyun's personal website - Cloud Coding Note articles are original, reproduced please indicate the origin and link!

Posted by pelegk2 on Fri, 12 Apr 2019 22:24:33 -0700