logstash configuration file details

Keywords: Apache ElasticSearch Nginx REST

Detailed application configuration of Logstash

background

The business purpose is to analyze the daily logs generated by nginx and apache, monitor the url, ip, rest interface and other information, and send the data to the elastic search service.

config

input

Getting information from log files

file {
    path => "/home/keepgostudio/demo/logs/test.log"
    start_position => "beginning"
}       

filter

grok

At present, it is the best way to convert non-standardized log data into standardized and searchable data in logstash. Logstash by default provides 120 forms of analysis, including java stack logs and apache logs. Click to view

If there is no special requirement, the default apache log format can achieve the desired effect, as follows.

grok{       
    match => {"message" => ["%{COMBINEDAPACHELOG}"]}        
}

But if we want to monitor more information, such as the parameters on the url, the default expression will not be able to meet our needs. Then we need to write some expressions that meet our business needs and tell logstash to convert data in a desired way.

First, create a pattern folder in the root directory of logstash, which is not available by default.

Secondly, create the file test_pattern in the patterns folder. (For convenience, there is no name for the file according to the function of pattern. In practice, it is better to name the file according to the function, as for the reason you understand.) In the test_pattern file, you can customize some regular expressions in the format of "Name Regular Expressions" for use in grok.

Finally, make sure you take the pattern_dir parameter with you when you use it, otherwise logstash can't recognize the regular expressions you customize.

grok {
    patterns_dir => ["/home/keepgostudio/download/logstash-5.2.0/patterns"]
    match => {
        "message" => ["%{PARAMS_APACHELOG}", "%{NO_PARAMS_APACHELOG}"]
    }
    remove_field => ["host", "timestamp", "httpversion", "@version"]
}

kv

Convert the data source into key-value pairs and create relative fields. For example, "a = 111 & B = 2222 & c = 3333" is passed in, and when output, a, b, c are created into three fields. The advantage of this is that when a parameter needs to be queried, it can be queried directly instead of searching out a string for parsing.

kv {
    source => "field_name"
    field_split => "&?"
}

geoip

This can be seen literally from his function, according to ip to find the corresponding geographical information, such as cities, provinces, countries, latitude and longitude. This ip information is searched in a data source in logstash, not in a network search.

geoip {
    source => "field_name"
    fields => ["country_name", "region_name", "city_name", "latitude", "longitude"]
    target => "location"
}

drop

drop can skip some log information that does not want to be counted. When a log information conforms to the if rule, that information will not appear in the out. logstash will directly parse the next log.

if [field_name] == "value" {
    drop {}
}

output

When logstash is output to elastic search, an id is randomly generated for each record. It is suggested that manual input can be used at the beginning of contact. By combining the services of elastic search, it is easier to understand the whole implementation process.

elasticsearch {
    hosts => ["192.168.1.44:9200"]      
    index => "logstash-test-%{type}-%{host}"        
}

appendix

test.config

input {
    stdin {}
}

filter {
    grok {
        patterns_dir => ["/home/keepgostudio/download/logstash-5.2.0/patterns"]
        match => {
            "message" => ["%{PARAMS_APACHELOG}", "%{NO_PARAMS_APACHELOG}"]
        }
        remove_field => ["host", "timestamp", "httpversion", "@version"]
    }

    kv {
        source => "params"
        field_split => "&?"
    }

    geoip {
        source => "ip"
        fields => ["country_name", "region_name", "city_name", "latitude", "longitude"]
        target => "location"
}

output {
    elasticsearch {
        hosts => ["192.168.1.44:9200"]      
        index => "logstash-test-%{type}-%{host}"        
    }

}

test_pattern

HTTP_URL \S+(?=\?)
HTTP_URL_WITH_PARAMS "(?:%{WORD:method} %{HTTP_URL:url}\?%{NOTSPACE:params}(?: HTTP/%{NUMBER:httpversion}))"
HTTP_URL_WITHOUT_PARAMS "(?:%{WORD:method} %{NOTSPACE:url}(?: HTTP/%{NUMBER:httpversion}))"
NO_PARAMS_APACHELOG %{IPV4:ip} %{USERNAME} %{USERNAME} \[%{HTTPDATE:timestamp}\] %{HTTP_URL_WITHOUT_PARAMS} %{NUMBER:response} (?:%{NUMBER:bytes}|-) "%{NOTSPACE:referrer}" %{QS:agent}
PARAMS_APACHELOG %{IPV4:ip} %{USERNAME} %{USERNAME} \[%{HTTPDATE:timestamp}\] %{HTTP_URL_WITH_PARAMS} %{NUMBER:response} (?:%{NUMBER:bytes}|-) "%{NOTSPACE:referrer}" %{QS:agent}

Posted by don_s on Sun, 14 Apr 2019 17:27:31 -0700