Collecting container logs of kubernetes

Keywords: Web Server Java Kubernetes Docker Nginx

demand

/The files under var/log/containers are actually soft links

The real log files are in the directory / var/lib/docker/containers

Options:

  1. Logstash (too much memory, try not to use this)
  2. fluentd
  3. filebeat
  4. Do not use docker-driver

Log format

/var/log/containers

{
    "log": "17:56:04.176 [http-nio-8080-exec-5] INFO  c.a.goods.proxy.GoodsGetServiceProxy - ------ request_id=514136795198259200,zdid=42,gid=108908071,Getting data from the cache:fail ------\n",
    "stream": "stdout",
    "time": "2018-11-19T09:56:04.176713636Z"
}

{
    "log": "[][WARN ][2018-11-19 18:13:48,896][http-nio-10080-exec-2][c.h.o.p.p.s.impl.PictureServiceImpl][[msg:Pictures do not meet the requirements:null];[code:400.imageUrl.invalid];[params:https://img.alicdn.com/bao/uploaded/i2/2680224805/TB2w5C9bY_I8KJjy1XaXXbsxpXa_!!2680224805.jpg];[stack:{\"requestId\":\"514141260156502016\",\"code\":\"400.imageUrl.invalid\",\"msg\":\"\",\"stackTrace\":[],\"suppressedExceptions\":[]}];]\n",
    "stream": "stdout",
    "time": "2018-11-19T10:13:48.896892566Z"
}

Logstash

  • filebeat.yml
filebeat:
  prospectors:
  - type: log
    //Turn on surveillance, turn on collection or not
    enable: true
    paths:  # The path to collect the log. Here is the path in the container.
    - /var/log/elkTest/error/*.log
    # Log multi-line merging acquisition
    multiline.pattern: '^\['
    multiline.negate: true
    multiline.match: after
    # Identify, or group, each item to distinguish between different formats of logs
    tags: ["java-logs"]
    # This file records the location where the log is read, and if the container restarts, the log can be retrieved from the location of the record.
    registry_file: /usr/share/filebeat/data/registry

output:
  # Output to logstash
  logstash:
    hosts: ["0.0.0.0:5044"]

Note: The filebeat.yml above 6.0 needs to be mounted to / usr/share/filebeat/filebeat.yml, as well as / usr/share/filebeat/data/registry file, so as to avoid new duplicate collection logs after the filebeat container hangs.

  • logstash.conf
input {
    beats {
        port => 5044
    }
}
filter {
    
   if "java-logs" in [tags]{ 
     grok {
        match => {
           "message" => "(?<date>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3})\]\[(?<level>[A-Z]{4,5})\]\[(?<thread>[A-Za-z0-9/-]{4,40})\]\[(?<class>[A-Za-z0-9/.]{4,40})\]\[(?<msg>.*)"
        }
        remove_field => ["message"]
     }
    }
    #if ([message] =~ "^\[") {
    #    drop {}
    #}
    # Mismatch regularity, matching regularity=~
    if [level] !~ "(ERROR|WARN|INFO)" {
        drop {}
    }
}

## Add your filters / logstash plugins configuration here

output {
    elasticsearch {
        hosts => "0.0.0.0:9200"
    }
}

fluentd

fluentd-es-image image image

Kubernetes-Unified Log Management Based on EFK

Docker Logging via EFK (Elasticsearch + Fluentd + Kibana) Stack with Docker Compose

filebeat+ES pipeline

Define pipeline

  • Define java-specific pipelines

PUT /_ingest/pipeline/java
{
    "description": "[0]java[1]nginx[last]General rules",
    "processors": [{
        "grok": {
            "field": "message",
            "patterns": [
                "\\[%{LOGLEVEL:level}\\s+?\\]\\[(?<date>\\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3})\\]\\[(?<thread>[A-Za-z0-9/-]+?)\\]\\[%{JAVACLASS:class}\\]\\[(?<msg>[\\s\\S]*?)\\]\\[(?<stack>.*?)\\]"
            ]
        },"remove": {
              "field": "message"
            }
    }]
}

PUT /_ingest/pipeline/nginx
{
    "description": "[0]java[1]nginx[last]General rules",
    "processors": [{
        "grok": {
            "field": "message",
            "patterns": [
                "%{IP:client} - - \\[(?<date>.*?)\\] \"(?<method>[A-Za-z]+?) (?<url>.*?)\" %{NUMBER:statuscode} %{NUMBER:duration} \"(?<refer>.*?)\" \"(?<user-agent>.*?)\"" 
            ]
        },"remove": {
              "field": "message"
            }
    }]
}

PUT /_ingest/pipeline/default
{
    "description": "[0]java[1]nginx[last]General rules",
    "processors": []
}

PUT /_ingest/pipeline/all
{
    "description": "[0]java[1]nginx[last]General rules",
    "processors": [{
        "grok": {
            "field": "message",
            "patterns": [
                "\\[%{LOGLEVEL:level}\\s+?\\]\\[(?<date>\\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3})\\]\\[(?<thread>[A-Za-z0-9/-]+?)\\]\\[%{JAVACLASS:class}\\]\\[(?<msg>[\\s\\S]*?)\\]\\[(?<stack>.*?)\\]",
                
                "%{IP:client} - - \\[(?<date>.*?)\\] \"(?<method>[A-Za-z]+?) (?<url>.*?)\" %{NUMBER:statuscode} %{NUMBER:duration} \"(?<refer>.*?)\" \"(?<user-agent>.*?)\"",
                
                ".+"
            ]
        }
    }]
}

filebeat.yml

apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
        inputs:
          # Mounted `filebeat-inputs` configmap:
          path: ${path.config}/inputs.d/*.yml
          # Reload inputs configs as they change:
          reload.enabled: false
        modules:
          path: ${path.config}/modules.d/*.yml
          # Reload module configs as they change:
          reload.enabled: false
    setup.template.settings:
        index.number_of_replicas: 0

    # https://www.elastic.co/guide/en/beats/filebeat/6.5/filebeat-reference-yml.html
    # https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover.html
    filebeat.autodiscover:
     providers:
       - type: kubernetes
         templates:
             config:
               - type: docker
                 containers.ids:
                  # - "${data.kubernetes.container.id}" 
                  - "*"
                 enable: true
                 processors:
                  - add_kubernetes_metadata:
                      # include_annotations:
                      #   - annotation_to_include        
                      in_cluster: true
                  - add_cloud_metadata:

    cloud.id: ${ELASTIC_CLOUD_ID}
    cloud.auth: ${ELASTIC_CLOUD_AUTH}

    output:
      elasticsearch:
        hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
        # username: ${ELASTICSEARCH_USERNAME}
        # password: ${ELASTICSEARCH_PASSWORD}
        # pipelines:          
        #   - pipeline: "nginx"
        #     when.contains:
        #       kubernetes.container.name: "nginx-"
        #   - pipeline: "java"
        #     when.contains:
        #       kubernetes.container.name: "java-"              
        #   - pipeline: "default"  
        #     when.contains:
        #       kubernetes.container.name: ""
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      tolerations:
        - key: "elasticsearch-exclusive"
          operator: "Exists"
          effect: "NoSchedule"         
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
        - name: filebeat
          imagePullPolicy: Always          
          image: 'filebeat:6.6.0'
          args: [
            "-c", 
            "/etc/filebeat.yml",
            "-e",
          ]         
          env:
          - name: ELASTICSEARCH_HOST
            value: 0.0.0.0
          - name: ELASTICSEARCH_PORT
            value: "9200"
          # - name: ELASTICSEARCH_USERNAME
          #   value: elastic
          # - name: ELASTICSEARCH_PASSWORD
          #   value: changeme
          # - name: ELASTIC_CLOUD_ID
          #   value:
          # - name: ELASTIC_CLOUD_AUTH
          #   value:
          securityContext:
            runAsUser: 0
            # If using Red Hat OpenShift uncomment this:
            #privileged: true
          resources:
            limits:
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 100Mi
          volumeMounts:
          - name: config
            mountPath: /etc/filebeat.yml
            readOnly: true
            subPath: filebeat.yml
          - name: data
            mountPath: /usr/share/filebeat/data
          - name: varlibdockercontainers
            mountPath: /var/lib/docker/containers
            readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat

If output is a single-node elastic search, you can set the exported filebeat* to 0 copies by modifying the template

curl -X PUT "10.10.10.10:9200/_template/template_log" -H 'Content-Type: application/json' -d'
{
    "index_patterns" : ["filebeat*"],
    "order" : 0,
    "settings" : {
        "number_of_replicas" : 0
    }
}
'

Reference link:

  1. running-on-kubernetes
  2. Elaboration of ELK+Filebeat Centralized Logging Solution
  3. filebeat.yml (Chinese configuration details)
  4. Elastic search Pipeline
  5. es number_of_shards and number_of_replicas

Other programmes

Some are sidecar models, which can be done in more detail.

  1. Using filebeat to collect application logs of kubernetes
  2. Use Logstash to collect Kubernetes application logs

Aliyun's plan

  1. Kubernetes Log Collection Process

Start with docker

  1. docker driver
kubectl delete po $pod  -n kube-system
kubectl get po -l k8s-app=fluentd-es -n kube-system
pod=`kubectl get po -l k8s-app=fluentd-es -n kube-system | grep -Eoi 'fluentd-es-([a-z]|-|[0-9])+'` && kubectl logs $pod -n kube-system
kubectl get events -n kube-system | grep $pod

Posted by caraldur on Sun, 06 Oct 2019 23:26:54 -0700