Three-step installation of kubernetes cluster
Summary
prometheus operator should be the best practice to use monitoring system. First, it builds the whole monitoring system by one key, and configures it by some non-intrusive means, such as monitoring data sources.
Automatic fault recovery, high availability alarm, etc.
However, there is still a small threshold for beginners to use. This article will share the correct posture of using prometheus operator with the example of how to monitor envoy.
As for how to write alarm rules, how to configure prometheus query statements is not the focus of this article, which will be shared in the following articles. This article focuses on how to use prometheus operator.
prometheus operator installation
sealyun Offline Installation Package It already contains prometheus operator, which can be used directly after installation.
Configuration of monitoring data sources
Principle: Discover the monitoring data source service through the CRD of the operator
Start envoy
apiVersion: apps/v1 kind: Deployment metadata: name: envoy labels: app: envoy spec: replicas: 1 selector: matchLabels: app: envoy template: metadata: labels: app: envoy spec: volumes: - hostPath: # Mount the envory configuration file for configuration convenience path: /root/envoy type: DirectoryOrCreate name: envoy containers: - name: envoy volumeMounts: - mountPath: /etc/envoy name: envoy readOnly: true image: envoyproxy/envoy:latest ports: - containerPort: 10000 # Data port - containerPort: 9901 # Management port, through which metric is exposed --- kind: Service apiVersion: v1 metadata: name: envoy labels: app: envoy # Label the service, and the operator will look for the service. spec: selector: app: envoy ports: - protocol: TCP port: 80 targetPort: 10000 name: user - protocol: TCP # service exposes metric s ports port: 81 targetPort: 9901 name: metrics # Names are important, and Service Monitor looks for port names
envoy configuration file:
The address of the monitor must be changed to 0.0.0.0, otherwise metric can not be obtained through service.
/root/envoy/envoy.yaml
admin: access_log_path: /tmp/admin_access.log address: socket_address: protocol: TCP address: 0.0.0.0 # This must be changed to 0.0.0.0 instead of 127.0.0.1. port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: protocol: TCP address: 0.0.0.0 port_value: 10000 filter_chains: - filters: - name: envoy.http_connection_manager config: stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: prefix: "/" route: host_rewrite: sealyun.com cluster: service_google http_filters: - name: envoy.router clusters: - name: service_sealyun connect_timeout: 0.25s type: LOGICAL_DNS # Comment out the following line to test on v6 networks dns_lookup_family: V4_ONLY lb_policy: ROUND_ROBIN hosts: - socket_address: address: sealyun.com port_value: 443 tls_context: { sni: sealyun.com }
Using Service Monitor
envoyServiceMonitor.yaml:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app: envoy name: envoy namespace: monitoring # This can not be in a namespace with service spec: endpoints: - interval: 15s port: metrics # Port name of envoy service path: /stats/prometheus # Data source path namespaceSelector: matchNames: # namespace where envoy service is located - default selector: matchLabels: app: envoy # Select envoy service
When create succeeds, we can see envoy's data source:
Then you can see metric s:
Then you can do some configuration on grafana, and promethues related usage is not the object of this article.
Alarm configuration
alert manager configuration
[root@dev-86-201 envoy]# kubectl get secret -n monitoring NAME TYPE DATA AGE alertmanager-main Opaque 1 27d
We can see the secrect and see the details:
[root@dev-86-201 envoy]# kubectl get secret alertmanager-main -o yaml -n monitoring apiVersion: v1 data: alertmanager.yaml: Imdsb2JhbCI6IAogICJyZXNvbHZlX3RpbWVvdXQiOiAiNW0iCiJyZWNlaXZlcnMiOiAKLSAibmFtZSI6ICJudWxsIgoicm91dGUiOiAKICAiZ3JvdXBfYnkiOiAKICAtICJqb2IiCiAgImdyb3VwX2ludGVydmFsIjogIjVtIgogICJncm91cF93YWl0IjogIjMwcyIKICAicmVjZWl2ZXIiOiAibnVsbCIKICAicmVwZWF0X2ludGVydmFsIjogIjEyaCIKICAicm91dGVzIjogCiAgLSAibWF0Y2giOiAKICAgICAgImFsZXJ0bmFtZSI6ICJEZWFkTWFuc1N3aXRjaCIKICAgICJyZWNlaXZlciI6ICJudWxsIg== kind: Secret
base64 decode:
"global": "resolve_timeout": "5m" "receivers": - "name": "null" "route": "group_by": - "job" "group_interval": "5m" "group_wait": "30s" "receiver": "null" "repeat_interval": "12h" "routes": - "match": "alertname": "DeadMansSwitch" "receiver": "null"
So configuring alert manager is very simple, just create a secrect
For example, alertmanager.yaml:
global: smtp_smarthost: 'smtp.qq.com:465' smtp_from: '474785153@qq.com' smtp_auth_username: '474785153@qq.com' smtp_auth_password: 'xxx' # This password is generated after smtp authorization is turned on. How to configure it is described below. smtp_require_tls: false route: group_by: ['alertmanager','cluster','service'] group_wait: 30s group_interval: 5m repeat_interval: 3h receiver: 'fanux' routes: - receiver: 'fanux' receivers: - name: 'fanux' email_configs: - to: '474785153@qq.com' send_resolved: true
delete the old secret and regenerate the secret according to its configuration
kubectl delete secret alertmanager-main -n monitoring kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
Mailbox configuration, take QQ mailbox as an example
Open smtp pop3 service
Following the operation, an authorization code will be configured to the configuration file above.
Then you can receive an alarm:
Alarm Rule Configuration
prometheus operator customizes Prometheus Rule CRD to describe alarm rules
[root@dev-86-202 shell]# kubectl get PrometheusRule -n monitoring NAME AGE prometheus-k8s-rules 6m
You can edit the rule directly, or you can create a Prometheus Rule yourself.
kubectl edit PrometheusRule prometheus-k8s-rules -n monitoring
If we add an alarm in group:
spec: groups: - name: ./example.rules rules: - alert: ExampleAlert expr: vector(1) - name: k8s.rules rules:
Restart prometheuspod:
kubectl delete pod prometheus-k8s-0 prometheus-k8s-1 -n monitoring
Then you can see the new rules on the interface:
Exploring additive QQ groups: 98488045