Extras: Why did you insist on using k8s to realize the existing business? There is a strong impetus that the original virtual deployment of elastic cluster can no longer adapt to the expansion of business scale. It is nightmarish to set up large-scale elastic cluster on virtual machine. Therefore, we are determined to deploy in a container way. The deployment of containers not only realizes rapid deployment, but also simplifies the operation and maintenance of elastic cluster by using k8s. Of course, it is more difficult to deploy elastic cluster (stateful set) after the k8s cluster is set up. It involves server exposure and persistent session storage, which can be solved step by step.
Related brief introduction
Kubernetes Operator
The Operator is developed by CoreOS to extend the Kubernetes API, a specific application controller. It is used to create, configure and manage complex stateful applications, such as databases, caches and monitoring systems. The Operator is built on Kubernetes' resource and controller concepts, but at the same time contains application specific domain knowledge. The key to creating an Operator is the design of the CRD (custom resource).
Since Kubernetes version 1.7, the concept of custom controller has been introduced. This function allows developers to expand and add new functions, update existing functions, and automatically perform some management tasks. These custom controllers are just like the native components of Kubernetes. Operator s directly use Kubernetes API development, that is, they can monitor clusters, change Pods/Services, and expand and shrink running applications according to the custom rules written in these controllers.
ECK
Elastic Cloud on Kubernetes is referred to as ECK, which extends the basic layout function of Kubernetes to support the setting and management of elastic search, Kibana and APM Server on Kubernetes. ECK simplifies all critical operations:
- Manage and monitor multiple clusters
- Expand and shrink cluster
- Change cluster configuration
- schedule backup
- Using TLS certificate to protect cluster
- Implementation of hot warm cold architecture with area awareness
Installing the ECK Operator
Online installation
kubectl apply -f https://download.elastic.co/downloads/eck/1.0.1/all-in-one.yaml
Off line installation
First, import offline files and images. Download the "all in one. Yaml" file and the docker image file Eck operator: 1.0.1 (the version number here is the current version when I wrote this article). In addition, the docker images of elasticsearch and kibana are prepared by the way. apm is available but I don't need it here, so I haven't downloaded it. The specific download address can be found and pulled in the dockhub, and then docker save \ < image_name: version \ > - O \ < export_name. Tar \ > is exported.
If there is no image warehouse, the default file does not specify which physical node to run on when arranging. Therefore, in an offline environment without a private warehouse, the above image needs to be loaded into all nodes with docker load - I < image filename. Tar. GZ >.
Note: the only official image of elastic is amd64!
Suggestion: for the convenience of future deployment, it is better to set up a private image warehouse. I will explain the problem of setting up a private warehouse in Kubernetes cluster practice (07).
Deploy ElasicSearch
In the existing physical environment, I have a Huawei E9000 blade server configured with 16 CH121 V5 computing nodes, which have only two 2.5-inch drawers and cannot provide large storage space. Therefore, I used the space of the nearby IP SAN to divide it into 16 LUN and mount them to 16 CH121 V5 nodes. (because the 16 compute nodes are not a Linux cluster, it is not possible to mount a large shared LUN.) each CH121 V5 uses the mounted space as a LocalPV to provide persistent storage for the ES data nodes. Why didn't Pod use iSCSI directly? I haven't done it myself(
Configure PV
Edit the file 1-test-pv.yaml
# set namespace for this elasticsearch cluster apiVersion: v1 kind: Namespace metadata: name: test --- # master-eligible storage class apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nf5270m4-es-master namespace: test provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer --- # master pv0 apiVersion: v1 kind: PersistentVolume metadata: name: test-es-master-pv0 namespace: test labels: pvname: test-es-master-pv0 spec: capacity: storage: 32Gi volumeMode: Filesystem accessMode: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle # Retain can be used for persistent storage, but pay attention to how to change from Release state to available state when PV is reused storageClassName: nf5270m4-es-master local: path: /home/elastic-data/es-master-pv0 # Set affinity to specify that the primary node runs on a node labeled with kubernetes.io/hostname=nf5270m4 (a wave server) nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - nf5270m4 # master pv1 defines the PV of a compliant master node. Like pv0, it only changes pv0 into pv1. The specific definition is omitted and space is saved ... # coordinate storage class apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: nf5270m4-es-coordinate namespace: test provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer --- # coordinate pv0 apiVersion: v1 kind: PersistentVolume metadata: name: test-es-coordinate-pv0 namespace: test labels: pvname: test-es-coordinate-pv0 spec: capacity: storage: 32Gi volumeMode: Filesystem accessMode: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle # Recycle policy clears PV data when Pod is deleted, and becomes available at the same time storageClassName: nf5270m4-es-coordinate-pv0 local: path: /home/elastic-data/es-coordinate-pv0 nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - nf5270m4 --- # Similarly, you can configure the storage class, pv, pvc of the ingest node ... --- # data storage class apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: e9k-es-data namespace: test provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer --- # data pv00 apiVersion: v1 kind: PersistentVolume metadata: name: test-es-data-pv00 namespace: test labels: pvname: test-es-data-pv00 spec: capacity: storage: 2Ti volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: e9k-es-data local: path: /vol/eck/data-pv00 # Here is the physical host directory of the iSCSI mount nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - e9k-1-01 ... # Other pv are similar, not listed one by one
Configure PVC
Edit the file 2-test-pvc.yaml, where the definition of PV and PVC are separated to facilitate the separate operation of PV and PVC (for example, when the recycling policy of PV is Retain, only PVC is deleted)
# master pvc0 apiVersion: v1 kind: PersistentVolumeClaim metadata: name: test-es-master-pvc0 namespace: test spec: resources: requests: storage: 32Gi accessModes: - ReadWriteOnce storageClassName: nf5270m4-es-master volumeName: es-master-pvc0 selector: matchLabels: pvname: test-es-master-pv0 # The configuration of master pvc1 is similar to that of pvc0 ... # coordinate pvc0 apiVersion: v1 kind: PersistentVolumeClaim metadata: name: test-es-coordinate-pvc0 namespace: test spec: resources: requests: storage: 32Gi accessModes: - ReadWriteOnce storageClassName: nf5270m4-es-coordinate volumeName: es-coordinate-pvc0 selector: matchLabels: pvname: test-es-coordinate-pv0 --- # Similarly, you can configure the storage class, pv, pvc of the ingest node ... # data pvc00 apiVersion: v1 kind: PersistentVolumeClaim metadata: name: test-es-data-pvc00 namespace: test spec: resources: requests: storage: 2Ti accessModes: - ReadWriteOnce storageClassName: e9k-es-data volumeName: es-data-pvc00 selector: matchLabels: pvname: test-es-data-pv00 --- # PVC configuration of other data nodes is similar
For the master volling node, it is only the primary node, and it cannot be selected as the primary node, so it is not equipped with persistent storage.
Configure ES and Kibana nodes
Edit 3-test-eck.yaml
apiVersion: elasticsearch.k8s.elastic.co/v1 kind: Elasticsearch metadata: name: test namespace: test spec: version: 7.6.0 image: nexus.internal.test:8082/amd64/elasticsearch:7.6.0 # This is my private warehouse address imagePullPolicy: IfNotPresent updateStrategy: changeBudget: maxSurge: 2 # The default value is - 1, which means the new Pod will be created immediately, which will consume a lot of resources in an instant, and then replace the old Pod to upgrade maxUnavailable: 1 # The default is 1. podDisruptionBudget: spec: minAvailable: 1 # The default is 1. selector: matchLabels: elasticsearch.k8s.elastic.co/cluster-name: test # That is, the value of metadata.name nodeSets: # Define compliance master - name: master-eligible count: 2 config: node.master: true node.data: false node.ingest: false node.ml: false node.store.allow_mmap: false xpack.ml.enabled: false cluster.remote.connect: false volumeClaimTemplates: - metadata: name: elasticsearch-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 32Gi storageClassName: nf5270m4-es-master podTemplate: metadata: labels: app: master-eligible spec: nodeSelector: # Node selection and stain tolerance, because nf5270m4, a wave server, is used as a private warehouse and generally does not schedule Pod "kubernetes.io/hostname": nf5270m4 tolerations: - key: "node-role.kubernetes.io/node" operator: "Exists" effect: "PreferNoSchedule" containers: # Define resource limits - name: elasticsearch resources: requests: cpu: 2 # No restrictions by default memory: 16Gi # The default is 2Gi. limits: # cpu: # There is no definition here and no definition by default, so there is no limit memory: 24Gi # The default is 2Gi. env: - name: ES_JAVA_OPTS # The default is 1Gi. value: -Xms10g -Xmx10g initContainers: - name: sysctl securityContext: privileged: true command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144'] - name: master-voting # Defining election nodes count: 1 config: node.master: true node.voting_only: true # The default value is false node.data: false node.ingest: false node.ml: false node.store.allow_mmap: false xpack.ml.enabled: false cluster.remote.connect: false podTemplate: metadata: labels: app: master-voting spec: nodeSelector: "kubernetes.io/hostname": nf5270m4 tolerations: - key: "node-role.kubernetes.io/node" operator: "Exists" effect: "PreferNoSchedule" initContainers: - name: sysctl securityContext: privileged: true command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144'] containers: - name: elasticsearch resources: requests: cpu: 1 # default is not set memory: 2Gi # default is 2Gi limits: cpu: 1 # default is not set memory: 2Gi # default is 2Gi env: - name: ES_JAVA_OPTS # default is 1 Gi value: 1Gi volumes: - name: elasticsearch-data emptyDir: {} # Use empty directory initContainers: - name: sysctl securityContext: privileged: true command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144'] --- # Define compliance master - name: ingest count: 1 config: node.master: false node.data: false node.ingest: true node.ml: false node.store.allow_mmap: false cluster.remote.connect: false volumeClaimTemplates: - metadata: name: elasticsearch-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 32Gi storageClassName: nf5270m4-es-ingest podTemplate: metadata: labels: app: ingest spec: nodeSelector: # Node selection and stain tolerance, because nf5270m4, a wave server, is used as a private warehouse and generally does not schedule Pod "kubernetes.io/hostname": nf5270m4 tolerations: - key: "node-role.kubernetes.io/node" operator: "Exists" effect: "PreferNoSchedule" containers: # Define resource limits - name: elasticsearch resources: requests: cpu: 1 # No restrictions by default memory: 8Gi # The default is 2Gi. limits: # cpu: # There is no definition here and no definition by default, so there is no limit memory: 16Gi # The default is 2Gi. env: - name: ES_JAVA_OPTS # The default is 1Gi. value: -Xms10g -Xmx10g initContainers: - name: sysctl securityContext: privileged: true command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144'] --- # Define compliance master - name: coordinate count: 1 config: node.master: false node.data: false node.ingest: false node.ml: false node.store.allow_mmap: false cluster.remote.connect: false volumeClaimTemplates: - metadata: name: elasticsearch-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 32Gi storageClassName: nf5270m4-es-coordinate podTemplate: metadata: labels: app: coordinate spec: nodeSelector: # Node selection and stain tolerance, because nf5270m4, a wave server, is used as a private warehouse and generally does not schedule Pod "kubernetes.io/hostname": nf5270m4 tolerations: - key: "node-role.kubernetes.io/node" operator: "Exists" effect: "PreferNoSchedule" containers: # Define resource limits - name: elasticsearch resources: requests: cpu: 4 # No restrictions by default memory: 32Gi # The default is 2Gi. limits: # cpu: # There is no definition here and no definition by default, so there is no limit memory: 48Gi # The default is 2Gi. env: - name: ES_JAVA_OPTS # The default is 1Gi. value: -Xms16g -Xmx16g initContainers: - name: sysctl securityContext: privileged: true command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144'] --- # Define compliance master - name: data count: 64 config: node.master: false node.data: true node.ingest: false node.ml: false node.store.allow_mmap: false cluster.remote.connect: false volumeClaimTemplates: - metadata: name: elasticsearch-data spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Ti storageClassName: e9k-es-data podTemplate: metadata: labels: app: data spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchLabels: elasticsearch.k8s.elastic.co/cluster-name: test topologyKey: kubernetes.io/hostname containers: # Define resource limits - name: elasticsearch resources: requests: cpu: 2 # No restrictions by default memory: 48Gi # The default is 2Gi. limits: # cpu: # There is no definition here and no definition by default, so there is no limit memory: 64Gi # The default is 2Gi. env: - name: ES_JAVA_OPTS # The default is 1Gi. value: -Xms31g -Xmx31g initContainers: - name: sysctl securityContext: privileged: true command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144'] --- apiVersion: kibana.k8s.elastic.co/v1 kind: Kibana metadata: name: test namespace: test spec: version: 7.6.0 image: nexus.internal.test:8082/amd64/kibana:7.6.0 # This is my private warehouse address imagePullPolicy: IfNotPresent count: 1 elasticsearchRef: # Name of the connected es cluster name: "test" http: tls: selfSignedCertificate: disabled: true # Access using http podTemplate: spec: nodeSelector: "kubernetes.io/hostname": nf5270m4 tolerations: - key: "node-role.kubernetes.io/node" operator: "Exists" effect: "PreferNoSchedule" containers: - name: kibana resources: requests: cpu: 1 memory: 2Gi limits: memory: 64Gi
Note: for official specific installation, please refer to https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-quickstart.html
Configure service exposure
In the way of service exposure, use the tracefik set up earlier. Edit the original tracefik file first, or map it with the host port (reduce the NAT overhead)
spec: template: spec: containers: - name: traefik ports: # Here are the added ports - name: elasticsearch containerPort: 9200 hostPort: 9200 ... args: # Add entryPoints - --entrypoints.elasticsearch.Address=:9200
kibana directly reuses 80 ports for web access.
Edit 4-test-route.yaml
apiVersion: traefik.containo.us/v1alpha1 kind: Ingre***oute metadata: name: test-kibana-route namespace: test spec: entryPoints: - web routes: - match: Host(`kibana`, `kibana.internal.pla95929`) kind: Rule services: - name: test-kb-http # Backend service name port: 5601 # Service port of back-end k8s --- apiVersion: traefik.containo.us/v1alpha1 kind: Ingre***oute metadata: name: test-es-route namespace: test spec: entryPoints: - elasticsearch routes: - match: Host(`es`, `es.internal.pla95929`) kind: Rule services: - name: test-es-http # Backend service name port: 9200 # Service port of back-end k8s
Note: the service port and service name can be viewed by the following command
kubectl get svc -n test