Distributed ELK platform

Keywords: curl ElasticSearch yum Database

What is ELK?

  • ELK is a complete solution, which is the acronym of three software products
  • ELK represents
    • Elasticsearch: responsible for log retrieval and storage
    • Logstash: responsible for the collection, analysis and processing of logs
    • Kibana: responsible for the visualization of logs
  • These three kinds of software are open source software, usually used together, and they are successively attributed to Elastic.co, so they are called ELK for short

The role of ELK

ELK component can be used to solve the problem in the operation and maintenance of massive log system

  • Centralized query and management of distributed log data
  • System monitoring, including monitoring of system hardware and application components
  • Troubleshooting
  • Security information and incident management
  • Report function

Elasticsearch

  • ElasticSearch is a Lucene based search server. It provides a full-text search engine with distributed multi-user capabilities and a Web interface based on RESTful API
  • Elasticsearch is developed in Java and released as open source under Apache license. It is a popular enterprise search engine. Designed for cloud computing, it can achieve real-time search, stability, reliability and speed. Easy to install and use

main features

  • real-time analysis
  • Distributed real-time file storage and index every field
  • Document oriented: all objects are documents
  • High availability, easy to scale, cluster, Shards and replicas support
  • Interface friendly, JSON supported
  • No typical transaction in elastic search
  • Elasticsearch is a document oriented database
  • Elasticsearch does not provide authorization and authentication features

Related concepts

It is a non relational database

  • Node: node with an ES server
  • Cluster: a cluster composed of multiple nodes
  • Document: a searchable basic information unit
  • Index: a collection of documents with similar characteristics
  • Type: one or more types can be defined in an index
  • File: the smallest unit of ES, which is equivalent to a certain column of data
  • Shards: index Shards. Each Shard is a Shard
  • Replicas: copy of index

ES cluster installation

  • Required packages
elasticsearch-2.3.4.rpm    
kibana-4.5.2-1.x86_64.rpm
filebeat-1.2.3-x86_64.rpm  
logstash-2.3.4-1.noarch.rpm
[root@yum ~] Upload files to yum source
[root@yum ~] createrepo --update /var/ftp/localrepo   #Update yum source
  • Cluster architecture
host name IP Effect
es1 192.168.1.51 Distributed database
es2 192.168.1.52 Distributed database
es3 192.168.1.53 Distributed database
es4 192.168.1.54 Distributed database
es5 192.168.1.55 Distributed database
kibana 192.168.1.56 Log visualization
logstash 192.168.1.57 Collect analysis, process logs
yum 192.168.1.252 Provide yum source

1) Installation method

  • Set hostname resolution
  • Resolve dependencies, install packages
  • Modify profile
  • Startup service
  • Inspection Service
[root@es1 ~] vim /etc/hosts
192.168.1.51 es1
[root@es1 ~] yum -y install java-1.8.0-openjdk
[root@es1 ~] yum -y install elasticsearch
[root@es1 ~] vim /etc/elasticsearch/elsticsearch.yml
 54 network.host: 0.0.0.0
[root@es1 ~] systemctl start elasticsearch
[root@es1 ~] systemctl enable elasticsearch
[root@es1 ~] ss -lntup
[root@es1 ~] curl 192.168.1.51:9200
{
  "name" : "es1",
  "cluster_name" : "elk",
  "version" : {
    "number" : "2.3.4",
    "build_hash" : "e455fd0c13dceca8dbbdbb1665d068ae55dabe3f",
    "build_timestamp" : "2016-06-30T11:24:31Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

2) Using ansible to deploy clusters

  • playbook used in the experiment
  • When the cluster is deployed, it needs to modify the configuration file additionally. The other steps are the same as the above methods
] vim /etc/elasticsearch/elasticsearch.yml
 17 cluster.name: elk	#Cluster name, all servers need to be the same
 23 node.name: {{ ansible_hostname }}  
 # The template module is used in playbook to call the host name. It needs to correspond to the / etc/hosts file
 54 network.host: 0.0.0.0
 68 discovery.zen.ping.unicast.hosts: ["es1", "es2", "es3", "es4", "es5"]  #Indicates the host in the cluster
  • All nodes in the cluster should be able to ping each other, and should be in all clusters
  • Configure the host name and ip correspondence in / etc/hosts on the machine
  • Java environment should be installed on all machines in the cluster
  • clustername cluster name configuration requires complete consistency
  • node.name is the current node ID, and the host name of this machine should be configured
  • discovery is a cluster node machine and does not need to be fully configured
  • Start all node services after configuration

Verification:

curl 192.168.1.51:9200/_cluster/health?pretty
{
  "cluster_name" : "nsd1910",
  "status" : "green",	#In the cluster state, green is normal, yellow is a problem, and red is a complete failure
  "timed_out" : false,
  "number_of_nodes" : 5,	#Number of cluster nodes
  "number_of_data_nodes" : 5,	#Number of cluster data nodes
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Use of ES plug-in

1) Common plug-ins

head plug-in

  • It shows the topology of ES cluster, and it can be used for index and Node level operations
  • It provides a set of query API s for clusters and returns the results in json and table form
  • It provides some shortcut menus to show the various states of the cluster

kopf plug-in

  • Is an elastic search management tool
  • It provides an API for ES cluster operation

bigdesk plug in

  • Is a cluster monitoring tool of elastic search
  • It can be used to view various states of es cluster, such as: cpu, memory usage, index data, search, http connections, etc

2)ES plug in installation and view

Install on one node of the cluster

[root@es5 ~] ln -s /usr/share/elasticsearch/bin/plugin /usr/local/bin/   #Create a soft connection for a command
[root@es5 ~] plugin install Plugin file URL address	#Install plug-ins
[root@es5 ~] plugin list   #View installed plug-ins
[root@es5 ~] plugin remove Plug-in name  #Remove plug-ins

Deploy ftp on the yum server to provide plug-ins

[root@yum ~] ls /var/ftp/elk/    #Providing ftp services
elasticsearch-head-master.zip  bigdesk-master.zip
elasticsearch-kopf-master.zip

Install the plug-in:

[root@es5 ~] plugin ftp://192.168.1.252/elk/elasticsearch-kopf-master.zip
[root@es5 ~] plugin ftp://192.168.1.252/elk/bigdesk-master.zip
[root@es5 ~] plugin ftp://192.168.1.252/elk/elasticsearch-head-master.zip

#################################################
//When the plug-in file is local:
[root@es5 ~] plugin file:///root/bigdesk-master.zip   #URL address
  • When installing a plug-in, you must specify the URL address of the plug-in

3) Using plug-ins

Access with browser:

http://192.168.1.55:9200/_plugin/head
http://192.168.1.55:9200/_plugin/kopf
http://192.168.1.55:9200/_plugin/bigdesk

4) Extension RESTful API

http protocol introduction

  • The http request consists of three parts:
    They are: request line, message header, request body
  • The request line starts with a method symbol, separated by a space, followed by the requested ∪ R and protocol version. The format is as follows:
    Method Request-uri HTTP-Version Crlf
  • http request method
    • Common methods GET,POST,HEAD
    • Other methods OPTIONS,PUT, DELETE, TRACE and CONNECT
  • ES commonly used
    • PUT increase
    • DELETE delete
    • POST modification
    • GET check

curl introduction:

  • In linux, curl is a file transfer tool that uses URL rules to work under the command line. It can be said that curl is a very powerful http command line tool. It supports a variety of request modes, and has powerful functions such as custom request first. It is a comprehensive tool
  • Common parameters:
    • -A: Modify request Agent
    • -10: Specify request method
    • -i: Display return header information

RESTful API call

Elastic search provides a series of RESTful API s

  • Check the health, status, and statistics of clusters, nodes, and indexes
  • Manage data and metadata of cluster, node and index
  • CRUD and query index
  • Perform other advanced operations such as paging, sorting, filtering, etc

Use json format for POST or PUT data

curl 192.168.1.52:9200/_cat/
curl 192.168.1.52:9200/_cat/health?v
curl 192.168.1.52:9200/_cat/master 	#View master node
curl 192.168.1.52:9200/_cat/master?v 	#? v: display details
curl 192.168.1.52:9200/_cat/master?help 	#? help: view help information
curl 192.168.1.52:9200/_cat/nodes	#View node
curl 192.168.1.52:9200/_cat/indices	 	#View index
curl 192.168.1.52:9200/_cat/shards		#View fragmentation

1. increase PUT:

curl -X PUT 192.168.1.52:9200/school -d '   
> {"settings": {
>   "index": {
>     "number_of_shards": 5,  
>     "number_of_replicas": 1
>   }
> }
> }'
# Create a new index with the name of school, the number of slices is 5, and the number of copies is 1
  • Call method: database address / index / type / id
curl -X PUT http://192.168.1.51:9200/school/teacher/1 -d '   
{
  "Full name": "dc",
  "hobby": "Hot head",
  "stage": "1.0"
}'
# Add data under the school 'index'. The teacher is' type ', and the 1 is' id'

2. change POST

  • To modify data, you must call the "update" keyword
  • Call method: database address / index / type / ID / ﹤ update
curl -X POST http://192.168.1.52:9200/school/teacher/1/_update -d '
{"doc":{"hobby": "stroke a cat"}}'
# doc is fixed format

3. check GET

  • Multiple queries need to be called with the keyword of ﹐ mget and json
curl -X GET http://192.168.1.53:9200/school/teacher/1?pretty  
# GET can be omitted. The default is GET mode, and "pretty" is output in a standard way

4. delete DELETE

  • Documents can be deleted, indexes can be deleted, but types cannot be deleted
curl -X DELETE http://192.168.1.53:9200/school/teacher/1
curl -X DELETE http://192.168.1.53:9200/school   #Delete index
curl -X DELETE http://192.168.1.53:9200/*   #Delete all

Import data:

  • Call "bulk" to batch import data
  • The method is POST, the data format is JSON, and the URL is encoded with data binary

Import the json file with the index configuration:

gzip -d log.jsonl.gz
curl -X POST 192.168.1.51:9200/_bulk --data-binary @logs.jsonl
Published 105 original articles, won praise 7, visited 6097
Private letter follow

Posted by arjuna on Sat, 22 Feb 2020 03:03:17 -0800