Elasticsearch concept and query syntax

Keywords: ElasticSearch search engine

Elasticsearch concept and query syntax

ES is an open source search engine written in Java. It uses Lucene internally for indexing and searching. By encapsulating Lucene, it hides the complexity of Lucene and provides a set of simple and consistent restful APIs instead.

However, elastic search is not only Lucene, but also a full-text search engine.

It can be accurately described as follows:

A distributed real-time document storage, each field can be indexed and searched.
A distributed real-time analysis search engine.
Capable of expanding hundreds of service nodes and supporting PB Level of structured or unstructured data.

The introduction of Elasticsearch on the official website is that Elasticsearch is a distributed, scalable, near real-time search and data analysis engine.

1. Es query string retrieval

1.1 view index list

http://xx.xx.xxx.xx:9200/_cat/indices

1.2 check all data in an index

For example, check indicator_ All data in IP:

http://xx.xx.xxx.xx:9200/indicator_ip/_search?q=*

1.3 Pretty: output in json format

http://xx.xx.xxx.xx:9200/indicator_ip/_search?q=*&pretty

1.4 Sort: Specifies the field sorting

For example: indicator_ The data in IP is sorted by modified

http://xx.xx.xxx.xx:9200/indicator_ip/_search?sort=modified:desc&pretty

1.5 *values: fuzzy search for data containing values in a field

For example, check indicator_ The values field in IP contains all data of 61.161

http://xx.xx.xxx.xx:9200/indicator_ip/_search?q=values:*61.161&pretty

1.6 _source: only the specified fields are output

For example, only indicator is output_ modified field in IP

http://xx.xx.xxx.xx:9200/indicator_ip/_search?_source=modified&pretty

1.7 combined query

For example: query indicator_ Intra IP thread_ Level is 1 and thread_ Data with types 9

http://xx.xx.xxx.xx:9200/indicator_ip/_search?q=threat_level:1 AND threat_types:9&pretty

1.8 range query

For example: query indicator_ Intra IP thread_ The level value is between 1 and 5, excluding 1 and 5

http://xx.xx.xxx.xx:9200/indicator_ip/_search?q=threat_level:{1 TO 5}&pretty

2. ES structured retrieval

Recommended Firefox es plug-in: Elasticvue
Recommended Google es plug-in: ElasticSearch Head

2.1 term filtering

term is mainly used to exactly match which values, such as number, date, Boolean or not_analyzed string (unparsed text data type):

{ "term": { "age": 26 }}
{ "term": { "date": "2014-09-01" }}
{ "term": { "public": true }}
{ "term": { "tag": "full_text" }}

For example: query indicator_ Intra IP thread_ Data with level value of 1:

{ 
  "query": { 
    "term": { 
      "hostname": "xxx" 
    } 
  } 
}

2.2 terms filtering

Terms is somewhat similar to term, but terms allows you to specify multiple matching criteria. If multiple values are specified for a field, the document needs to be matched together:
Template:

{ "terms": { 
    "tag": [ "search", "full_text", "nosql" ] 
    } 
}

For example: query indicator_ Intra IP thread_ Data with level value of 1,3,5

{ 
  "query": { 
    "terms": { 
      "threat_level": [1,3,5] 
    } 
  } 
}

2.3 range filtering

Range filtering allows us to find a batch of data according to the specified range:
Range operators include:
gt: greater than
gte: greater than or equal to
lt: less than
lte: less than or equal to
For example: query indicator_ The modified value in the IP index is the data between the current time and 0:00 on November 17, 2021

{
	"query": {
		"range": {
			"modified": {
			    "gt":"2021-11-17T00:00:00.000Z",
			    "lt":"now"
			}
		}
	},
	"size": 10,
	"from": 0,
	"sort": []
}

2.4 boost filtering and joint query must,should,must_not,filter**

bool filtering can be used to merge the Boolean logic of query results of multiple filter conditions. It includes the following operators:
must: exact matching of multiple query criteria, equivalent to and.
must_not: the opposite matching of multiple query criteria, which is equivalent to not.
Should: at least one query condition matches, equivalent to or. Filter: the function is the same as that of must, but must will have a score, and the filter is only filtering, so the performance is higher. Bool: the next level can include must, should and must_ Not, filter, when the next level (must,should,must_not,filter) of the bool returns true, the current bool returns true (returning true indicates that the current condition matches)
For example: query indicator_ Thread in IP table_ Level is 1, modified is within the current time range and November 15, 2021, and confidence is 80 or credit_ Data with level 1

{
	"query": {
		"bool": {
			"must": [
				{
					"term": {
						"threat_level": 1
					}
				},
				{
					"range": {
						"modified": {
							"gt": "2021-11-15T06:27:11.000Z",
							"lt": "now"
						}
					}
				},
				{
					"bool": {
						"should": [
							{
								"term": {
									"confidence": 80
								}
							},
							{
								"term": {
									"credit_level": 1
								}
							}
						]
					}
				}
			]
		}
	},
	"size": 10,
	"from": 0,
	"sort": []
}

Note that should cannot be used in parallel with the other three. If used in parallel, should will become invalid

Posted by sane993 on Sun, 28 Nov 2021 21:27:50 -0800