1. Introduction
Elasticsearch introduces the Cross Cluster Search (CCS Cross Cluster Search) function in version 5.3 to replace the cube node to be discarded. Similar to triple node, Cross Cluster Search is used to realize cross cluster data search. Cross Cluster Search enables you to run a single search request against one or more remote clusters. For example, you can use Cross Cluster Search to filter and analyze log data stored in clusters in different data centers.
Cross cluster query allows you to query the data of multiple clusters. This feature helps us better design our architecture.
In distributed systems or microservices, the system is divided into multiple modules. Module A is in the charge of one team, module B is in the charge of one team, and module C is in the charge of one team. Usually, each module only queries its own log, but when some problems involve multiple modules, it will find the log of multiple modules. At this time, cross cluster query can just solve such problems. For the architecture, we will use module A The logs of module B and module C are stored in separate clusters.
In this way, mutual interference between clusters can be avoided. The large amount of logs in module a will only affect the writing of cluster A, so as to divide and rule.
2. Configure Cross Cluster Search
Suppose we have two ES clusters:
Node | Address | Port | Transport Port | Cluster |
---|---|---|---|---|
elasticsearch01 | 127.0.0.1 | 9200 | 9300 | America |
elasticsearch02 | 127.0.0.1 | 9201 | 9301 | America |
elasticsearch03 | 127.0.0.1 | 9202 | 9302 | Europe |
elasticsearch04 | 127.0.0.1 | 9203 | 9303 | Europe |
There are two ways to configure CCS:
1) Configure elasticsearch.yml
search: remote: america: seeds: 127.0.0.1:9300 seeds: 127.0.0.1:9301 europe: seeds: 127.0.0.1:9302 seeds: 127.0.0.1:9303
Note: in the above way, the remote cluster needs to be running during configuration. For example, when configuring the "america" cluster, the "europe" cluster needs to be running, otherwise the node cannot be started successfully.
2) Configuring using the Cluster Settings API
curl -XPUT -H'Content-Type: application/json' localhost:9200/_cluster/settings -d ' { "persistent": { "search.remote": { "america": { "skip_unavailable": "true", "seeds": ["127.0.0.1:9300","127.0.0.1:9301"] }, "europe": { "skip_unavailable": "true", "seeds": ["127.0.0.1:9302","127.0.0.1:9303"] } } } }'
It is recommended to use API to modify seeds and other configurations of remote cluster conveniently.
3. Validate Cross Cluster Search
1) Use_ remote/info to view CCS connection status:
[root@localhost elasticsearch01]# curl -XGET -H 'Content-Type: application/json' localhost:9201/_remote/info?pretty { "america" : { "seeds" : [ "127.0.0.1:9300", "127.0.0.1:9301" ], "http_addresses" : [ "127.0.0.1:9200", "127.0.0.1:9201" ], "connected" : true, "num_nodes_connected" : 2, "max_connections_per_cluster" : 3, "initial_connect_timeout" : "30s" }, "europe" : { "seeds" : [ "127.0.0.1:9302", "127.0.0.1:9303" ], "http_addresses" : [ "127.0.0.1:9202", "127.0.0.1:9203" ], "connected" : true, "num_nodes_connected" : 2, "max_connections_per_cluster" : 3, "initial_connect_timeout" : "30s" } }
2) Use cross cluster search:
Query the data of two clusters at the same time:
curl -H 'Content-Type:application/json' -XGET 'http://localhost:9200/cluster_name:index,cluster_name:index/_search
Query the data of all clusters at the same time:
curl -H 'Content-Type:application/json' -XGET 'http://localhost:9200/test/_search'
Example: using cluster queries
Example: using remote cluster queries
Cluster: america
Index: test
curl -H 'Content-Type:application/json' -XGET 'http://localhost:9200/america:test/_search'
java API example:
//Query all clusters for data starting with appIndex -
SearchRequest searchRequest = Requests.searchRequest(":appIndex-");
SearchResponse response = es.getClient().search(searchRequest).get();
4,Disable Cross Cluster Search
Use API settings:
curl -XPUT -H'Content-Type: application/json' localhost:9201/_cluster/settings -d ' { "persistent": { "search.remote": { "america": { "skip_unavailable": null, "seeds": null }, "europe": { "skip_unavailable": null, "seeds": null } } } }'
5. Configuration of CCS
search.remote.${cluster_alias}.skip_unavailable: clusters that cannot be reached by skip during query. The default is false, and it is recommended to set it to true
search.remote.connect: the default is true, that is, any node is connected to the remote cluster as a cross cluster client. Cross cluster search requests must be sent to the cross cluster client.
Search.remote.node.attr: set the properties of remote node, such as search.remote.node.attr:gateway. Only nodes with node.attr.gateway: true will be used by the node connection for CCS query.