English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
Elasticsearch is composed of many modules, which are responsible for its functions. These modules have two types of settings, as shown below:
Static settings−Before starting Elasticsearch, these settings need to be configured in the config(elasticsearch.yml) file. You need to update all concerned nodes in the cluster to reflect the changes in these settings.
Dynamic settings −These settings can be set on a real-time Elasticsearch.
We will discuss the different modules of Elasticsearch in the following sections of this chapter.
Cluster-level settings determine the allocation of fragments to different nodes and the redistribution of fragments to rebalance the cluster. The following settings control fragment allocation.
Settings | Possible values | Description |
---|---|---|
cluster.routing.allocation.enable | ||
all | This default value allows fragment allocation for all types of fragments. | |
primaries | This only allows fragment allocation for the master fragment. | |
new_primaries | This only allows fragment allocation for the master fragment of a new index. | |
none | This does not allow any fragment allocation. | |
cluster.routing.allocation.node_concurrent_recoveries | Numeric value (default value is2) | This limits the number of concurrent fragment recoveries. |
cluster.routing.allocation.node_initial_primaries_recoveries | Numeric value (default is4) | This limits the number of parallel initial master recoveries. |
cluster.routing.allocation.same_shard.host | Boolean value (default is false) | This limits the number of multiple copies of the same shard allocated in the same physical node. |
index.recovery.concurrent_streams | Numeric value (default is3) | This controls the number of network streams opened by each node when recovering fragments from peers. |
index.recovery.concurrent_small_file_streams | Numeric value (default is2) | This can control the size of fragments during recovery to be less than5The number of streams opened by mb's small files on each node. |
cluster.routing.rebalance.enable | ||
all | This default value allows balancing all types of shards. | |
primaries | This only allows shard balancing for primary fragments. | |
replicas | This only allows shard balancing for replica fragments. | |
none | This does not allow any form of shard balancing. | |
cluster.routing.allocation .allow_rebalance | ||
always | This default value always allows rebalancing. | |
indexs_primaries_active | This allows rebalancing when all primary fragments in the cluster are allocated. | |
Indices_all_active | This allows rebalancing when all primary and replica fragments are allocated. | |
cluster.routing.allocation.cluster _concurrent_rebalance | Numeric value (default is2) | This limits the number of concurrent shard balances in the cluster. |
cluster.routing.allocation .balance.shard | Floating-point value (default is 0.45f) | This defines the weight factor for the fragments allocated to each node. |
cluster.routing.allocation .balance.index | Floating-point value (default is 0.55f) | This defines the ratio of the number of fragments allocated to each index on a specific node. |
cluster.routing.allocation .balance.threshold | Non-negative floating-point value (default is1.0f) | This is the minimum optimization value for the operation that should be performed. |
Settings | Possible values | Description |
---|---|---|
cluster.routing.allocation.disk.threshold_enabled | Boolean value (default is true) | This enables and disables the disk allocation decision-making process. |
cluster.routing.allocation.disk.watermark.low | String value (default is85) | This indicates the maximum usage rate of the disk; after this point, no other shards can be allocated to this disk. |
cluster.routing.allocation.disk.watermark.high | string value (default is90%) | This indicates the maximum usage during allocation; if this point is reached during allocation, Elasticsearch will allocate that shard to another disk. |
cluster.info.update.interval | string value (default30s) | This is the interval between two disk usage checks. |
cluster.routing.allocation.disk.include_relocations | Boolean value (default is true) | This determines whether to consider the currently allocated shards when calculating disk usage. |
This module helps the cluster discover and maintain the status of all nodes in the cluster. The cluster status changes when nodes are added or removed from the cluster. Cluster name settings are used to create logical differences between different clusters. Some modules can help you use the API provided by cloud service providers, as shown below-
Azure discovery
EC2Discovery
Google Compute Engine discovery
Zen discovery
This module maintains cluster state and shard data during the entire cluster restart. The following are the static settings of this module-
Settings | Possible values | Description |
---|---|---|
gateway.expected_nodes | Numeric value (default is 0) | The number of nodes in the cluster used to recover local shards. |
gateway.expected_master_nodes | Numeric value (default is 0) | The expected number of master nodes in the cluster before starting recovery. |
gateway.expected_data_nodes | Numeric value (default is 0) | The expected number of data nodes in the cluster before starting recovery. |
gateway.recover_after_time | String value (default is5m) | This is the interval between two disk usage checks. |
cluster.routing.allocation. disk.include_relocations | Boolean value (default is true) | This specifies the time at which the recovery process will start, regardless of the number of nodes joining the cluster. gateway.recover_after_nodes |
This module manages communication between the HTTP client and the Elasticsearch API. This module can be disabled by changing the value of http.enabled to false.
The following are the settings used to control this module (configured in elasticsearch.yml)-
Serial number | Settings and descriptions |
---|---|
1 | http.port This is the port for accessing Elasticsearch, ranging from9200-9300. |
2 | http.publish_port This port is used for http clients and is also very useful in firewall situations. |
3 | http.bind_host This is the host address of the http service. |
4 | http.publish_host This is the host address of the http client. |
5 | http.max_content_length This is the maximum size of the content in the http request. Its default value is100mb. |
6 | http.max_initial_line_length This is the maximum size of the URL, with the default value being4kb. |
7 | http.max_header_size This is the maximum size of the http header, with the default value being8kb. |
8 | http.compression This will enable or disable support for compression, with the default value being false. |
9 | http.pipelinig This will enable or disable HTTP pipelining. |
10 | http.pipelining.max_events This limits the number of events to be queued before closing the HTTP request. |
This module maintains the global settings for each index. The following settings are mainly related to memory usage-
This is used to prevent operations from causing OutOfMemoryError. This setting mainly limits the JVM heap size. For example, the indexs.breaker.total.limit setting, by default, is the size of the JVM heap.70%.
主要用于在字段上聚合时使用. It is recommended to have enough memory to allocate it. The index.fielddata.cache.size setting can be used to control the amount of memory used for field data caching.
This memory is used to cache query results. The cache uses the least recently used (LRU) eviction policy. The Indices.queries.cache.size setting controls the memory size of this cache.
This buffer stores newly created documents in the index and refreshes them when the buffer is full. Settings like indexs.memory.index_buffer_size control the number of heaps allocated to this buffer.
This cache is used to store local search data for each shard. It can be enabled during index creation and disabled by sending URL parameters.
Disable cache - ?request_cache = true Enable cache "index.requests.cache.enable": true
It controls resources during the recovery process. The following settings are provided-
Settings | Default Value |
---|---|
indices.recovery.concurrent_streams | 3 |
indices.recovery.concurrent_small_file_streams | 2 |
indices.recovery.file_chunk_size | 512kb |
indices.recovery.translog_ops | 1000 |
indices.recovery.translog_size | 512kb |
indices.recovery.compress | true |
indices.recovery.max_bytes_per_sec | 40mb |
The TTL interval setting defines the time of the document, after which the document will be deleted. The following are dynamic settings used to control this process-
Settings | Default Value |
---|---|
indices.ttl.interval | 60s |
indices.ttl.bulk_size | 1000 |
Each node can choose whether it is a data node. This attribute can be changed by modifying the node.data setting. Setting this value to false defines the node as not a data node.