# OpenSearch Plugin: go.d.plugin Module: elasticsearch ## Overview This collector monitors the performance and health of the Elasticsearch cluster. It uses [Cluster APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster.html) to collect metrics. Used endpoints: | Endpoint | Description | API | |------------------------|----------------------|-------------------------------------------------------------------------------------------------------------| | `/` | Node info | | | `/_nodes/stats` | Nodes metrics | [Nodes stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html) | | `/_nodes/_local/stats` | Local node metrics | [Nodes stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html) | | `/_cluster/health` | Cluster health stats | [Cluster health API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html) | | `/_cluster/stats` | Cluster metrics | [Cluster stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-stats.html) | This collector is supported on all platforms. This collector supports collecting metrics from multiple instances of this integration, including remote instances. ### Default Behavior #### Auto-Detection By default, it detects instances running on localhost by attempting to connect to port 9200: - http://127.0.0.1:9200 - https://127.0.0.1:9200 #### Limits By default, this collector monitors only the node it is connected to. To monitor all cluster nodes, set the `cluster_mode` configuration option to `yes`. #### Performance Impact The default configuration for this integration is not expected to impose a significant performance impact on the system. ## Metrics Metrics grouped by *scope*. The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels. ### Per node These metrics refer to the cluster node. Labels: | Label | Description | |:-----------|:----------------| | cluster_name | Name of the cluster. Based on the [Cluster name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster-name). | | node_name | Human-readable identifier for the node. Based on the [Node name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#node-name). | | host | Network host for the node, based on the [Network host setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#network.host). | Metrics: | Metric | Dimensions | Unit | |:------|:----------|:----| | elasticsearch.node_indices_indexing | index | operations/s | | elasticsearch.node_indices_indexing_current | index | operations | | elasticsearch.node_indices_indexing_time | index | milliseconds | | elasticsearch.node_indices_search | queries, fetches | operations/s | | elasticsearch.node_indices_search_current | queries, fetches | operations | | elasticsearch.node_indices_search_time | queries, fetches | milliseconds | | elasticsearch.node_indices_refresh | refresh | operations/s | | elasticsearch.node_indices_refresh_time | refresh | milliseconds | | elasticsearch.node_indices_flush | flush | operations/s | | elasticsearch.node_indices_flush_time | flush | milliseconds | | elasticsearch.node_indices_fielddata_memory_usage | used | bytes | | elasticsearch.node_indices_fielddata_evictions | evictions | operations/s | | elasticsearch.node_indices_segments_count | segments | segments | | elasticsearch.node_indices_segments_memory_usage_total | used | bytes | | elasticsearch.node_indices_segments_memory_usage | terms, stored_fields, term_vectors, norms, points, doc_values, index_writer, version_map, fixed_bit_set | bytes | | elasticsearch.node_indices_translog_operations | total, uncommitted | operations | | elasticsearch.node_indices_translog_size | total, uncommitted | bytes | | elasticsearch.node_file_descriptors | open | fd | | elasticsearch.node_jvm_heap | inuse | percentage | | elasticsearch.node_jvm_heap_bytes | committed, used | bytes | | elasticsearch.node_jvm_buffer_pools_count | direct, mapped | pools | | elasticsearch.node_jvm_buffer_pool_direct_memory | total, used | bytes | | elasticsearch.node_jvm_buffer_pool_mapped_memory | total, used | bytes | | elasticsearch.node_jvm_gc_count | young, old | gc/s | | elasticsearch.node_jvm_gc_time | young, old | milliseconds | | elasticsearch.node_thread_pool_queued | generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management | threads | | elasticsearch.node_thread_pool_rejected | generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management | threads | | elasticsearch.node_cluster_communication_packets | received, sent | pps | | elasticsearch.node_cluster_communication_traffic | received, sent | bytes/s | | elasticsearch.node_http_connections | open | connections | | elasticsearch.node_breakers_trips | requests, fielddata, in_flight_requests, model_inference, accounting, parent | trips/s | ### Per cluster These metrics refer to the cluster. Labels: | Label | Description | |:-----------|:----------------| | cluster_name | Name of the cluster. Based on the [Cluster name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster-name). | Metrics: | Metric | Dimensions | Unit | |:------|:----------|:----| | elasticsearch.cluster_health_status | green, yellow, red | status | | elasticsearch.cluster_number_of_nodes | nodes, data_nodes | nodes | | elasticsearch.cluster_shards_count | active_primary, active, relocating, initializing, unassigned, delayed_unaasigned | shards | | elasticsearch.cluster_pending_tasks | pending | tasks | | elasticsearch.cluster_number_of_in_flight_fetch | in_flight_fetch | fetches | | elasticsearch.cluster_indices_count | indices | indices | | elasticsearch.cluster_indices_shards_count | total, primaries, replication | shards | | elasticsearch.cluster_indices_docs_count | docs | docs | | elasticsearch.cluster_indices_store_size | size | bytes | | elasticsearch.cluster_indices_query_cache | hit, miss | events/s | | elasticsearch.cluster_nodes_by_role_count | coordinating_only, data, data_cold, data_content, data_frozen, data_hot, data_warm, ingest, master, ml, remote_cluster_client, voting_only | nodes | ### Per index These metrics refer to the index. Labels: | Label | Description | |:-----------|:----------------| | cluster_name | Name of the cluster. Based on the [Cluster name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster-name). | | index | Name of the index. | Metrics: | Metric | Dimensions | Unit | |:------|:----------|:----| | elasticsearch.node_index_health | green, yellow, red | status | | elasticsearch.node_index_shards_count | shards | shards | | elasticsearch.node_index_docs_count | docs | docs | | elasticsearch.node_index_store_size | store_size | bytes | ## Alerts The following alerts are available: | Alert name | On metric | Description | |:------------|:----------|:------------| | [ elasticsearch_node_indices_search_time_query ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.node_indices_search_time | search performance is degraded, queries run slowly. | | [ elasticsearch_node_indices_search_time_fetch ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.node_indices_search_time | search performance is degraded, fetches run slowly. | | [ elasticsearch_cluster_health_status_red ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.cluster_health_status | cluster health status is red. | | [ elasticsearch_cluster_health_status_yellow ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.cluster_health_status | cluster health status is yellow. | | [ elasticsearch_node_index_health_red ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.node_index_health | node index $label:index health status is red. | ## Setup ### Prerequisites No action required. ### Configuration #### File The configuration file name for this integration is `go.d/elasticsearch.conf`. You can edit the configuration file using the `edit-config` script from the Netdata [config directory](/docs/netdata-agent/configuration/README.md#the-netdata-config-directory). ```bash cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata sudo ./edit-config go.d/elasticsearch.conf ``` #### Options The following options can be defined globally: update_every, autodetection_retry.
Config options | Name | Description | Default | Required | |:----|:-----------|:-------|:--------:| | update_every | Data collection frequency. | 5 | no | | autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no | | url | Server URL. | http://127.0.0.1:9200 | yes | | cluster_mode | Controls whether to collect metrics for all nodes in the cluster or only for the local node. | false | no | | collect_node_stats | Controls whether to collect nodes metrics. | true | no | | collect_cluster_health | Controls whether to collect cluster health metrics. | true | no | | collect_cluster_stats | Controls whether to collect cluster stats metrics. | true | no | | collect_indices_stats | Controls whether to collect indices metrics. | false | no | | timeout | HTTP request timeout. | 2 | no | | username | Username for basic HTTP authentication. | | no | | password | Password for basic HTTP authentication. | | no | | proxy_url | Proxy URL. | | no | | proxy_username | Username for proxy basic HTTP authentication. | | no | | proxy_password | Password for proxy basic HTTP authentication. | | no | | method | HTTP request method. | GET | no | | body | HTTP request body. | | no | | headers | HTTP request headers. | | no | | not_follow_redirects | Redirect handling policy. Controls whether the client follows redirects. | no | no | | tls_skip_verify | Server certificate chain and hostname validation policy. Controls whether the client performs this check. | no | no | | tls_ca | Certification authority that the client uses when verifying the server's certificates. | | no | | tls_cert | Client TLS certificate. | | no | | tls_key | Client TLS key. | | no |
#### Examples ##### Basic single node mode A basic example configuration. ```yaml jobs: - name: local url: http://127.0.0.1:9200 ``` ##### Cluster mode Cluster mode example configuration.
Config ```yaml jobs: - name: local url: http://127.0.0.1:9200 cluster_mode: yes ```
##### HTTP authentication Basic HTTP authentication.
Config ```yaml jobs: - name: local url: http://127.0.0.1:9200 username: username password: password ```
##### HTTPS with self-signed certificate Elasticsearch with enabled HTTPS and self-signed certificate.
Config ```yaml jobs: - name: local url: https://127.0.0.1:9200 tls_skip_verify: yes ```
##### Multi-instance > **Note**: When you define multiple jobs, their names must be unique. Collecting metrics from local and remote instances.
Config ```yaml jobs: - name: local url: http://127.0.0.1:9200 - name: remote url: http://192.0.2.1:9200 ```
## Troubleshooting ### Debug Mode **Important**: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature. To troubleshoot issues with the `elasticsearch` collector, run the `go.d.plugin` with the debug option enabled. The output should give you clues as to why the collector isn't working. - Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`. ```bash cd /usr/libexec/netdata/plugins.d/ ``` - Switch to the `netdata` user. ```bash sudo -u netdata -s ``` - Run the `go.d.plugin` to debug the collector: ```bash ./go.d.plugin -d -m elasticsearch ``` ### Getting Logs If you're encountering problems with the `elasticsearch` collector, follow these steps to retrieve logs and identify potential issues: - **Run the command** specific to your system (systemd, non-systemd, or Docker container). - **Examine the output** for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem. #### System with systemd Use the following command to view logs generated since the last Netdata service restart: ```bash journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep elasticsearch ``` #### System without systemd Locate the collector log file, typically at `/var/log/netdata/collector.log`, and use `grep` to filter for collector's name: ```bash grep elasticsearch /var/log/netdata/collector.log ``` **Note**: This method shows logs from all restarts. Focus on the **latest entries** for troubleshooting current issues. #### Docker Container If your Netdata runs in a Docker container named "netdata" (replace if different), use this command: ```bash docker logs netdata 2>&1 | grep elasticsearch ```