1 files changed, 286 insertions, 0 deletions
diff --git a/src/go/plugin/go.d/modules/hdfs/integrations/hadoop_distributed_file_system_hdfs.md b/src/go/plugin/go.d/modules/hdfs/integrations/hadoop_distributed_file_system_hdfs.md
new file mode 100644
index 000000000..e37ccde0c
--- /dev/null
+++ b/src/go/plugin/go.d/modules/hdfs/integrations/hadoop_distributed_file_system_hdfs.md
@@ -0,0 +1,286 @@
+<!--startmeta
+custom_edit_url: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/modules/hdfs/README.md"
+meta_yaml: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/modules/hdfs/metadata.yaml"
+sidebar_label: "Hadoop Distributed File System (HDFS)"
+learn_status: "Published"
+learn_rel_path: "Collecting Metrics/Storage, Mount Points and Filesystems"
+most_popular: True
+message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE"
+endmeta-->
+
+# Hadoop Distributed File System (HDFS)
+
+
+<img src="https://netdata.cloud/img/hadoop.svg" width="150"/>
+
+
+Plugin: go.d.plugin
+Module: hfs
+
+<img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" />
+
+## Overview
+
+This collector monitors HDFS nodes.
+
+Netdata accesses HDFS metrics over `Java Management Extensions` (JMX) through the web interface of an HDFS daemon.
+
+
+
+
+This collector is supported on all platforms.
+
+This collector supports collecting metrics from multiple instances of this integration, including remote instances.
+
+
+### Default Behavior
+
+#### Auto-Detection
+
+This integration doesn't support auto-detection.
+
+#### Limits
+
+The default configuration for this integration does not impose any limits on data collection.
+
+#### Performance Impact
+
+The default configuration for this integration is not expected to impose a significant performance impact on the system.
+
+
+## Metrics
+
+Metrics grouped by *scope*.
+
+The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
+
+
+
+### Per Hadoop Distributed File System (HDFS) instance
+
+These metrics refer to the entire monitored application.
+
+This scope has no labels.
+
+Metrics:
+
+| Metric | Dimensions | Unit | DataNode | NameNode |
+|:------|:----------|:----|:---:|:---:|
+| hdfs.heap_memory | committed, used | MiB | • | • |
+| hdfs.gc_count_total | gc | events/s | • | • |
+| hdfs.gc_time_total | ms | ms | • | • |
+| hdfs.gc_threshold | info, warn | events/s | • | • |
+| hdfs.threads | new, runnable, blocked, waiting, timed_waiting, terminated | num | • | • |
+| hdfs.logs_total | info, error, warn, fatal | logs/s | • | • |
+| hdfs.rpc_bandwidth | received, sent | kilobits/s | • | • |
+| hdfs.rpc_calls | calls | calls/s | • | • |
+| hdfs.open_connections | open | connections | • | • |
+| hdfs.call_queue_length | length | num | • | • |
+| hdfs.avg_queue_time | time | ms | • | • |
+| hdfs.avg_processing_time | time | ms | • | • |
+| hdfs.capacity | remaining, used | KiB |   | • |
+| hdfs.used_capacity | dfs, non_dfs | KiB |   | • |
+| hdfs.load | load | load |   | • |
+| hdfs.volume_failures_total | failures | events/s |   | • |
+| hdfs.files_total | files | num |   | • |
+| hdfs.blocks_total | blocks | num |   | • |
+| hdfs.blocks | corrupt, missing, under_replicated | num |   | • |
+| hdfs.data_nodes | live, dead, stale | num |   | • |
+| hdfs.datanode_capacity | remaining, used | KiB | • |   |
+| hdfs.datanode_used_capacity | dfs, non_dfs | KiB | • |   |
+| hdfs.datanode_failed_volumes | failed volumes | num | • |   |
+| hdfs.datanode_bandwidth | reads, writes | KiB/s | • |   |
+
+
+
+## Alerts
+
+
+The following alerts are available:
+
+| Alert name  | On metric | Description |
+|:------------|:----------|:------------|
+| [ hdfs_capacity_usage ](https://github.com/netdata/netdata/blob/master/src/health/health.d/hdfs.conf) | hdfs.capacity | summary datanodes space capacity utilization |
+| [ hdfs_missing_blocks ](https://github.com/netdata/netdata/blob/master/src/health/health.d/hdfs.conf) | hdfs.blocks | number of missing blocks |
+| [ hdfs_stale_nodes ](https://github.com/netdata/netdata/blob/master/src/health/health.d/hdfs.conf) | hdfs.data_nodes | number of datanodes marked stale due to delayed heartbeat |
+| [ hdfs_dead_nodes ](https://github.com/netdata/netdata/blob/master/src/health/health.d/hdfs.conf) | hdfs.data_nodes | number of datanodes which are currently dead |
+| [ hdfs_num_failed_volumes ](https://github.com/netdata/netdata/blob/master/src/health/health.d/hdfs.conf) | hdfs.num_failed_volumes | number of failed volumes |
+
+
+## Setup
+
+### Prerequisites
+
+No action required.
+
+### Configuration
+
+#### File
+
+The configuration file name for this integration is `go.d/hdfs.conf`.
+
+
+You can edit the configuration file using the `edit-config` script from the
+Netdata [config directory](/docs/netdata-agent/configuration/README.md#the-netdata-config-directory).
+
+```bash
+cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
+sudo ./edit-config go.d/hdfs.conf
+```
+#### Options
+
+The following options can be defined globally: update_every, autodetection_retry.
+
+
+<details open><summary>Config options</summary>
+
+| Name | Description | Default | Required |
+|:----|:-----------|:-------|:--------:|
+| update_every | Data collection frequency. | 1 | no |
+| autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no |
+| url | Server URL. | http://127.0.0.1:9870/jmx | yes |
+| timeout | HTTP request timeout. | 1 | no |
+| username | Username for basic HTTP authentication. |  | no |
+| password | Password for basic HTTP authentication. |  | no |
+| proxy_url | Proxy URL. |  | no |
+| proxy_username | Username for proxy basic HTTP authentication. |  | no |
+| proxy_password | Password for proxy basic HTTP authentication. |  | no |
+| method | HTTP request method. | GET | no |
+| body | HTTP request body. |  | no |
+| headers | HTTP request headers. |  | no |
+| not_follow_redirects | Redirect handling policy. Controls whether the client follows redirects. | no | no |
+| tls_skip_verify | Server certificate chain and hostname validation policy. Controls whether the client performs this check. | no | no |
+| tls_ca | Certification authority that the client uses when verifying the server's certificates. |  | no |
+| tls_cert | Client TLS certificate. |  | no |
+| tls_key | Client TLS key. |  | no |
+
+</details>
+
+#### Examples
+
+##### Basic
+
+A basic example configuration.
+
+```yaml
+jobs:
+  - name: local
+    url: http://127.0.0.1:9870/jmx
+
+```
+##### HTTP authentication
+
+Basic HTTP authentication.
+
+<details open><summary>Config</summary>
+
+```yaml
+jobs:
+  - name: local
+    url: http://127.0.0.1:9870/jmx
+    username: username
+    password: password
+
+```
+</details>
+
+##### HTTPS with self-signed certificate
+
+Do not validate server certificate chain and hostname.
+
+
+<details open><summary>Config</summary>
+
+```yaml
+jobs:
+  - name: local
+    url: https://127.0.0.1:9870/jmx
+    tls_skip_verify: yes
+
+```
+</details>
+
+##### Multi-instance
+
+> **Note**: When you define multiple jobs, their names must be unique.
+
+Collecting metrics from local and remote instances.
+
+
+<details open><summary>Config</summary>
+
+```yaml
+jobs:
+  - name: local
+    url: http://127.0.0.1:9870/jmx
+
+  - name: remote
+    url: http://192.0.2.1:9870/jmx
+
+```
+</details>
+
+
+
+## Troubleshooting
+
+### Debug Mode
+
+**Important**: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.
+
+To troubleshoot issues with the `hfs` collector, run the `go.d.plugin` with the debug option enabled. The output
+should give you clues as to why the collector isn't working.
+
+- Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on
+  your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`.
+
+  ```bash
+  cd /usr/libexec/netdata/plugins.d/
+  ```
+
+- Switch to the `netdata` user.
+
+  ```bash
+  sudo -u netdata -s
+  ```
+
+- Run the `go.d.plugin` to debug the collector:
+
+  ```bash
+  ./go.d.plugin -d -m hfs
+  ```
+
+### Getting Logs
+
+If you're encountering problems with the `hfs` collector, follow these steps to retrieve logs and identify potential issues:
+
+- **Run the command** specific to your system (systemd, non-systemd, or Docker container).
+- **Examine the output** for any warnings or error messages that might indicate issues.  These messages should provide clues about the root cause of the problem.
+
+#### System with systemd
+
+Use the following command to view logs generated since the last Netdata service restart:
+
+```bash
+journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep hfs
+```
+
+#### System without systemd
+
+Locate the collector log file, typically at `/var/log/netdata/collector.log`, and use `grep` to filter for collector's name:
+
+```bash
+grep hfs /var/log/netdata/collector.log
+```
+
+**Note**: This method shows logs from all restarts. Focus on the **latest entries** for troubleshooting current issues.
+
+#### Docker Container
+
+If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:
+
+```bash
+docker logs netdata 2>&1 | grep hfs
+```
+
+