From f09848204fa5283d21ea43e262ee41aa578e1808 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Mon, 26 Aug 2024 10:15:24 +0200 Subject: Merging upstream version 1.47.0. Signed-off-by: Daniel Baumann --- .../change-metrics-storage.md | 12 +- docs/netdata-agent/securing-netdata-agents.md | 3 + .../disk-requirements-and-retention.md | 141 +++++---------------- 3 files changed, 41 insertions(+), 115 deletions(-) (limited to 'docs/netdata-agent') diff --git a/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md b/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md index 8d940a730..8a8659eff 100644 --- a/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md +++ b/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md @@ -5,11 +5,13 @@ space**. This provides greater control and helps you optimize storage usage for **Default Retention Limits**: -| Tier | Resolution | Time Limit | Size Limit | -|:----:|:-------------------:|:----------:|:----------:| -| 0 | high (per second) | 14 days | 1 GiB | -| 1 | middle (per minute) | 3 months | 1 GiB | -| 2 | low (per hour) | 2 years | 1 GiB | +| Tier | Resolution | Time Limit | Size Limit (min 256 MB) | +|:----:|:-------------------:|:----------:|:-----------------------:| +| 0 | high (per second) | 14 days | 1 GiB | +| 1 | middle (per minute) | 3 months | 1 GiB | +| 2 | low (per hour) | 2 years | 1 GiB | + +> **Note**: If a user sets a disk space size less than 256 MB for a tier, Netdata will automatically adjust it to 256 MB. With these defaults, Netdata requires approximately 4 GiB of storage space (including metadata). diff --git a/docs/netdata-agent/securing-netdata-agents.md b/docs/netdata-agent/securing-netdata-agents.md index 4f6ff4094..5232173fb 100644 --- a/docs/netdata-agent/securing-netdata-agents.md +++ b/docs/netdata-agent/securing-netdata-agents.md @@ -69,6 +69,9 @@ that node no longer serves its local dashboard. `netdata.conf` and use > `edit-config`. +If you are using Netdata with Docker, make sure to set the `NETDATA_HEALTHCHECK_TARGET` environment variable to `cli`. + + ## Expose Netdata only in a private LAN If your organisation has a private administration and management LAN, you can bind Netdata on this network interface on all your servers. diff --git a/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md b/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md index d9e879cb6..7cd9a527d 100644 --- a/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md +++ b/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md @@ -2,41 +2,17 @@ ## Database Modes and Tiers -Netdata comes with 3 database modes: +Netdata offers two database modes to suit your needs for performance and data persistence: -1. `dbengine`: the default high-performance multi-tier database of Netdata. Metric samples are cached in memory and are saved to disk in multiple tiers, with compression. -2. `ram`: metric samples are stored in ring buffers in memory, with increments of 1024 samples. Metric samples are not committed to disk. Kernel-Same-Page (KSM) can be used to deduplicate Netdata's memory. -3. `alloc`: metric samples are stored in ring buffers in memory, with flexible increments. Metric samples are not committed to disk. - -## `ram` and `alloc` - -Modes `ram` and `alloc` can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected. - -When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance. - -The memory required per sample in these modes, is 4 bytes: - -- `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory. -- `alloc` mode can be sized for any number of samples per metric. KSM cannot be used in this mode. - -To configure database mode `ram` or `alloc`, in `netdata.conf`, set the following: - -- `[db].mode` to either `ram` or `alloc`. -- `[db].retention` to the number of samples the ring buffers should maintain. For `ram` if the value set is not a multiple of 1024, the next multiple of 1024 will be used. +| Mode | Description | +|:------------------:|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| dbengine (default) | High-performance, multi-tier storage with compression. Metric samples are cached in memory and then written to disk in multiple tiers for efficient retrieval and long-term storage. | +| ram | In-memory storage. Metric samples are stored in memory only, and older data is overwritten as new data arrives. This mode prioritizes speed, making it ideal for Netdata Child instances that stream data to a central Netdata parent. | ## `dbengine` -`dbengine` supports up to 5 tiers. By default, 3 tiers are used, like this: - -| Tier | Resolution | Uncompressed Sample Size | Usually On Disk | -|:--------:|:--------------------------------------------------------------------------------------------:|:------------------------:|:---------------:| -| `tier0` | native resolution (metrics collected per-second as stored per-second) | 4 bytes | 0.6 bytes | -| `tier1` | 60 iterations of `tier0`, so when metrics are collected per-second, this tier is per-minute. | 16 bytes | 6 bytes | -| `tier2` | 60 iterations of `tier1`, so when metrics are collected per second, this tier is per-hour. | 16 bytes | 18 bytes | - -Data are saved to disk compressed, so the actual size on disk varies depending on compression efficiency. - -`dbegnine` tiers are overlapping, so higher tiers include a down-sampled version of the samples in lower tiers: +Netdata's `dbengine` mode efficiently stores data on disk using compression. The actual disk space used depends on how well the data compresses. +This mode utilizes a tiered storage approach: data is saved in multiple tiers on disk. Each tier retains data at a different resolution (detail level). Higher tiers store a down-sampled (less detailed) version of the data found in lower tiers. ```mermaid gantt @@ -49,83 +25,28 @@ gantt tier2, 365d :a3, 2023-11-02, 59d ``` -## Disk Space and Metrics Retention - -You can find information about the current disk utilization of a Netdata Parent, at . The output of this endpoint is like this: - -```json -{ - // more information about the agent - // then, near the end: - "db_size": [ - { - "tier": 0, - "metrics": 43070, - "samples": 88078162001, - "disk_used": 41156409552, - "disk_max": 41943040000, - "disk_percent": 98.1245269, - "from": 1705033983, - "to": 1708856640, - "retention": 3822657, - "expected_retention": 3895720, - "currently_collected_metrics": 27424 - }, - { - "tier": 1, - "metrics": 72987, - "samples": 5155155269, - "disk_used": 20585157180, - "disk_max": 20971520000, - "disk_percent": 98.1576785, - "from": 1698287340, - "to": 1708856640, - "retention": 10569300, - "expected_retention": 10767675, - "currently_collected_metrics": 27424 - }, - { - "tier": 2, - "metrics": 148234, - "samples": 314919121, - "disk_used": 5957346684, - "disk_max": 10485760000, - "disk_percent": 56.8136853, - "from": 1667808000, - "to": 1708856640, - "retention": 41048640, - "expected_retention": 72251324, - "currently_collected_metrics": 27424 - } - ] -} -``` +`dbengine` supports up to 5 tiers. By default, 3 tiers are used: + +| Tier | Resolution | Uncompressed Sample Size | Usually On Disk | +|:-------:|:--------------------------------------------------------------------------------------------:|:------------------------:|:---------------:| +| `tier0` | native resolution (metrics collected per-second as stored per-second) | 4 bytes | 0.6 bytes | +| `tier1` | 60 iterations of `tier0`, so when metrics are collected per-second, this tier is per-minute. | 16 bytes | 6 bytes | +| `tier2` | 60 iterations of `tier1`, so when metrics are collected per second, this tier is per-hour. | 16 bytes | 18 bytes | + +**Configuring dbengine mode and retention**: + +- Enable dbengine mode: The dbengine mode is already the default, so no configuration change is necessary. For reference, the dbengine mode can be configured by setting `[db].mode` to `dbengine` in `netdata.conf`. +- Adjust retention (optional): see [Change how long Netdata stores metrics](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md). + +## `ram` + +`ram` mode can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected. + +When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance. + +The memory required per sample in these modes, is 4 bytes: `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory. + +**Configuring ram mode and retention**: -In this example: - -- `tier` is the database tier. -- `metrics` is the number of unique time-series in the database. -- `samples` is the number of samples in the database. -- `disk_used` is the currently used disk space in bytes. -- `disk_max` is the configured max disk space in bytes. -- `disk_percent` is the current disk space utilization for this tier. -- `from` is the first (oldest) timestamp in the database for this tier. -- `to` is the latest (newest) timestamp in the database for this tier. -- `retention` is the current retention of the database for this tier, in seconds (divide by 3600 for hours, divide by 86400 for days). -- `expected_retention` is the expected retention in seconds when `disk_percent` will be 100 (divide by 3600 for hours, divide by 86400 for days). -- `currently_collected_metrics` is the number of unique time-series currently being collected for this tier. - -So, for our example above: - -| Tier | # Of Metrics | # Of Samples | Disk Used | Disk Free | Current Retention | Expected Retention | Sample Size | -|-----:|-------------:|--------------:|----------:|----------:|------------------:|-------------------:|------------:| -| 0 | 43.1K | 88.1 billion | 38.4Gi | 1.88% | 44.2 days | 45.0 days | 0.46 B | -| 1 | 73.0K | 5.2 billion | 19.2Gi | 1.84% | 122.3 days | 124.6 days | 3.99 B | -| 2 | 148.3K | 315.0 million | 5.6Gi | 43.19% | 475.1 days | 836.2 days | 18.91 B | - -To configure retention, in `netdata.conf`, set the following: - -- `[db].mode` to `dbengine`. -- `[db].dbengine multihost disk space MB`, this is the max disk size for `tier0`. The default is 256MiB. -- `[db].dbengine tier 1 multihost disk space MB`, this is the max disk space for `tier1`. The default is 50% of `tier0`. -- `[db].dbengine tier 2 multihost disk space MB`, this is the max disk space for `tier2`. The default is 50% of `tier1`. +- Enable ram mode: To use in-memory storage, set `[db].mode` to ram in your `netdata.conf` file. Remember, this mode won't retain historical data after restarts. +- Adjust retention (optional): While ram mode focuses on real-time data, you can optionally control the number of samples stored in memory. Set `[db].retention` in `netdata.conf` to the desired number in seconds. Note: If the value you choose isn't a multiple of 1024, Netdata will automatically round it up to the nearest multiple. -- cgit v1.2.3