From 7877a98bd9c00db5e81dd2f8c734cba2bab20be7 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Fri, 12 Aug 2022 09:26:17 +0200 Subject: Merging upstream version 1.36.0. Signed-off-by: Daniel Baumann --- docs/store/change-metrics-storage.md | 95 ++++++++++++++++++----------- docs/store/distributed-data-architecture.md | 67 ++++++++++++-------- 2 files changed, 101 insertions(+), 61 deletions(-) (limited to 'docs/store') diff --git a/docs/store/change-metrics-storage.md b/docs/store/change-metrics-storage.md index 99760e8d3..437b45fc2 100644 --- a/docs/store/change-metrics-storage.md +++ b/docs/store/change-metrics-storage.md @@ -6,72 +6,95 @@ custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/store/chang # Change how long Netdata stores metrics -import { Calculator } from '../../src/components/agent/dbCalc/' +The Netdata Agent uses a custom made time-series database (TSDB), named the [`dbengine`](/database/engine/README.md), to store metrics. -The Netdata Agent uses a time-series database (TSDB), named the [database engine -(`dbengine`)](/database/engine/README.md), to store metrics data. The most recently-collected metrics are stored in RAM, -and when metrics reach a certain age, and based on how much system RAM you allocate toward storing metrics in memory, -they are compressed and "spilled" to disk for long-term storage. +The default settings retain approximately two day's worth of metrics on a system collecting 2,000 metrics every second, +but the Netdata Agent is highly configurable if you want your nodes to store days, weeks, or months worth of per-second +data. -The default settings retain about two day's worth of metrics on a system collecting 2,000 metrics every second, but the -Netdata Agent is highly configurable if you want your nodes to store days, weeks, or months worth of per-second data. - -The Netdata Agent uses two settings in `netdata.conf` to change the behavior of the database engine: +The Netdata Agent uses the following three fundamental settings in `netdata.conf` to change the behavior of the database engine: ```conf [global] - page cache size = 32 + dbengine page cache size = 32 dbengine multihost disk space = 256 + storage tiers = 1 ``` -`page cache size` sets the maximum amount of RAM (in MiB) the database engine uses to cache and index recent metrics. +`dbengine page cache size` sets the maximum amount of RAM (in MiB) the database engine uses to cache and index recent +metrics. `dbengine multihost disk space` sets the maximum disk space (again, in MiB) the database engine uses to store -historical, compressed metrics. When the size of stored metrics exceeds the allocated disk space, the database engine -removes the oldest metrics on a rolling basis. +historical, compressed metrics and `storage tiers` specifies the number of storage tiers you want to have in +your `dbengine`. When the size of stored metrics exceeds the allocated disk space, the database engine removes the +oldest metrics on a rolling basis. ## Calculate the system resources (RAM, disk space) needed to store metrics You can store more or less metrics using the database engine by changing the allocated disk space. Use the calculator -below to find an appropriate value for `dbengine multihost disk space` based on how many metrics your node(s) collect, -whether you are streaming metrics to a parent node, and more. +below to find the appropriate value for the `dbengine` based on how many metrics your node(s) collect, whether you are +streaming metrics to a parent node, and more. + +You do not need to edit the `dbengine page cache size` setting to store more metrics using the database engine. However, +if you want to store more metrics _specifically in memory_, you can increase the cache size. + +:::tip + +We advise you to visit the [tiering mechanism](/database/engine/README.md#tiering) reference. This will help you +configure the Agent to retain metrics for longer periods. -You do not need to edit the `page cache size` setting to store more metrics using the database engine. However, if you -want to store more metrics _specifically in memory_, you can increase the cache size. +::: -> ⚠️ This calculator provides an estimate of disk and RAM usage for **metrics storage**, along with its best -> recommendation for the `dbengine multihost disk space` setting. Real-life usage may vary based on the accuracy of the -> values you enter below, changes in the compression ratio, and the types of metrics stored. +:::caution - +This calculator provides an estimation of disk and RAM usage for **metrics usage**. Real-life usage may vary based on +the accuracy of the values you enter below, changes in the compression ratio, and the types of metrics stored. + +::: + +Download +the [calculator](https://docs.google.com/spreadsheets/d/e/2PACX-1vTYMhUU90aOnIQ7qF6iIk6tXps57wmY9lxS6qDXznNJrzCKMDzxU3zkgh8Uv0xj_XqwFl3U6aHDZ6ag/pub?output=xlsx) +to optimize the data retention to your preferences. Utilize the "Front" spreadsheet. Experiment with the variables which +are padded with yellow to come up with the best settings for your use case. ## Edit `netdata.conf` with recommended database engine settings -Now that you have a recommended setting for `dbengine multihost disk space`, open `netdata.conf` with -[`edit-config`](/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files) and look for the `dbengine -multihost disk space` setting. Change it to the value recommended above. For example: +Now that you have a recommended setting for your Agent's `dbengine`, open `netdata.conf` with +[`edit-config`](/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files) and look for the `[db]` +subsection. Change it to the recommended values you calculated from the calculator. For example: ```conf -[global] - dbengine multihost disk space = 1024 +[db] + mode = dbengine + storage tiers = 3 + update every = 1 + dbengine multihost disk space MB = 1024 + dbengine page cache size MB = 32 + dbengine tier 1 update every iterations = 60 + dbengine tier 1 multihost disk space MB = 384 + dbengine tier 1 page cache size MB = 32 + dbengine tier 2 update every iterations = 60 + dbengine tier 2 multihost disk space MB = 16 + dbengine tier 2 page cache size MB = 32 ``` -Save the file and restart the Agent with `sudo systemctl restart netdata`, or the [appropriate -method](/docs/configure/start-stop-restart.md) for your system, to change the database engine's size. +Save the file and restart the Agent with `sudo systemctl restart netdata`, or +the [appropriate method](/docs/configure/start-stop-restart.md) for your system, to change the database engine's size. ## What's next? -If you have multiple nodes with the Netdata Agent installed, you can [stream -metrics](/docs/metrics-storage-management/how-streaming-works.mdx) from any number of _child_ nodes to a _parent_ node -and store metrics using a centralized time-series database. Streaming allows you to centralize your data, run Agents as -headless collectors, replicate data, and more. +If you have multiple nodes with the Netdata Agent installed, you +can [stream metrics](/docs/metrics-storage-management/how-streaming-works.mdx) from any number of _child_ nodes to a _ +parent_ node and store metrics using a centralized time-series database. Streaming allows you to centralize your data, +run Agents as headless collectors, replicate data, and more. -Storing metrics with the database engine is completely interoperable with [exporting to other time-series -databases](/docs/export/external-databases.md). With exporting, you can use the node's resources to surface metrics -when [viewing dashboards](/docs/visualize/interact-dashboards-charts.md), while also archiving metrics elsewhere for -further analysis, visualization, or correlation with other tools. +Storing metrics with the database engine is completely interoperable +with [exporting to other time-series databases](/docs/export/external-databases.md). With exporting, you can use the +node's resources to surface metrics when [viewing dashboards](/docs/visualize/interact-dashboards-charts.md), while also +archiving metrics elsewhere for further analysis, visualization, or correlation with other tools. ### Related reference documentation - [Netdata Agent · Database engine](/database/engine/README.md) +- [Netdata Agent · Database engine configuration option](/daemon/config/README.md#[db]-section-options) diff --git a/docs/store/distributed-data-architecture.md b/docs/store/distributed-data-architecture.md index c834d710a..62933cfe5 100644 --- a/docs/store/distributed-data-architecture.md +++ b/docs/store/distributed-data-architecture.md @@ -10,34 +10,43 @@ Netdata uses a distributed data architecture to help you collect and store per-s Every node in your infrastructure, whether it's one or a thousand, stores the metrics it collects. Netdata Cloud bridges the gap between many distributed databases by _centralizing the interface_ you use to query and -visualize your nodes' metrics. When you [look at charts in Netdata -Cloud](/docs/visualize/interact-dashboards-charts.md), the metrics values are queried directly from that node's database -and securely streamed to Netdata Cloud, which proxies them to your browser. +visualize your nodes' metrics. When you [look at charts in Netdata Cloud](/docs/visualize/interact-dashboards-charts.md) +, the metrics values are queried directly from that node's database and securely streamed to Netdata Cloud, which +proxies them to your browser. Netdata's distributed data architecture has a number of benefits: -- **Performance**: Every query to a node's database takes only a few milliseconds to complete for responsiveness when - viewing dashboards or using features like [Metric - Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations). -- **Scalability**: As your infrastructure scales, install the Netdata Agent on every new node to immediately add it to - your monitoring solution without adding cost or complexity. -- **1-second granularity**: Without an expensive centralized data lake, you can store all of your nodes' per-second - metrics, for any period of time, while keeping costs down. -- **No filtering or selecting of metrics**: Because Netdata's distributed data architecture allows you to store all - metrics, you don't have to configure which metrics you retain. Keep everything for full visibility during - troubleshooting and root cause analysis. -- **Easy maintenance**: There is no centralized data lake to purchase, allocate, monitor, and update, removing - complexity from your monitoring infrastructure. +- **Performance**: Every query to a node's database takes only a few milliseconds to complete for responsiveness when + viewing dashboards or using features + like [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations). +- **Scalability**: As your infrastructure scales, install the Netdata Agent on every new node to immediately add it to + your monitoring solution without adding cost or complexity. +- **1-second granularity**: Without an expensive centralized data lake, you can store all of your nodes' per-second + metrics, for any period of time, while keeping costs down. +- **No filtering or selecting of metrics**: Because Netdata's distributed data architecture allows you to store all + metrics, you don't have to configure which metrics you retain. Keep everything for full visibility during + troubleshooting and root cause analysis. +- **Easy maintenance**: There is no centralized data lake to purchase, allocate, monitor, and update, removing + complexity from your monitoring infrastructure. -## Does Netdata Cloud store my metrics? +## Ephemerality of metrics -Netdata Cloud does not store metric values. +The ephemerality of metrics plays an important role in retention. In environments where metrics collection is dynamic and +new metrics are constantly being generated, we are interested about 2 parameters: -To enable certain features, such as [viewing active alarms](/docs/monitor/view-active-alarms.md) or [filtering by -hostname/service](https://learn.netdata.cloud/docs/cloud/war-rooms#node-filter), Netdata Cloud does store configured -alarms, their status, and a list of active collectors. +1. The **expected concurrent number of metrics** as an average for the lifetime of the database. This affects mainly the + storage requirements. -Netdata does not and never will sell your personal data or data about your deployment. +2. The **expected total number of unique metrics** for the lifetime of the database. This affects mainly the memory + requirements for having all these metrics indexed and available to be queried. + +## Granularity of metrics + +The granularity of metrics (the frequency they are collected and stored, i.e. their resolution) is significantly +affecting retention. + +Lowering the granularity from per second to every two seconds, will double their retention and half the CPU requirements +of the Netdata Agent, without affecting disk space or memory requirements. ## Long-term metrics storage with Netdata @@ -47,7 +56,8 @@ appropriate amount of RAM and disk space. Read our document on changing [how long Netdata stores metrics](/docs/store/change-metrics-storage.md) on your nodes for details. -## Other options for your metrics data +You can also stream between nodes using [streaming](/streaming/README.md), allowing to replicate databases and create +your own centralized data lake of metrics, if you choose to do so. While a distributed data architecture is the default when monitoring infrastructure with Netdata, you can also configure its behavior based on your needs or the type of infrastructure you manage. @@ -55,12 +65,19 @@ its behavior based on your needs or the type of infrastructure you manage. To archive metrics to an external time-series database, such as InfluxDB, Graphite, OpenTSDB, Elasticsearch, TimescaleDB, and many others, see details on [integrating Netdata via exporting](/docs/export/external-databases.md). -You can also stream between nodes using [streaming](/streaming/README.md), allowing to replicate databases and create -your own centralized data lake of metrics, if you choose to do so. - When you use the database engine to store your metrics, you can always perform a quick backup of a node's `/var/cache/netdata/dbengine/` folder using the tool of your choice. +## Does Netdata Cloud store my metrics? + +Netdata Cloud does not store metric values. + +To enable certain features, such as [viewing active alarms](/docs/monitor/view-active-alarms.md) +or [filtering by hostname/service](https://learn.netdata.cloud/docs/cloud/war-rooms#node-filter), Netdata Cloud does +store configured alarms, their status, and a list of active collectors. + +Netdata does not and never will sell your personal data or data about your deployment. + ## What's next? You can configure the Netdata Agent to store days, weeks, or months worth of distributed, per-second data by -- cgit v1.2.3