From 7877a98bd9c00db5e81dd2f8c734cba2bab20be7 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Fri, 12 Aug 2022 09:26:17 +0200 Subject: Merging upstream version 1.36.0. Signed-off-by: Daniel Baumann --- docs/store/distributed-data-architecture.md | 67 ++++++++++++++++++----------- 1 file changed, 42 insertions(+), 25 deletions(-) (limited to 'docs/store/distributed-data-architecture.md') diff --git a/docs/store/distributed-data-architecture.md b/docs/store/distributed-data-architecture.md index c834d710a..62933cfe5 100644 --- a/docs/store/distributed-data-architecture.md +++ b/docs/store/distributed-data-architecture.md @@ -10,34 +10,43 @@ Netdata uses a distributed data architecture to help you collect and store per-s Every node in your infrastructure, whether it's one or a thousand, stores the metrics it collects. Netdata Cloud bridges the gap between many distributed databases by _centralizing the interface_ you use to query and -visualize your nodes' metrics. When you [look at charts in Netdata -Cloud](/docs/visualize/interact-dashboards-charts.md), the metrics values are queried directly from that node's database -and securely streamed to Netdata Cloud, which proxies them to your browser. +visualize your nodes' metrics. When you [look at charts in Netdata Cloud](/docs/visualize/interact-dashboards-charts.md) +, the metrics values are queried directly from that node's database and securely streamed to Netdata Cloud, which +proxies them to your browser. Netdata's distributed data architecture has a number of benefits: -- **Performance**: Every query to a node's database takes only a few milliseconds to complete for responsiveness when - viewing dashboards or using features like [Metric - Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations). -- **Scalability**: As your infrastructure scales, install the Netdata Agent on every new node to immediately add it to - your monitoring solution without adding cost or complexity. -- **1-second granularity**: Without an expensive centralized data lake, you can store all of your nodes' per-second - metrics, for any period of time, while keeping costs down. -- **No filtering or selecting of metrics**: Because Netdata's distributed data architecture allows you to store all - metrics, you don't have to configure which metrics you retain. Keep everything for full visibility during - troubleshooting and root cause analysis. -- **Easy maintenance**: There is no centralized data lake to purchase, allocate, monitor, and update, removing - complexity from your monitoring infrastructure. +- **Performance**: Every query to a node's database takes only a few milliseconds to complete for responsiveness when + viewing dashboards or using features + like [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations). +- **Scalability**: As your infrastructure scales, install the Netdata Agent on every new node to immediately add it to + your monitoring solution without adding cost or complexity. +- **1-second granularity**: Without an expensive centralized data lake, you can store all of your nodes' per-second + metrics, for any period of time, while keeping costs down. +- **No filtering or selecting of metrics**: Because Netdata's distributed data architecture allows you to store all + metrics, you don't have to configure which metrics you retain. Keep everything for full visibility during + troubleshooting and root cause analysis. +- **Easy maintenance**: There is no centralized data lake to purchase, allocate, monitor, and update, removing + complexity from your monitoring infrastructure. -## Does Netdata Cloud store my metrics? +## Ephemerality of metrics -Netdata Cloud does not store metric values. +The ephemerality of metrics plays an important role in retention. In environments where metrics collection is dynamic and +new metrics are constantly being generated, we are interested about 2 parameters: -To enable certain features, such as [viewing active alarms](/docs/monitor/view-active-alarms.md) or [filtering by -hostname/service](https://learn.netdata.cloud/docs/cloud/war-rooms#node-filter), Netdata Cloud does store configured -alarms, their status, and a list of active collectors. +1. The **expected concurrent number of metrics** as an average for the lifetime of the database. This affects mainly the + storage requirements. -Netdata does not and never will sell your personal data or data about your deployment. +2. The **expected total number of unique metrics** for the lifetime of the database. This affects mainly the memory + requirements for having all these metrics indexed and available to be queried. + +## Granularity of metrics + +The granularity of metrics (the frequency they are collected and stored, i.e. their resolution) is significantly +affecting retention. + +Lowering the granularity from per second to every two seconds, will double their retention and half the CPU requirements +of the Netdata Agent, without affecting disk space or memory requirements. ## Long-term metrics storage with Netdata @@ -47,7 +56,8 @@ appropriate amount of RAM and disk space. Read our document on changing [how long Netdata stores metrics](/docs/store/change-metrics-storage.md) on your nodes for details. -## Other options for your metrics data +You can also stream between nodes using [streaming](/streaming/README.md), allowing to replicate databases and create +your own centralized data lake of metrics, if you choose to do so. While a distributed data architecture is the default when monitoring infrastructure with Netdata, you can also configure its behavior based on your needs or the type of infrastructure you manage. @@ -55,12 +65,19 @@ its behavior based on your needs or the type of infrastructure you manage. To archive metrics to an external time-series database, such as InfluxDB, Graphite, OpenTSDB, Elasticsearch, TimescaleDB, and many others, see details on [integrating Netdata via exporting](/docs/export/external-databases.md). -You can also stream between nodes using [streaming](/streaming/README.md), allowing to replicate databases and create -your own centralized data lake of metrics, if you choose to do so. - When you use the database engine to store your metrics, you can always perform a quick backup of a node's `/var/cache/netdata/dbengine/` folder using the tool of your choice. +## Does Netdata Cloud store my metrics? + +Netdata Cloud does not store metric values. + +To enable certain features, such as [viewing active alarms](/docs/monitor/view-active-alarms.md) +or [filtering by hostname/service](https://learn.netdata.cloud/docs/cloud/war-rooms#node-filter), Netdata Cloud does +store configured alarms, their status, and a list of active collectors. + +Netdata does not and never will sell your personal data or data about your deployment. + ## What's next? You can configure the Netdata Agent to store days, weeks, or months worth of distributed, per-second data by -- cgit v1.2.3