summaryrefslogtreecommitdiffstats
path: root/docs/store
diff options
context:
space:
mode:
Diffstat (limited to 'docs/store')
-rw-r--r--docs/store/change-metrics-storage.md98
-rw-r--r--docs/store/distributed-data-architecture.md88
2 files changed, 186 insertions, 0 deletions
diff --git a/docs/store/change-metrics-storage.md b/docs/store/change-metrics-storage.md
new file mode 100644
index 0000000..c4b77d9
--- /dev/null
+++ b/docs/store/change-metrics-storage.md
@@ -0,0 +1,98 @@
+<!--
+title: "Change how long Netdata stores metrics"
+description: "With a single configuration change, the Netdata Agent can store days, weeks, or months of metrics at its famous per-second granularity."
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/store/change-metrics-storage.md
+-->
+
+# Change how long Netdata stores metrics
+
+The Netdata Agent uses a custom made time-series database (TSDB), named the [`dbengine`](/database/engine/README.md), to store metrics.
+
+The default settings retain approximately two day's worth of metrics on a system collecting 2,000 metrics every second,
+but the Netdata Agent is highly configurable if you want your nodes to store days, weeks, or months worth of per-second
+data.
+
+The Netdata Agent uses the following three fundamental settings in `netdata.conf` to change the behavior of the database engine:
+
+```conf
+[global]
+ dbengine page cache size = 32
+ dbengine multihost disk space = 256
+ storage tiers = 1
+```
+
+`dbengine page cache size` sets the maximum amount of RAM (in MiB) the database engine uses to cache and index recent
+metrics.
+`dbengine multihost disk space` sets the maximum disk space (again, in MiB) the database engine uses to store
+historical, compressed metrics and `storage tiers` specifies the number of storage tiers you want to have in
+your `dbengine`. When the size of stored metrics exceeds the allocated disk space, the database engine removes the
+oldest metrics on a rolling basis.
+
+## Calculate the system resources (RAM, disk space) needed to store metrics
+
+You can store more or less metrics using the database engine by changing the allocated disk space. Use the calculator
+below to find the appropriate value for the `dbengine` based on how many metrics your node(s) collect, whether you are
+streaming metrics to a parent node, and more.
+
+You do not need to edit the `dbengine page cache size` setting to store more metrics using the database engine. However,
+if you want to store more metrics _specifically in memory_, you can increase the cache size.
+
+:::tip
+
+We advise you to visit the [tiering mechanism](/database/engine/README.md#tiering) reference. This will help you
+configure the Agent to retain metrics for longer periods.
+
+:::
+
+:::caution
+
+This calculator provides an estimation of disk and RAM usage for **metrics usage**. Real-life usage may vary based on
+the accuracy of the values you enter below, changes in the compression ratio, and the types of metrics stored.
+
+:::
+
+Visit the [Netdata Storage Calculator](https://netdata-storage-calculator.herokuapp.com/) app to customize
+data retention according to your preferences.
+
+## Edit `netdata.conf` with recommended database engine settings
+
+Now that you have a recommended setting for your Agent's `dbengine`, open `netdata.conf` with
+[`edit-config`](/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files) and look for the `[db]`
+subsection. Change it to the recommended values you calculated from the calculator. For example:
+
+```conf
+[db]
+ mode = dbengine
+ storage tiers = 3
+ update every = 1
+ dbengine multihost disk space MB = 1024
+ dbengine page cache size MB = 32
+ dbengine tier 1 update every iterations = 60
+ dbengine tier 1 multihost disk space MB = 384
+ dbengine tier 1 page cache size MB = 32
+ dbengine tier 2 update every iterations = 60
+ dbengine tier 2 multihost disk space MB = 16
+ dbengine tier 2 page cache size MB = 32
+```
+
+Save the file and restart the Agent with `sudo systemctl restart netdata`, or
+the [appropriate method](/docs/configure/start-stop-restart.md) for your system, to change the database engine's size.
+
+## What's next?
+
+If you have multiple nodes with the Netdata Agent installed, you
+can [stream metrics](/docs/metrics-storage-management/how-streaming-works.mdx) from any number of _child_ nodes to a _
+parent_ node and store metrics using a centralized time-series database. Streaming allows you to centralize your data,
+run Agents as headless collectors, replicate data, and more.
+
+Storing metrics with the database engine is completely interoperable
+with [exporting to other time-series databases](/docs/export/external-databases.md). With exporting, you can use the
+node's resources to surface metrics when [viewing dashboards](/docs/visualize/interact-dashboards-charts.md), while also
+archiving metrics elsewhere for further analysis, visualization, or correlation with other tools.
+
+### Related reference documentation
+
+- [Netdata Agent · Database engine](/database/engine/README.md)
+- [Netdata Agent · Database engine configuration option](/daemon/config/README.md#[db]-section-options)
+
+
diff --git a/docs/store/distributed-data-architecture.md b/docs/store/distributed-data-architecture.md
new file mode 100644
index 0000000..62933cf
--- /dev/null
+++ b/docs/store/distributed-data-architecture.md
@@ -0,0 +1,88 @@
+<!--
+title: "Distributed data architecture"
+description: "Netdata's distributed data architecture stores metrics on individual nodes for high performance and scalability using all your granular metrics."
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/store/distributed-data-architecture.md
+-->
+
+# Distributed data architecture
+
+Netdata uses a distributed data architecture to help you collect and store per-second metrics from any number of nodes.
+Every node in your infrastructure, whether it's one or a thousand, stores the metrics it collects.
+
+Netdata Cloud bridges the gap between many distributed databases by _centralizing the interface_ you use to query and
+visualize your nodes' metrics. When you [look at charts in Netdata Cloud](/docs/visualize/interact-dashboards-charts.md)
+, the metrics values are queried directly from that node's database and securely streamed to Netdata Cloud, which
+proxies them to your browser.
+
+Netdata's distributed data architecture has a number of benefits:
+
+- **Performance**: Every query to a node's database takes only a few milliseconds to complete for responsiveness when
+ viewing dashboards or using features
+ like [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations).
+- **Scalability**: As your infrastructure scales, install the Netdata Agent on every new node to immediately add it to
+ your monitoring solution without adding cost or complexity.
+- **1-second granularity**: Without an expensive centralized data lake, you can store all of your nodes' per-second
+ metrics, for any period of time, while keeping costs down.
+- **No filtering or selecting of metrics**: Because Netdata's distributed data architecture allows you to store all
+ metrics, you don't have to configure which metrics you retain. Keep everything for full visibility during
+ troubleshooting and root cause analysis.
+- **Easy maintenance**: There is no centralized data lake to purchase, allocate, monitor, and update, removing
+ complexity from your monitoring infrastructure.
+
+## Ephemerality of metrics
+
+The ephemerality of metrics plays an important role in retention. In environments where metrics collection is dynamic and
+new metrics are constantly being generated, we are interested about 2 parameters:
+
+1. The **expected concurrent number of metrics** as an average for the lifetime of the database. This affects mainly the
+ storage requirements.
+
+2. The **expected total number of unique metrics** for the lifetime of the database. This affects mainly the memory
+ requirements for having all these metrics indexed and available to be queried.
+
+## Granularity of metrics
+
+The granularity of metrics (the frequency they are collected and stored, i.e. their resolution) is significantly
+affecting retention.
+
+Lowering the granularity from per second to every two seconds, will double their retention and half the CPU requirements
+of the Netdata Agent, without affecting disk space or memory requirements.
+
+## Long-term metrics storage with Netdata
+
+Any node running the Netdata Agent can store long-term metrics for any retention period, given you allocate the
+appropriate amount of RAM and disk space.
+
+Read our document on changing [how long Netdata stores metrics](/docs/store/change-metrics-storage.md) on your nodes for
+details.
+
+You can also stream between nodes using [streaming](/streaming/README.md), allowing to replicate databases and create
+your own centralized data lake of metrics, if you choose to do so.
+
+While a distributed data architecture is the default when monitoring infrastructure with Netdata, you can also configure
+its behavior based on your needs or the type of infrastructure you manage.
+
+To archive metrics to an external time-series database, such as InfluxDB, Graphite, OpenTSDB, Elasticsearch,
+TimescaleDB, and many others, see details on [integrating Netdata via exporting](/docs/export/external-databases.md).
+
+When you use the database engine to store your metrics, you can always perform a quick backup of a node's
+`/var/cache/netdata/dbengine/` folder using the tool of your choice.
+
+## Does Netdata Cloud store my metrics?
+
+Netdata Cloud does not store metric values.
+
+To enable certain features, such as [viewing active alarms](/docs/monitor/view-active-alarms.md)
+or [filtering by hostname/service](https://learn.netdata.cloud/docs/cloud/war-rooms#node-filter), Netdata Cloud does
+store configured alarms, their status, and a list of active collectors.
+
+Netdata does not and never will sell your personal data or data about your deployment.
+
+## What's next?
+
+You can configure the Netdata Agent to store days, weeks, or months worth of distributed, per-second data by
+[configuring the database engine](/docs/store/change-metrics-storage.md). Use our calculator to determine the system
+resources required to retain your desired amount of metrics, and expand or contract the database by editing a single
+setting.
+
+