summaryrefslogtreecommitdiffstats
path: root/docs/netdata-agent
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-05 11:19:16 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-05 12:07:37 +0000
commitb485aab7e71c1625cfc27e0f92c9509f42378458 (patch)
treeae9abe108601079d1679194de237c9a435ae5b55 /docs/netdata-agent
parentAdding upstream version 1.44.3. (diff)
downloadnetdata-b485aab7e71c1625cfc27e0f92c9509f42378458.tar.xz
netdata-b485aab7e71c1625cfc27e0f92c9509f42378458.zip
Adding upstream version 1.45.3+dfsg.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/netdata-agent')
-rw-r--r--docs/netdata-agent/README.md84
-rw-r--r--docs/netdata-agent/configuration.md43
-rw-r--r--docs/netdata-agent/sizing-netdata-agents/README.md89
-rw-r--r--docs/netdata-agent/sizing-netdata-agents/bandwidth-requirements.md47
-rw-r--r--docs/netdata-agent/sizing-netdata-agents/cpu-requirements.md65
-rw-r--r--docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md131
-rw-r--r--docs/netdata-agent/sizing-netdata-agents/ram-requirements.md60
-rw-r--r--docs/netdata-agent/versions-and-platforms.md70
8 files changed, 589 insertions, 0 deletions
diff --git a/docs/netdata-agent/README.md b/docs/netdata-agent/README.md
new file mode 100644
index 00000000..faf262fd
--- /dev/null
+++ b/docs/netdata-agent/README.md
@@ -0,0 +1,84 @@
+# Netdata Agent
+
+The Netdata Agent is the main building block in a Netdata ecosystem. It is installed on all monitored systems to monitor system components, containers and applications.
+
+The Netdata Agent is an **observability pipeline in a box** that can either operate standalone, or blend into a bigger pipeline made by more Netdata Agents (Children and Parents).
+
+## Distributed Observability Pipeline
+
+The Netdata observability pipeline looks like in the following graph.
+
+The pipeline is extended by creating Metrics Observability Centralization Points that are linked all together (`from a remote Netdata`, `to a remote Netdata`), so that all Netdata installed become a vast integrated observability pipeline.
+
+```mermaid
+stateDiagram-v2
+ classDef userFeature fill:#f00,color:white,font-weight:bold,stroke-width:2px,stroke:yellow
+ classDef usedByNC fill:#090,color:white,font-weight:bold,stroke-width:2px,stroke:yellow
+ Local --> Discover
+ Local: Local Netdata
+ [*] --> Detect: from a remote Netdata
+ Others: 3rd party time-series DBs
+ Detect: Detect Anomalies
+ Dashboard:::userFeature
+ Dashboard: Netdata Dashboards
+ 3rdDashboard:::userFeature
+ 3rdDashboard: 3rd party Dashboards
+ Notifications:::userFeature
+ Notifications: Alert Notifications
+ Alerts: Alert Transitions
+ Discover --> Collect
+ Collect --> Detect
+ Store: Store
+ Store: Time-Series Database
+ Detect --> Store
+ Store --> Learn
+ Store --> Check
+ Store --> Query
+ Store --> Score
+ Store --> Stream
+ Store --> Export
+ Query --> Visualize
+ Score --> Visualize
+ Check --> Alerts
+ Learn --> Detect: trained ML models
+ Alerts --> Notifications
+ Stream --> [*]: to a remote Netdata
+ Export --> Others
+ Others --> 3rdDashboard
+ Visualize --> Dashboard
+ Score:::usedByNC
+ Query:::usedByNC
+ Alerts:::usedByNC
+```
+
+1. **Discover**: auto-detect metric sources on localhost, auto-discover metric sources on Kubernetes.
+2. **Collect**: query data sources to collect metric samples, using the optimal protocol for each data source. 800+ integrations supported, including dozens of native application protocols, OpenMetrics and StatsD.
+3. **Detect Anomalies**: use the trained machine learning models for each metric, to detect in real-time if each sample collected is an outlier (an anomaly), or not.
+4. **Store**: keep collected samples and their anomaly status, in the time-series database (database mode `dbengine`) or a ring buffer (database modes `ram` and `alloc`).
+5. **Learn**: train multiple machine learning models for each metric collected, learning behaviors and patterns for detecting anomalies.
+6. **Check**: a health engine, triggering alerts and sending notifications. Netdata comes with hundreds of alert configurations that are automatically attached to metrics when they get collected, detecting errors, common configuration errors and performance issues.
+7. **Query**: a query engine for querying time-series data.
+8. **Score**: a scoring engine for comparing and correlating metrics.
+9. **Stream**: a mechanism to connect Netdata agents and build Metrics Centralization Points (Netdata Parents).
+10. **Visualize**: Netdata's fully automated dashboards for all metrics.
+11. **Export**: export metric samples to 3rd party time-series databases, enabling the use of 3rd party tools for visualization, like Grafana.
+
+## Comparison to other observability solutions
+
+1. **One moving part**: Other monitoring solution require maintaining metrics exporters, time-series databases, visualization engines. Netdata has everything integrated into one package, even when [Metrics Centralization Points](https://github.com/netdata/netdata/blob/master/docs/observability-centralization-points/metrics-centralization-points/README.md) are required, making deployment and maintenance a lot simpler.
+
+2. **Automation**: Netdata is designed to automate most of the process of setting up and running an observability solution. It is designed to instantly provide comprehensive dashboards and fully automated alerts, with zero configuration.
+
+3. **High Fidelity Monitoring**: Netdata was born from our need to kill the console for observability. So, it provides metrics and logs in the same granularity and fidelity console tools do, but also comes with tools that go beyond metrics and logs, to provide a holistic view of the monitored infrastructure (e.g. check [Top Monitoring](https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md)).
+
+4. **Minimal impact on monitored systems and applications**: Netdata has been designed to have a minimal impact on the monitored systems and their applications. There are [independent studies](https://www.ivanomalavolta.com/files/papers/ICSOC_2023.pdf) reporting that Netdata excels in CPU usage, RAM utilization, Execution Time and the impact Netdata has on monitored applications and containers.
+
+5. **Energy efficiency**: [University of Amsterdam did a research to find the energy efficiency of monitoring tools](https://twitter.com/IMalavolta/status/1734208439096676680). They tested Netdata, Prometheus, ELK, among other tools. The study concluded that **Netdata is the most energy efficient monitoring tool**.
+
+## Dashboard Versions
+
+The Netdata agents (Standalone, Children and Parents) **share the dashboard** of Netdata Cloud. However, when the user is logged-in and the Netdata agent is connected to Netdata Cloud, the following are enabled (which are otherwise disabled):
+
+1. **Access to Sensitive Data**: Some data, like systemd-journal logs and several [Top Monitoring](https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md) features expose sensitive data, like IPs, ports, process command lines and more. To access all these when the dashboard is served directly from a Netdata agent, Netdata Cloud is required to verify that the user accessing the dashboard has the required permissions.
+
+2. **Dynamic Configuration**: Netdata agents are configured via configuration files, manually or through some provisioning system. The latest Netdata includes a feature to allow users change some of the configuration (collectors, alerts) via the dashboard. This feature is only available to users of paid Netdata Cloud plan.
diff --git a/docs/netdata-agent/configuration.md b/docs/netdata-agent/configuration.md
new file mode 100644
index 00000000..85319984
--- /dev/null
+++ b/docs/netdata-agent/configuration.md
@@ -0,0 +1,43 @@
+# Netdata Agent Configuration
+
+The main Netdata agent configuration is `netdata.conf`.
+
+## The Netdata config directory
+
+On most Linux systems, by using our [recommended one-line installation](https://github.com/netdata/netdata/blob/master/packaging/installer/README.md#install-on-linux-with-one-line-installer), the **Netdata config
+directory** will be `/etc/netdata/`. The config directory contains several configuration files with the `.conf` extension, a
+few directories, and a shell script named `edit-config`.
+
+> Some operating systems will use `/opt/netdata/etc/netdata/` as the config directory. If you're not sure where yours
+> is, navigate to `http://NODE:19999/netdata.conf` in your browser, replacing `NODE` with the IP address or hostname of
+> your node, and find the `# config directory = ` setting. The value listed is the config directory for your system.
+
+All of Netdata's documentation assumes that your config directory is at `/etc/netdata`, and that you're running any scripts from inside that directory.
+
+
+## edit `netdata.conf`
+
+To edit `netdata.conf`, run this on your terminal:
+
+```bash
+cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
+sudo ./edit-config netdata.conf
+```
+
+Your editor will open.
+
+## downloading `netdata.conf`
+
+The running version of `netdata.conf` can be downloaded from a running Netdata agent, at this URL:
+
+```
+http://agent-ip:19999/netdata.conf
+```
+
+You can save and use this version, using these commands:
+
+```bash
+cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
+curl -ksSLo /tmp/netdata.conf.new http://localhost:19999/netdata.conf && sudo mv -i /tmp/netdata.conf.new netdata.conf
+```
+
diff --git a/docs/netdata-agent/sizing-netdata-agents/README.md b/docs/netdata-agent/sizing-netdata-agents/README.md
new file mode 100644
index 00000000..b945dc56
--- /dev/null
+++ b/docs/netdata-agent/sizing-netdata-agents/README.md
@@ -0,0 +1,89 @@
+# Sizing Netdata Agents
+
+Netdata automatically adjusts its resources utilization based on the workload offered to it.
+
+This is a map of how Netdata **features impact resources utilization**:
+
+| Feature | CPU | RAM | Disk I/O | Disk Space | Retention | Bandwidth |
+|-----------------------------:|:---:|:---:|:--------:|:----------:|:---------:|:---------:|
+| Metrics collected | X | X | X | X | X | - |
+| Samples collection frequency | X | - | X | X | X | - |
+| Database mode and tiers | - | X | X | X | X | - |
+| Machine learning | X | X | - | - | - | - |
+| Streaming | X | X | - | - | - | X |
+
+1. **Metrics collected**: The number of metrics collected affects almost every aspect of resources utilization.
+
+ When you need to lower the resources used by Netdata, this is an obvious first step.
+
+2. **Samples collection frequency**: By default Netdata collects metrics with 1-second granularity, unless the metrics collected are not updated that frequently, in which case Netdata collects them at the frequency they are updated. This is controlled per data collection job.
+
+ Lowering the data collection frequency from every-second to every-2-seconds, will make Netdata use half the CPU utilization. So, CPU utilization is proportional to the data collection frequency.
+
+3. **Database Mode and Tiers**: By default Netdata stores metrics in 3 database tiers: high-resolution, mid-resolution, low-resolution. All database tiers are updated in parallel during data collection, and depending on the query duration Netdata may consult one or more tiers to optimize the resources required to satisfy it.
+
+ The number of database tiers affects the memory requirements of Netdata. Going from 3-tiers to 1-tier, will make Netdata use half the memory. Of course metrics retention will also be limited to 1 tier.
+
+4. **Machine Learning**: Byt default Netdata trains multiple machine learning models for every metric collected, to learn its behavior and detect anomalies. Machine Learning is a CPU intensive process and affects the overall CPU utilization of Netdata.
+
+5. **Streaming Compression**: When using Netdata in Parent-Child configurations to create Metrics Centralization Points, the compression algorithm used greatly affects CPU utilization and bandwidth consumption.
+
+ Netdata supports multiple streaming compressions algorithms, allowing the optimization of either CPU utilization or Network Bandwidth. The default algorithm `zstd` provides the best balance among them.
+
+## Minimizing the resources used by Netdata Agents
+
+To minimize the resources used by Netdata Agents, we suggest to configure Netdata Parents for centralizing metric samples, and disabling most of the features on Netdata Children. This will provide minimal resources utilization at the edge, while all the features of Netdata are available at the Netdata Parents.
+
+The following guides provide instructions on how to do this.
+
+## Maximizing the scale of Netdata Parents
+
+Netdata Parents automatically size resource utilization based on the workload they receive. The only possible option for improving query performance is to dedicate more RAM to them, by increasing their caches efficiency.
+
+Check [RAM Requirements](https://github.com/netdata/netdata/blob/master/docs/netdata-agent/sizing-netdata-agents/ram-requirements.md) for more information.
+
+## Innovations Netdata has for optimal performance and scalability
+
+The following are some of the innovations the open-source Netdata agent has, that contribute to its excellent performance, and scalability.
+
+1. **Minimal disk I/O**
+
+ When Netdata saves data on-disk, it stores them at their final place, eliminating the need to reorganize this data.
+
+ Netdata is organizing its data structures in such a way that samples are committed to disk as evenly as possible across time, without affecting its memory requirements.
+
+ Furthermore, Netdata Agents use direct-I/O for saving and loading metric samples. This prevents Netdata from polluting system caches with metric data. Netdata maintains its own caches for this data.
+
+ All these features make Netdata an nice partner and a polite citizen for production applications running on the same systems Netdata runs.
+
+2. **4 bytes per sample uncompressed**
+
+ To achieve optimal memory and disk footprint, Netdata uses a custom 32-bit floating point number. This floating point number is used to store the samples collected, together with their anomaly bit. The database of Netdata is fixed-step, so it has predefined slots for every sample, allowing Netdata to store timestamps once every several hundreds samples, minimizing both its memory requirements and the disk footprint.
+
+ The final disk footprint of Netdata varies due to compression efficiency. It is usually about 0.6 bytes per sample for the high-resolution tier (per-second), 6 bytes per sample for the mid-resolution tier (per-minute) and 18 bytes per sample for the low-resolution tier (per-hour).
+
+3. **Query priorities**
+
+ Alerting, Machine Learning, Streaming and Replication, rely on metric queries. When multiple queries are running in parallel, Netdata assigns priorities to all of them, favoring interactive queries over background tasks. This means that queries do not compete equally for resources. Machine learning or replication may slow down when interactive queries are running and the system starves for resources.
+
+4. **A pointer per label**
+
+ Apart from metric samples, metric labels and their cardinality is the biggest memory consumer, especially in highly ephemeral environments, like kubernetes. Netdata uses a single pointer for any label key-value pair that is reused. Keys and values are also deduplicated, providing the best possible memory footprint for metric labels.
+
+5. **Streaming Protocol**
+
+ The streaming protocol of Netdata allows minimizing the resources consumed on production systems by delegating features of to other Netdata agents (Parents), without compromising monitoring fidelity or responsiveness, enabling the creation of a highly distributed observability platform.
+
+## Netdata vs Prometheus
+
+Netdata outperforms Prometheus in every aspect. -35% CPU Utilization, -49% RAM usage, -12% network bandwidth, -98% disk I/O, -75% in disk footprint for high resolution data, while providing more than a year of retention.
+
+Read the [full comparison here](https://blog.netdata.cloud/netdata-vs-prometheus-performance-analysis/).
+
+## Energy Efficiency
+
+University of Amsterdam contacted a research on the impact monitoring systems have on docker based systems.
+
+The study found that Netdata excels in CPU utilization, RAM usage, Execution Time and concluded that **Netdata is the most energy efficient tool**.
+
+Read the [full study here](https://www.ivanomalavolta.com/files/papers/ICSOC_2023.pdf).
diff --git a/docs/netdata-agent/sizing-netdata-agents/bandwidth-requirements.md b/docs/netdata-agent/sizing-netdata-agents/bandwidth-requirements.md
new file mode 100644
index 00000000..092c8da1
--- /dev/null
+++ b/docs/netdata-agent/sizing-netdata-agents/bandwidth-requirements.md
@@ -0,0 +1,47 @@
+# Bandwidth Requirements
+
+## On Production Systems, Standalone Netdata
+
+Standalone Netdata may use network bandwidth under the following conditions:
+
+1. You configured data collection jobs that are fetching data from remote systems. There is no such jobs enabled by default.
+2. You use the dashboard of the Netdata.
+3. [Netdata Cloud communication](#netdata-cloud-communication) (see below).
+
+## On Metrics Centralization Points, between Netdata Children & Parents
+
+Netdata supports multiple compression algorithms for streaming communication. Netdata Children offer all their compression algorithms when connecting to a Netdata Parent, and the Netdata Parent decides which one to use based on algorithms availability and user configuration.
+
+| Algorithm | Best for |
+|:---------:|:-----------------------------------------------------------------------------------------------------------------------------------:|
+| `zstd` | The best balance between CPU utilization and compression efficiency. This is the default. |
+| `lz4` | The fastest of the algorithms. Use this when CPU utilization is more important than bandwidth. |
+| `gzip` | The best compression efficiency, at the expense of CPU utilization. Use this when bandwidth is more important than CPU utilization. |
+| `brotli` | The most CPU intensive algorithm, providing the best compression. |
+
+The expected bandwidth consumption using `zstd` for 1 million samples per second is 84 Mbps, or 10.5 MiB/s.
+
+The order compression algorithms is selected is configured in `stream.conf`, per `[API KEY]`, like this:
+
+```
+ compression algorithms order = zstd lz4 brotli gzip
+```
+
+The first available algorithm on both the Netdata Child and the Netdata Parent, from left to right, is chosen.
+
+Compression can also be disabled in `stream.conf` at either Netdata Children or Netdata Parents.
+
+## Netdata Cloud Communication
+
+When Netdata Agents connect to Netdata Cloud, they communicate metadata of the metrics being collected, but they do not stream the samples collected for each metric.
+
+The information transferred to Netdata Cloud is:
+
+1. Information and **metadata about the system itself**, like its hostname, architecture, virtualization technologies used and generally labels associated with the system.
+2. Information about the **running data collection plugins, modules and jobs**.
+3. Information about the **metrics available and their retention**.
+4. Information about the **configured alerts and their transitions**.
+
+This is not a constant stream of information. Netdata Agents update Netdata Cloud only about status changes on all the above (e.g. an alert being triggered, or a metric stopped being collected). So, there is an initial handshake and exchange of information when Netdata starts, and then there only updates when required.
+
+Of course, when you view Netdata Cloud dashboards that need to query the database a Netdata agent maintains, this query is forwarded to an agent that can satisfy it. This means that Netdata Cloud receives metric samples only when a user is accessing a dashboard and the samples transferred are usually aggregations to allow rendering the dashboards.
diff --git a/docs/netdata-agent/sizing-netdata-agents/cpu-requirements.md b/docs/netdata-agent/sizing-netdata-agents/cpu-requirements.md
new file mode 100644
index 00000000..021a35fb
--- /dev/null
+++ b/docs/netdata-agent/sizing-netdata-agents/cpu-requirements.md
@@ -0,0 +1,65 @@
+# CPU Requirements
+
+Netdata's CPU consumption is affected by the following factors:
+
+1. The number of metrics collected
+2. The frequency metrics are collected
+3. Machine Learning
+4. Streaming compression (streaming of metrics to Netdata Parents)
+5. Database Mode
+
+## On Production Systems, Netdata Children
+
+On production systems, where Netdata is running with default settings, monitoring the system it is installed at and its containers and applications, CPU utilization should usually be about 1% to 5% of a single CPU core.
+
+This includes 3 database tiers, machine learning, per-second data collection, alerts, and streaming to a Netdata Parent.
+
+## On Metrics Centralization Points, Netdata Parents
+
+On Metrics Centralization Points, Netdata Parents running on modern server hardware, we **estimate CPU utilization per million of samples collected per second**:
+
+| Feature | Depends On | Expected Utilization | Key Reasons |
+|:-----------------:|:---------------------------------------------------:|:----------------------------------------------------------------:|:-------------------------------------------------------------------------:|
+| Metrics Ingestion | Number of samples received per second | 2 CPU cores per million of samples per second | Decompress and decode received messages, update database. |
+| Metrics re-streaming| Number of samples resent per second | 2 CPU cores per million of samples per second | Encode and compress messages towards Netdata Parent. |
+| Machine Learning | Number of unique time-series concurrently collected | 2 CPU cores per million of unique metrics concurrently collected | Train machine learning models, query existing models to detect anomalies. |
+
+We recommend keeping the total CPU utilization below 60% when a Netdata Parent is steadily ingesting metrics, training machine learning models and running health checks. This will leave enough CPU resources available for queries.
+
+## I want to minimize CPU utilization. What should I do?
+
+You can control Netdata's CPU utilization with these parameters:
+
+1. **Data collection frequency**: Going from per-second metrics to every-2-seconds metrics will half the CPU utilization of Netdata.
+2. **Number of metrics collected**: Netdata by default collects every metric available on the systems it runs. Review the metrics collected and disable data collection plugins and modules not needed.
+3. **Machine Learning**: Disable machine learning to save CPU cycles.
+4. **Number of database tiers**: Netdata updates database tiers in parallel, during data collection. This affects both CPU utilization and memory requirements.
+5. **Database Mode**: The default database mode is `dbengine`, which compresses and commits data to disk. If you have a Netdata Parent where metrics are aggregated and saved to disk and there is a reliable connection between the Netdata you want to optimize and its Parent, switch to database mode `ram` or `alloc`. This disables saving to disk, so your Netdata will also not use any disk I/O.
+
+## I see increased CPU consumption when a busy Netdata Parent starts, why?
+
+When a Netdata Parent starts and Netdata children get connected to it, there are several operations that temporarily affect CPU utilization, network bandwidth and disk I/O.
+
+The general flow looks like this:
+
+1. **Back-filling of higher tiers**: Usually this means calculating the aggregates of the last hour of `tier2` and of the last minute of `tier1`, ensuring that higher tiers reflect all the information `tier0` has. If Netdata was stopped abnormally (e.g. due to a system failure or crash), higher tiers may have to be back-filled for longer durations.
+2. **Metadata synchronization**: The metadata of all metrics each Netdata Child maintains are negotiated between the Child and the Parent and are synchronized.
+3. **Replication**: If the Parent is missing samples the Child has, these samples are transferred to the Parent before transferring new samples.
+4. Once all these finish, the normal **streaming of new metric samples** starts.
+5. At the same time, **machine learning** initializes, loads saved trained models and prepares anomaly detection.
+6. After a few moments the **health engine starts checking metrics** for triggering alerts.
+
+The above process is per metric. So, while one metric back-fills, another replicates and a third one streams.
+
+At the same time:
+
+- the compression algorithm learns the patterns of the data exchanged and optimizes its dictionaries for optimal compression and CPU utilization,
+- the database engine adjusts the page size of each metric, so that samples are committed to disk as evenly as possible across time.
+
+So, when looking for the "steady CPU consumption during ingestion" of a busy Netdata Parent, we recommend to let it stabilize for a few hours before checking.
+
+Keep in mind that Netdata has been designed so that even if during the initialization phase and the connection of hundreds of Netdata Children the system lacks CPU resources, the Netdata Parent will complete all the operations and eventually enter a steady CPU consumption during ingestion, without affecting the quality of the metrics stored. So, it is ok if during initialization of a busy Netdata Parent, CPU consumption spikes to 100%.
+
+Important: the above initialization process is not such intense when new nodes get connected to a Netdata Parent for the first time (e.g. ephemeral nodes), since several of the steps involved are not required.
+
+Especially for the cases where children disconnect and reconnect to the Parent due to network related issues (i.e. both the Netdata Child and the Netdata Parent have not been restarted and less than 1 hour has passed since the last disconnection), the re-negotiation phase is minimal and metrics are instantly entering the normal streaming phase.
diff --git a/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md b/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md
new file mode 100644
index 00000000..d9e879cb
--- /dev/null
+++ b/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md
@@ -0,0 +1,131 @@
+# Disk Requirements &amp; Retention
+
+## Database Modes and Tiers
+
+Netdata comes with 3 database modes:
+
+1. `dbengine`: the default high-performance multi-tier database of Netdata. Metric samples are cached in memory and are saved to disk in multiple tiers, with compression.
+2. `ram`: metric samples are stored in ring buffers in memory, with increments of 1024 samples. Metric samples are not committed to disk. Kernel-Same-Page (KSM) can be used to deduplicate Netdata's memory.
+3. `alloc`: metric samples are stored in ring buffers in memory, with flexible increments. Metric samples are not committed to disk.
+
+## `ram` and `alloc`
+
+Modes `ram` and `alloc` can help when Netdata should not introduce any disk I/O at all. In both of these modes, metric samples exist only in memory, and only while they are collected.
+
+When Netdata is configured to stream its metrics to a Metrics Observability Centralization Point (a Netdata Parent), metric samples are forwarded in real-time to that Netdata Parent. The ring buffers available in these modes is used to cache the collected samples for some time, in case there are network issues, or the Netdata Parent is restarted for maintenance.
+
+The memory required per sample in these modes, is 4 bytes:
+
+- `ram` mode uses `mmap()` behind the scene, and can be incremented in steps of 1024 samples (4KiB). Mode `ram` allows the use of the Linux kernel memory dedupper (Kernel-Same-Page or KSM) to deduplicate Netdata ring buffers and save memory.
+- `alloc` mode can be sized for any number of samples per metric. KSM cannot be used in this mode.
+
+To configure database mode `ram` or `alloc`, in `netdata.conf`, set the following:
+
+- `[db].mode` to either `ram` or `alloc`.
+- `[db].retention` to the number of samples the ring buffers should maintain. For `ram` if the value set is not a multiple of 1024, the next multiple of 1024 will be used.
+
+## `dbengine`
+
+`dbengine` supports up to 5 tiers. By default, 3 tiers are used, like this:
+
+| Tier | Resolution | Uncompressed Sample Size | Usually On Disk |
+|:--------:|:--------------------------------------------------------------------------------------------:|:------------------------:|:---------------:|
+| `tier0` | native resolution (metrics collected per-second as stored per-second) | 4 bytes | 0.6 bytes |
+| `tier1` | 60 iterations of `tier0`, so when metrics are collected per-second, this tier is per-minute. | 16 bytes | 6 bytes |
+| `tier2` | 60 iterations of `tier1`, so when metrics are collected per second, this tier is per-hour. | 16 bytes | 18 bytes |
+
+Data are saved to disk compressed, so the actual size on disk varies depending on compression efficiency.
+
+`dbegnine` tiers are overlapping, so higher tiers include a down-sampled version of the samples in lower tiers:
+
+```mermaid
+gantt
+ dateFormat YYYY-MM-DD
+ tickInterval 1week
+ axisFormat
+ todayMarker off
+ tier0, 14d :a1, 2023-12-24, 7d
+ tier1, 60d :a2, 2023-12-01, 30d
+ tier2, 365d :a3, 2023-11-02, 59d
+```
+
+## Disk Space and Metrics Retention
+
+You can find information about the current disk utilization of a Netdata Parent, at <http://agent-ip:19999/api/v2/info>. The output of this endpoint is like this:
+
+```json
+{
+ // more information about the agent
+ // then, near the end:
+ "db_size": [
+ {
+ "tier": 0,
+ "metrics": 43070,
+ "samples": 88078162001,
+ "disk_used": 41156409552,
+ "disk_max": 41943040000,
+ "disk_percent": 98.1245269,
+ "from": 1705033983,
+ "to": 1708856640,
+ "retention": 3822657,
+ "expected_retention": 3895720,
+ "currently_collected_metrics": 27424
+ },
+ {
+ "tier": 1,
+ "metrics": 72987,
+ "samples": 5155155269,
+ "disk_used": 20585157180,
+ "disk_max": 20971520000,
+ "disk_percent": 98.1576785,
+ "from": 1698287340,
+ "to": 1708856640,
+ "retention": 10569300,
+ "expected_retention": 10767675,
+ "currently_collected_metrics": 27424
+ },
+ {
+ "tier": 2,
+ "metrics": 148234,
+ "samples": 314919121,
+ "disk_used": 5957346684,
+ "disk_max": 10485760000,
+ "disk_percent": 56.8136853,
+ "from": 1667808000,
+ "to": 1708856640,
+ "retention": 41048640,
+ "expected_retention": 72251324,
+ "currently_collected_metrics": 27424
+ }
+ ]
+}
+```
+
+In this example:
+
+- `tier` is the database tier.
+- `metrics` is the number of unique time-series in the database.
+- `samples` is the number of samples in the database.
+- `disk_used` is the currently used disk space in bytes.
+- `disk_max` is the configured max disk space in bytes.
+- `disk_percent` is the current disk space utilization for this tier.
+- `from` is the first (oldest) timestamp in the database for this tier.
+- `to` is the latest (newest) timestamp in the database for this tier.
+- `retention` is the current retention of the database for this tier, in seconds (divide by 3600 for hours, divide by 86400 for days).
+- `expected_retention` is the expected retention in seconds when `disk_percent` will be 100 (divide by 3600 for hours, divide by 86400 for days).
+- `currently_collected_metrics` is the number of unique time-series currently being collected for this tier.
+
+So, for our example above:
+
+| Tier | # Of Metrics | # Of Samples | Disk Used | Disk Free | Current Retention | Expected Retention | Sample Size |
+|-----:|-------------:|--------------:|----------:|----------:|------------------:|-------------------:|------------:|
+| 0 | 43.1K | 88.1 billion | 38.4Gi | 1.88% | 44.2 days | 45.0 days | 0.46 B |
+| 1 | 73.0K | 5.2 billion | 19.2Gi | 1.84% | 122.3 days | 124.6 days | 3.99 B |
+| 2 | 148.3K | 315.0 million | 5.6Gi | 43.19% | 475.1 days | 836.2 days | 18.91 B |
+
+To configure retention, in `netdata.conf`, set the following:
+
+- `[db].mode` to `dbengine`.
+- `[db].dbengine multihost disk space MB`, this is the max disk size for `tier0`. The default is 256MiB.
+- `[db].dbengine tier 1 multihost disk space MB`, this is the max disk space for `tier1`. The default is 50% of `tier0`.
+- `[db].dbengine tier 2 multihost disk space MB`, this is the max disk space for `tier2`. The default is 50% of `tier1`.
diff --git a/docs/netdata-agent/sizing-netdata-agents/ram-requirements.md b/docs/netdata-agent/sizing-netdata-agents/ram-requirements.md
new file mode 100644
index 00000000..159c979a
--- /dev/null
+++ b/docs/netdata-agent/sizing-netdata-agents/ram-requirements.md
@@ -0,0 +1,60 @@
+# RAM Requirements
+
+With default configuration about database tiers, Netdata should need about 16KiB per unique metric collected, independently of the data collection frequency.
+
+Netdata supports memory ballooning and automatically sizes and limits the memory used, based on the metrics concurrently being collected.
+
+## On Production Systems, Netdata Children
+
+With default settings, Netdata should run with 100MB to 200MB of RAM, depending on the number of metrics being collected.
+
+This number can be lowered by limiting the number of database tier or switching database modes. For more information check [Disk Requirements and Retention](https://github.com/netdata/netdata/blob/master/docs/netdata-agent/sizing-netdata-agents/disk-requirements-and-retention.md).
+
+## On Metrics Centralization Points, Netdata Parents
+
+The general formula, with the default configuration of database tiers, is:
+
+```
+memory = UNIQUE_METRICS x 16KiB + CONFIGURED_CACHES
+```
+
+The default `CONFIGURED_CACHES` is 32MiB.
+
+For 1 million concurrently collected time-series (independently of their data collection frequency), the memory required is:
+
+```
+UNIQUE_METRICS = 1000000
+CONFIGURED_CACHES = 32MiB
+
+(UNIQUE_METRICS * 16KiB / 1024 in MiB) + CONFIGURED_CACHES =
+( 1000000 * 16KiB / 1024 in MiB) + 32 MiB =
+15657 MiB =
+about 16 GiB
+```
+
+There are 2 cache sizes that can be configured in `netdata.conf`:
+
+1. `[db].dbengine page cache size MB`: this is the main cache that keeps metrics data into memory. When data are not found in it, the extent cache is consulted, and if not found in that either, they are loaded from disk.
+2. `[db].dbengine extent cache size MB`: this is the compressed extent cache. It keeps in memory compressed data blocks, as they appear on disk, to avoid reading them again. Data found in the extend cache but not in the main cache have to be uncompressed to be queried.
+
+Both of them are dynamically adjusted to use some of the total memory computed above. The configuration in `netdata.conf` allows providing additional memory to them, increasing their caching efficiency.
+
+## I have a Netdata Parent that is also a systemd-journal logs centralization point, what should I know?
+
+Logs usually require significantly more disk space and I/O bandwidth than metrics. For optimal performance we recommend to store metrics and logs on separate, independent disks.
+
+Netdata uses direct-I/O for its database, so that it does not pollute the system caches with its own data. We want Netdata to be a nice citizen when it runs side-by-side with production applications, so this was required to guarantee that Netdata does not affect the operation of databases or other sensitive applications running on the same servers.
+
+To optimize disk I/O, Netdata maintains its own private caches. The default settings of these caches are automatically adjusted to the minimum required size for acceptable metrics query performance.
+
+`systemd-journal` on the other hand, relies on operating system caches for improving the query performance of logs. When the system lacks free memory, querying logs leads to increased disk I/O.
+
+If you are experiencing slow responses and increased disk reads when metrics queries run, we suggest to dedicate some more RAM to Netdata.
+
+We frequently see that the following strategy gives best results:
+
+1. Start the Netdata Parent, send all the load you expect it to have and let it stabilize for a few hours. Netdata will now use the minimum memory it believes is required for smooth operation.
+2. Check the available system memory.
+3. Set the page cache in `netdata.conf` to use 1/3 of the available memory.
+
+This will allow Netdata queries to have more caches, while leaving plenty of available memory of logs and the operating system.
diff --git a/docs/netdata-agent/versions-and-platforms.md b/docs/netdata-agent/versions-and-platforms.md
new file mode 100644
index 00000000..787874d6
--- /dev/null
+++ b/docs/netdata-agent/versions-and-platforms.md
@@ -0,0 +1,70 @@
+# Netdata Agent Versions & Platforms
+
+Netdata is evolving rapidly and new features are added at a constant pace. Therefore we have frequent release cadence to deliver all these features to use as soon as possible.
+
+Netdata Agents are available in 2 versions:
+
+| Release Channel | Release Frequency | Support Policy & Features | Support Duration | Backwards Compatibility |
+|:---------------:|:---------------------------------------------:|:---------------------------------------------------------:|:----------------------------------------:|:---------------------------------------------------------------------------------:|
+| Stable | At most once per month, usually every 45 days | Receiving bug fixes and security updates between releases | Up to the 2nd stable release after them | Previous configuration semantics and data are supported by newer releases |
+| Nightly | Every night at 00:00 UTC | Latest pre-released features | Up to the 2nd nightly release after them | Configuration and data of unreleased features may change between nightly releases |
+
+> "Support Duration" defines the time we consider the release as actively used by users in production systems, so that all features of Netdata should be working like the day they were released. However, after the latest release, previous releases stop receiving bug fixes and security updates. All users are advised to update to the latest release to get the latest bug fixes.
+
+## Binary Distribution Packages
+
+Binary distribution packages are provided by Netdata, via CI integration, for the following platforms and architectures:
+
+| Platform | Platform Versions | Released Packages Architecture | Format |
+|:-----------------------:|:--------------------------------:|:------------------------------------------------:|:------------:|
+| Docker under Linux | 19.03 and later | `x86_64`, `i386`, `ARMv7`, `AArch64`, `POWER8+` | docker image |
+| Static Builds | - | `x86_64`, `ARMv6`, `ARMv7`, `AArch64`, `POWER8+` | .gz.run |
+| Alma Linux | 8.x, 9.x | `x86_64`, `AArch64` | RPM |
+| Amazon Linux | 2, 2023 | `x86_64`, `AArch64` | RPM |
+| Centos | 7.x | `x86_64` | RPM |
+| Debian | 10.x, 11.x, 12.x | `x86_64`, `i386`, `ARMv7`, `AArch64` | DEB |
+| Fedora | 37, 38, 39 | `x86_64`, `AArch64` | RPM |
+| OpenSUSE | Leap 15.4, Leap 15.5, Tumbleweed | `x86_64`, `AArch64` | RPM |
+| Oracle Linux | 8.x, 9.x | `x86_64`, `AArch64` | RPM |
+| Redhat Enterprise Linux | 7.x | `x86_64` | RPM |
+| Redhat Enterprise Linux | 8.x, 9.x | `x86_64`, `AArch64` | RPM |
+| Ubuntu | 20.04, 22.04, 23.10 | `x86_64`, `i386`, `ARMv7` | DEB |
+
+> IMPORTANT: Linux distributions frequently provide binary packages of Netdata. However, the packages you will find at the distributions' repositories may be outdated, incomplete, missing significant features or completely broken. We recommend to use the packages we provide.
+
+## Third party Supported Binary Packages
+
+The following distributions always provide the latest stable version of Netdata:
+
+| Platform | Platform Versions | Released Packages Architecture |
+|:----------:|:-----------------:|:------------------------------------:|
+| Arch Linux | Latest | All the Arch supported architectures |
+| MacOS Brew | Latest | All the Brew supported architectures |
+
+
+## Builds from Source
+
+We guarantee Netdata builds from source for the platforms we provide automated binary packages. These platforms are automatically checked via our CI, and fixes are always applied to allow merging new code into the nightly versions.
+
+The following builds from source should usually work, although we don't regularly monitor if there are issues:
+
+| Platform | Platform Versions |
+|:-----------------------------------:|:--------------------------:|
+| Linux Distributions | Latest unreleased versions |
+| FreeBSD and derivatives | 13-STABLE |
+| Gentoo and derivatives | Latest |
+| Arch Linux and derivatives | latest from AUR |
+| MacOS | 11, 12, 13 |
+| Linux under Microsoft Windows (WSL) | Latest |
+
+## Static Builds and Unsupported Linux Versions
+
+The static builds of Netdata can be used on any Linux platform of the supported architectures. The only requirement these static builds have is a working Linux kernel, any version. Everything else required for Netdata to run, is inside the package itself.
+
+Static builds usually miss certain features that require operating-system support and cannot be provided in a generic way. These features include:
+
+- IPMI hardware sensors support
+- systemd-journal features
+- eBPF related features
+
+When platforms are removed from the [Binary Distribution Packages](https://github.com/netdata/netdata/blob/master/packaging/makeself/README.md) list, they default to install or update Netdata to a static build. This may mean that after platforms become EOL, Netdata on them may lose some of its features. We recommend to upgrade the operating system before it becomes EOL, to continue using all the features of Netdata.