diff options
Diffstat (limited to 'docs/observability-centralization-points/metrics-centralization-points')
4 files changed, 16 insertions, 10 deletions
diff --git a/docs/observability-centralization-points/metrics-centralization-points/clustering-and-high-availability-of-netdata-parents.md b/docs/observability-centralization-points/metrics-centralization-points/clustering-and-high-availability-of-netdata-parents.md index 17a10b02..412263be 100644 --- a/docs/observability-centralization-points/metrics-centralization-points/clustering-and-high-availability-of-netdata-parents.md +++ b/docs/observability-centralization-points/metrics-centralization-points/clustering-and-high-availability-of-netdata-parents.md @@ -45,6 +45,6 @@ The easiest way is to `rsync` the directory `/var/cache/netdata` from the existi To configure retention at the new Netdata Parent, set in `netdata.conf` the following to at least the values the old Netdata Parent has: -- `[db].dbengine multihost disk space MB`, this is the max disk size for `tier0`. The default is 256MiB. -- `[db].dbengine tier 1 multihost disk space MB`, this is the max disk space for `tier1`. The default is 50% of `tier0`. -- `[db].dbengine tier 2 multihost disk space MB`, this is the max disk space for `tier2`. The default is 50% of `tier1`. +- `[db].dbengine tier 0 retention size`, this is the max disk size for `tier0`. The default is 1GiB. +- `[db].dbengine tier 1 retention size`, this is the max disk space for `tier1`. The default is 1GiB. +- `[db].dbengine tier 2 retention size`, this is the max disk space for `tier2`. The default is 1GiB. diff --git a/docs/observability-centralization-points/metrics-centralization-points/configuration.md b/docs/observability-centralization-points/metrics-centralization-points/configuration.md index bf2aa98d..d1f13f05 100644 --- a/docs/observability-centralization-points/metrics-centralization-points/configuration.md +++ b/docs/observability-centralization-points/metrics-centralization-points/configuration.md @@ -58,7 +58,7 @@ Save the file and restart Netdata. While encrypting the connection between your parent and child nodes is recommended for security, it's not required to get started. -This example uses self-signed certificates. +This example uses self-signed certificates. > **Note** > This section assumes you have read the documentation on [how to edit the Netdata configuration files](/docs/netdata-agent/configuration/README.md). @@ -70,7 +70,7 @@ This example uses self-signed certificates. 2. **Child node** Update `stream.conf` to enable SSL/TLS and allow self-signed certificates. Append ':SSL' to the destination and uncomment 'ssl skip certificate verification'. - ```conf + ```text [stream] enabled = yes destination = 203.0.113.0:SSL @@ -80,8 +80,6 @@ This example uses self-signed certificates. 3. Restart the Netdata Agent on both the parent and child nodes, to stream encrypted metrics using TLS/SSL. - - ## Troubleshooting Streaming Connections You can find any issues related to streaming at Netdata logs. diff --git a/docs/observability-centralization-points/metrics-centralization-points/faq.md b/docs/observability-centralization-points/metrics-centralization-points/faq.md index 027dfc74..1ce0d853 100644 --- a/docs/observability-centralization-points/metrics-centralization-points/faq.md +++ b/docs/observability-centralization-points/metrics-centralization-points/faq.md @@ -65,6 +65,14 @@ It depends on the ephemerality setting of each Netdata Child. 2. **Ephemeral nodes**: These are nodes that are ephemeral by nature and they may shutdown at any point in time without any impact on the services you run. -To set the ephemeral flag on a node, edit its netdata.conf and in the `[health]` section set `is ephemeral = yes`. This setting is propagated to parent nodes and Netdata Cloud. +To set the ephemeral flag on a node, edit its netdata.conf and in the `[global]` section set `is ephemeral node = yes`. This setting is propagated to parent nodes and Netdata Cloud. + +A parent node tracks connections and disconnections. When a node is marked as ephemeral and stops connecting for more than 24 hours, the parent will delete it from its memory and local administration, and tell Cloud that it is no longer live nor stale. Data for the node can no longer be accessed, but if the node connects again later, the node will be "revived", and previous data becomes available again. + +A node can be forced into this "forgotten" state with the Netdata CLI tool on the parent the node is connected to (if still connected) or one of the parent agents it was previously connected to. The state will be propagated _upwards_ and _sideways_ in case of an HA setup. + +``` +netdatacli remove-stale-node <node_id | machine_guid | hostname | ALL_NODES> +``` When using Netdata Cloud (via a parent or directly) and a permanent node gets disconnected, Netdata Cloud sends node disconnection notifications. diff --git a/docs/observability-centralization-points/metrics-centralization-points/replication-of-past-samples.md b/docs/observability-centralization-points/metrics-centralization-points/replication-of-past-samples.md index 5c776b86..e0c60e89 100644 --- a/docs/observability-centralization-points/metrics-centralization-points/replication-of-past-samples.md +++ b/docs/observability-centralization-points/metrics-centralization-points/replication-of-past-samples.md @@ -45,13 +45,13 @@ The following `netdata.conf` configuration parameters affect replication. On the receiving side (Netdata Parent): -- `[db].seconds to replicate` limits the maximum time to be replicated. The default is 1 day (86400 seconds). Keep in mind that replication is also limited by the `tier0` retention the sending side has. +- `[db].replication period` limits the maximum time to be replicated. The default is 1 day. Keep in mind that replication is also limited by the `tier0` retention the sending side has. On the sending side (Netdata Children, or Netdata Parent when parents are clustered): - `[db].replication threads` controls how many concurrent threads will be replicating metrics. The default is 1. Usually the performance is about 2 million samples per second per thread, so increasing this number may allow replication to progress faster between Netdata Parents. -- `[db].cleanup obsolete charts after secs` controls for how much time after metrics stop being collected will not be available for replication. The default is 1 hour (3600 seconds). If you plan to have scheduled maintenance on Netdata Parents of more than 1 hour, we recommend increasing this setting. Keep in mind however, that increasing this duration in highly ephemeral environments can have an impact on RAM utilization, since metrics will be considered as collected for longer durations. +- `[db].cleanup obsolete charts after` controls for how much time after metrics stop being collected will not be available for replication. The default is 1 hour (3600 seconds). If you plan to have scheduled maintenance on Netdata Parents of more than 1 hour, we recommend increasing this setting. Keep in mind however, that increasing this duration in highly ephemeral environments can have an impact on RAM utilization, since metrics will be considered as collected for longer durations. ## Monitoring Replication Progress |