summaryrefslogtreecommitdiffstats
path: root/docs/guides/monitor
diff options
context:
space:
mode:
Diffstat (limited to 'docs/guides/monitor')
-rw-r--r--docs/guides/monitor/anomaly-detection-python.md36
-rw-r--r--docs/guides/monitor/anomaly-detection.md18
-rw-r--r--docs/guides/monitor/dimension-templates.md37
-rw-r--r--docs/guides/monitor/kubernetes-k8s-netdata.md28
-rw-r--r--docs/guides/monitor/lamp-stack.md42
-rw-r--r--docs/guides/monitor/pi-hole-raspberry-pi.md26
-rw-r--r--docs/guides/monitor/process.md231
-rw-r--r--docs/guides/monitor/raspberry-pi-anomaly-detection.md22
-rw-r--r--docs/guides/monitor/statsd.md14
-rw-r--r--docs/guides/monitor/stop-notifications-alarms.md12
-rw-r--r--docs/guides/monitor/visualize-monitor-anomalies.md28
11 files changed, 255 insertions, 239 deletions
diff --git a/docs/guides/monitor/anomaly-detection-python.md b/docs/guides/monitor/anomaly-detection-python.md
index ad8398cc6..d6d27f4e5 100644
--- a/docs/guides/monitor/anomaly-detection-python.md
+++ b/docs/guides/monitor/anomaly-detection-python.md
@@ -23,7 +23,7 @@ library](https://github.com/yzhao062/pyod/tree/master), which periodically runs
quantify how anomalous certain charts are.
All these metrics and alarms are available for centralized monitoring in [Netdata Cloud](https://app.netdata.cloud). If
-you choose to sign up for Netdata Cloud and [connect your nodes](/claim/README.md), you will have the ability to run
+you choose to sign up for Netdata Cloud and [connect your nodes](https://github.com/netdata/netdata/blob/master/claim/README.md), you will have the ability to run
tailored anomaly detection on every node in your infrastructure, regardless of its purpose or workload.
In this guide, you'll learn how to set up the anomalies collector to instantly detect anomalies in an Nginx web server
@@ -35,9 +35,9 @@ server](https://user-images.githubusercontent.com/1153921/103586700-da5b0a00-4ea
## Prerequisites
-- A node running the Netdata Agent. If you don't yet have that, [get Netdata](/docs/get-started.mdx).
+- A node running the Netdata Agent. If you don't yet have that, [get Netdata](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx).
- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
-- Familiarity with configuring the Netdata Agent with [`edit-config`](/docs/configure/nodes.md).
+- Familiarity with configuring the Netdata Agent with [`edit-config`](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md).
- _Optional_: An Nginx web server running on the same node to follow the example configuration steps.
## Install required Python packages
@@ -65,7 +65,7 @@ Use `exit` to become your normal user again.
## Enable the anomalies collector
-Navigate to your [Netdata config directory](/docs/configure/nodes.md#the-netdata-config-directory) and use `edit-config`
+Navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory) and use `edit-config`
to open the `python.d.conf` file.
```bash
@@ -79,8 +79,8 @@ yourself if it doesn't already exist. Either way, the final result should look l
anomalies: yes
```
-[Restart the Agent](/docs/configure/start-stop-restart.md) with `sudo systemctl restart netdata`, or the [appropriate
-method](/docs/configure/start-stop-restart.md) for your system, to start up the anomalies collector. By default, the
+[Restart the Agent](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) with `sudo systemctl restart netdata`, or the [appropriate
+method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your system, to start up the anomalies collector. By default, the
model training process runs every 30 minutes, and uses the previous 4 hours of metrics to establish a baseline for
health and performance across the default included charts.
@@ -105,7 +105,7 @@ involve tweaking the behavior of the ML training itself.
- `train_every_n`: How often to train the ML models.
- `train_n_secs`: The number of historical observations to train each model on. The default is 4 hours, but if your node
doesn't have historical metrics going back that far, consider [changing the metrics retention
- policy](/docs/store/change-metrics-storage.md) or reducing this window.
+ policy](https://github.com/netdata/netdata/blob/master/docs/store/change-metrics-storage.md) or reducing this window.
- `custom_models`: A way to define custom models that you want anomaly probabilities for, including multi-node or
streaming setups.
@@ -119,8 +119,8 @@ involve tweaking the behavior of the ML training itself.
As mentioned above, this guide uses an Nginx web server to demonstrate how the anomalies collector works. You must
configure the collector to monitor charts from the
-[Nginx](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx) and [web
-log](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog) collectors.
+[Nginx](https://github.com/netdata/go.d.plugin/blob/master/modules/nginx/README.md) and [web
+log](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md) collectors.
`charts_regex` allows for some basic regex, such as wildcards (`*`) to match all contexts with a certain pattern. For
example, `system\..*` matches with any chart with a context that begins with `system.`, and ends in any number of other
@@ -163,27 +163,27 @@ volume of requests/responses, not, for example, which type of 4xx response a use
dimensions](https://user-images.githubusercontent.com/1153921/102820642-d69f9180-4392-11eb-91c5-d3d166d40105.png)
Apply the ideas behind the collector's regex and exclude settings to any other
-[system](/docs/collect/system-metrics.md), [container](/docs/collect/container-metrics.md), or
-[application](/docs/collect/application-metrics.md) metrics you want to detect anomalies for.
+[system](https://github.com/netdata/netdata/blob/master/docs/collect/system-metrics.md), [container](https://github.com/netdata/netdata/blob/master/docs/collect/container-metrics.md), or
+[application](https://github.com/netdata/netdata/blob/master/docs/collect/application-metrics.md) metrics you want to detect anomalies for.
## What's next?
Now that you know how to set up unsupervised anomaly detection in the Netdata Agent, using an Nginx web server as an
example, it's time to apply that knowledge to other mission-critical parts of your infrastructure. If you're not sure
-what to monitor next, check out our list of [collectors](/collectors/COLLECTORS.md) to see what kind of metrics Netdata
+what to monitor next, check out our list of [collectors](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md) to see what kind of metrics Netdata
can collect from your systems, containers, and applications.
-Keep on moving to [part 2](/docs/guides/monitor/visualize-monitor-anomalies.md), which covers the charts and alarms
+Keep on moving to [part 2](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/visualize-monitor-anomalies.md), which covers the charts and alarms
Netdata creates for unsupervised anomaly detection.
For a different troubleshooting experience, try out the [Metric
-Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations) feature in Netdata Cloud. Metric
+Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) feature in Netdata Cloud. Metric
Correlations helps you perform faster root cause analysis by narrowing a dashboard to only the charts most likely to be
related to an anomaly.
### Related reference documentation
-- [Netdata Agent · Anomalies collector](/collectors/python.d.plugin/anomalies/README.md)
-- [Netdata Agent · Nginx collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx)
-- [Netdata Agent · web log collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog)
-- [Netdata Cloud · Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations)
+- [Netdata Agent · Anomalies collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md)
+- [Netdata Agent · Nginx collector](https://github.com/netdata/go.d.plugin/blob/master/modules/nginx/README.md)
+- [Netdata Agent · web log collector](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md)
+- [Netdata Cloud · Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md)
diff --git a/docs/guides/monitor/anomaly-detection.md b/docs/guides/monitor/anomaly-detection.md
index e98c5c02e..ce819d937 100644
--- a/docs/guides/monitor/anomaly-detection.md
+++ b/docs/guides/monitor/anomaly-detection.md
@@ -14,27 +14,27 @@ custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/moni
As of [`v1.32.0`](https://github.com/netdata/netdata/releases/tag/v1.32.0), Netdata comes with some ML powered [anomaly detection](https://en.wikipedia.org/wiki/Anomaly_detection) capabilities built into it and available to use out of the box, with zero configuration required (ML was enabled by default in `v1.35.0-29-nightly` in [this PR](https://github.com/netdata/netdata/pull/13158), previously it required a one line config change).
-This means that in addition to collecting raw value metrics, the Netdata agent will also produce an [`anomaly-bit`](https://learn.netdata.cloud/docs/agent/ml#anomaly-bit---100--anomalous-0--normal) every second which will be `100` when recent raw metric values are considered anomalous by Netdata and `0` when they look normal. Once we aggregate beyond one second intervals this aggregated `anomaly-bit` becomes an ["anomaly rate"](https://learn.netdata.cloud/docs/agent/ml#anomaly-rate---averageanomaly-bit).
+This means that in addition to collecting raw value metrics, the Netdata agent will also produce an [`anomaly-bit`](https://github.com/netdata/netdata/blob/master/ml/README.md#anomaly-bit---100--anomalous-0--normal) every second which will be `100` when recent raw metric values are considered anomalous by Netdata and `0` when they look normal. Once we aggregate beyond one second intervals this aggregated `anomaly-bit` becomes an ["anomaly rate"](https://github.com/netdata/netdata/blob/master/ml/README.md#anomaly-rate---averageanomaly-bit).
-To be as concrete as possible, the below api call shows how to access the raw anomaly bit of the `system.cpu` chart from the [london.my-netdata.io](https://london.my-netdata.io) Netdata demo server. Passing `options=anomaly-bit` returns the anomay bit instead of the raw metric value.
+To be as concrete as possible, the below api call shows how to access the raw anomaly bit of the `system.cpu` chart from the [london.my-netdata.io](https://london.my-netdata.io) Netdata demo server. Passing `options=anomaly-bit` returns the anomaly bit instead of the raw metric value.
```
https://london.my-netdata.io/api/v1/data?chart=system.cpu&options=anomaly-bit
```
-If we aggregate the above to just 1 point by adding `points=1` we get an "[Anomaly Rate](https://learn.netdata.cloud/docs/agent/ml#anomaly-rate---averageanomaly-bit)":
+If we aggregate the above to just 1 point by adding `points=1` we get an "[Anomaly Rate](https://github.com/netdata/netdata/blob/master/ml/README.md#anomaly-rate---averageanomaly-bit)":
```
https://london.my-netdata.io/api/v1/data?chart=system.cpu&options=anomaly-bit&points=1
```
-The fundamentals of Netdata's anomaly detection approach and implmentation are covered in lots more detail in the [agent ML documentation](https://learn.netdata.cloud/docs/agent/ml).
+The fundamentals of Netdata's anomaly detection approach and implementation are covered in lots more detail in the [agent ML documentation](https://github.com/netdata/netdata/blob/master/ml/README.md).
This guide will explain how to get started using these ML based anomaly detection capabilities within Netdata.
## Anomaly Advisor
-The [Anomaly Advisor](https://learn.netdata.cloud/docs/cloud/insights/anomaly-advisor) is the flagship anomaly detection feature within Netdata. In the "Anomalies" tab of Netdata you will see an overall "Anomaly Rate" chart that aggregates node level anomaly rate for all nodes in a space. The aim of this chart is to make it easy to quickly spot periods of time where the overall "[node anomaly rate](https://learn.netdata.cloud/docs/agent/ml#node-anomaly-rate)" is evelated in some unusual way and for what node or nodes this relates to.
+The [Anomaly Advisor](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/anomaly-advisor.mdx) is the flagship anomaly detection feature within Netdata. In the "Anomalies" tab of Netdata you will see an overall "Anomaly Rate" chart that aggregates node level anomaly rate for all nodes in a space. The aim of this chart is to make it easy to quickly spot periods of time where the overall "[node anomaly rate](https://github.com/netdata/netdata/blob/master/ml/README.md#node-anomaly-rate)" is elevated in some unusual way and for what node or nodes this relates to.
![image](https://user-images.githubusercontent.com/2178292/175928290-490dd8b9-9c55-4724-927e-e145cb1cc837.png)
@@ -44,7 +44,7 @@ Once an area on the Anomaly Rate chart is highlighted netdata will append a "hea
## Embedded Anomaly Rate Charts
-Charts in both the [Overview](https://learn.netdata.cloud/docs/cloud/visualize/overview) and [single node dashboard](https://learn.netdata.cloud/docs/cloud/visualize/overview#jump-to-single-node-dashboards) tabs also expose the underlying anomaly rates for each dimension so users can easily see if the raw metrics are considered anomalous or not by Netdata.
+Charts in both the [Overview](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md) and [single node dashboard](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md#jump-to-single-node-dashboards) tabs also expose the underlying anomaly rates for each dimension so users can easily see if the raw metrics are considered anomalous or not by Netdata.
Pressing the anomalies icon (next to the information icon in the chart header) will expand the anomaly rate chart to make it easy to see how the anomaly rate for any individual dimension corresponds to the raw underlying data. In the example below we can see that the spike in `system.pgpgio|in` corresponded in the anomaly rate for that dimension jumping to 100% for a small period of time until the spike passed.
@@ -65,9 +65,9 @@ You can see some example ML based alert configurations below:
Check out the resources below to learn more about how Netdata is approaching ML:
-- [Agent ML documentation](https://learn.netdata.cloud/docs/agent/ml).
-- [Anomaly Advisor documentation](https://learn.netdata.cloud/docs/cloud/insights/anomaly-advisor).
-- [Metric Correlations documentation](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations).
+- [Agent ML documentation](https://github.com/netdata/netdata/blob/master/ml/README.md).
+- [Anomaly Advisor documentation](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/anomaly-advisor.mdx).
+- [Metric Correlations documentation](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md).
- Anomaly Advisor [launch blog post](https://www.netdata.cloud/blog/introducing-anomaly-advisor-unsupervised-anomaly-detection-in-netdata/).
- Netdata Approach to ML [blog post](https://www.netdata.cloud/blog/our-approach-to-machine-learning/).
- `areal/ml` related [GitHub Discussions](https://github.com/netdata/netdata/discussions?discussions_q=label%3Aarea%2Fml).
diff --git a/docs/guides/monitor/dimension-templates.md b/docs/guides/monitor/dimension-templates.md
index 539127366..d2795a9c6 100644
--- a/docs/guides/monitor/dimension-templates.md
+++ b/docs/guides/monitor/dimension-templates.md
@@ -8,24 +8,27 @@ custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/moni
Your ability to monitor the health of your systems and applications relies on your ability to create and maintain
the best set of alarms for your particular needs.
-In v1.18 of Netdata, we introduced **dimension templates** for alarms, which simplifies the process of writing [alarm
-entities](/health/REFERENCE.md#health-entity-reference) for charts with many dimensions.
+In v1.18 of Netdata, we introduced **dimension templates** for alarms, which simplifies the process of
+writing [alarm entities](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#health-entity-reference) for
+charts with many dimensions.
Dimension templates can condense many individual entities into one—no more copy-pasting one entity and changing the
`alarm`/`template` and `lookup` lines for each dimension you'd like to monitor.
They are, however, an advanced health monitoring feature. For more basic instructions on creating your first alarm,
-check out our [health monitoring documentation](/health/README.md), which also includes
-[examples](/health/REFERENCE.md#example-alarms).
+check out our [health monitoring documentation](https://github.com/netdata/netdata/blob/master/health/README.md), which also includes
+[examples](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#example-alarms).
## The fundamentals of `foreach`
-Our dimension templates update creates a new `foreach` parameter to the existing [`lookup`
-line](/health/REFERENCE.md#alarm-line-lookup). This is where the magic happens.
+Our dimension templates update creates a new `foreach` parameter to the
+existing [`lookup` line](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-lookup). This
+is where the magic happens.
You use the `foreach` parameter to specify which dimensions you want to monitor with this single alarm. You can separate
-them with a comma (`,`) or a pipe (`|`). You can also use a [Netdata simple pattern](/libnetdata/simple_pattern/README.md)
-to create many alarms with a regex-like syntax.
+them with a comma (`,`) or a pipe (`|`). You can also use
+a [Netdata simple pattern](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md) to create
+many alarms with a regex-like syntax.
The `foreach` parameter _has_ to be the last parameter in your `lookup` line, and if you have both `of` and `foreach` in
the same `lookup` line, Netdata will ignore the `of` parameter and use `foreach` instead.
@@ -95,7 +98,7 @@ Let's look at some other examples of how `foreach` works so you can best apply i
In the last example, we used `foreach system,user,nice` to create three distinct alarms using dimension templates. But
what if you want to quickly create alarms for _all_ the dimensions of a given chart?
-Use a [simple pattern](/libnetdata/simple_pattern/README.md)! One example of a simple pattern is a single wildcard
+Use a [simple pattern](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md)! One example of a simple pattern is a single wildcard
(`*`).
Instead of monitoring system CPU usage, let's monitor per-application CPU usage using the `apps.cpu` chart. Passing a
@@ -113,14 +116,15 @@ lookup: average -10m percentage foreach *
This entity will now create alarms for every dimension in the `apps.cpu` chart. Given that most `apps.cpu` charts have
10 or more dimensions, using the wildcard ensures you catch every CPU-hogging process.
-To learn more about how to use simple patterns with dimension templates, see our [simple patterns
-documentation](/libnetdata/simple_pattern/README.md).
+To learn more about how to use simple patterns with dimension templates, see
+our [simple patterns documentation](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md).
## Using `foreach` with alarm templates
-Dimension templates also work with [alarm templates](/health/REFERENCE.md#alarm-line-alarm-or-template). Alarm
-templates help you create alarms for all the charts with a given context—for example, all the cores of your system's
-CPU.
+Dimension templates also work
+with [alarm templates](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-alarm-or-template).
+Alarm templates help you create alarms for all the charts with a given context—for example, all the cores of your
+system's CPU.
By combining the two, you can create dozens of individual alarms with a single template entity. Here's how you would
create alarms for the `system`, `user`, and `nice` dimensions for every chart in the `cpu.cpu` context—or, in other
@@ -170,7 +174,8 @@ alarms that will help you better monitor the health of your systems.
Or, at the very least, simplify your configuration files.
-For information about other advanced features in Netdata's health monitoring toolkit, check out our [health
-documentation](/health/README.md). And if you have some cool alarms you built using dimension templates,
+For information about other advanced features in Netdata's health monitoring toolkit, check out
+our [health documentation](https://github.com/netdata/netdata/blob/master/health/README.md). And if you have some cool
+alarms you built using dimension templates,
diff --git a/docs/guides/monitor/kubernetes-k8s-netdata.md b/docs/guides/monitor/kubernetes-k8s-netdata.md
index 5cfefe892..5732fc96c 100644
--- a/docs/guides/monitor/kubernetes-k8s-netdata.md
+++ b/docs/guides/monitor/kubernetes-k8s-netdata.md
@@ -46,7 +46,7 @@ To follow this tutorial, you need:
- A free Netdata Cloud account. [Sign up](https://app.netdata.cloud/sign-up?cloudRoute=/spaces) if you don't have one
already.
- A working cluster running Kubernetes v1.9 or newer, with a Netdata deployment and connected parent/child nodes. See
- our [Kubernetes deployment process](/packaging/installer/methods/kubernetes.md) for details on deployment and
+ our [Kubernetes deployment process](https://github.com/netdata/netdata/blob/master/packaging/installer/methods/kubernetes.md) for details on deployment and
conneting to Cloud.
- The [`kubectl`](https://kubernetes.io/docs/reference/kubectl/overview/) command line tool, within [one minor version
difference](https://kubernetes.io/docs/tasks/tools/install-kubectl/#before-you-begin) of your cluster, on an
@@ -104,7 +104,7 @@ To get started, [sign in](https://app.netdata.cloud/sign-in?cloudRoute=/spaces)
to the War Room you connected your cluster to, if not **General**.
Netdata Cloud is already visualizing your Kubernetes metrics, streamed in real-time from each node, in the
-[Overview](https://learn.netdata.cloud/docs/cloud/visualize/overview):
+[Overview](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md):
![Netdata's Kubernetes monitoring
dashboard](https://user-images.githubusercontent.com/1153921/109037415-eafc5500-7687-11eb-8773-9b95941e3328.png)
@@ -126,8 +126,8 @@ cluster](https://user-images.githubusercontent.com/1153921/109042169-19c8fa00-76
For example, the chart above shows a spike in the CPU utilization from `rabbitmq` every minute or so, along with a
baseline CPU utilization of 10-15% across the cluster.
-Read about the [Overview](https://learn.netdata.cloud/docs/cloud/visualize/overview) and some best practices on [viewing
-an overview of your infrastructure](/docs/visualize/overview-infrastructure.md) for details on using composite charts to
+Read about the [Overview](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md) and some best practices on [viewing
+an overview of your infrastructure](https://github.com/netdata/netdata/blob/master/docs/visualize/overview-infrastructure.md) for details on using composite charts to
drill down into per-node performance metrics.
## Pod and container metrics
@@ -154,7 +154,7 @@ Let's explore the most colorful box by hovering over it.
container](https://user-images.githubusercontent.com/1153921/109049544-a8417980-7695-11eb-80a7-109b4a645a27.png)
The **Context** tab shows `rabbitmq-5bb66bb6c9-6xr5b` as the container's image name, which means this container is
-running a [RabbitMQ](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/rabbitmq) workload.
+running a [RabbitMQ](https://github.com/netdata/go.d.plugin/blob/master/modules/rabbitmq/README.md) workload.
Click the **Metrics** tab to see real-time metrics from that container. Unsurprisingly, it shows a spike in CPU
utilization at regular intervals.
@@ -173,7 +173,7 @@ different namespaces.
![Time-series Kubernetes monitoring in Netdata
Cloud](https://user-images.githubusercontent.com/1153921/109075210-126a1680-76b6-11eb-918d-5acdcdac152d.png)
-Each composite chart has a [definition bar](https://learn.netdata.cloud/docs/cloud/visualize/overview#definition-bar)
+Each composite chart has a [definition bar](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md#definition-bar)
for complete customization. For example, grouping the top chart by `k8s_container_name` reveals new information.
![Changing time-series charts](https://user-images.githubusercontent.com/1153921/109075212-139b4380-76b6-11eb-836f-939482ae55fc.png)
@@ -183,20 +183,20 @@ for complete customization. For example, grouping the top chart by `k8s_containe
Netdata has a [service discovery plugin](https://github.com/netdata/agent-service-discovery), which discovers and
creates configuration files for [compatible
services](https://github.com/netdata/helmchart#service-discovery-and-supported-services) and any endpoints covered by
-our [generic Prometheus collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus).
+our [generic Prometheus collector](https://github.com/netdata/go.d.plugin/blob/master/modules/prometheus/README.md).
Netdata uses these files to collect metrics from any compatible application as they run _inside_ of a pod. Service
discovery happens without manual intervention as pods are created, destroyed, or moved between nodes.
Service metrics show up on the Overview as well, beneath the **Kubernetes** section, and are labeled according to the
service in question. For example, the **RabbitMQ** section has numerous charts from the [`rabbitmq`
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/rabbitmq):
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/rabbitmq/README.md):
![Finding service discovery
metrics](https://user-images.githubusercontent.com/1153921/109054511-2eac8a00-769b-11eb-97f1-da93acb4b5fe.png)
> The robot-shop cluster has more supported services, such as MySQL, which are not visible with zero configuration. This
> is usually because of services running on non-default ports, using non-default names, or required passwords. Read up
-> on [configuring service discovery](/packaging/installer/methods/kubernetes.md#configure-service-discovery) to collect
+> on [configuring service discovery](https://github.com/netdata/netdata/blob/master/packaging/installer/methods/kubernetes.md#configure-service-discovery) to collect
> more service metrics.
Service metrics are essential to infrastructure monitoring, as they're the best indicator of the end-user experience,
@@ -210,7 +210,7 @@ Netdata also automatically collects metrics from two essential Kubernetes proces
The **k8s kubelet** section visualizes metrics from the Kubernetes agent responsible for managing every pod on a given
node. This also happens without any configuration thanks to the [kubelet
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubelet).
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/k8s_kubelet/README.md).
Monitoring each node's kubelet can be invaluable when diagnosing issues with your Kubernetes cluster. For example, you
can see if the number of running containers/pods has dropped, which could signal a fault or crash in a particular
@@ -226,7 +226,7 @@ configuration-related errors, and the actual vs. desired numbers of volumes, plu
The **k8s kube-proxy** section displays metrics about the network proxy that runs on each node in your Kubernetes
cluster. kube-proxy lets pods communicate with each other and accept sessions from outside your cluster. Its metrics are
collected by the [kube-proxy
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubeproxy).
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/k8s_kubeproxy/README.md).
With Netdata, you can monitor how often your k8s proxies are syncing proxy rules between nodes. Dramatic changes in
these figures could indicate an anomaly in your cluster that's worthy of further investigation.
@@ -246,9 +246,9 @@ clusters of all sizes.
- [Netdata Helm chart](https://github.com/netdata/helmchart)
- [Netdata service discovery](https://github.com/netdata/agent-service-discovery)
- [Netdata Agent · `kubelet`
- collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubelet)
+ collector](https://github.com/netdata/go.d.plugin/blob/master/modules/k8s_kubelet/README.md)
- [Netdata Agent · `kube-proxy`
- collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubeproxy)
-- [Netdata Agent · `cgroups.plugin`](/collectors/cgroups.plugin/README.md)
+ collector](https://github.com/netdata/go.d.plugin/blob/master/modules/k8s_kubeproxy/README.md)
+- [Netdata Agent · `cgroups.plugin`](https://github.com/netdata/netdata/blob/master/collectors/cgroups.plugin/README.md)
diff --git a/docs/guides/monitor/lamp-stack.md b/docs/guides/monitor/lamp-stack.md
index 29b35e142..165888c4b 100644
--- a/docs/guides/monitor/lamp-stack.md
+++ b/docs/guides/monitor/lamp-stack.md
@@ -58,7 +58,7 @@ To follow this tutorial, you need:
## Install the Netdata Agent
If you don't have the free, open-source Netdata monitoring agent installed on your node yet, get started with a [single
-kickstart command](/docs/get-started.mdx):
+kickstart command](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx):
<OneLineInstallWget/>
@@ -68,15 +68,15 @@ replacing `NODE` with the hostname or IP address of your system.
## Enable hardware and Linux system monitoring
-There's nothing you need to do to enable [system monitoring](/docs/collect/system-metrics.md) and Linux monitoring with
+There's nothing you need to do to enable [system monitoring](https://github.com/netdata/netdata/blob/master/docs/collect/system-metrics.md) and Linux monitoring with
the Netdata Agent, which autodetects metrics from CPUs, memory, disks, networking devices, and Linux processes like
systemd without any configuration. If you're using containers, Netdata automatically collects resource utilization
-metrics from each using the [cgroups data collector](/collectors/cgroups.plugin/README.md).
+metrics from each using the [cgroups data collector](https://github.com/netdata/netdata/blob/master/collectors/cgroups.plugin/README.md).
## Enable Apache monitoring
Let's begin by configuring Apache to work with Netdata's [Apache data
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache).
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/apache/README.md).
Actually, there's nothing for you to do to enable Apache monitoring with Netdata.
@@ -87,7 +87,7 @@ metrics](https://httpd.apache.org/docs/2.4/mod/mod_status.html), which is just _
## Enable web log monitoring
The Netdata Agent also comes with a [web log
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog), which reads Apache's access
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md), which reads Apache's access
log file, processes each line, and converts them into per-second metrics. On Debian systems, it reads the file at
`/var/log/apache2/access.log`.
@@ -100,7 +100,7 @@ monitoring.
Because your MySQL database is password-protected, you do need to tell MySQL to allow the `netdata` user to connect to
without a password. Netdata's [MySQL data
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql) collects metrics in _read-only_
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/mysql/README.md) collects metrics in _read-only_
mode, without being able to alter or affect operations in any way.
First, log into the MySQL shell. Then, run the following three commands, one at a time:
@@ -112,15 +112,15 @@ FLUSH PRIVILEGES;
```
Run `sudo systemctl restart netdata`, or the [appropriate alternative for your
-system](/docs/configure/start-stop-restart.md), to collect dozens of metrics every second for robust MySQL monitoring.
+system](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md), to collect dozens of metrics every second for robust MySQL monitoring.
## Enable PHP monitoring
Unlike Apache or MySQL, PHP isn't a service that you can monitor directly, unless you instrument a PHP-based application
-with [StatsD](/collectors/statsd.plugin/README.md).
+with [StatsD](https://github.com/netdata/netdata/blob/master/collectors/statsd.plugin/README.md).
However, if you use [PHP-FPM](https://php-fpm.org/) in your LAMP stack, you can monitor that process with our [PHP-FPM
-data collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/phpfpm).
+data collector](https://github.com/netdata/go.d.plugin/blob/master/modules/phpfpm/README.md).
Open your PHP-FPM configuration for editing, replacing `7.4` with your version of PHP:
@@ -166,12 +166,12 @@ If the Netdata Agent isn't already open in your browser, open a new tab and navi
> If you [signed up](https://app.netdata.cloud/sign-up?cloudRoute=/spaces) for Netdata Cloud earlier, you can also view
> the exact same LAMP stack metrics there, plus additional features, like drag-and-drop custom dashboards. Be sure to
-> [connecting your node](/claim/README.md) to start streaming metrics to your browser through Netdata Cloud.
+> [connecting your node](https://github.com/netdata/netdata/blob/master/claim/README.md) to start streaming metrics to your browser through Netdata Cloud.
Netdata automatically organizes all metrics and charts onto a single page for easy navigation. Peek at gauges to see
overall system performance, then scroll down to see more. Click-and-drag with your mouse to pan _all_ charts back and
forth through different time intervals, or hold `SHIFT` and use the scrollwheel (or two-finger scroll) to zoom in and
-out. Check out our doc on [interacting with charts](/docs/visualize/interact-dashboards-charts.md) for all the details.
+out. Check out our doc on [interacting with charts](https://github.com/netdata/netdata/blob/master/docs/visualize/interact-dashboards-charts.md) for all the details.
![The Netdata
dashboard](https://user-images.githubusercontent.com/1153921/109520555-98e17800-7a69-11eb-86ec-16f689da4527.png)
@@ -205,15 +205,15 @@ Here's a quick reference for what charts you might want to focus on after settin
The Netdata Agent comes with hundreds of pre-configured alarms to help you keep tabs on your system, including 19 alarms
designed for smarter LAMP stack monitoring.
-Click the 🔔 icon in the top navigation to [see active alarms](/docs/monitor/view-active-alarms.md). The **Active** tabs
+Click the 🔔 icon in the top navigation to [see active alarms](https://github.com/netdata/netdata/blob/master/docs/monitor/view-active-alarms.md). The **Active** tabs
shows any alarms currently triggered, while the **All** tab displays a list of _every_ pre-configured alarm. The
![An example of LAMP stack
alarms](https://user-images.githubusercontent.com/1153921/109524120-5883f900-7a6d-11eb-830e-0e7baaa28163.png)
-[Tweak alarms](/docs/monitor/configure-alarms.md) based on your infrastructure monitoring needs, and to see these alarms
+[Tweak alarms](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md) based on your infrastructure monitoring needs, and to see these alarms
in other places, like your inbox or a Slack channel, [enable a notification
-method](/docs/monitor/enable-notifications.md).
+method](https://github.com/netdata/netdata/blob/master/docs/monitor/enable-notifications.md).
## What's next?
@@ -223,7 +223,7 @@ services. The per-second metrics granularity means you have the most accurate in
any LAMP-related issues.
Another powerful way to monitor the availability of a LAMP stack is the [`httpcheck`
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/httpcheck), which pings a web server at
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/httpcheck/README.md), which pings a web server at
a regular interval and tells you whether if and how quickly it's responding. The `response_match` option also lets you
monitor when the web server's response isn't what you expect it to be, which might happen if PHP-FPM crashes, for
example.
@@ -233,14 +233,14 @@ we're not covering it here, but it _does_ work in a single-node setup. Just don'
node crashed.
If you're planning on managing more than one node, or want to take advantage of advanced features, like finding the
-source of issues faster with [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations),
+source of issues faster with [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md),
[sign up](https://app.netdata.cloud/sign-up?cloudRoute=/spaces) for a free Netdata Cloud account.
### Related reference documentation
-- [Netdata Agent · Get started](/docs/get-started.mdx)
-- [Netdata Agent · Apache data collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache)
-- [Netdata Agent · Web log collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog)
-- [Netdata Agent · MySQL data collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql)
-- [Netdata Agent · PHP-FPM data collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/phpfpm)
+- [Netdata Agent · Get started](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx)
+- [Netdata Agent · Apache data collector](https://github.com/netdata/go.d.plugin/blob/master/modules/apache/README.md)
+- [Netdata Agent · Web log collector](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md)
+- [Netdata Agent · MySQL data collector](https://github.com/netdata/go.d.plugin/blob/master/modules/mysql/README.md)
+- [Netdata Agent · PHP-FPM data collector](https://github.com/netdata/go.d.plugin/blob/master/modules/phpfpm/README.md)
diff --git a/docs/guides/monitor/pi-hole-raspberry-pi.md b/docs/guides/monitor/pi-hole-raspberry-pi.md
index 1246d8ba1..5099d12b9 100644
--- a/docs/guides/monitor/pi-hole-raspberry-pi.md
+++ b/docs/guides/monitor/pi-hole-raspberry-pi.md
@@ -79,7 +79,7 @@ service](https://discourse.pi-hole.net/t/how-do-i-configure-my-devices-to-use-pi
finished setting up Pi-hole at this point.
As far as configuring Netdata to monitor Pi-hole metrics, there's nothing you actually need to do. Netdata's [Pi-hole
-collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/pihole) will autodetect the new service
+collector](https://github.com/netdata/go.d.plugin/blob/master/modules/pihole/README.md) will autodetect the new service
running on your Raspberry Pi and immediately start collecting metrics every second.
Restart Netdata with `sudo systemctl restart netdata`, which will then recognize that Pi-hole is running and start a
@@ -98,15 +98,15 @@ part of your system might affect another.
![The Netdata dashboard in
action](https://user-images.githubusercontent.com/1153921/80827388-b9fee100-8b98-11ea-8f60-0d7824667cd3.gif)
-If you're completely new to Netdata, look at our [step-by-step guide](/docs/guides/step-by-step/step-00.md) for a
-walkthrough of all its features. For a more expedited tour, see the [get started guide](/docs/get-started.mdx).
+If you're completely new to Netdata, look at our [step-by-step guide](https://github.com/netdata/netdata/blob/master/docs/guides/step-by-step/step-00.md) for a
+walkthrough of all its features. For a more expedited tour, see the [get started guide](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx).
### Enable temperature sensor monitoring
You need to manually enable Netdata's built-in [temperature sensor
-collector](https://learn.netdata.cloud/docs/agent/collectors/charts.d.plugin/sensors) to start collecting metrics.
+collector](https://github.com/netdata/netdata/blob/master/collectors/charts.d.plugin/sensors/README.md) to start collecting metrics.
-> Netdata uses a few plugins to manage its [collectors](/collectors/REFERENCE.md), each using a different language: Go,
+> Netdata uses a few plugins to manage its [collectors](https://github.com/netdata/netdata/blob/master/collectors/REFERENCE.md), each using a different language: Go,
> Python, Node.js, and Bash. While our Go collectors are undergoing the most active development, we still support the
> other languages. In this case, you need to enable a temperature sensor collector that's written in Bash.
@@ -124,7 +124,7 @@ Raspberry Pi temperature sensor monitoring.
### Storing historical metrics on your Raspberry Pi
By default, Netdata allocates 256 MiB in disk space to store historical metrics inside the [database
-engine](/database/engine/README.md). On the Raspberry Pi used for this guide, Netdata collects 1,500 metrics every
+engine](https://github.com/netdata/netdata/blob/master/database/engine/README.md). On the Raspberry Pi used for this guide, Netdata collects 1,500 metrics every
second, which equates to storing 3.5 days worth of historical metrics.
You can increase this allocation by editing `netdata.conf` and increasing the `dbengine multihost disk space` setting to
@@ -136,8 +136,8 @@ more than 256.
```
Use our [database sizing
-calculator](/docs/store/change-metrics-storage.md#calculate-the-system-resources-ram-disk-space-needed-to-store-metrics)
-and [guide on storing historical metrics](/docs/guides/longer-metrics-storage.md) to help you determine the right
+calculator](https://github.com/netdata/netdata/blob/master/docs/store/change-metrics-storage.md#calculate-the-system-resources-ram-disk-space-needed-to-store-metrics)
+and [guide on storing historical metrics](https://github.com/netdata/netdata/blob/master/docs/guides/longer-metrics-storage.md) to help you determine the right
setting for your Raspberry Pi.
## What's next?
@@ -146,12 +146,12 @@ Now that you're monitoring Pi-hole and your Raspberry Pi with Netdata, you can e
configure Netdata to more specific goals.
Most importantly, you can always install additional services and instantly collect metrics from many of them with our
-[300+ integrations](/collectors/COLLECTORS.md).
+[300+ integrations](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md).
-- [Optimize performance](/docs/guides/configure/performance.md) using tweaks developed for IoT devices.
-- [Stream Raspberry Pi metrics](/streaming/README.md) to a parent host for easy access or longer-term storage.
-- [Tweak alarms](/health/QUICKSTART.md) for either Pi-hole or the health of your Raspberry Pi.
-- [Export metrics to external databases](/exporting/README.md) with the exporting engine.
+- [Optimize performance](https://github.com/netdata/netdata/blob/master/docs/guides/configure/performance.md) using tweaks developed for IoT devices.
+- [Stream Raspberry Pi metrics](https://github.com/netdata/netdata/blob/master/streaming/README.md) to a parent host for easy access or longer-term storage.
+- [Tweak alarms](https://github.com/netdata/netdata/blob/master/health/QUICKSTART.md) for either Pi-hole or the health of your Raspberry Pi.
+- [Export metrics to external databases](https://github.com/netdata/netdata/blob/master/exporting/README.md) with the exporting engine.
Or, head over to [our guides](https://learn.netdata.cloud/guides/) for even more experiments and insights into
troubleshooting the health of your systems and services.
diff --git a/docs/guides/monitor/process.md b/docs/guides/monitor/process.md
index 2f46d7abc..7cc327a01 100644
--- a/docs/guides/monitor/process.md
+++ b/docs/guides/monitor/process.md
@@ -23,38 +23,46 @@ SQL queries or know a bunch of arbitrary command-line flags.
With Netdata's process monitoring, you can:
-- Benchmark/optimize performance of standard applications, like web servers or databases
-- Benchmark/optimize performance of custom applications
-- Troubleshoot CPU/memory/disk utilization issues (why is my system's CPU spiking right now?)
-- Perform granular capacity planning based on the specific needs of your infrastructure
-- Search for leaking file descriptors
-- Investigate zombie processes
+- Benchmark/optimize performance of standard applications, like web servers or databases
+- Benchmark/optimize performance of custom applications
+- Troubleshoot CPU/memory/disk utilization issues (why is my system's CPU spiking right now?)
+- Perform granular capacity planning based on the specific needs of your infrastructure
+- Search for leaking file descriptors
+- Investigate zombie processes
... and much more. Let's get started.
## Prerequisites
-- One or more Linux nodes running [Netdata](/docs/get-started.mdx). If you need more time to understand Netdata before
- following this guide, see the [infrastructure](/docs/quickstart/infrastructure.md) or
- [single-node](/docs/quickstart/single-node.md) monitoring quickstarts.
-- A general understanding of how to [configure the Netdata Agent](/docs/configure/nodes.md) using `edit-config`.
-- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
+- One or more Linux nodes running [Netdata](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx). If you
+ need more time to understand Netdata before
+ following this guide, see
+ the [infrastructure](https://github.com/netdata/netdata/blob/master/docs/quickstart/infrastructure.md) or
+ [single-node](https://github.com/netdata/netdata/blob/master/docs/quickstart/single-node.md) monitoring quickstarts.
+- A general understanding of how
+ to [configure the Netdata Agent](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md)
+ using `edit-config`.
+- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
## How does Netdata do process monitoring?
-The Netdata Agent already knows to look for hundreds of [standard applications that we support via
-collectors](/collectors/COLLECTORS.md), and groups them based on their purpose. Let's say you want to monitor a MySQL
+The Netdata Agent already knows to look for hundreds
+of [standard applications that we support via collectors](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md),
+and groups them based on their
+purpose. Let's say you want to monitor a MySQL
database using its process. The Netdata Agent already knows to look for processes with the string `mysqld` in their
name, along with a few others, and puts them into the `sql` group. This `sql` group then becomes a dimension in all
process-specific charts.
The process and groups settings are used by two unique and powerful collectors.
-[**`apps.plugin`**](/collectors/apps.plugin/README.md) looks at the Linux process tree every second, much like `top` or
+[**`apps.plugin`**](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md) looks at the Linux
+process tree every second, much like `top` or
`ps fax`, and collects resource utilization information on every running process. It then automatically adds a layer of
meaningful visualization on top of these metrics, and creates per-process/application charts.
-[**`ebpf.plugin`**](/collectors/ebpf.plugin/README.md): Netdata's extended Berkeley Packet Filter (eBPF) collector
+[**`ebpf.plugin`**](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md): Netdata's extended
+Berkeley Packet Filter (eBPF) collector
monitors Linux kernel-level metrics for file descriptors, virtual filesystem IO, and process management, and then hands
process-specific metrics over to `apps.plugin` for visualization. The eBPF collector also collects and visualizes
metrics on an _event frequency_, which means it captures every kernel interaction, and not just the volume of
@@ -65,55 +73,55 @@ interaction at every second in time. That's even more precise than Netdata's sta
With these collectors working in parallel, Netdata visualizes the following per-second metrics for _any_ process on your
Linux systems:
-- CPU utilization (`apps.cpu`)
- - Total CPU usage
- - User/system CPU usage (`apps.cpu_user`/`apps.cpu_system`)
-- Disk I/O
- - Physical reads/writes (`apps.preads`/`apps.pwrites`)
- - Logical reads/writes (`apps.lreads`/`apps.lwrites`)
- - Open unique files (if a file is found open multiple times, it is counted just once, `apps.files`)
-- Memory
- - Real Memory Used (non-shared, `apps.mem`)
- - Virtual Memory Allocated (`apps.vmem`)
- - Minor page faults (i.e. memory activity, `apps.minor_faults`)
-- Processes
- - Threads running (`apps.threads`)
- - Processes running (`apps.processes`)
- - Carried over uptime (since the last Netdata Agent restart, `apps.uptime`)
- - Minimum uptime (`apps.uptime_min`)
- - Average uptime (`apps.uptime_average`)
- - Maximum uptime (`apps.uptime_max`)
- - Pipes open (`apps.pipes`)
-- Swap memory
- - Swap memory used (`apps.swap`)
- - Major page faults (i.e. swap activity, `apps.major_faults`)
-- Network
- - Sockets open (`apps.sockets`)
-- eBPF file
- - Number of calls to open files. (`apps.file_open`)
- - Number of files closed. (`apps.file_closed`)
- - Number of calls to open files that returned errors.
- - Number of calls to close files that returned errors.
-- eBPF syscall
- - Number of calls to delete files. (`apps.file_deleted`)
- - Number of calls to `vfs_write`. (`apps.vfs_write_call`)
- - Number of calls to `vfs_read`. (`apps.vfs_read_call`)
- - Number of bytes written with `vfs_write`. (`apps.vfs_write_bytes`)
- - Number of bytes read with `vfs_read`. (`apps.vfs_read_bytes`)
- - Number of calls to write a file that returned errors.
- - Number of calls to read a file that returned errors.
-- eBPF process
- - Number of process created with `do_fork`. (`apps.process_create`)
- - Number of threads created with `do_fork` or `__x86_64_sys_clone`, depending on your system's kernel version. (`apps.thread_create`)
- - Number of times that a process called `do_exit`. (`apps.task_close`)
-- eBPF net
- - Number of bytes sent. (`apps.bandwidth_sent`)
- - Number of bytes received. (`apps.bandwidth_recv`)
+- CPU utilization (`apps.cpu`)
+ - Total CPU usage
+ - User/system CPU usage (`apps.cpu_user`/`apps.cpu_system`)
+- Disk I/O
+ - Physical reads/writes (`apps.preads`/`apps.pwrites`)
+ - Logical reads/writes (`apps.lreads`/`apps.lwrites`)
+ - Open unique files (if a file is found open multiple times, it is counted just once, `apps.files`)
+- Memory
+ - Real Memory Used (non-shared, `apps.mem`)
+ - Virtual Memory Allocated (`apps.vmem`)
+ - Minor page faults (i.e. memory activity, `apps.minor_faults`)
+- Processes
+ - Threads running (`apps.threads`)
+ - Processes running (`apps.processes`)
+ - Carried over uptime (since the last Netdata Agent restart, `apps.uptime`)
+ - Minimum uptime (`apps.uptime_min`)
+ - Average uptime (`apps.uptime_average`)
+ - Maximum uptime (`apps.uptime_max`)
+ - Pipes open (`apps.pipes`)
+- Swap memory
+ - Swap memory used (`apps.swap`)
+ - Major page faults (i.e. swap activity, `apps.major_faults`)
+- Network
+ - Sockets open (`apps.sockets`)
+- eBPF file
+ - Number of calls to open files. (`apps.file_open`)
+ - Number of files closed. (`apps.file_closed`)
+ - Number of calls to open files that returned errors.
+ - Number of calls to close files that returned errors.
+- eBPF syscall
+ - Number of calls to delete files. (`apps.file_deleted`)
+ - Number of calls to `vfs_write`. (`apps.vfs_write_call`)
+ - Number of calls to `vfs_read`. (`apps.vfs_read_call`)
+ - Number of bytes written with `vfs_write`. (`apps.vfs_write_bytes`)
+ - Number of bytes read with `vfs_read`. (`apps.vfs_read_bytes`)
+ - Number of calls to write a file that returned errors.
+ - Number of calls to read a file that returned errors.
+- eBPF process
+ - Number of process created with `do_fork`. (`apps.process_create`)
+ - Number of threads created with `do_fork` or `__x86_64_sys_clone`, depending on your system's kernel
+ version. (`apps.thread_create`)
+ - Number of times that a process called `do_exit`. (`apps.task_close`)
+- eBPF net
+ - Number of bytes sent. (`apps.bandwidth_sent`)
+ - Number of bytes received. (`apps.bandwidth_recv`)
As an example, here's the per-process CPU utilization chart, including a `sql` group/dimension.
-![A per-process CPU utilization chart in Netdata
-Cloud](https://user-images.githubusercontent.com/1153921/101217226-3a5d5700-363e-11eb-8610-aa1640aefb5d.png)
+![A per-process CPU utilization chart in Netdata Cloud](https://user-images.githubusercontent.com/1153921/101217226-3a5d5700-363e-11eb-8610-aa1640aefb5d.png)
## Configure the Netdata Agent to recognize a specific process
@@ -123,7 +131,8 @@ aware of hundreds of processes, and collects metrics from them automatically.
But, if you want to change the grouping behavior, add an application that isn't yet supported in the Netdata Agent, or
monitor a custom application, you need to edit the `apps_groups.conf` configuration file.
-Navigate to your [Netdata config directory](/docs/configure/nodes.md) and use `edit-config` to edit the file.
+Navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md) and
+use `edit-config` to edit the file.
```bash
cd /etc/netdata # Replace this with your Netdata config directory if not at /etc/netdata.
@@ -138,7 +147,8 @@ others, and groups them into `sql`. That makes sense, since all these processes
sql: mysqld* mariad* postgres* postmaster* oracle_* ora_* sqlservr
```
-These groups are then reflected as [dimensions](/web/README.md#dimensions) within Netdata's charts.
+These groups are then reflected as [dimensions](https://github.com/netdata/netdata/blob/master/web/README.md#dimensions)
+within Netdata's charts.
![An example per-process CPU utilization chart in Netdata
Cloud](https://user-images.githubusercontent.com/1153921/101369156-352e2100-3865-11eb-9f0d-b8fac162e034.png)
@@ -153,12 +163,13 @@ shouldn't need to configure it to discover them.
However, if you're using multiple applications that the Netdata Agent groups together you may want to separate them for
more precise monitoring. If you're not running any other types of SQL databases on that node, you don't need to change
-the grouping, since you know that any MySQL is the only process contributing to the `sql` group.
+the grouping, since you know that any MySQL is the only process contributing to the `sql` group.
Let's say you're using both MySQL and PostgreSQL databases on a single node, and want to monitor their processes
-independently. Open the `apps_groups.conf` file as explained in the [section
-above](#configure-the-netdata-agent-to-recognize-a-specific-process) and scroll down until you find the `database
-servers` section. Create new groups for MySQL and PostgreSQL, and move their process queries into the unique groups.
+independently. Open the `apps_groups.conf` file as explained in
+the [section above](#configure-the-netdata-agent-to-recognize-a-specific-process) and scroll down until you find
+the `database servers` section. Create new groups for MySQL and PostgreSQL, and move their process queries into the
+unique groups.
```conf
# -----------------------------------------------------------------------------
@@ -169,17 +180,18 @@ postgres: postgres*
sql: mariad* postmaster* oracle_* ora_* sqlservr
```
-Restart Netdata with `sudo systemctl restart netdata`, or the [appropriate
-method](/docs/configure/start-stop-restart.md) for your system, to start collecting utilization metrics from your
-application. Time to [visualize your process metrics](#visualize-process-metrics).
+Restart Netdata with `sudo systemctl restart netdata`, or
+the [appropriate method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your system, to start collecting utilization metrics
+from your application. Time to [visualize your process metrics](#visualize-process-metrics).
### Custom applications
Let's assume you have an application that runs on the process `custom-app`. To monitor eBPF metrics for that application
separate from any others, you need to create a new group in `apps_groups.conf` and associate that process name with it.
-Open the `apps_groups.conf` file as explained in the [section
-above](#configure-the-netdata-agent-to-recognize-a-specific-process). Scroll down to `# NETDATA processes accounting`.
+Open the `apps_groups.conf` file as explained in
+the [section above](#configure-the-netdata-agent-to-recognize-a-specific-process). Scroll down
+to `# NETDATA processes accounting`.
Above that, paste in the following text, which creates a new `custom-app` group with the `custom-app` process. Replace
`custom-app` with the name of your application's Linux process. `apps_groups.conf` should now look like this:
@@ -195,26 +207,25 @@ custom-app: custom-app
...
```
-Restart Netdata with `sudo systemctl restart netdata`, or the [appropriate
-method](/docs/configure/start-stop-restart.md) for your system, to start collecting utilization metrics from your
-application.
+Restart Netdata with `sudo systemctl restart netdata`, or
+the [appropriate method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your system, to start collecting utilization metrics
+from your application.
## Visualize process metrics
Now that you're collecting metrics for your process, you'll want to visualize them using Netdata's real-time,
-interactive charts. Find these visualizations in the same section regardless of whether you use [Netdata
-Cloud](https://app.netdata.cloud) for infrastructure monitoring, or single-node monitoring with the local Agent's
-dashboard at `http://localhost:19999`.
+interactive charts. Find these visualizations in the same section regardless of whether you
+use [Netdata Cloud](https://app.netdata.cloud) for infrastructure monitoring, or single-node monitoring with the local
+Agent's dashboard at `http://localhost:19999`.
-If you need a refresher on all the available per-process charts, see the [above
-list](#per-process-metrics-and-charts-in-netdata).
+If you need a refresher on all the available per-process charts, see
+the [above list](#per-process-metrics-and-charts-in-netdata).
### Using Netdata's application collector (`apps.plugin`)
`apps.plugin` puts all of its charts under the **Applications** section of any Netdata dashboard.
-![Screenshot of the Applications section on a Netdata
-dashboard](https://user-images.githubusercontent.com/1153921/101401172-2ceadb80-388f-11eb-9e9a-88443894c272.png)
+![Screenshot of the Applications section on a Netdata dashboard](https://user-images.githubusercontent.com/1153921/101401172-2ceadb80-388f-11eb-9e9a-88443894c272.png)
Let's continue with the MySQL example. We can create a [test
database](https://www.digitalocean.com/community/tutorials/how-to-measure-mysql-query-performance-with-mysqlslap) in
@@ -223,11 +234,9 @@ MySQL to generate load on the `mysql` process.
`apps.plugin` immediately collects and visualizes this activity `apps.cpu` chart, which shows an increase in CPU
utilization from the `sql` group. There is a parallel increase in `apps.pwrites`, which visualizes writes to disk.
-![Per-application CPU utilization
-metrics](https://user-images.githubusercontent.com/1153921/101409725-8527da80-389b-11eb-96e9-9f401535aafc.png)
+![Per-application CPU utilization metrics](https://user-images.githubusercontent.com/1153921/101409725-8527da80-389b-11eb-96e9-9f401535aafc.png)
-![Per-application disk writing
-metrics](https://user-images.githubusercontent.com/1153921/101409728-85c07100-389b-11eb-83fd-d79dd1545b5a.png)
+![Per-application disk writing metrics](https://user-images.githubusercontent.com/1153921/101409728-85c07100-389b-11eb-83fd-d79dd1545b5a.png)
Next, the `mysqlslap` utility queries the database to provide some benchmarking load on the MySQL database. It won't
look exactly like a production database executing lots of user queries, but it gives you an idea into the possibility of
@@ -240,8 +249,7 @@ sudo mysqlslap --user=sysadmin --password --host=localhost --concurrency=50 --i
The following per-process disk utilization charts show spikes under the `sql` group at the same time `mysqlslap` was run
numerous times, with slightly different concurrency and query options.
-![Per-application disk
-metrics](https://user-images.githubusercontent.com/1153921/101411810-d08fb800-389e-11eb-85b3-f3fa41f1f887.png)
+![Per-application disk metrics](https://user-images.githubusercontent.com/1153921/101411810-d08fb800-389e-11eb-85b3-f3fa41f1f887.png)
> 💡 Click on any dimension below a chart in Netdata Cloud (or to the right of a chart on a local Agent dashboard), to
> visualize only that dimension. This can be particularly useful in process monitoring to separate one process'
@@ -256,8 +264,7 @@ For example, running the above workload shows the entire "story" how MySQL inter
processes/threads to handle a large number of SQL queries, then subsequently close the tasks as each query returns the
relevant data.
-![Per-process eBPF
-charts](https://user-images.githubusercontent.com/1153921/101412395-c8844800-389f-11eb-86d2-20c8a0f7b3c0.png)
+![Per-process eBPF charts](https://user-images.githubusercontent.com/1153921/101412395-c8844800-389f-11eb-86d2-20c8a0f7b3c0.png)
`ebpf.plugin` visualizes additional eBPF metrics, which are system-wide and not per-process, under the **eBPF** section.
@@ -267,35 +274,39 @@ Now that you have `apps_groups.conf` configured correctly, and know where to fin
Netdata's ecosystem, you can precisely monitor the health and performance of any process on your node using per-second
metrics.
-For even more in-depth troubleshooting, see our guide on [monitoring and debugging applications with
-eBPF](/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md).
+For even more in-depth troubleshooting, see our guide
+on [monitoring and debugging applications with eBPF](https://github.com/netdata/netdata/blob/master/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md).
-If the process you're monitoring also has a [supported collector](/collectors/COLLECTORS.md), now is a great time to set
+If the process you're monitoring also has
+a [supported collector](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md), now is a great time to
+set
that up if it wasn't autodetected. With both process utilization and application-specific metrics, you should have every
-piece of data needed to discover the root cause of an incident. See our [collector
-setup](/docs/collect/enable-configure.md) doc for details.
+piece of data needed to discover the root cause of an incident. See
+our [collector setup](https://github.com/netdata/netdata/blob/master/docs/collect/enable-configure.md) doc for details.
-[Create new dashboards](/docs/visualize/create-dashboards.md) in Netdata Cloud using charts from `apps.plugin`,
+[Create new dashboards](https://github.com/netdata/netdata/blob/master/docs/visualize/create-dashboards.md) in Netdata
+Cloud using charts from `apps.plugin`,
`ebpf.plugin`, and application-specific collectors to build targeted dashboards for monitoring key processes across your
infrastructure.
-Try running [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations) on a node that's
-running the process(es) you're monitoring. Even if nothing is going wrong at the moment, Netdata Cloud's embedded
-intelligence helps you better understand how a MySQL database, for example, might influence a system's volume of memory
-page faults. And when an incident is afoot, use Metric Correlations to reduce mean time to resolution (MTTR) and
-cognitive load.
-
-If you want more specific metrics from your custom application, check out Netdata's [statsd
-support](/collectors/statsd.plugin/README.md). With statd, you can send detailed metrics from your application to
-Netdata and visualize them with per-second granularity. Netdata's statsd collector works with dozens of [statsd server
-implementations](https://github.com/etsy/statsd/wiki#client-implementations), which work with most application
+Try
+running [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md)
+on a node that's running the process(es) you're monitoring. Even if nothing is going wrong at the moment, Netdata
+Cloud's embedded intelligence helps you better understand how a MySQL database, for example, might influence a system's
+volume of memory page faults. And when an incident is afoot, use Metric Correlations to reduce mean time to resolution (
+MTTR) and cognitive load.
+
+If you want more specific metrics from your custom application, check out
+Netdata's [statsd support](https://github.com/netdata/netdata/blob/master/collectors/statsd.plugin/README.md). With statd, you can send detailed metrics from your
+application to Netdata and visualize them with per-second granularity. Netdata's statsd collector works with dozens of
+[statsd server implementations](https://github.com/etsy/statsd/wiki#client-implementations), which work with most application
frameworks.
### Related reference documentation
-- [Netdata Agent · `apps.plugin`](/collectors/apps.plugin/README.md)
-- [Netdata Agent · `ebpf.plugin`](/collectors/ebpf.plugin/README.md)
-- [Netdata Agent · Dashboards](/web/README.md#dimensions)
-- [Netdata Agent · MySQL collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql)
+- [Netdata Agent · `apps.plugin`](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md)
+- [Netdata Agent · `ebpf.plugin`](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md)
+- [Netdata Agent · Dashboards](https://github.com/netdata/netdata/blob/master/web/README.md#dimensions)
+- [Netdata Agent · MySQL collector](https://github.com/netdata/go.d.plugin/blob/master/modules/mysql/README.md)
diff --git a/docs/guides/monitor/raspberry-pi-anomaly-detection.md b/docs/guides/monitor/raspberry-pi-anomaly-detection.md
index 73f57cd04..00b652bf2 100644
--- a/docs/guides/monitor/raspberry-pi-anomaly-detection.md
+++ b/docs/guides/monitor/raspberry-pi-anomaly-detection.md
@@ -12,7 +12,7 @@ We love IoT and edge at Netdata, we also love machine learning. Even better if w
of monitoring increasingly complex systems.
We recently explored what might be involved in enabling our Python-based [anomalies
-collector](/collectors/python.d.plugin/anomalies/README.md) on a Raspberry Pi. To our delight, it's actually quite
+collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md) on a Raspberry Pi. To our delight, it's actually quite
straightforward!
Read on to learn all the steps and enable unsupervised anomaly detection on your on Raspberry Pi(s).
@@ -23,14 +23,14 @@ Read on to learn all the steps and enable unsupervised anomaly detection on your
- A Raspberry Pi running Raspbian, which we'll call a _node_.
- The [open-source Netdata](https://github.com/netdata/netdata) monitoring agent. If you don't have it installed on your
- node yet, [get started now](/docs/get-started.mdx).
+ node yet, [get started now](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx).
## Install dependencies
First make sure Netdata is using Python 3 when it runs Python-based data collectors.
-Next, open `netdata.conf` using [`edit-config`](/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files)
-from within the [Netdata config directory](/docs/configure/nodes.md#the-netdata-config-directory). Scroll down to the
+Next, open `netdata.conf` using [`edit-config`](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files)
+from within the [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory). Scroll down to the
`[plugin:python.d]` section to pass in the `-ppython3` command option.
```conf
@@ -59,7 +59,7 @@ LLVM_CONFIG=llvm-config-9 pip3 install --user llvmlite numpy==1.20.1 netdata-pan
## Enable the anomalies collector
-Now you're ready to enable the collector and [restart Netdata](/docs/configure/start-stop-restart.md).
+Now you're ready to enable the collector and [restart Netdata](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md).
```bash
sudo ./edit-config python.d.conf
@@ -82,7 +82,7 @@ centralized cloud somewhere) is the resource utilization impact of running a mon
With the default configuration, the anomalies collector uses about 6.5% of CPU at each run. During the retraining step,
CPU utilization jumps to between 20-30% for a few seconds, but you can [configure
-retraining](/collectors/python.d.plugin/anomalies/README.md#configuration) to happen less often if you wish.
+retraining](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md#configuration) to happen less often if you wish.
![CPU utilization of anomaly detection on the Raspberry
Pi](https://user-images.githubusercontent.com/1153921/110149718-9d749c00-7d9b-11eb-9af8-46e2032cd1d0.png)
@@ -108,18 +108,18 @@ looks like a potentially useful addition to enable unsupervised anomaly detectio
See our two-part guide series for a more complete picture of configuring the anomalies collector, plus some best
practices on using the charts it automatically generates:
-- [_Detect anomalies in systems and applications_](/docs/guides/monitor/anomaly-detection-python.md)
-- [_Monitor and visualize anomalies with Netdata_](/docs/guides/monitor/visualize-monitor-anomalies.md)
+- [_Detect anomalies in systems and applications_](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md)
+- [_Monitor and visualize anomalies with Netdata_](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/visualize-monitor-anomalies.md)
If you're using your Raspberry Pi for other purposes, like blocking ads/trackers with Pi-hole, check out our companions
-Pi guide: [_Monitor Pi-hole (and a Raspberry Pi) with Netdata_](/docs/guides/monitor/pi-hole-raspberry-pi.md).
+Pi guide: [_Monitor Pi-hole (and a Raspberry Pi) with Netdata_](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/pi-hole-raspberry-pi.md).
Once you've had a chance to give unsupervised anomaly detection a go, share your use cases and let us know of any
feedback on our [community forum](https://community.netdata.cloud/t/anomalies-collector-feedback-megathread/767).
### Related reference documentation
-- [Netdata Agent · Get Netdata](/docs/get-started.mdx)
-- [Netdata Agent · Anomalies collector](/collectors/python.d.plugin/anomalies/README.md)
+- [Netdata Agent · Get Netdata](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx)
+- [Netdata Agent · Anomalies collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md)
diff --git a/docs/guides/monitor/statsd.md b/docs/guides/monitor/statsd.md
index 3e2f0f85c..848e2649c 100644
--- a/docs/guides/monitor/statsd.md
+++ b/docs/guides/monitor/statsd.md
@@ -22,7 +22,7 @@ In general, the process for creating a StatsD collector can be summarized in 2 s
- Run an experiment by sending StatsD metrics to Netdata, without any prior configuration. This will create a chart per metric (called private charts) and will help you verify that everything works as expected from the application side of things.
- Make sure to reload the dashboard tab **after** you start sending data to Netdata.
-- Create a configuration file for your app using [edit-config](/docs/configure/nodes.md): `sudo ./edit-config
+- Create a configuration file for your app using [edit-config](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md): `sudo ./edit-config
statsd.d/myapp.conf`
- Each app will have it's own section in the right-hand menu.
@@ -30,7 +30,7 @@ Now, let's see the above process in detail.
## Prerequisites
-- A node with the [Netdata](/docs/get-started.mdx) installed.
+- A node with the [Netdata](https://github.com/netdata/netdata/blob/master/docs/get-started.mdx) installed.
- An application to instrument. For this guide, that will be [k6](https://k6.io/docs/getting-started/installation).
## Understanding the metrics
@@ -63,7 +63,7 @@ Here are some examples of default private charts. You can see that the histogram
## Create a new StatsD configuration file
-Start by creating a new configuration file under the `statsd.d/` folder in the [Netdata config directory](/docs/configure/nodes.md#the-netdata-config-directory). Use [`edit-config`](/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files) to create a new file called `k6.conf`.
+Start by creating a new configuration file under the `statsd.d/` folder in the [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory). Use [`edit-config`](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#use-edit-config-to-edit-configuration-files) to create a new file called `k6.conf`.
```bash=
sudo ./edit-config statsd.d/k6.conf
@@ -104,7 +104,7 @@ Families and context are additional ways to group metrics. Families control the
Context is a second way to group metrics, when the metrics are of the same nature but different origin. In our case, if we ran several different load testing experiments side-by-side, we could define the same app, but different context (e.g `http_requests.experiment1`, `http_requests.experiment2`).
-Find more details about family and context in our [documentation](/web/README.md#families).
+Find more details about family and context in our [documentation](https://github.com/netdata/netdata/blob/master/web/README.md#families).
### Dimension
@@ -115,7 +115,7 @@ Now, having decided on how we are going to group the charts, we need to define h
The dimension option has this syntax: `dimension = [pattern] METRIC NAME TYPE MULTIPLIER DIVIDER OPTIONS`
-- **pattern**: A keyword that tells the StatsD server the `METRIC` string is actually a [simple pattern].(/libnetdata/simple_pattern/README.md). We don't simple patterns in the example, but if we wanted to visualize all the `http_req` metrics, we could have a single dimension: `dimension = pattern 'k6.http_req*' last 1 1`. Find detailed examples with patterns in our [documentation](/collectors/statsd.plugin/README.md#dimension-patterns).
+- **pattern**: A keyword that tells the StatsD server the `METRIC` string is actually a [simple pattern].(/libnetdata/simple_pattern/README.md). We don't simple patterns in the example, but if we wanted to visualize all the `http_req` metrics, we could have a single dimension: `dimension = pattern 'k6.http_req*' last 1 1`. Find detailed examples with patterns in our [documentation](https://github.com/netdata/netdata/blob/master/collectors/statsd.plugin/README.md#dimension-patterns).
- **METRIC** The id of the metric as it comes from the client. You can easily find this in the private charts above, for example: `k6.http_req_connecting`.
- **NAME**: The name of the dimension. You can use the dictionary to expand this to something more human-readable.
- **TYPE**:
@@ -212,7 +212,7 @@ Following the above steps, we append to the `k6.conf` that we defined above, the
> Take note that Netdata will report the rate for metrics and counters, even if k6 or another application sends an _absolute_ number. For example, k6 sends absolute HTTP requests with `http_reqs`, but Netdat visualizes that in `requests/second`.
-To enable this StatsD configuration, [restart Netdata](/docs/configure/start-stop-restart.md).
+To enable this StatsD configuration, [restart Netdata](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md).
## Final touches
@@ -293,6 +293,6 @@ Netdata allows you easily visualize any StatsD metric without any configuration,
### Related reference documentation
-- [Netdata Agent · StatsD](/collectors/statsd.plugin/README.md)
+- [Netdata Agent · StatsD](https://github.com/netdata/netdata/blob/master/collectors/statsd.plugin/README.md)
diff --git a/docs/guides/monitor/stop-notifications-alarms.md b/docs/guides/monitor/stop-notifications-alarms.md
index a8b73a86a..3c026a89b 100644
--- a/docs/guides/monitor/stop-notifications-alarms.md
+++ b/docs/guides/monitor/stop-notifications-alarms.md
@@ -13,7 +13,7 @@ relevant if you run Netdata on your laptop or a small virtual server. If they're
to real issues with health and performance.
Silencing individual alarms is an excellent solution for situations where you're not interested in seeing a specific
-alarm but don't want to disable a [notification system](/health/notifications/README.md) entirely.
+alarm but don't want to disable a [notification system](https://github.com/netdata/netdata/blob/master/health/notifications/README.md) entirely.
## Find the alarm configuration file
@@ -34,7 +34,7 @@ In the `source` row, you see that this chart is getting its configuration from
the file you need to edit if you want to silence this alarm.
For more information about editing or referencing health configuration files on your system, see the [health
-quickstart](/health/QUICKSTART.md#edit-health-configuration-files).
+quickstart](https://github.com/netdata/netdata/blob/master/health/QUICKSTART.md#edit-health-configuration-files).
## Edit the file to enable silencing
@@ -70,7 +70,7 @@ To silence this alarm, change `sysadmin` to `silent`.
to: silent
```
-Use one of the available [methods](/health/QUICKSTART.md#reload-health-configuration) to reload your health configuration
+Use one of the available [methods](https://github.com/netdata/netdata/blob/master/health/QUICKSTART.md#reload-health-configuration) to reload your health configuration
and ensure you get no more notifications about that alarm**.
You can add `to: silent` to any alarm you'd rather not bother you with notifications.
@@ -80,12 +80,12 @@ You can add `to: silent` to any alarm you'd rather not bother you with notificat
You should now know the fundamentals behind silencing any individual alarm in Netdata.
To learn about _all_ of Netdata's health configuration possibilities, visit the [health reference
-guide](/health/REFERENCE.md), or check out other [tutorials on health monitoring](/health/README.md#guides).
+guide](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md), or check out other [tutorials on health monitoring](https://github.com/netdata/netdata/blob/master/health/README.md#guides).
Or, take better control over how you get notified about alarms via the [notification
-system](/health/notifications/README.md).
+system](https://github.com/netdata/netdata/blob/master/health/notifications/README.md).
-You can also use Netdata's [Health Management API](/web/api/health/README.md#health-management-api) to control health
+You can also use Netdata's [Health Management API](https://github.com/netdata/netdata/blob/master/web/api/health/README.md#health-management-api) to control health
checks and notifications while Netdata runs. With this API, you can disable health checks during a maintenance window or
backup process, for example.
diff --git a/docs/guides/monitor/visualize-monitor-anomalies.md b/docs/guides/monitor/visualize-monitor-anomalies.md
index 1f8c2c8f8..90ce20a4b 100644
--- a/docs/guides/monitor/visualize-monitor-anomalies.md
+++ b/docs/guides/monitor/visualize-monitor-anomalies.md
@@ -10,7 +10,7 @@ custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/moni
Welcome to part 2 of our series of guides on using _unsupervised anomaly detection_ to detect issues with your systems,
containers, and applications using the open-source Netdata Agent. For an introduction to detecting anomalies and
-monitoring associated metrics, see [part 1](/docs/guides/monitor/anomaly-detection-python.md), which covers prerequisites and
+monitoring associated metrics, see [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md), which covers prerequisites and
configuration basics.
With anomaly detection in the Netdata Agent set up, you will now want to visualize and monitor which charts have
@@ -48,8 +48,8 @@ analysis (RCA).
The anomalies collector creates two "classes" of alarms for each chart captured by the `charts_regex` setting. All these
alarms are preconfigured based on your [configuration in
-`anomalies.conf`](/docs/guides/monitor/anomaly-detection-python.md#configure-the-anomalies-collector). With the `charts_regex`
-and `charts_to_exclude` settings from [part 1](/docs/guides/monitor/anomaly-detection-python.md) of this guide series, the
+`anomalies.conf`](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md#configure-the-anomalies-collector). With the `charts_regex`
+and `charts_to_exclude` settings from [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md) of this guide series, the
Netdata Agent creates 32 alarms driven by unsupervised anomaly detection.
The first class triggers warning alarms when the average anomaly probability for a given chart has stayed above 50% for
@@ -69,17 +69,17 @@ there's a full-blown incident, depending on what application/service you're usin
further investigation.
As you use the anomalies collector, you may find that the default settings provide too many or too few genuine alarms.
-In this case, [configure the alarm](/docs/monitor/configure-alarms.md) with `sudo ./edit-config
+In this case, [configure the alarm](https://github.com/netdata/netdata/blob/master/docs/monitor/configure-alarms.md) with `sudo ./edit-config
health.d/anomalies.conf`. Take a look at the `lookup` line syntax in the [health
-reference](/health/REFERENCE.md#alarm-line-lookup) to understand how the anomalies collector automatically creates
+reference](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alarm-line-lookup) to understand how the anomalies collector automatically creates
alarms for any dimension on the `anomalies_local.probability` and `anomalies_local.anomaly` charts.
## Visualize anomalies in charts
In either [Netdata Cloud](https://app.netdata.cloud) or the local Agent dashboard at `http://NODE:19999`, click on the
-**Anomalies** [section](/web/gui/README.md#sections) to see the pair of anomaly detection charts, which are
+**Anomalies** [section](https://github.com/netdata/netdata/blob/master/web/gui/README.md#sections) to see the pair of anomaly detection charts, which are
preconfigured to visualize per-second anomaly metrics based on your [configuration in
-`anomalies.conf`](/docs/guides/monitor/anomaly-detection-python.md#configure-the-anomalies-collector).
+`anomalies.conf`](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md#configure-the-anomalies-collector).
These charts have the contexts `anomalies.probability` and `anomalies.anomaly`. Together, these charts
create meaningful visualizations for immediately recognizing not only that something is going wrong on your node, but
@@ -88,7 +88,7 @@ give context as to where to look next.
The `anomalies_local.probability` chart shows the probability that the latest observed data is anomalous, based on the
trained model. The `anomalies_local.anomaly` chart visualizes 0&rarr;1 predictions based on whether the latest observed
data is anomalous based on the trained model. Both charts share the same dimensions, which you configured via
-`charts_regex` and `charts_to_exclude` in [part 1](/docs/guides/monitor/anomaly-detection-python.md).
+`charts_regex` and `charts_to_exclude` in [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md).
In other words, the `probability` chart shows the amplitude of the anomaly, whereas the `anomaly` chart provides quick
yes/no context.
@@ -108,7 +108,7 @@ dimensions that immediately shot to 100% anomaly probability, and remained there
## Build an anomaly detection dashboard
[Netdata Cloud](https://app.netdata.cloud) features a drag-and-drop [dashboard
-editor](/docs/visualize/create-dashboards.md) that helps you create entirely new dashboards with charts targeted for
+editor](https://github.com/netdata/netdata/blob/master/docs/visualize/create-dashboards.md) that helps you create entirely new dashboards with charts targeted for
your specific applications.
For example, here's a dashboard designed for visualizing anomalies present in an Nginx web server, including
@@ -119,12 +119,12 @@ dashboard](https://user-images.githubusercontent.com/1153921/104226915-c6188f00-
Use the anomaly charts for instant visual identification of potential anomalies, and then Nginx-specific charts, in the
right column, to validate whether the probability and anomaly counters are showing a valid incident worth further
-investigation using [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations) to narrow
+investigation using [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) to narrow
the dashboard into only the charts relevant to what you're seeing from the anomalies collector.
## What's next?
-Between this guide and [part 1](/docs/guides/monitor/anomaly-detection-python.md), which covered setup and configuration, you
+Between this guide and [part 1](https://github.com/netdata/netdata/blob/master/docs/guides/monitor/anomaly-detection-python.md), which covered setup and configuration, you
now have a fundamental understanding of how unsupervised anomaly detection in Netdata works, from root cause to alarms
to preconfigured or custom dashboards.
@@ -132,11 +132,11 @@ We'd love to hear your feedback on the anomalies collector. Hop over to the [com
forum](https://community.netdata.cloud/t/anomalies-collector-feedback-megathread/767), and let us know if you're already getting value from
unsupervised anomaly detection, or would like to see something added to it. You might even post a custom configuration
that works well for monitoring some other popular application, like MySQL, PostgreSQL, Redis, or anything else we
-[support through collectors](/collectors/COLLECTORS.md).
+[support through collectors](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md).
### Related reference documentation
-- [Netdata Agent · Anomalies collector](/collectors/python.d.plugin/anomalies/README.md)
-- [Netdata Cloud · Build new dashboards](https://learn.netdata.cloud/docs/cloud/visualize/dashboards)
+- [Netdata Agent · Anomalies collector](https://github.com/netdata/netdata/blob/master/collectors/python.d.plugin/anomalies/README.md)
+- [Netdata Cloud · Build new dashboards](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/dashboards.md)