summaryrefslogtreecommitdiffstats
path: root/docs/guides/monitor
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-27 11:08:07 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-27 11:08:07 +0000
commitc69cb8cc094cc916adbc516b09e944cd3d137c01 (patch)
treef2878ec41fb6d0e3613906c6722fc02b934eeb80 /docs/guides/monitor
parentInitial commit. (diff)
downloadnetdata-upstream/1.29.3.tar.xz
netdata-upstream/1.29.3.zip
Adding upstream version 1.29.3.upstream/1.29.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
-rw-r--r--docs/guides/monitor-cockroachdb.md136
-rw-r--r--docs/guides/monitor-hadoop-cluster.md204
-rw-r--r--docs/guides/monitor/anomaly-detection.md191
-rw-r--r--docs/guides/monitor/dimension-templates.md176
-rw-r--r--docs/guides/monitor/kubernetes-k8s-netdata.md278
-rw-r--r--docs/guides/monitor/pi-hole-raspberry-pi.md163
-rw-r--r--docs/guides/monitor/process.md299
-rw-r--r--docs/guides/monitor/stop-notifications-alarms.md92
-rw-r--r--docs/guides/monitor/visualize-monitor-anomalies.md147
9 files changed, 1686 insertions, 0 deletions
diff --git a/docs/guides/monitor-cockroachdb.md b/docs/guides/monitor-cockroachdb.md
new file mode 100644
index 0000000..fd0e7db
--- /dev/null
+++ b/docs/guides/monitor-cockroachdb.md
@@ -0,0 +1,136 @@
+<!--
+title: "Monitor CockroachDB metrics with Netdata"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor-cockroachdb.md
+-->
+
+# Monitor CockroachDB metrics with Netdata
+
+[CockroachDB](https://github.com/cockroachdb/cockroach) is an open-source project that brings SQL databases into
+scalable, disaster-resilient cloud deployments. Thanks to a [new CockroachDB
+collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/cockroachdb/) released in
+[v1.20](https://blog.netdata.cloud/posts/release-1.20/), you can now monitor any number of CockroachDB databases with
+maximum granularity using Netdata. Collect more than 50 unique metrics and put them on interactive visualizations
+designed for better visual anomaly detection.
+
+Netdata itself uses CockroachDB as part of its Netdata Cloud infrastructure, so we're happy to introduce this new
+collector and help others get started with it straightaway.
+
+Let's dive in and walk through the process of monitoring CockroachDB metrics with Netdata.
+
+## What's in this guide
+
+- [Configure the CockroachDB collector](#configure-the-cockroachdb-collector)
+ - [Manual setup for a local CockroachDB database](#manual-setup-for-a-local-cockroachdb-database)
+- [Tweak CockroachDB alarms](#tweak-cockroachdb-alarms)
+
+## Configure the CockroachDB collector
+
+Because _all_ of Netdata's collectors can auto-detect the services they monitor, you _shouldn't_ need to worry about
+configuring CockroachDB. Netdata only needs to regularly query the database's `_status/vars` page to gather metrics and
+display them on the dashboard.
+
+If your CockroachDB instance is accessible through `http://localhost:8080/` or `http://127.0.0.1:8080`, your setup is
+complete. Restart Netdata with `service netdata restart`, or use the [appropriate
+method](../getting-started.md#start-stop-and-restart-netdata) for your system, and refresh your browser. You should see
+CockroachDB metrics in your Netdata dashboard!
+
+<figure>
+ <img src="https://user-images.githubusercontent.com/1153921/73564467-d7e36b00-441c-11ea-9ec9-b5d5ea7277d4.png" alt="CPU utilization charts from a CockroachDB database monitored by Netdata" />
+ <figcaption>CPU utilization charts from a CockroachDB database monitored by Netdata</figcaption>
+</figure>
+
+> Note: Netdata collects metrics from CockroachDB every 10 seconds, instead of our usual 1 second, because CockroachDB
+> only updates `_status/vars` every 10 seconds. You can't change this setting in CockroachDB.
+
+If you don't see CockroachDB charts, you may need to configure the collector manually.
+
+### Manual setup for a local CockroachDB database
+
+To configure Netdata's CockroachDB collector, navigate to your Netdata configuration directory (typically at
+`/etc/netdata/`) and use `edit-config` to initialize and edit your CockroachDB configuration file.
+
+```bash
+cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /etc/netdata/
+./edit-config go.d/cockroachdb.conf
+```
+
+Scroll down to the `[JOBS]` section at the bottom of the file. You will see the two default jobs there, which you can
+edit, or create a new job with any of the parameters listed above in the file. Both the `name` and `url` values are
+required, and everything else is optional.
+
+For a production cluster, you'll use either an IP address or the system's hostname. Be sure that your remote system
+allows TCP communication on port 8080, or whichever port you have configured CockroachDB's [Admin
+UI](https://www.cockroachlabs.com/docs/stable/monitoring-and-alerting.html#prometheus-endpoint) to listen on.
+
+```yaml
+# [ JOBS ]
+jobs:
+ - name: remote
+ url: http://203.0.113.0:8080/_status/vars
+
+ - name: remote_hostname
+ url: http://cockroachdb.example.com:8080/_status/vars
+```
+
+For a secure cluster, use `https` in the `url` field instead.
+
+```yaml
+# [ JOBS ]
+jobs:
+ - name: remote
+ url: https://203.0.113.0:8080/_status/vars
+ tls_skip_verify: yes # If your certificate is self-signed
+
+ - name: remote_hostname
+ url: https://cockroachdb.example.com:8080/_status/vars
+ tls_skip_verify: yes # If your certificate is self-signed
+```
+
+You can add as many jobs as you'd like based on how many CockroachDB databases you have鈥擭etdata will create separate
+charts for each job. Once you've edited `cockroachdb.conf` according to the needs of your infrastructure, restart
+Netdata to see your new charts.
+
+<figure>
+ <img src="https://user-images.githubusercontent.com/1153921/73564469-d7e36b00-441c-11ea-8333-02ba0e1c294c.png" alt="Charts showing a node failure during a simulated test" />
+ <figcaption>Charts showing a node failure during a simulated test</figcaption>
+</figure>
+
+## Tweak CockroachDB alarms
+
+This release also includes eight pre-configured alarms for live nodes, such as whether the node is live, storage
+capacity, issues with replication, and the number of SQL connections/statements. See [health.d/cockroachdb.conf on
+GitHub](https://raw.githubusercontent.com/netdata/netdata/master/health/health.d/cockroachdb.conf) for details.
+
+You can also edit these files directly with `edit-config`:
+
+```bash
+cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /etc/netdata/
+./edit-config health.d/cockroachdb.conf # You may need to use `sudo` for write privileges
+```
+
+For more information about editing the defaults or writing new alarm entities, see our health monitoring [quickstart
+guide](/health/QUICKSTART.md).
+
+## What's next?
+
+Now that you're collecting metrics from your CockroachDB databases, let us know how it's working for you! There's always
+room for improvement or refinement based on real-world use cases. Feel free to [file an
+issue](https://github.com/netdata/netdata/issues/new?labels=bug%2C+needs+triage&template=bug_report.md) with your
+thoughts.
+
+Also, be sure to check out these useful resources:
+
+- [Netdata's CockroachDB
+ documentation](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/cockroachdb/)
+- [Netdata's CockroachDB
+ configuration](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/cockroachdb.conf)
+- [Netdata's CockroachDB
+ alarms](https://github.com/netdata/netdata/blob/29d9b5e51603792ee27ef5a21f1de0ba8e130158/health/health.d/cockroachdb.conf)
+- [CockroachDB homepage](https://www.cockroachlabs.com/product/)
+- [CockroachDB documentation](https://www.cockroachlabs.com/docs/stable/)
+- [`_status/vars` endpoint
+ docs](https://www.cockroachlabs.com/docs/stable/monitoring-and-alerting.html#prometheus-endpoint)
+- [Monitor CockroachDB with
+ Prometheus](https://www.cockroachlabs.com/docs/stable/monitor-cockroachdb-with-prometheus.html)
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor-cockroachdb&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor-hadoop-cluster.md b/docs/guides/monitor-hadoop-cluster.md
new file mode 100644
index 0000000..1ca2c03
--- /dev/null
+++ b/docs/guides/monitor-hadoop-cluster.md
@@ -0,0 +1,204 @@
+<!--
+title: "Monitor a Hadoop cluster with Netdata"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor-hadoop-cluster.md
+-->
+
+# Monitor a Hadoop cluster with Netdata
+
+Hadoop is an [Apache project](https://hadoop.apache.org/) is a framework for processing large sets of data across a
+distributed cluster of systems.
+
+And while Hadoop is designed to be a highly-available and fault-tolerant service, those who operate a Hadoop cluster
+will want to monitor the health and performance of their [Hadoop Distributed File System
+(HDFS)](https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) and [Zookeeper](https://zookeeper.apache.org/)
+implementations.
+
+Netdata comes with built-in and pre-configured support for monitoring both HDFS and Zookeeper.
+
+This guide assumes you have a Hadoop cluster, with HDFS and Zookeeper, running already. If you don't, please follow
+the [official Hadoop
+instructions](http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html) or an
+alternative, like the guide available from
+[DigitalOcean](https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-in-stand-alone-mode-on-ubuntu-18-04).
+
+For more specifics on the collection modules used in this guide, read the respective pages in our documentation:
+
+- [HDFS](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/hdfs)
+- [Zookeeper](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/zookeeper)
+
+## Set up your HDFS and Zookeeper installations
+
+As with all data sources, Netdata can auto-detect HDFS and Zookeeper nodes if you installed them using the standard
+installation procedure.
+
+For Netdata to collect HDFS metrics, it needs to be able to access the node's `/jmx` endpoint. You can test whether an
+JMX endpoint is accessible by using `curl HDFS-IP:PORT/jmx`. For a NameNode, you should see output similar to the
+following:
+
+```json
+{
+ "beans" : [ {
+ "name" : "Hadoop:service=NameNode,name=JvmMetrics",
+ "modelerType" : "JvmMetrics",
+ "MemNonHeapUsedM" : 65.67851,
+ "MemNonHeapCommittedM" : 67.3125,
+ "MemNonHeapMaxM" : -1.0,
+ "MemHeapUsedM" : 154.46341,
+ "MemHeapCommittedM" : 215.0,
+ "MemHeapMaxM" : 843.0,
+ "MemMaxM" : 843.0,
+ "GcCount" : 15,
+ "GcTimeMillis" : 305,
+ "GcNumWarnThresholdExceeded" : 0,
+ "GcNumInfoThresholdExceeded" : 0,
+ "GcTotalExtraSleepTime" : 92,
+ "ThreadsNew" : 0,
+ "ThreadsRunnable" : 6,
+ "ThreadsBlocked" : 0,
+ "ThreadsWaiting" : 7,
+ "ThreadsTimedWaiting" : 34,
+ "ThreadsTerminated" : 0,
+ "LogFatal" : 0,
+ "LogError" : 0,
+ "LogWarn" : 2,
+ "LogInfo" : 348
+ },
+ { ... }
+ ]
+}
+```
+
+The JSON result for a DataNode's `/jmx` endpoint is slightly different:
+
+```json
+{
+ "beans" : [ {
+ "name" : "Hadoop:service=DataNode,name=DataNodeActivity-dev-slave-01.dev.loc
+al-9866",
+ "modelerType" : "DataNodeActivity-dev-slave-01.dev.local-9866",
+ "tag.SessionId" : null,
+ "tag.Context" : "dfs",
+ "tag.Hostname" : "dev-slave-01.dev.local",
+ "BytesWritten" : 500960407,
+ "TotalWriteTime" : 463,
+ "BytesRead" : 80689178,
+ "TotalReadTime" : 41203,
+ "BlocksWritten" : 16,
+ "BlocksRead" : 16,
+ "BlocksReplicated" : 4,
+ ...
+ },
+ { ... }
+ ]
+}
+```
+
+If Netdata can't access the `/jmx` endpoint for either a NameNode or DataNode, it will not be able to auto-detect and
+collect metrics from your HDFS implementation.
+
+Zookeeper auto-detection relies on an accessible client port and a allow-listed `mntr` command. For more details on
+`mntr`, see Zookeeper's documentation on [cluster
+options](https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_clusterOptions) and [Zookeeper
+commands](https://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_zkCommands).
+
+## Configure the HDFS and Zookeeper modules
+
+To configure Netdata's HDFS module, navigate to your Netdata directory (typically at `/etc/netdata/`) and use
+`edit-config` to initialize and edit your HDFS configuration file.
+
+```bash
+cd /etc/netdata/
+sudo ./edit-config go.d/hdfs.conf
+```
+
+At the bottom of the file, you will see two example jobs, both of which are commented out:
+
+```yaml
+# [ JOBS ]
+#jobs:
+# - name: namenode
+# url: http://127.0.0.1:9870/jmx
+#
+# - name: datanode
+# url: http://127.0.0.1:9864/jmx
+```
+
+Uncomment these lines and edit the `url` value(s) according to your setup. Now's the time to add any other configuration
+details, which you can find inside of the `hdfs.conf` file itself. Most production implementations will require TLS
+certificates.
+
+The result for a simple HDFS setup, running entirely on `localhost` and without certificate authentication, might look
+like this:
+
+```yaml
+# [ JOBS ]
+jobs:
+ - name: namenode
+ url: http://127.0.0.1:9870/jmx
+
+ - name: datanode
+ url: http://127.0.0.1:9864/jmx
+```
+
+At this point, Netdata should be configured to collect metrics from your HDFS servers. Let's move on to Zookeeper.
+
+Next, use `edit-config` again to initialize/edit your `zookeeper.conf` file.
+
+```bash
+cd /etc/netdata/
+sudo ./edit-config go.d/zookeeper.conf
+```
+
+As with the `hdfs.conf` file, head to the bottom, uncomment the example jobs, and tweak the `address` values according
+to your setup. Again, you may need to add additional configuration options, like TLS certificates.
+
+```yaml
+jobs:
+ - name : local
+ address : 127.0.0.1:2181
+
+ - name : remote
+ address : 203.0.113.10:2182
+```
+
+Finally, restart Netdata.
+
+```sh
+sudo service restart netdata
+```
+
+Upon restart, Netdata should recognize your HDFS/Zookeeper servers, enable the HDFS and Zookeeper modules, and begin
+showing real-time metrics for both in your Netdata dashboard. 馃帀
+
+## Configuring HDFS and Zookeeper alarms
+
+The Netdata community helped us create sane defaults for alarms related to both HDFS and Zookeeper. You may want to
+investigate these to ensure they work well with your Hadoop implementation.
+
+- [HDFS alarms](https://raw.githubusercontent.com/netdata/netdata/master/health/health.d/hdfs.conf)
+- [Zookeeper alarms](https://raw.githubusercontent.com/netdata/netdata/master/health/health.d/zookeeper.conf)
+
+You can also access/edit these files directly with `edit-config`:
+
+```bash
+sudo /etc/netdata/edit-config health.d/hdfs.conf
+sudo /etc/netdata/edit-config health.d/zookeeper.conf
+```
+
+For more information about editing the defaults or writing new alarm entities, see our [health monitoring
+documentation](/health/README.md).
+
+## What's next?
+
+If you're having issues with Netdata auto-detecting your HDFS/Zookeeper servers, or want to help improve how Netdata
+collects or presents metrics from these services, feel free to [file an
+issue](https://github.com/netdata/netdata/issues/new?labels=bug%2C+needs+triage&template=bug_report.md).
+
+- Read up on the [HDFS configuration
+ file](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/hdfs.conf) to understand how to configure
+ global options or per-job options, such as username/password, TLS certificates, timeouts, and more.
+- Read up on the [Zookeeper configuration
+ file](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/zookeeper.conf) to understand how to configure
+ global options or per-job options, timeouts, TLS certificates, and more.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor-hadoop-cluster&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/anomaly-detection.md b/docs/guides/monitor/anomaly-detection.md
new file mode 100644
index 0000000..bb9dbc8
--- /dev/null
+++ b/docs/guides/monitor/anomaly-detection.md
@@ -0,0 +1,191 @@
+<!--
+title: "Detect anomalies in systems and applications"
+description: "Detect anomalies in any system, container, or application in your infrastructure with machine learning and the open-source Netdata Agent."
+image: /img/seo/guides/monitor/anomaly-detection.png
+author: "Joel Hans"
+author_title: "Editorial Director, Technical & Educational Resources"
+author_img: "/img/authors/joel-hans.jpg"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/anomaly-detection.md
+-->
+
+# Detect anomalies in systems and applications
+
+Beginning with v1.27, the [open-source Netdata Agent](https://github.com/netdata/netdata) is capable of unsupervised
+[anomaly detection](https://en.wikipedia.org/wiki/Anomaly_detection) with machine learning (ML). As with all things
+Netdata, the anomalies collector comes with preconfigured alarms and instant visualizations that require no query
+languages or organizing metrics. You configure the collector to look at specific charts, and it handles the rest.
+
+Netdata's implementation uses a handful of functions in the [Python Outlier Detection (PyOD)
+library](https://github.com/yzhao062/pyod/tree/master), which periodically runs a `train` function that learns what
+"normal" looks like on your node and creates an ML model for each chart, then utilizes the
+[`predict_proba()`](https://pyod.readthedocs.io/en/latest/api_cc.html#pyod.models.base.BaseDetector.predict_proba) and
+[`predict()`](https://pyod.readthedocs.io/en/latest/api_cc.html#pyod.models.base.BaseDetector.predict) PyOD functions to
+quantify how anomalous certain charts are.
+
+All these metrics and alarms are available for centralized monitoring in [Netdata Cloud](https://app.netdata.cloud). If
+you choose to sign up for Netdata Cloud and [claim your nodes](/claim/README.md), you will have the ability to run
+tailored anomaly detection on every node in your infrastructure, regardless of its purpose or workload.
+
+In this guide, you'll learn how to set up the anomalies collector to instantly detect anomalies in an Nginx web server
+and/or the node that hosts it, which will give you the tools to configure parallel unsupervised monitors for any
+application in your infrastructure. Let's get started.
+
+![Example anomaly detection with an Nginx web
+server](https://user-images.githubusercontent.com/1153921/103586700-da5b0a00-4ea2-11eb-944e-46edd3f83e3a.png)
+
+## Prerequisites
+
+- A node running the Netdata Agent. If you don't yet have that, [get Netdata](/docs/get/README.md).
+- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
+- Familiarity with configuring the Netdata Agent with [`edit-config`](/docs/configure/nodes.md).
+- _Optional_: An Nginx web server running on the same node to follow the example configuration steps.
+
+## Install required Python packages
+
+The anomalies collector uses a few Python packages, available with `pip3`, to run ML training. It requires
+[`numba`](http://numba.pydata.org/), [`scikit-learn`](https://scikit-learn.org/stable/),
+[`pyod`](https://pyod.readthedocs.io/en/latest/), in addition to
+[`netdata-pandas`](https://github.com/netdata/netdata-pandas), which is a package built by the Netdata team to pull data
+from a Netdata Agent's API into a [Pandas](https://pandas.pydata.org/). Read more about `netdata-pandas` on its [package
+repo](https://github.com/netdata/netdata-pandas) or in Netdata's [community
+repo](https://github.com/netdata/community/tree/main/netdata-agent-api/netdata-pandas).
+
+```bash
+# Become the netdata user
+sudo su -s /bin/bash netdata
+
+# Install required packages for the netdata user
+pip3 install --user netdata-pandas==0.0.32 numba==0.50.1 scikit-learn==0.23.2 pyod==0.8.3
+```
+
+> If the `pip3` command fails, you need to install it. For example, on an Ubuntu system, use `sudo apt install
+> python3-pip`.
+
+Use `exit` to become your normal user again.
+
+## Enable the anomalies collector
+
+Navigate to your [Netdata config directory](/docs/configure/nodes.md#the-netdata-config-directory) and use `edit-config`
+to open the `python.d.conf` file.
+
+```bash
+sudo ./edit-config python.d.conf
+```
+
+In `python.d.conf` file, search for the `anomalies` line. If the line exists, set the value to `yes`. Add the line
+yourself if it doesn't already exist. Either way, the final result should look like:
+
+```conf
+anomalies: yes
+```
+
+[Restart the Agent](/docs/configure/start-stop-restart.md) with `sudo systemctl restart netdata` to start up the
+anomalies collector. By default, the model training process runs every 30 minutes, and uses the previous 4 hours of
+metrics to establish a baseline for health and performance across the default included charts.
+
+> 馃挕 The anomaly collector may need 30-60 seconds to finish its initial training and have enough data to start
+> generating anomaly scores. You may need to refresh your browser tab for the **Anomalies** section to appear in menus
+> on both the local Agent dashboard or Netdata Cloud.
+
+## Configure the anomalies collector
+
+Open `python.d/anomalies.conf` with `edit-conf`.
+
+```bash
+sudo ./edit-config python.d/anomalies.conf
+```
+
+The file contains many user-configurable settings with sane defaults. Here are some important settings that don't
+involve tweaking the behavior of the ML training itself.
+
+- `charts_regex`: Which charts to train models for and run anomaly detection on, with each chart getting a separate
+ model.
+- `charts_to_exclude`: Specific charts, selected by the regex in `charts_regex`, to exclude.
+- `train_every_n`: How often to train the ML models.
+- `train_n_secs`: The number of historical observations to train each model on. The default is 4 hours, but if your node
+ doesn't have historical metrics going back that far, consider [changing the metrics retention
+ policy](/docs/store/change-metrics-storage.md) or reducing this window.
+- `custom_models`: A way to define custom models that you want anomaly probabilities for, including multi-node or
+ streaming setups. More on custom models in part 3 of this guide series.
+
+> 鈿狅笍 Setting `charts_regex` with many charts or `train_n_secs` to a very large number will have an impact on the
+> resources and time required to train a model for every chart. The actual performance implications depend on the
+> resources available on your node. If you plan on changing these settings beyond the default, or what's mentioned in
+> this guide, make incremental changes to observe the performance impact. Considering `train_max_n` to cap the number of
+> observations actually used to train on.
+
+### Run anomaly detection on Nginx and log file metrics
+
+As mentioned above, this guide uses an Nginx web server to demonstrate how the anomalies collector works. You must
+configure the collector to monitor charts from the
+[Nginx](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx) and [web
+log](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog) collectors.
+
+`charts_regex` allows for some basic regex, such as wildcards (`*`) to match all contexts with a certain pattern. For
+example, `system\..*` matches with any chart wit ha context that begins with `system.`, and ends in any number of other
+characters (`.*`). Note the escape character (`\`) around the first period to capture a period character exactly, and
+not any character.
+
+Change `charts_regex` in `anomalies.conf` to the following:
+
+```conf
+ charts_regex: 'system\..*|nginx_local\..*|web_log_nginx\..*|apps.cpu|apps.mem'
+```
+
+This value tells the anomaly collector to train against every `system.` chart, every `nginx_local` chart, every
+`web_log_nginx` chart, and specifically the `apps.cpu` and `apps.mem` charts.
+
+![The anomalies collector chart with many
+dimensions](https://user-images.githubusercontent.com/1153921/102813877-db5e4880-4386-11eb-8040-d7a1d7a476bb.png)
+
+### Remove some metrics from anomaly detection
+
+As you can see in the above screenshot, this node is now looking for anomalies in many places. The result is a single
+`anomalies_local.probability` chart with more than twenty dimensions, some of which the dashboard hides at the bottom of
+a scroll-able area. In addition, training and analyzing the anomaly collector on many charts might require more CPU
+utilization that you're willing to give.
+
+First, explicitly declare which `system.` charts to monitor rather than of all of them using regex (`system\..*`).
+
+```conf
+ charts_regex: 'system\.cpu|system\.load|system\.io|system\.net|system\.ram|nginx_local\..*|web_log_nginx\..*|apps.cpu|apps.mem'
+```
+
+Next, remove some charts with the `charts_to_exclude` setting. For this example, using an Nginx web server, focus on the
+volume of requests/responses, not, for example, which type of 4xx response a user might receive.
+
+```conf
+ charts_to_exclude: 'web_log_nginx.excluded_requests,web_log_nginx.responses_by_status_code_class,web_log_nginx.status_code_class_2xx_responses,web_log_nginx.status_code_class_4xx_responses,web_log_nginx.current_poll_uniq_clients,web_log_nginx.requests_by_http_method,web_log_nginx.requests_by_http_version,web_log_nginx.requests_by_ip_proto'
+```
+
+![The anomalies collector with less
+dimensions](https://user-images.githubusercontent.com/1153921/102820642-d69f9180-4392-11eb-91c5-d3d166d40105.png)
+
+Apply the ideas behind the collector's regex and exclude settings to any other
+[system](/docs/collect/system-metrics.md), [container](/docs/collect/container-metrics.md), or
+[application](/docs/collect/application-metrics.md) metrics you want to detect anomalies for.
+
+## What's next?
+
+Now that you know how to set up unsupervised anomaly detection in the Netdata Agent, using an Nginx web server as an
+example, it's time to apply that knowledge to other mission-critical parts of your infrastructure. If you're not sure
+what to monitor next, check out our list of [collectors](/collectors/COLLECTORS.md) to see what kind of metrics Netdata
+can collect from your systems, containers, and applications.
+
+For a more user-friendly anomaly detection experience, try out the [Metric
+Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations) feature in Netdata Cloud. Metric
+Correlations runs only at your requests, removing unrelated charts from the dashboard to help you focus on root cause
+analysis.
+
+Stay tuned for the next two parts of this guide, which provide more real-world context for the anomalies collector.
+First, maximize the immediate value you get from anomaly detection by tracking preconfigured alarms, visualizing
+anomalies in charts, and building a new dashboard tailored to your applications. Then, learn about creating custom ML
+models, which help you holistically monitor an application or service by monitoring anomalies across a _cluster of
+charts_.
+
+### Related reference documentation
+
+- [Netdata Agent 路 Anomalies collector](/collectors/python.d.plugin/anomalies/README.md)
+- [Netdata Cloud 路 Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations)
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2Fanomaly-detectionl&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/dimension-templates.md b/docs/guides/monitor/dimension-templates.md
new file mode 100644
index 0000000..7cbe795
--- /dev/null
+++ b/docs/guides/monitor/dimension-templates.md
@@ -0,0 +1,176 @@
+<!--
+title: "Use dimension templates to create dynamic alarms"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/dimension-templates.md
+-->
+
+# Use dimension templates to create dynamic alarms
+
+Your ability to monitor the health of your systems and applications relies on your ability to create and maintain
+the best set of alarms for your particular needs.
+
+In v1.18 of Netdata, we introduced **dimension templates** for alarms, which simplifies the process of writing [alarm
+entities](/health/REFERENCE.md#health-entity-reference) for charts with many dimensions.
+
+Dimension templates can condense many individual entities into one鈥攏o more copy-pasting one entity and changing the
+`alarm`/`template` and `lookup` lines for each dimension you'd like to monitor.
+
+They are, however, an advanced health monitoring feature. For more basic instructions on creating your first alarm,
+check out our [health monitoring documentation](/health/README.md), which also includes
+[examples](/health/REFERENCE.md#example-alarms).
+
+## The fundamentals of `foreach`
+
+Our dimension templates update creates a new `foreach` parameter to the existing [`lookup`
+line](/health/REFERENCE.md#alarm-line-lookup). This is where the magic happens.
+
+You use the `foreach` parameter to specify which dimensions you want to monitor with this single alarm. You can separate
+them with a comma (`,`) or a pipe (`|`). You can also use a [Netdata simple pattern](/libnetdata/simple_pattern/README.md)
+to create many alarms with a regex-like syntax.
+
+The `foreach` parameter _has_ to be the last parameter in your `lookup` line, and if you have both `of` and `foreach` in
+the same `lookup` line, Netdata will ignore the `of` parameter and use `foreach` instead.
+
+Let's get into some examples so you can see how the new parameter works.
+
+> 鈿狅笍 The following entities are examples to showcase the functionality and syntax of dimension templates. They are not
+> meant to be run as-is on production systems.
+
+## Condensing entities with `foreach`
+
+Let's say you want to monitor the `system`, `user`, and `nice` dimensions in your system's overall CPU utilization.
+Before dimension templates, you would need the following three entities:
+
+```yaml
+ alarm: cpu_system
+ on: system.cpu
+lookup: average -10m percentage of system
+ every: 1m
+ warn: $this > 50
+ crit: $this > 80
+
+ alarm: cpu_user
+ on: system.cpu
+lookup: average -10m percentage of user
+ every: 1m
+ warn: $this > 50
+ crit: $this > 80
+
+ alarm: cpu_nice
+ on: system.cpu
+lookup: average -10m percentage of nice
+ every: 1m
+ warn: $this > 50
+ crit: $this > 80
+```
+
+With dimension templates, you can condense these into a single alarm. Take note of the `alarm` and `lookup` lines.
+
+```yaml
+ alarm: cpu_template
+ on: system.cpu
+lookup: average -10m percentage foreach system,user,nice
+ every: 1m
+ warn: $this > 50
+ crit: $this > 80
+```
+
+The `alarm` line specifies the naming scheme Netdata will use. You can use whatever naming scheme you'd like, with `.`
+and `_` being the only allowed symbols.
+
+The `lookup` line has changed from `of` to `foreach`, and we're now passing three dimensions.
+
+In this example, Netdata will create three alarms with the names `cpu_template_system`, `cpu_template_user`, and
+`cpu_template_nice`. Every minute, each alarm will use the same database query to calculate the average CPU usage for
+the `system`, `user`, and `nice` dimensions over the last 10 minutes and send out alarms if necessary.
+
+You can find these three alarms active by clicking on the **Alarms** button in the top navigation, and then clicking on
+the **All** tab and scrolling to the **system - cpu** collapsible section.
+
+![Three new alarms created from the dimension template](https://user-images.githubusercontent.com/1153921/66218994-29523800-e67f-11e9-9bcb-9bca23e2c554.png)
+
+Let's look at some other examples of how `foreach` works so you can best apply it in your configurations.
+
+### Using a Netdata simple pattern in `foreach`
+
+In the last example, we used `foreach system,user,nice` to create three distinct alarms using dimension templates. But
+what if you want to quickly create alarms for _all_ the dimensions of a given chart?
+
+Use a [simple pattern](/libnetdata/simple_pattern/README.md)! One example of a simple pattern is a single wildcard
+(`*`).
+
+Instead of monitoring system CPU usage, let's monitor per-application CPU usage using the `apps.cpu` chart. Passing a
+wildcard as the simple pattern tells Netdata to create a separate alarm for _every_ process on your system:
+
+```yaml
+ alarm: app_cpu
+ on: apps.cpu
+lookup: average -10m percentage foreach *
+ every: 1m
+ warn: $this > 50
+ crit: $this > 80
+```
+
+This entity will now create alarms for every dimension in the `apps.cpu` chart. Given that most `apps.cpu` charts have
+10 or more dimensions, using the wildcard ensures you catch every CPU-hogging process.
+
+To learn more about how to use simple patterns with dimension templates, see our [simple patterns
+documentation](/libnetdata/simple_pattern/README.md).
+
+## Using `foreach` with alarm templates
+
+Dimension templates also work with [alarm templates](/health/REFERENCE.md#alarm-line-alarm-or-template). Alarm
+templates help you create alarms for all the charts with a given context鈥攆or example, all the cores of your system's
+CPU.
+
+By combining the two, you can create dozens of individual alarms with a single template entity. Here's how you would
+create alarms for the `system`, `user`, and `nice` dimensions for every chart in the `cpu.cpu` context鈥攐r, in other
+words, every CPU core.
+
+```yaml
+template: cpu_template
+ on: cpu.cpu
+ lookup: average -10m percentage foreach system,user,nice
+ every: 1m
+ warn: $this > 50
+ crit: $this > 80
+```
+
+On a system with a 6-core, 12-thread Ryzen 5 1600 CPU, this one entity creates alarms on the following charts and
+dimensions:
+
+- `cpu.cpu0`
+ - `cpu_template_user`
+ - `cpu_template_system`
+ - `cpu_template_nice`
+- `cpu.cpu1`
+ - `cpu_template_user`
+ - `cpu_template_system`
+ - `cpu_template_nice`
+- `cpu.cpu2`
+ - `cpu_template_user`
+ - `cpu_template_system`
+ - `cpu_template_nice`
+- ...
+- `cpu.cpu11`
+ - `cpu_template_user`
+ - `cpu_template_system`
+ - `cpu_template_nice`
+
+And how just a few of those dimension template-generated alarms look like in the Netdata dashboard.
+
+![A few of the created alarms in the Netdata dashboard](https://user-images.githubusercontent.com/1153921/66219669-708cf880-e680-11e9-8b3a-7bfe178fa28b.png)
+
+All in all, this single entity creates 36 individual alarms. Much easier than writing 36 separate entities in your
+health configuration files!
+
+## What's next?
+
+We hope you're excited about the possibilities of using dimension templates! Maybe they'll inspire you to build new
+alarms that will help you better monitor the health of your systems.
+
+Or, at the very least, simplify your configuration files.
+
+For information about other advanced features in Netdata's health monitoring toolkit, check out our [health
+documentation](/health/README.md). And if you have some cool alarms you built using dimension templates,
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2dimension-templates&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/kubernetes-k8s-netdata.md b/docs/guides/monitor/kubernetes-k8s-netdata.md
new file mode 100644
index 0000000..40af0e9
--- /dev/null
+++ b/docs/guides/monitor/kubernetes-k8s-netdata.md
@@ -0,0 +1,278 @@
+<!--
+title: "Monitor a Kubernetes (k8s) cluster with Netdata"
+description: "Use Netdata's helmchart, service discovery plugin, and Kubelet/kube-proxy collectors for real-time visibility into your Kubernetes cluster."
+image: /img/seo/guides/monitor/kubernetes-k8s-netdata.png
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/kubernetes-k8s-netdata.md
+-->
+
+# Monitor a Kubernetes cluster with Netdata
+
+While Kubernetes (k8s) might simplify the way you deploy, scale, and load-balance your applications, not all clusters
+come with "batteries included" when it comes to monitoring. Doubly so for a monitoring stack that helps you actively
+troubleshoot issues with your cluster.
+
+Some k8s providers, like GKE (Google Kubernetes Engine), do deploy clusters bundled with monitoring capabilities, such
+as Google Stackdriver Monitoring. However, these pre-configured solutions might not offer the depth of metrics,
+customization, or integration with your preferred alerting methods.
+
+Without this visibility, it's like you built an entire house and _then_ smashed your way through the finished walls to
+add windows.
+
+At Netdata, we're working to build Kubernetes monitoring tools that add visibility without complexity while also helping
+you actively troubleshoot anomalies or outages. Better yet, this toolkit includes a few complementary collectors that
+let you monitor the many layers of a Kubernetes cluster entirely for free.
+
+We already have a few complementary tools and collectors for monitoring the many layers of a Kubernetes cluster,
+_entirely for free_. These methods work together to help you troubleshoot performance or availability issues across
+your k8s infrastructure.
+
+- A [Helm chart](https://github.com/netdata/helmchart), which bootstraps a Netdata Agent pod on every node in your
+ cluster, plus an additional parent pod for storing metrics and managing alarm notifications.
+- A [service discovery plugin](https://github.com/netdata/agent-service-discovery), which discovers and creates
+ configuration files for [compatible
+ applications](https://github.com/netdata/helmchart#service-discovery-and-supported-services) and any endpoints
+ covered by our [generic Prometheus
+ collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus). With these
+ configuration files, Netdata collects metrics from any compatible applications as they run _inside_ of a pod.
+ Service discovery happens without manual intervention as pods are created, destroyed, or moved between nodes.
+- A [Kubelet collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubelet), which runs
+ on each node in a k8s cluster to monitor the number of pods/containers, the volume of operations on each container,
+ and more.
+- A [kube-proxy collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubeproxy), which
+ also runs on each node and monitors latency and the volume of HTTP requests to the proxy.
+- A [cgroups collector](/collectors/cgroups.plugin/README.md), which collects CPU, memory, and bandwidth metrics for
+ each container running on your k8s cluster.
+
+By following this guide, you'll learn how to discover, explore, and take away insights from each of these layers in your
+Kubernetes cluster. Let's get started.
+
+## Prerequisites
+
+To follow this guide, you need:
+
+- A working cluster running Kubernetes v1.9 or newer.
+- The [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command line tool, within [one minor version
+ difference](https://kubernetes.io/docs/tasks/tools/install-kubectl/#before-you-begin) of your cluster, on an
+ administrative system.
+- The [Helm package manager](https://helm.sh/) v3.0.0 or newer on the same administrative system.
+
+**You need to install the Netdata Helm chart on your cluster** before you proceed. See our [Kubernetes installation
+process](/packaging/installer/methods/kubernetes.md) for details.
+
+This guide uses a 3-node cluster, running on Digital Ocean, as an example. This cluster runs CockroachDB, Redis, and
+Apache, which we'll use as examples of how to monitor a Kubernetes cluster with Netdata.
+
+```bash
+kubectl get nodes
+NAME STATUS ROLES AGE VERSION
+pool-0z7557lfb-3fnbf Ready <none> 51m v1.17.5
+pool-0z7557lfb-3fnbx Ready <none> 51m v1.17.5
+pool-0z7557lfb-3fnby Ready <none> 51m v1.17.5
+
+kubectl get pods
+NAME READY STATUS RESTARTS AGE
+cockroachdb-0 1/1 Running 0 44h
+cockroachdb-1 1/1 Running 0 44h
+cockroachdb-2 1/1 Running 1 44h
+cockroachdb-init-q7mp6 0/1 Completed 0 44h
+httpd-6f6cb96d77-4zlc9 1/1 Running 0 2m47s
+httpd-6f6cb96d77-d9gs6 1/1 Running 0 2m47s
+httpd-6f6cb96d77-xtpwn 1/1 Running 0 11m
+netdata-child-5p2m9 2/2 Running 0 42h
+netdata-child-92qvf 2/2 Running 0 42h
+netdata-child-djc6w 2/2 Running 0 42h
+netdata-parent-0 1/1 Running 0 42h
+redis-6bb94d4689-6nn6v 1/1 Running 0 73s
+redis-6bb94d4689-c2fk2 1/1 Running 0 73s
+redis-6bb94d4689-tjcz5 1/1 Running 0 88s
+```
+
+## Explore Netdata's Kubernetes charts
+
+The Helm chart installs and enables everything you need for visibility into your k8s cluster, including the service
+discovery plugin, Kubelet collector, kube-proxy collector, and cgroups collector.
+
+To get started, open your browser and navigate to your cluster's Netdata dashboard. See our [Kubernetes installation
+instructions](/packaging/installer/methods/kubernetes.md) for how to access the dashboard based on your cluster's
+configuration.
+
+You'll see metrics from the parent pod as soon as you navigate to the dashboard:
+
+![The Netdata dashboard when monitoring a Kubernetes
+cluster](https://user-images.githubusercontent.com/1153921/85343043-c6206400-b4a0-11ea-8de6-cf2c6837c456.png)
+
+Remember that the parent pod is responsible for storing metrics from all the child pods and sending alarms.
+
+Take note of the **Replicated Nodes** menu, which shows not only the parent pod, but also the three child pods. This
+example cluster has three child pods, but the number of child pods depends entirely on the number of nodes in your
+cluster.
+
+You'll use the links in the **Replicated Nodes** menu to navigate between the various pods in your cluster. Let's do
+that now to explore the pod-level Kubernetes monitoring Netdata delivers.
+
+### Pods
+
+Click on any of the nodes under **netdata-parent-0**. Netdata redirects you to a separate instance of the Netdata
+dashboard, run by the Netdata child pod, which visualizes thousands of metrics from that node.
+
+![The Netdata dashboard monitoring a pod in a Kubernetes
+cluster](https://user-images.githubusercontent.com/1153921/85348461-85c8e200-b4b0-11ea-85fa-e88046e94719.png)
+
+From this dashboard, you can see all the familiar charts showing the health and performance of an individual node, just
+like you would if you installed Netdata on a single physical system. Explore CPU, memory, bandwidth, networking, and
+more.
+
+You can use the menus on the right-hand side of the dashboard to navigate between different sections of charts and
+metrics.
+
+For example, click on the **Applications** section to view per-application metrics, collected by
+[apps.plugin](/collectors/apps.plugin/README.md). The first chart you see is **Apps CPU Time (100% = 1 core)
+(apps.cpu)**, which shows the CPU utilization of various applications running on the node. You shouldn't be surprised to
+find Netdata processes (`netdata`, `sd-agent`, and more) alongside Kubernetes processes (`kubelet`, `kube-proxy`, and
+`containers`).
+
+![Per-application monitoring on a Kubernetes
+cluster](https://user-images.githubusercontent.com/1153921/85348852-ad6c7a00-b4b1-11ea-95b4-5952bd0e9d98.png)
+
+Beneath the **Applications** section, you'll begin to see sections for **k8s kubelet**, **k8s kubeproxy**, and long
+strings that start with **k8s**, which are sections for metrics collected by
+[`cgroups.plugin`](/collectors/cgroups.plugin/README.md). Let's skip over those for now and head further down to see
+Netdata's service discovery in action.
+
+### Service discovery (services running inside of pods)
+
+Thanks to Netdata's service discovery feature, you monitor containerized applications running in k8s pods with zero
+configuration or manual intervention. Service discovery is like a watchdog for created or deleted pods, recognizing the
+service they run based on the image name and port and immediately attempting to apply a logical default configuration.
+
+Service configuration supports [popular
+applications](https://github.com/netdata/helmchart#service-discovery-and-supported-services), plus any endpoints covered
+by our [generic Prometheus collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus),
+which are automatically added or removed from Netdata as soon as the pods are created or destroyed.
+
+You can find these service discovery sections near the bottom of the menu. The names for these sections follow a
+pattern: the name of the detected service, followed by a string of the module name, pod TUID, service type, port
+protocol, and port number. See the graphic below to help you identify service discovery sections.
+
+![Showing the difference between cgroups and service discovery
+sections](https://user-images.githubusercontent.com/1153921/85443711-73998300-b546-11ea-9b3b-2dddfe00bdf8.png)
+
+For example, the first service discovery section shows metrics for a pod running an Apache web server running on port 80
+in a pod named `httpd-6f6cb96d77-xtpwn`.
+
+> If you don't see any service discovery sections, it's either because your services are not compatible with service
+> discovery or you changed their default configuration, such as the listening port. See the [list of supported
+> services](https://github.com/netdata/helmchart#service-discovery-and-supported-services) for details about whether
+> your installed services are compatible with service discovery, or read the [configuration
+> instructions](/packaging/installer/methods/kubernetes.md#configure-service-discovery) to change how it discovers the
+> supported services.
+
+Click on any of these service discovery sections to see metrics from that particular service. For example, click on the
+**Apache apache-default httpd-6f6cb96d77-xtpwn httpd tcp 80** section brings you to a series of charts populated by the
+[Apache collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache) itself.
+
+With service discovery, you can now see valuable metrics like requests, bandwidth, workers, and more for this pod.
+
+![Apache metrics collected via service
+discovery](https://user-images.githubusercontent.com/1153921/85443905-a5aae500-b546-11ea-99f0-be20ba796feb.png)
+
+The same goes for metrics coming from the CockroachDB pod running on this same node.
+
+![CockroachDB metrics collected via service
+discovery](https://user-images.githubusercontent.com/1153921/85444316-0e925d00-b547-11ea-83ba-b834275cb419.png)
+
+Service discovery helps you monitor the health of specific applications running on your Kubernetes cluster, which in
+turn gives you a complete resource when troubleshooting your infrastructure's health and performance.
+
+### Kubelet
+
+Let's head back up the menu to the **k8s kubelet** section. Kubelet is an agent that runs on every node in a cluster. It
+receives a set of PodSpecs from the Kubernetes Control Plane and ensures the pods described there are both running and
+healthy. Think of it as a manager for the various pods on that node.
+
+Monitoring each node's Kubelet can be invaluable when diagnosing issues with your Kubernetes cluster. For example, you
+can see when the volume of running containers/pods has dropped.
+
+![Charts showing pod and container removal during a scale
+down](https://user-images.githubusercontent.com/1153921/85598613-9ab48b00-b600-11ea-827e-d9ec7779e2d4.png)
+
+This drop might signal a fault or crash in a particular Kubernetes service or deployment (see `kubectl get services` or
+`kubectl get deployments` for more details). If the number of pods increases, it may be because of something more
+benign, like another member of your team scaling up a service with `kubectl scale`.
+
+You can also view charts for the Kubelet API server, the volume of runtime/Docker operations by type,
+configuration-related errors, and the actual vs. desired numbers of volumes, plus a lot more.
+
+Kubelet metrics are collected and visualized thanks to the [kubelet
+collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubelet), which is enabled with
+zero configuration on most Kubernetes clusters with standard configurations.
+
+### kube-proxy
+
+Scroll down into the **k8s kubeproxy** section to see metrics about the network proxy that runs on each node in your
+Kubernetes cluster. kube-proxy allows for pods to communicate with each other and accept sessions from outside your
+cluster.
+
+With Netdata, you can monitor how often your k8s proxies are syncing proxy rules between nodes. Dramatic changes in
+these figures could indicate an anomaly in your cluster that's worthy of further investigation.
+
+kube-proxy metrics are collected and visualized thanks to the [kube-proxy
+collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubeproxy), which is enabled with
+zero configuration on most Kubernetes clusters with standard configurations.
+
+### Containers
+
+We can finally talk about the final piece of Kubernetes monitoring: containers. Each Kubernetes pod is a set of one or
+more cooperating containers, sharing the same namespace, all of which are resourced and tracked by the cgroups feature
+of the Linux kernel. Netdata automatically detects and monitors each running container by interfacing with the cgroups
+feature itself.
+
+You can find these sections beneath **Users**, **k8s kubelet**, and **k8s kubeproxy**. Below, a number of containers
+devoted to running services like CockroachDB, Apache, Redis, and more.
+
+![A number of sections devoted to
+containers](https://user-images.githubusercontent.com/1153921/85480217-74e1a480-b574-11ea-9da7-dd975e0fde0c.png)
+
+Let's look at the section devoted to the container that runs the Apache pod named `httpd-6f6cb96d77-xtpwn`, as described
+in the previous part on [service discovery](#service-discovery-services-running-inside-of-pods).
+
+![cgroups metrics for an Apache
+container/pod](https://user-images.githubusercontent.com/1153921/85480516-03562600-b575-11ea-92ae-dd605bf04106.png)
+
+At first glance, these sections might seem redundant. You might ask, "Why do I need both a service discovery section
+_and_ a container section? It's just one pod, after all!"
+
+The difference is that while the service discovery section shows _Apache_ metrics, the equivalent cgroups section shows
+that container's CPU, memory, and bandwidth usage. You can use the two sections in conjunction to monitor the health and
+performance of your pods and the services they run.
+
+For example, let's say you get an alarm notification from `netdata-parent-0` saying the
+`ea287694-0f22-4f39-80aa-2ca066caf45a` container (also known as the `httpd-6f6cb96d77-xtpwn` pod) is using 99% of its
+available RAM. You can then hop over to the **Apache apache-default httpd-6f6cb96d77-xtpwn httpd tcp 80** section to
+further investigate why Apache is using an unexpected amount of RAM.
+
+All container metrics, whether they're managed by Kubernetes or the Docker service directly, are collected by the
+[cgroups collector](/collectors/cgroups.plugin/README.md). Because this collector integrates with the cgroups Linux
+kernel feature itself, monitoring containers requires zero configuration on most Kubernetes clusters.
+
+## What's next?
+
+After following this guide, you should have a more comprehensive understanding of how to monitor your Kubernetes cluster
+with Netdata. With this setup, you can monitor the health and performance of all your nodes, pods, services, and k8s
+agents. Pre-configured alarms will tell you when something goes awry, and this setup gives you every per-second metric
+you need to make informed decisions about your cluster.
+
+The best part of monitoring a Kubernetes cluster with Netdata is that you don't have to worry about constantly running
+complex `kubectl` commands to see hundreds of highly granular metrics from your nodes. And forget about using `kubectl
+exec -it pod bash` to start up a shell on a pod to find and diagnose an issue with any given pod on your cluster.
+
+And with service discovery, all your compatible pods will automatically appear and disappear as they scale up, move, or
+scale down across your cluster.
+
+To monitor your Kubernetes cluster with Netdata, start by [installing the Helm
+chart](/packaging/installer/methods/kubernetes.md) if you haven't already. The Netdata Agent is open source and entirely
+free for every cluster and every organization, whether you have 10 or 10,000 pods. A few minutes and one `helm install`
+later and you'll have started on the path of building an effective platform for troubleshooting the next performance or
+availability issue on your Kubernetes cluster.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2Fkubernetes-k8s-netdata.md&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/pi-hole-raspberry-pi.md b/docs/guides/monitor/pi-hole-raspberry-pi.md
new file mode 100644
index 0000000..a180466
--- /dev/null
+++ b/docs/guides/monitor/pi-hole-raspberry-pi.md
@@ -0,0 +1,163 @@
+<!--
+title: "Monitor Pi-hole (and a Raspberry Pi) with Netdata"
+description: "Monitor Pi-hole metrics, plus Raspberry Pi system metrics, in minutes and completely for free with Netdata's open-source monitoring agent."
+image: /img/seo/guides/monitor/netdata-pi-hole-raspberry-pi.png
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/pi-hole-raspberry-pi.md
+-->
+
+# Monitor Pi-hole (and a Raspberry Pi) with Netdata
+
+Between intrusive ads, invasive trackers, and vicious malware, many techies and homelab enthusiasts are advancing their
+networks' security and speed with a tiny computer and a powerful piece of software: [Pi-hole](https://pi-hole.net/).
+
+Pi-hole is a DNS sinkhole that prevents unwanted content from even reaching devices on your home network. It blocks ads
+and malware at the network, instead of using extensions/add-ons for individual browsers, so you'll stop seeing ads in
+some of the most intrusive places, like your smart TV. Pi-hole can even [improve your network's speed and reduce
+bandwidth](https://discourse.pi-hole.net/t/will-pi-hole-slow-down-my-network/2048).
+
+Most Pi-hole users run it on a [Raspberry Pi](https://www.raspberrypi.org/products/raspberry-pi-4-model-b/) (hence the
+name), a credit card-sized, super-capable computer that costs about $35.
+
+And to keep tabs on how both Pi-hole and the Raspberry Pi are working to protect your network, you can use the
+open-source [Netdata monitoring agent](https://github.com/netdata/netdata).
+
+To get started, all you need is a [Raspberry Pi](https://www.raspberrypi.org/products/raspberry-pi-4-model-b/) with
+Raspbian installed. This guide uses a Raspberry Pi 4 Model B and Raspbian GNU/Linux 10 (buster). This guide assumes
+you're connecting to a Raspberry Pi remotely over SSH, but you could also complete all these steps on the system
+directly using a keyboard, mouse, and monitor.
+
+## Why monitor Pi-hole and a Raspberry Pi with Netdata?
+
+Netdata helps you monitor and troubleshoot all kinds of devices and the applications they run, including IoT devices
+like the Raspberry Pi and applications like Pi-hole.
+
+After a two-minute installation and with zero configuration, you'll be able to see all of Pi-hole's metrics, including
+the volume of queries, connected clients, DNS queries per type, top clients, top blocked domains, and more.
+
+With Netdata installed, you can also monitor system metrics and any other applications you might be running. By default,
+Netdata collects metrics on CPU usage, disk IO, bandwidth, per-application resource usage, and a ton more. With the
+Raspberry Pi used for this guide, Netdata automatically collects about 1,500 metrics every second!
+
+![Real-time Pi-hole monitoring with
+Netdata](https://user-images.githubusercontent.com/1153921/90447745-c8fe9600-e098-11ea-8a57-4f07339f002b.png)
+
+## Install Netdata
+
+Let's start by installing Netdata first so that it can start collecting system metrics as soon as possible for the most
+possible historic data.
+
+> 鈿狅笍 Don't install Netdata using `apt` and the default package available in Raspbian. The Netdata team does not maintain
+> this package, and can't guarantee it works properly.
+
+On Raspberry Pis running Raspbian, the best way to install Netdata is our one-line kickstart script. This script asks
+you to install dependencies, then compiles Netdata from source via [GitHub](https://github.com/netdata/netdata).
+
+```bash
+bash <(curl -Ss https://my-netdata.io/kickstart.sh)
+```
+
+Once installed on a Raspberry Pi 4 with no accessories, Netdata starts collecting roughly 1,500 metrics every second and
+populates its dashboard with more than 250 charts.
+
+Open your browser of choice and navigate to `http://NODE:19999/`, replacing `NODE` with the IP address of your Raspberry
+Pi. Not sure what that IP is? Try running `hostname -I | awk '{print $1}'` from the Pi itself.
+
+You'll see Netdata's dashboard and a few hundred real-time,
+[interactive](https://learn.netdata.cloud/guides/step-by-step/step-02#interact-with-charts) charts. Feel free to
+explore, but let's turn our attention to installing Pi-hole.
+
+## Install Pi-Hole
+
+Like Netdata, Pi-hole has a one-line script for simple installation. From your Raspberry Pi, run the following:
+
+```bash
+curl -sSL https://install.pi-hole.net | bash
+```
+
+The installer will help you set up Pi-hole based on the topology of your network. Once finished, you should set up your
+devices鈥攐r your router for system-wide sinkhole protection鈥攖o [use Pi-hole as their DNS
+service](https://discourse.pi-hole.net/t/how-do-i-configure-my-devices-to-use-pi-hole-as-their-dns-server/245). You've
+finished setting up Pi-hole at this point.
+
+As far as configuring Netdata to monitor Pi-hole metrics, there's nothing you actually need to do. Netdata's [Pi-hole
+collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/pihole) will autodetect the new service
+running on your Raspberry Pi and immediately start collecting metrics every second.
+
+Restart Netdata with `sudo service netdata restart` to start Netdata, which will then recognize that Pi-hole is running
+and start a per-second collection job. When you refresh your Netdata dashboard or load it up again in a new tab, you'll
+see a new entry in the menu for **Pi-hole** metrics.
+
+## Use Netdata to explore and monitor your Raspberry Pi and Pi-hole
+
+By the time you've reached this point in the guide, Netdata has already collected a ton of valuable data about your
+Raspberry Pi, Pi-hole, and any other apps/services you might be running. Even a few minutes of collecting 1,500 metrics
+per second adds up quickly.
+
+You can now use Netdata's synchronized charts to zoom, highlight, scrub through time, and discern how an anomaly in one
+part of your system might affect another.
+
+![The Netdata dashboard in
+action](https://user-images.githubusercontent.com/1153921/80827388-b9fee100-8b98-11ea-8f60-0d7824667cd3.gif)
+
+If you're completely new to Netdata, look at our [step-by-step guide](/docs/guides/step-by-step/step-00.md) for a
+walkthrough of all its features. For a more expedited tour, see the [get started guide](/docs/getting-started.md).
+
+### Enable temperature sensor monitoring
+
+You need to manually enable Netdata's built-in [temperature sensor
+collector](https://learn.netdata.cloud/docs/agent/collectors/charts.d.plugin/sensors) to start collecting metrics.
+
+> Netdata uses a few plugins to manage its [collectors](/collectors/REFERENCE.md), each using a different language: Go,
+> Python, Node.js, and Bash. While our Go collectors are undergoing the most active development, we still support the
+> other languages. In this case, you need to enable a temperature sensor collector that's written in Bash.
+
+First, open the `charts.d.conf` file for editing. You should always use the `edit-config` script to edit Netdata's
+configuration files, as it ensures your settings persist across updates to the Netdata Agent.
+
+```bash
+cd /etc/netdata
+sudo ./edit-config charts.d.conf
+```
+
+Uncomment the `sensors=force` line and save the file. Restart Netdata with `sudo service netdata restart` to enable
+Raspberry Pi temperature sensor monitoring.
+
+### Storing historical metrics on your Raspberry Pi
+
+By default, Netdata allocates 256 MiB in disk space to store historical metrics inside the [database
+engine](/database/engine/README.md). On the Raspberry Pi used for this guide, Netdata collects 1,500 metrics every
+second, which equates to storing 3.5 days worth of historical metrics.
+
+You can increase this allocation by editing `netdata.conf` and increasing the `dbengine multihost disk space` setting to
+more than 256.
+
+```yaml
+[global]
+ dbengine multihost disk space = 512
+```
+
+Use our [database sizing
+calculator](/docs/store/change-metrics-storage.md#calculate-the-system-resources-RAM-disk-space-needed-to-store-metrics)
+and [guide on storing historical metrics](/docs/guides/longer-metrics-storage.md) to help you determine the right
+setting for your Raspberry Pi.
+
+## What's next?
+
+Now that you're monitoring Pi-hole and your Raspberry Pi with Netdata, you can extend its capabilities even further, or
+configure Netdata to more specific goals.
+
+Most importantly, you can always install additional services and instantly collect metrics from many of them with our
+[300+ integrations](/collectors/COLLECTORS.md).
+
+- [Optimize performance](/docs/guides/configure/performance.md) using tweaks developed for IoT devices.
+- [Stream Raspberry Pi metrics](/streaming/README.md) to a parent host for easy access or longer-term storage.
+- [Tweak alarms](/health/QUICKSTART.md) for either Pi-hole or the health of your Raspberry Pi.
+- [Export metrics to external databases](/exporting/README.md) with the exporting engine.
+
+Or, head over to [our guides](https://learn.netdata.cloud/guides/) for even more experiments and insights into
+troubleshooting the health of your systems and services.
+
+If you have any questions about using Netdata to monitor your Raspberry Pi, Pi-hole, or any other applications, head on
+over to our [community forum](https://community.netdata.cloud/).
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2Fpi-hole-raspberry-pi.md&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/process.md b/docs/guides/monitor/process.md
new file mode 100644
index 0000000..893e6b7
--- /dev/null
+++ b/docs/guides/monitor/process.md
@@ -0,0 +1,299 @@
+<!--
+title: Monitor any process in real-time with Netdata
+description: "Tap into Netdata's powerful collectors, with per-second utilization metrics for every process, to troubleshoot faster and make data-informed decisions."
+image: /img/seo/guides/monitor/process.png
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/process.md
+-->
+
+# Monitor any process in real-time with Netdata
+
+Netdata is more than a multitude of generic system-level metrics and visualizations. Instead of providing only a bird's
+eye view of your system, leaving you to wonder exactly _what_ is taking up 99% CPU, Netdata also gives you visibility
+into _every layer_ of your node. These additional layers give you context, and meaningful insights, into the true health
+and performance of your infrastructure.
+
+One of these layers is the _process_. Every time a Linux system runs a program, it creates an independent process that
+executes the program's instructions in parallel with anything else happening on the system. Linux systems track the
+state and resource utilization of processes using the [`/proc` filesystem](https://en.wikipedia.org/wiki/Procfs), and
+Netdata is designed to hook into those metrics to create meaningful visualizations out of the box.
+
+While there are a lot of existing command-line tools for tracking processes on Linux systems, such as `ps` or `top`,
+only Netdata provides dozens of real-time charts, at both per-second and event frequency, without you having to write
+SQL queries or know a bunch of arbitrary command-line flags.
+
+With Netdata's process monitoring, you can:
+
+- Benchmark/optimize performance of standard applications, like web servers or databases
+- Benchmark/optimize performance of custom applications
+- Troubleshoot CPU/memory/disk utilization issues (why is my system's CPU spiking right now?)
+- Perform granular capacity planning based on the specific needs of your infrastructure
+- Search for leaking file descriptors
+- Investigate zombie processes
+
+... and much more. Let's get started.
+
+## Prerequisites
+
+- One or more Linux nodes running the [Netdata Agent](/docs/get/README.md). If you need more time to understand
+ Netdata before following this guide, see the [infrastructure](/docs/quickstart/infrastructure.md) or
+ [single-node](/docs/quickstart/single-node.md) monitoring quickstarts.
+- A general understanding of how to [configure the Netdata Agent](/docs/configure/nodes.md) using `edit-config`.
+- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
+
+## How does Netdata do process monitoring?
+
+The Netdata Agent already knows to look for hundreds of [standard applications that we support via
+collectors](/collectors/COLLECTORS.md), and groups them based on their purpose. Let's say you want to monitor a MySQL
+database using its process. The Netdata Agent already knows to look for processes with the string `mysqld` in their
+name, along with a few others, and puts them into the `sql` group. This `sql` group then becomes a dimension in all
+process-specific charts.
+
+The process and groups settings are used by two unique and powerful collectors.
+
+[**`apps.plugin`**](/collectors/apps.plugin/README.md) looks at the Linux process tree every second, much like `top` or
+`ps fax`, and collects resource utilization information on every running process. It then automatically adds a layer of
+meaningful visualization on top of these metrics, and creates per-process/application charts.
+
+[**`ebpf.plugin`**](/collectors/ebpf.plugin/README.md): Netdata's extended Berkeley Packet Filter (eBPF) collector
+monitors Linux kernel-level metrics for file descriptors, virtual filesystem IO, and process management, and then hands
+process-specific metrics over to `apps.plugin` for visualization. The eBPF collector also collects and visualizes
+metrics on an _event frequency_, which means it captures every kernel interaction, and not just the volume of
+interaction at every second in time. That's even more precise than Netdata's standard per-second granularity.
+
+### Per-process metrics and charts in Netdata
+
+With these collectors working in parallel, Netdata visualizes the following per-second metrics for _any_ process on your
+Linux systems:
+
+- CPU utilization (`apps.cpu`)
+ - Total CPU usage
+ - User/system CPU usage (`apps.cpu_user`/`apps.cpu_system`)
+- Disk I/O
+ - Physical reads/writes (`apps.preads`/`apps.pwrites`)
+ - Logical reads/writes (`apps.lreads`/`apps.lwrites`)
+ - Open unique files (if a file is found open multiple times, it is counted just once, `apps.files`)
+- Memory
+ - Real Memory Used (non-shared, `apps.mem`)
+ - Virtual Memory Allocated (`apps.vmem`)
+ - Minor page faults (i.e. memory activity, `apps.minor_faults`)
+- Processes
+ - Threads running (`apps.threads`)
+ - Processes running (`apps.processes`)
+ - Carried over uptime (since the last Netdata Agent restart, `apps.uptime`)
+ - Minimum uptime (`apps.uptime_min`)
+ - Average uptime (`apps.uptime_average`)
+ - Maximum uptime (`apps.uptime_max`)
+ - Pipes open (`apps.pipes`)
+- Swap memory
+ - Swap memory used (`apps.swap`)
+ - Major page faults (i.e. swap activity, `apps.major_faults`)
+- Network
+ - Sockets open (`apps.sockets`)
+- eBPF file
+ - Number of calls to open files. (`apps.file_open`)
+ - Number of files closed. (`apps.file_closed`)
+ - Number of calls to open files that returned errors.
+ - Number of calls to close files that returned errors.
+- eBPF syscall
+ - Number of calls to delete files. (`apps.file_deleted`)
+ - Number of calls to `vfs_write`. (`apps.vfs_write_call`)
+ - Number of calls to `vfs_read`. (`apps.vfs_read_call`)
+ - Number of bytes written with `vfs_write`. (`apps.vfs_write_bytes`)
+ - Number of bytes read with `vfs_read`. (`apps.vfs_read_bytes`)
+ - Number of calls to write a file that returned errors.
+ - Number of calls to read a file that returned errors.
+- eBPF process
+ - Number of process created with `do_fork`. (`apps.process_create`)
+ - Number of threads created with `do_fork` or `__x86_64_sys_clone`, depending on your system's kernel version. (`apps.thread_create`)
+ - Number of times that a process called `do_exit`. (`apps.task_close`)
+- eBPF net
+ - Number of bytes sent. (`apps.bandwidth_sent`)
+ - Number of bytes received. (`apps.bandwidth_recv`)
+
+As an example, here's the per-process CPU utilization chart, including a `sql` group/dimension.
+
+![A per-process CPU utilization chart in Netdata
+Cloud](https://user-images.githubusercontent.com/1153921/101217226-3a5d5700-363e-11eb-8610-aa1640aefb5d.png)
+
+## Configure the Netdata Agent to recognize a specific process
+
+To monitor any process, you need to make sure the Netdata Agent is aware of it. As mentioned above, the Agent is already
+aware of hundreds of processes, and collects metrics from them automatically.
+
+But, if you want to change the grouping behavior, add an application that isn't yet supported in the Netdata Agent, or
+monitor a custom application, you need to edit the `apps_groups.conf` configuration file.
+
+Navigate to your [Netdata config directory](/docs/configure/nodes.md) and use `edit-config` to edit the file.
+
+```bash
+cd /etc/netdata # Replace this with your Netdata config directory if not at /etc/netdata.
+sudo ./edit-config apps_groups.conf
+```
+
+Inside the file are lists of process names, oftentimes using wildcards (`*`), that the Netdata Agent looks for and
+groups together. For example, the Netdata Agent looks for processes starting with `mysqld`, `mariad`, `postgres`, and
+others, and groups them into `sql`. That makes sense, since all these processes are for SQL databases.
+
+```conf
+sql: mysqld* mariad* postgres* postmaster* oracle_* ora_* sqlservr
+```
+
+These groups are then reflected as [dimensions](/web/README.md#dimensions) within Netdata's charts.
+
+![An example per-process CPU utilization chart in Netdata
+Cloud](https://user-images.githubusercontent.com/1153921/101369156-352e2100-3865-11eb-9f0d-b8fac162e034.png)
+
+See the following two sections for details based on your needs. If you don't need to configure `apps_groups.conf`, jump
+down to [visualizing process metrics](#visualize-process-metrics).
+
+### Standard applications (web servers, databases, containers, and more)
+
+As explained above, the Netdata Agent is already aware of most standard applications you run on Linux nodes, and you
+shouldn't need to configure it to discover them.
+
+However, if you're using multiple applications that the Netdata Agent groups together you may want to separate them for
+more precise monitoring. If you're not running any other types of SQL databases on that node, you don't need to change
+the grouping, since you know that any MySQL is the only process contributing to the `sql` group.
+
+Let's say you're using both MySQL and PostgreSQL databases on a single node, and want to monitor their processes
+independently. Open the `apps_groups.conf` file as explained in the [section
+above](#configure-the-netdata-agent-to-recognize-a-specific-process) and scroll down until you find the `database
+servers` section. Create new groups for MySQL and PostgreSQL, and move their process queries into the unique groups.
+
+```conf
+# -----------------------------------------------------------------------------
+# database servers
+
+mysql: mysqld*
+postgres: postgres*
+sql: mariad* postmaster* oracle_* ora_* sqlservr
+```
+
+Restart Netdata with `service netdata restart`, or the appropriate method for your system, to start collecting
+utilization metrics from your application. Time to [visualize your process metrics](#visualize-process-metrics).
+
+### Custom applications
+
+Let's assume you have an application that runs on the process `custom-app`. To monitor eBPF metrics for that application
+separate from any others, you need to create a new group in `apps_groups.conf` and associate that process name with it.
+
+Open the `apps_groups.conf` file as explained in the [section
+above](#configure-the-netdata-agent-to-recognize-a-specific-process). Scroll down to `# NETDATA processes accounting`.
+Above that, paste in the following text, which creates a new `custom-app` group with the `custom-app` process. Replace
+`custom-app` with the name of your application's Linux process. `apps_groups.conf` should now look like this:
+
+```conf
+...
+# -----------------------------------------------------------------------------
+# Custom applications to monitor with apps.plugin and ebpf.plugin
+
+custom-app: custom-app
+
+# -----------------------------------------------------------------------------
+# NETDATA processes accounting
+...
+```
+
+Restart Netdata with `service netdata restart`, or the appropriate method for your system, to start collecting
+utilization metrics from your application.
+
+## Visualize process metrics
+
+Now that you're collecting metrics for your process, you'll want to visualize them using Netdata's real-time,
+interactive charts. Find these visualizations in the same section regardless of whether you use [Netdata
+Cloud](https://app.netdata.cloud) for infrastructure monitoring, or single-node monitoring with the local Agent's
+dashboard at `http://localhost:19999`.
+
+If you need a refresher on all the available per-process charts, see the [above
+list](#per-process-metrics-and-charts-in-netdata).
+
+### Using Netdata's application collector (`apps.plugin`)
+
+`apps.plugin` puts all of its charts under the **Applications** section of any Netdata dashboard.
+
+![Screenshot of the Applications section on a Netdata
+dashboard](https://user-images.githubusercontent.com/1153921/101401172-2ceadb80-388f-11eb-9e9a-88443894c272.png)
+
+Let's continue with the MySQL example. We can create a [test
+database](https://www.digitalocean.com/community/tutorials/how-to-measure-mysql-query-performance-with-mysqlslap) in
+MySQL to generate load on the `mysql` process.
+
+`apps.plugin` immediately collects and visualizes this activity `apps.cpu` chart, which shows an increase in CPU
+utilization from the `sql` group. There is a parallel increase in `apps.pwrites`, which visualizes writes to disk.
+
+![Per-application CPU utilization
+metrics](https://user-images.githubusercontent.com/1153921/101409725-8527da80-389b-11eb-96e9-9f401535aafc.png)
+
+![Per-application disk writing
+metrics](https://user-images.githubusercontent.com/1153921/101409728-85c07100-389b-11eb-83fd-d79dd1545b5a.png)
+
+Next, the `mysqlslap` utility queries the database to provide some benchmarking load on the MySQL database. It won't
+look exactly like a production database executing lots of user queries, but it gives you an idea into the possibility of
+these visualizations.
+
+```bash
+sudo mysqlslap --user=sysadmin --password --host=localhost --concurrency=50 --iterations=10 --create-schema=employees --query="SELECT * FROM dept_emp;" --verbose
+```
+
+The following per-process disk utilization charts show spikes under the `sql` group at the same time `mysqlslap` was run
+numerous times, with slightly different concurrency and query options.
+
+![Per-application disk
+metrics](https://user-images.githubusercontent.com/1153921/101411810-d08fb800-389e-11eb-85b3-f3fa41f1f887.png)
+
+> 馃挕 Click on any dimension below a chart in Netdata Cloud (or to the right of a chart on a local Agent dashboard), to
+> visualize only that dimension. This can be particularly useful in process monitoring to separate one process'
+> utilization from the rest of the system.
+
+### Using Netdata's eBPF collector (`ebpf.plugin`)
+
+Netdata's eBPF collector puts its charts in two places. Of most importance to process monitoring are the **ebpf file**,
+**ebpf syscall**, **ebpf process**, and **ebpf net** sub-sections under **Applications**, shown in the above screenshot.
+
+For example, running the above workload shows the entire "story" how MySQL interacts with the Linux kernel to open
+processes/threads to handle a large number of SQL queries, then subsequently close the tasks as each query returns the
+relevant data.
+
+![Per-process eBPF
+charts](https://user-images.githubusercontent.com/1153921/101412395-c8844800-389f-11eb-86d2-20c8a0f7b3c0.png)
+
+`ebpf.plugin` visualizes additional eBPF metrics, which are system-wide and not per-process, under the **eBPF** section.
+
+## What's next?
+
+Now that you have `apps_groups.conf` configured correctly, and know where to find per-process visualizations throughout
+Netdata's ecosystem, you can precisely monitor the health and performance of any process on your node using per-second
+metrics.
+
+For even more in-depth troubleshooting, see our guide on [monitoring and debugging applications with
+eBPF](/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md).
+
+If the process you're monitoring also has a [supported collector](/collectors/COLLECTORS.md), now is a great time to set
+that up if it wasn't autodetected. With both process utilization and application-specific metrics, you should have every
+piece of data needed to discover the root cause of an incident. See our [collector
+setup](/docs/collect/enable-configure.md) doc for details.
+
+[Create new dashboards](/docs/visualize/create-dashboards.md) in Netdata Cloud using charts from `apps.plugin`,
+`ebpf.plugin`, and application-specific collectors to build targeted dashboards for monitoring key processes across your
+infrastructure.
+
+Try running [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations) on a node that's
+running the process(es) you're monitoring. Even if nothing is going wrong at the moment, Netdata Cloud's embedded
+intelligence helps you better understand how a MySQL database, for example, might influence a system's volume of memory
+page faults. And when an incident is afoot, use Metric Correlations to reduce mean time to resolution (MTTR) and
+cognitive load.
+
+If you want more specific metrics from your custom application, check out Netdata's [statsd
+support](/collectors/statsd.plugin/README.md). With statd, you can send detailed metrics from your application to
+Netdata and visualize them with per-second granularity. Netdata's statsd collector works with dozens of [statsd server
+implementations](https://github.com/etsy/statsd/wiki#client-implementations), which work with most application
+frameworks.
+
+### Related reference documentation
+
+- [Netdata Agent 路 `apps.plugin`](/collectors/apps.plugin/README.md)
+- [Netdata Agent 路 `ebpf.plugin`](/collectors/ebpf.plugin/README.md)
+- [Netdata Agent 路 Dashboards](/web/README.md#dimensions)
+- [Netdata Agent 路 MySQL collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql)
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2Fprocess&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/stop-notifications-alarms.md b/docs/guides/monitor/stop-notifications-alarms.md
new file mode 100644
index 0000000..587880a
--- /dev/null
+++ b/docs/guides/monitor/stop-notifications-alarms.md
@@ -0,0 +1,92 @@
+<!--
+title: "Stop notifications for individual alarms"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/stop-notifications-alarms.md
+-->
+
+# Stop notifications for individual alarms
+
+In this short tutorial, you'll learn how to stop notifications for individual alarms in Netdata's health
+monitoring system. We also refer to this process as _silencing_ the alarm.
+
+Why silence alarms? We designed Netdata's pre-configured alarms for production systems, so they might not be
+relevant if you run Netdata on your laptop or a small virtual server. If they're not helpful, they can be a distraction
+to real issues with health and performance.
+
+Silencing individual alarms is an excellent solution for situations where you're not interested in seeing a specific
+alarm but don't want to disable a [notification system](/health/notifications/README.md) entirely.
+
+## Find the alarm configuration file
+
+To silence an alarm, you need to know where to find its configuration file.
+
+Let's use the `system.cpu` chart as an example. It's the first chart you'll see on most Netdata dashboards.
+
+To figure out which file you need to edit, open up Netdata's dashboard and, click the **Alarms** button at the top
+of the dashboard, followed by clicking on the **All** tab.
+
+In this example, we're looking for the `system - cpu` entity, which, when opened, looks like this:
+
+![The system - cpu alarm
+entity](https://user-images.githubusercontent.com/1153921/67034648-ebb4cc80-f0cc-11e9-9d49-1023629924f5.png)
+
+In the `source` row, you see that this chart is getting its configuration from
+`4@/usr/lib/netdata/conf.d/health.d/cpu.conf`. The relevant part of begins at `health.d`: `health.d/cpu.conf`. That's
+the file you need to edit if you want to silence this alarm.
+
+For more information about editing or referencing health configuration files on your system, see the [health
+quickstart](/health/QUICKSTART.md#edit-health-configuration-files).
+
+## Edit the file to enable silencing
+
+To edit `health.d/cpu.conf`, use `edit-config` from inside of your Netdata configuration directory.
+
+```bash
+cd /etc/netdata/ # Replace with your Netdata configuration directory, if not /etc/netdata/
+./edit-config health.d/cpu.conf
+```
+
+> You may need to use `sudo` or another method of elevating your privileges.
+
+The beginning of the file looks like this:
+
+```yaml
+template: 10min_cpu_usage
+ on: system.cpu
+ os: linux
+ hosts: *
+ lookup: average -10m unaligned of user,system,softirq,irq,guest
+ units: %
+ every: 1m
+ warn: $this > (($status >= $WARNING) ? (75) : (85))
+ crit: $this > (($status == $CRITICAL) ? (85) : (95))
+ delay: down 15m multiplier 1.5 max 1h
+ info: average cpu utilization for the last 10 minutes (excluding iowait, nice and steal)
+ to: sysadmin
+```
+
+To silence this alarm, change `sysadmin` to `silent`.
+
+```yaml
+ to: silent
+```
+
+Use one of the available [methods](/health/QUICKSTART.md#reload-health-configuration) to reload your health configuration
+ and ensure you get no more notifications about that alarm**.
+
+You can add `to: silent` to any alarm you'd rather not bother you with notifications.
+
+## What's next?
+
+You should now know the fundamentals behind silencing any individual alarm in Netdata.
+
+To learn about _all_ of Netdata's health configuration possibilities, visit the [health reference
+guide](/health/REFERENCE.md), or check out other [tutorials on health monitoring](/health/README.md#tutorials).
+
+Or, take better control over how you get notified about alarms via the [notification
+system](/health/notifications/README.md).
+
+You can also use Netdata's [Health Management API](/web/api/health/README.md#health-management-api) to control health
+checks and notifications while Netdata runs. With this API, you can disable health checks during a maintenance window or
+backup process, for example.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2Fstop-notifications-alarms%2F&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/docs/guides/monitor/visualize-monitor-anomalies.md b/docs/guides/monitor/visualize-monitor-anomalies.md
new file mode 100644
index 0000000..f37dadc
--- /dev/null
+++ b/docs/guides/monitor/visualize-monitor-anomalies.md
@@ -0,0 +1,147 @@
+<!--
+title: "Monitor and visualize anomalies with Netdata (part 2)"
+description: "Using unsupervised anomaly detection and machine learning, get notified "
+image: /img/seo/guides/monitor/visualize-monitor-anomalies.png
+author: "Joel Hans"
+author_title: "Editorial Director, Technical & Educational Resources"
+author_img: "/img/authors/joel-hans.jpg"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/visualize-monitor-anomalies.md
+-->
+
+# Monitor and visualize anomalies with Netdata (part 2)
+
+Welcome to part 2 of our series of guides on using _unsupervised anomaly detection_ to detect issues with your systems,
+containers, and applications using the open-source Netdata Agent. For an introduction to detecting anomalies and
+monitoring associated metrics, see [part 1](/docs/guides/monitor/anomaly-detection.md), which covers prerequisites and
+configuration basics.
+
+With anomaly detection in the Netdata Agent set up, you will now want to visualize and monitor which charts have
+anomalous data, when, and where to look next.
+
+> 馃挕 In certain cases, the anomalies collector doesn't start immediately after restarting the Netdata Agent. If this
+> happens, you won't see the dashboard section or the relevant [charts](#visualize-anomalies-in-charts) right away. Wait
+> a minute or two, refresh, and look again. If the anomalies charts and alarms are still not present, investigate the
+> error log with `less /var/log/netdata/error.log | grep anomalies`.
+
+## Test anomaly detection
+
+Time to see the Netdata Agent's unsupervised anomaly detection in action. To trigger anomalies on the Nginx web server,
+use `ab`, otherwise known as [Apache Bench](https://httpd.apache.org/docs/2.4/programs/ab.html). Despite its name, it
+works just as well with Nginx web servers. Install it on Ubuntu/Debian systems with `sudo apt install apache2-utils`.
+
+> 馃挕 If you haven't followed the guide's example of using Nginx, an easy way to test anomaly detection on your node is
+> to use the `stress-ng` command, which is available on most Linux distributions. Run `stress-ng --cpu 0` to create CPU
+> stress or `stress-ng --vm 0` for RAM stress. Each test will cause some "collateral damage," in that you may see CPU
+> utilization rise when running the RAM test, and vice versa.
+
+The following test creates a minimum of 10,000,000 requests for Nginx to handle, with a maximum of 10 at any given time,
+with a run time of 60 seconds. If your system can handle those 10,000,000 in less than 60 seconds, `ab` will keep
+sending requests until the timer runs out.
+
+```bash
+ab -k -c 10 -t 60 -n 10000000 http://127.0.0.1/
+```
+
+Let's see how Netdata detects this anomalous behavior and propagates information to you through preconfigured alarms and
+dashboards that automatically organize anomaly detection metrics into meaningful charts to help you begin root cause
+analysis (RCA).
+
+## Monitor anomalies with alarms
+
+The anomalies collector creates two "classes" of alarms for each chart captured by the `charts_regex` setting. All these
+alarms are preconfigured based on your [configuration in
+`anomalies.conf`](/docs/guides/monitor/anomaly-detection.md#configure-the-anomalies-collector). With the `charts_regex`
+and `charts_to_exclude` settings from [part 1](/docs/guides/monitor/anomaly-detection.md) of this guide series, the
+Netdata Agent creates 32 alarms driven by unsupervised anomaly detection.
+
+The first class triggers warning alarms when the average anomaly probability for a given chart has stayed above 50% for
+at least the last two minutes.
+
+![An example anomaly probability
+alarm](https://user-images.githubusercontent.com/1153921/104225767-0a0a9480-5404-11eb-9bfd-e29592397203.png)
+
+The second class triggers warning alarms when the number of anomalies in the last two minutes hits 10 or higher.
+
+![An example anomaly count
+alarm](https://user-images.githubusercontent.com/1153921/104225769-0aa32b00-5404-11eb-95f3-7309f9429fe1.png)
+
+If you see either of these alarms in Netdata Cloud, the local Agent dashboard, or on your preferred notification
+platform, it's a safe bet that the node's current metrics have deviated from normal. That doesn't necessarily mean
+there's a full-blown incident, depending on what application/service you're using anomaly detection on, but it's worth
+further investigation.
+
+As you use the anomalies collector, you may find that the default settings provide too many or too few genuine alarms.
+In this case, [configure the alarm](/docs/monitor/configure-alarms.md) with `sudo ./edit-config
+health.d/anomalies.conf`. Take a look at the `lookup` line syntax in the [health
+reference](/health/REFERENCE.md#alarm-line-lookup) to understand how the anomalies collector automatically creates
+alarms for any dimension on the `anomalies_local.probability` and `anomalies_local.anomaly` charts.
+
+## Visualize anomalies in charts
+
+In either [Netdata Cloud](https://app.netdata.cloud) or the local Agent dashboard at `http://NODE:19999`, click on the
+**Anomalies** [section](/web/gui/README.md#sections) to see the pair of anomaly detection charts, which are
+preconfigured to visualize per-second anomaly metrics based on your [configuration in
+`anomalies.conf`](/docs/guides/monitor/anomaly-detection.md#configure-the-anomalies-collector).
+
+These charts have the contexts `anomalies.probability` and `anomalies.anomaly`. Together, these charts
+create meaningful visualizations for immediately recognizing not only that something is going wrong on your node, but
+give context as to where to look next.
+
+The `anomalies_local.probability` chart shows the probability that the latest observed data is anomalous, based on the
+trained model. The `anomalies_local.anomaly` chart visualizes 0&rarr;1 predictions based on whether the latest observed
+data is anomalous based on the trained model. Both charts share the same dimensions, which you configured via
+`charts_regex` and `charts_to_exclude` in [part 1](/docs/guides/monitor/anomaly-detection.md).
+
+In other words, the `probability` chart shows the amplitude of the anomaly, whereas the `anomaly` chart provides quick
+yes/no context.
+
+![Two charts created by the anomalies
+collector](https://user-images.githubusercontent.com/1153921/104226380-ef84eb00-5404-11eb-9faf-9e64c43b95ff.png)
+
+Before `08:32:00`, both charts show little in the way of verified anomalies. Based on the metrics the anomalies
+collector has trained on, a certain percentage of anomaly probability score is normal, as seen in the
+`web_log_nginx_requests_prob` dimension and a few others. What you're looking for is large deviations from the "noise"
+in the `anomalies.probability` chart, or any increments to the `anomalies.anomaly` chart.
+
+Unsurprisingly, the stress test that began at `08:32:00` caused significant changes to these charts. The three
+dimensions that immediately shot to 100% anomaly probability, and remained there during the test, were
+`web_log_nginx.requests_prob`, `nginx_local.connections_accepted_handled_prob`, and `system.cpu_pressure_prob`.
+
+## Build an anomaly detection dashboard
+
+[Netdata Cloud](https://app.netdata.cloud) features a drag-and-drop [dashboard
+editor](/docs/visualize/create-dashboards.md) that helps you create entirely new dashboards with charts targeted for
+your specific applications.
+
+For example, here's a dashboard designed for visualizing anomalies present in an Nginx web server, including
+documentation about why the dashboard exists and where to look next based on what you're seeing:
+
+![An example anomaly detection
+dashboard](https://user-images.githubusercontent.com/1153921/104226915-c6188f00-5405-11eb-9bb4-559a18016fa7.png)
+
+Use the anomaly charts for instant visual identification of potential anomalies, and then Nginx-specific charts, in the
+right column, to validate whether the probability and anomaly counters are showing a valid incident worth further
+investigation using [Metric Correlations](https://learn.netdata.cloud/docs/cloud/insights/metric-correlations) to narrow
+the dashboard into only the charts relevant to what you're seeing from the anomalies collector.
+
+## What's next?
+
+Between this guide and [part 1](/docs/guides/monitor/anomaly-detection.md), which covered setup and configuration, you
+now have a fundamental understanding of how unsupervised anomaly detection in Netdata works, from root cause to alarms
+to preconfigured or custom dashboards.
+
+We'd love to hear your feedback on the anomalies collector. Hop over to the [community
+forum](https://community.netdata.cloud/t/anomalies-collector-feedback-megathread/767), and let us know if you're already getting value from
+unsupervised anomaly detection, or would like to see something added to it. You might even post a custom configuration
+that works well for monitoring some other popular application, like MySQL, PostgreSQL, Redis, or anything else we
+[support through collectors](/collectors/COLLECTORS.md).
+
+In part 3 of this series on unsupervised anomaly detection using Netdata, we'll create a custom model to apply
+unsupervised anomaly detection to an entire mission-critical application. Stay tuned!
+
+### Related reference documentation
+
+- [Netdata Agent 路 Anomalies collector](/collectors/python.d.plugin/anomalies/README.md)
+- [Netdata Cloud 路 Build new dashboards](https://learn.netdata.cloud/docs/cloud/visualize/dashboards)
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fdocs%2Fguides%2Fmonitor%2Fanomaly-detectionl&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)