summaryrefslogtreecommitdiffstats
path: root/docs/guides/monitor/process.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/guides/monitor/process.md')
-rw-r--r--docs/guides/monitor/process.md270
1 files changed, 0 insertions, 270 deletions
diff --git a/docs/guides/monitor/process.md b/docs/guides/monitor/process.md
deleted file mode 100644
index 9aa6911f..00000000
--- a/docs/guides/monitor/process.md
+++ /dev/null
@@ -1,270 +0,0 @@
-<!--
-title: Monitor any process in real-time with Netdata
-sidebar_label: Monitor any process in real-time with Netdata
-description: "Tap into Netdata's powerful collectors, with per-second utilization metrics for every process, to troubleshoot faster and make data-informed decisions."
-image: /img/seo/guides/monitor/process.png
-custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/monitor/process.md
-learn_status: "Published"
-learn_rel_path: "Operations"
--->
-
-# Monitor any process in real-time with Netdata
-
-Netdata is more than a multitude of generic system-level metrics and visualizations. Instead of providing only a bird's
-eye view of your system, leaving you to wonder exactly _what_ is taking up 99% CPU, Netdata also gives you visibility
-into _every layer_ of your node. These additional layers give you context, and meaningful insights, into the true health
-and performance of your infrastructure.
-
-One of these layers is the _process_. Every time a Linux system runs a program, it creates an independent process that
-executes the program's instructions in parallel with anything else happening on the system. Linux systems track the
-state and resource utilization of processes using the [`/proc` filesystem](https://en.wikipedia.org/wiki/Procfs), and
-Netdata is designed to hook into those metrics to create meaningful visualizations out of the box.
-
-While there are a lot of existing command-line tools for tracking processes on Linux systems, such as `ps` or `top`,
-only Netdata provides dozens of real-time charts, at both per-second and event frequency, without you having to write
-SQL queries or know a bunch of arbitrary command-line flags.
-
-With Netdata's process monitoring, you can:
-
-- Benchmark/optimize performance of standard applications, like web servers or databases
-- Benchmark/optimize performance of custom applications
-- Troubleshoot CPU/memory/disk utilization issues (why is my system's CPU spiking right now?)
-- Perform granular capacity planning based on the specific needs of your infrastructure
-- Search for leaking file descriptors
-- Investigate zombie processes
-
-... and much more. Let's get started.
-
-## Prerequisites
-
-- One or more Linux nodes running [Netdata](https://github.com/netdata/netdata/blob/master/packaging/installer/README.md)
-- A general understanding of how
- to [configure the Netdata Agent](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md)
- using `edit-config`.
-- A Netdata Cloud account. [Sign up](https://app.netdata.cloud) if you don't have one already.
-
-## How does Netdata do process monitoring?
-
-The Netdata Agent already knows to look for hundreds
-of [standard applications that we support via collectors](https://github.com/netdata/netdata/blob/master/collectors/COLLECTORS.md),
-and groups them based on their
-purpose. Let's say you want to monitor a MySQL
-database using its process. The Netdata Agent already knows to look for processes with the string `mysqld` in their
-name, along with a few others, and puts them into the `sql` group. This `sql` group then becomes a dimension in all
-process-specific charts.
-
-The process and groups settings are used by two unique and powerful collectors.
-
-[**`apps.plugin`**](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md) looks at the Linux
-process tree every second, much like `top` or
-`ps fax`, and collects resource utilization information on every running process. It then automatically adds a layer of
-meaningful visualization on top of these metrics, and creates per-process/application charts.
-
-[**`ebpf.plugin`**](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md): Netdata's extended
-Berkeley Packet Filter (eBPF) collector
-monitors Linux kernel-level metrics for file descriptors, virtual filesystem IO, and process management, and then hands
-process-specific metrics over to `apps.plugin` for visualization. The eBPF collector also collects and visualizes
-metrics on an _event frequency_, which means it captures every kernel interaction, and not just the volume of
-interaction at every second in time. That's even more precise than Netdata's standard per-second granularity.
-
-### Per-process metrics and charts in Netdata
-
-With these collectors working in parallel, Netdata visualizes the following per-second metrics for _any_ process on your
-Linux systems:
-
-- CPU utilization (`apps.cpu`)
- - Total CPU usage
- - User/system CPU usage (`apps.cpu_user`/`apps.cpu_system`)
-- Disk I/O
- - Physical reads/writes (`apps.preads`/`apps.pwrites`)
- - Logical reads/writes (`apps.lreads`/`apps.lwrites`)
- - Open unique files (if a file is found open multiple times, it is counted just once, `apps.files`)
-- Memory
- - Real Memory Used (non-shared, `apps.mem`)
- - Virtual Memory Allocated (`apps.vmem`)
- - Minor page faults (i.e. memory activity, `apps.minor_faults`)
-- Processes
- - Threads running (`apps.threads`)
- - Processes running (`apps.processes`)
- - Carried over uptime (since the last Netdata Agent restart, `apps.uptime`)
- - Minimum uptime (`apps.uptime_min`)
- - Average uptime (`apps.uptime_average`)
- - Maximum uptime (`apps.uptime_max`)
- - Pipes open (`apps.pipes`)
-- Swap memory
- - Swap memory used (`apps.swap`)
- - Major page faults (i.e. swap activity, `apps.major_faults`)
-- Network
- - Sockets open (`apps.sockets`)
-- eBPF file
- - Number of calls to open files. (`apps.file_open`)
- - Number of files closed. (`apps.file_closed`)
- - Number of calls to open files that returned errors.
- - Number of calls to close files that returned errors.
-- eBPF syscall
- - Number of calls to delete files. (`apps.file_deleted`)
- - Number of calls to `vfs_write`. (`apps.vfs_write_call`)
- - Number of calls to `vfs_read`. (`apps.vfs_read_call`)
- - Number of bytes written with `vfs_write`. (`apps.vfs_write_bytes`)
- - Number of bytes read with `vfs_read`. (`apps.vfs_read_bytes`)
- - Number of calls to write a file that returned errors.
- - Number of calls to read a file that returned errors.
-- eBPF process
- - Number of process created with `do_fork`. (`apps.process_create`)
- - Number of threads created with `do_fork` or `__x86_64_sys_clone`, depending on your system's kernel
- version. (`apps.thread_create`)
- - Number of times that a process called `do_exit`. (`apps.task_close`)
-- eBPF net
- - Number of bytes sent. (`apps.bandwidth_sent`)
- - Number of bytes received. (`apps.bandwidth_recv`)
-
-As an example, here's the per-process CPU utilization chart, including a `sql` group/dimension.
-
-![A per-process CPU utilization chart in Netdata Cloud](https://user-images.githubusercontent.com/1153921/101217226-3a5d5700-363e-11eb-8610-aa1640aefb5d.png)
-
-## Configure the Netdata Agent to recognize a specific process
-
-To monitor any process, you need to make sure the Netdata Agent is aware of it. As mentioned above, the Agent is already
-aware of hundreds of processes, and collects metrics from them automatically.
-
-But, if you want to change the grouping behavior, add an application that isn't yet supported in the Netdata Agent, or
-monitor a custom application, you need to edit the `apps_groups.conf` configuration file.
-
-Navigate to your [Netdata config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md) and
-use `edit-config` to edit the file.
-
-```bash
-cd /etc/netdata # Replace this with your Netdata config directory if not at /etc/netdata.
-sudo ./edit-config apps_groups.conf
-```
-
-Inside the file are lists of process names, oftentimes using wildcards (`*`), that the Netdata Agent looks for and
-groups together. For example, the Netdata Agent looks for processes starting with `mysqld`, `mariad`, `postgres`, and
-others, and groups them into `sql`. That makes sense, since all these processes are for SQL databases.
-
-```conf
-sql: mysqld* mariad* postgres* postmaster* oracle_* ora_* sqlservr
-```
-
-These groups are then reflected as [dimensions](https://github.com/netdata/netdata/blob/master/web/README.md#dimensions)
-within Netdata's charts.
-
-![An example per-process CPU utilization chart in Netdata
-Cloud](https://user-images.githubusercontent.com/1153921/101369156-352e2100-3865-11eb-9f0d-b8fac162e034.png)
-
-See the following two sections for details based on your needs. If you don't need to configure `apps_groups.conf`, jump
-down to [visualizing process metrics](#visualize-process-metrics).
-
-### Standard applications (web servers, databases, containers, and more)
-
-As explained above, the Netdata Agent is already aware of most standard applications you run on Linux nodes, and you
-shouldn't need to configure it to discover them.
-
-However, if you're using multiple applications that the Netdata Agent groups together you may want to separate them for
-more precise monitoring. If you're not running any other types of SQL databases on that node, you don't need to change
-the grouping, since you know that any MySQL is the only process contributing to the `sql` group.
-
-Let's say you're using both MySQL and PostgreSQL databases on a single node, and want to monitor their processes
-independently. Open the `apps_groups.conf` file as explained in
-the [section above](#configure-the-netdata-agent-to-recognize-a-specific-process) and scroll down until you find
-the `database servers` section. Create new groups for MySQL and PostgreSQL, and move their process queries into the
-unique groups.
-
-```conf
-# -----------------------------------------------------------------------------
-# database servers
-
-mysql: mysqld*
-postgres: postgres*
-sql: mariad* postmaster* oracle_* ora_* sqlservr
-```
-
-Restart Netdata with `sudo systemctl restart netdata`, or
-the [appropriate method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your system, to start collecting utilization metrics
-from your application. Time to [visualize your process metrics](#visualize-process-metrics).
-
-### Custom applications
-
-Let's assume you have an application that runs on the process `custom-app`. To monitor eBPF metrics for that application
-separate from any others, you need to create a new group in `apps_groups.conf` and associate that process name with it.
-
-Open the `apps_groups.conf` file as explained in
-the [section above](#configure-the-netdata-agent-to-recognize-a-specific-process). Scroll down
-to `# NETDATA processes accounting`.
-Above that, paste in the following text, which creates a new `custom-app` group with the `custom-app` process. Replace
-`custom-app` with the name of your application's Linux process. `apps_groups.conf` should now look like this:
-
-```conf
-...
-# -----------------------------------------------------------------------------
-# Custom applications to monitor with apps.plugin and ebpf.plugin
-
-custom-app: custom-app
-
-# -----------------------------------------------------------------------------
-# NETDATA processes accounting
-...
-```
-
-Restart Netdata with `sudo systemctl restart netdata`, or
-the [appropriate method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your system, to start collecting utilization metrics
-from your application.
-
-## Visualize process metrics
-
-Now that you're collecting metrics for your process, you'll want to visualize them using Netdata's real-time,
-interactive charts. Find these visualizations in the same section regardless of whether you
-use [Netdata Cloud](https://app.netdata.cloud) for infrastructure monitoring, or single-node monitoring with the local
-Agent's dashboard at `http://localhost:19999`.
-
-If you need a refresher on all the available per-process charts, see
-the [above list](#per-process-metrics-and-charts-in-netdata).
-
-### Using Netdata's application collector (`apps.plugin`)
-
-`apps.plugin` puts all of its charts under the **Applications** section of any Netdata dashboard.
-
-![Screenshot of the Applications section on a Netdata dashboard](https://user-images.githubusercontent.com/1153921/101401172-2ceadb80-388f-11eb-9e9a-88443894c272.png)
-
-Let's continue with the MySQL example. We can create a [test
-database](https://www.digitalocean.com/community/tutorials/how-to-measure-mysql-query-performance-with-mysqlslap) in
-MySQL to generate load on the `mysql` process.
-
-`apps.plugin` immediately collects and visualizes this activity `apps.cpu` chart, which shows an increase in CPU
-utilization from the `sql` group. There is a parallel increase in `apps.pwrites`, which visualizes writes to disk.
-
-![Per-application CPU utilization metrics](https://user-images.githubusercontent.com/1153921/101409725-8527da80-389b-11eb-96e9-9f401535aafc.png)
-
-![Per-application disk writing metrics](https://user-images.githubusercontent.com/1153921/101409728-85c07100-389b-11eb-83fd-d79dd1545b5a.png)
-
-Next, the `mysqlslap` utility queries the database to provide some benchmarking load on the MySQL database. It won't
-look exactly like a production database executing lots of user queries, but it gives you an idea into the possibility of
-these visualizations.
-
-```bash
-sudo mysqlslap --user=sysadmin --password --host=localhost --concurrency=50 --iterations=10 --create-schema=employees --query="SELECT * FROM dept_emp;" --verbose
-```
-
-The following per-process disk utilization charts show spikes under the `sql` group at the same time `mysqlslap` was run
-numerous times, with slightly different concurrency and query options.
-
-![Per-application disk metrics](https://user-images.githubusercontent.com/1153921/101411810-d08fb800-389e-11eb-85b3-f3fa41f1f887.png)
-
-> 💡 Click on any dimension below a chart in Netdata Cloud (or to the right of a chart on a local Agent dashboard), to
-> visualize only that dimension. This can be particularly useful in process monitoring to separate one process'
-> utilization from the rest of the system.
-
-### Using Netdata's eBPF collector (`ebpf.plugin`)
-
-Netdata's eBPF collector puts its charts in two places. Of most importance to process monitoring are the **ebpf file**,
-**ebpf syscall**, **ebpf process**, and **ebpf net** sub-sections under **Applications**, shown in the above screenshot.
-
-For example, running the above workload shows the entire "story" how MySQL interacts with the Linux kernel to open
-processes/threads to handle a large number of SQL queries, then subsequently close the tasks as each query returns the
-relevant data.
-
-![Per-process eBPF charts](https://user-images.githubusercontent.com/1153921/101412395-c8844800-389f-11eb-86d2-20c8a0f7b3c0.png)
-
-`ebpf.plugin` visualizes additional eBPF metrics, which are system-wide and not per-process, under the **eBPF** section.
-
-