summaryrefslogtreecommitdiffstats
path: root/collectors/apps.plugin/README.md
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--collectors/apps.plugin/README.md80
1 files changed, 47 insertions, 33 deletions
diff --git a/collectors/apps.plugin/README.md b/collectors/apps.plugin/README.md
index 1b682bc65..d10af1cdd 100644
--- a/collectors/apps.plugin/README.md
+++ b/collectors/apps.plugin/README.md
@@ -1,3 +1,9 @@
+<!--
+title: "apps.plugin"
+sidebar_label: "Application monitoring (apps.plugin)"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/collectors/apps.plugin/README.md
+-->
+
# apps.plugin
`apps.plugin` breaks down system resource usage to **processes**, **users** and **user groups**.
@@ -7,7 +13,7 @@ for every process found running.
Since Netdata needs to present this information in charts and track them through time,
instead of presenting a `top` like list, `apps.plugin` uses a pre-defined list of **process groups**
-to which it assigns all running processes. This list is [customizable](apps_groups.conf) and Netdata
+to which it assigns all running processes. This list is customizable via `apps_groups.conf`, and Netdata
ships with a good default for most cases (to edit it on your system run `/etc/netdata/edit-config apps_groups.conf`).
So, `apps.plugin` builds a process tree (much like `ps fax` does in Linux), and groups
@@ -15,7 +21,7 @@ processes together (evaluating both child and parent processes) so that the resu
a predefined set of members (of course, only process groups found running are reported).
> If you find that `apps.plugin` categorizes standard applications as `other`, we would be
-> glad to accept pull requests improving the [defaults](apps_groups.conf) shipped with Netdata.
+> glad to accept pull requests improving the defaults shipped with Netdata in `apps_groups.conf`.
Unlike traditional process monitoring tools (like `top`), `apps.plugin` is able to account the resource
utilization of exit processes. Their utilization is accounted at their currently running parents.
@@ -32,35 +38,38 @@ that fork/spawn other short lived processes hundreds of times per second.
Each of these sections provides the same number of charts:
-- CPU Utilization
+- CPU utilization (`apps.cpu`)
- Total CPU usage
- - User / System CPU usage
+ - User/system CPU usage (`apps.cpu_user`/`apps.cpu_system`)
- Disk I/O
- - Physical Reads / Writes
- - Logical Reads / Writes
- - Open Unique Files (if a file is found open multiple times, it is counted just once)
+ - Physical reads/writes (`apps.preads`/`apps.pwrites`)
+ - Logical reads/writes (`apps.lreads`/`apps.lwrites`)
+ - Open unique files (if a file is found open multiple times, it is counted just once, `apps.files`)
- Memory
- - Real Memory Used (non shared)
- - Virtual Memory Allocated
- - Minor Page Faults (i.e. memory activity)
+ - Real Memory Used (non-shared, `apps.mem`)
+ - Virtual Memory Allocated (`apps.vmem`)
+ - Minor page faults (i.e. memory activity, `apps.minor_faults`)
- Processes
- - Threads Running
- - Processes Running
- - Pipes Open
- - Carried Over Uptime (since the Netdata restart)
- - Minimum Uptime
- - Average Uptime
- - Maximum Uptime
-
-- Swap Memory
- - Swap Memory Used
- - Major Page Faults (i.e. swap activity)
+ - Threads running (`apps.threads`)
+ - Processes running (`apps.processes`)
+ - Carried over uptime (since the last Netdata Agent restart, `apps.uptime`)
+ - Minimum uptime (`apps.uptime_min`)
+ - Average uptime (`apps.uptime_average`)
+ - Maximum uptime (`apps.uptime_max`)
+ - Pipes open (`apps.pipes`)
+- Swap memory
+ - Swap memory used (`apps.swap`)
+ - Major page faults (i.e. swap activity, `apps.major_faults`)
- Network
- - Sockets Open
+ - Sockets open (`apps.sockets`)
+
+In addition, if the [eBPF collector](/collectors/ebpf.plugin/README.md) is running, your dashboard will also show an
+additional [list of charts](/collectors/ebpf.plugin/README.md#integration-with-appsplugin) using low-level Linux
+metrics.
The above are reported:
-- For **Applications** per [target configured](apps_groups.conf).
+- For **Applications** per target configured.
- For **Users** per username or UID (when the username is not available).
- For **User Groups** per groupname or GID (when groupname is not available).
@@ -90,8 +99,7 @@ its CPU resources will be cut in half, and data collection will be once every 2
## Configuration
-The configuration file is `/etc/netdata/apps_groups.conf` (the default is [here](apps_groups.conf)).
-To edit it on your system run `/etc/netdata/edit-config apps_groups.conf`.
+The configuration file is `/etc/netdata/apps_groups.conf`. To edit it on your system, run `/etc/netdata/edit-config apps_groups.conf`.
The configuration file works accepts multiple lines, each having this format:
@@ -149,6 +157,15 @@ There are a few command line options you can pass to `apps.plugin`. The list of
command options = without-users without-groups
```
+### Integration with eBPF
+
+If you don't see charts under the **eBPF syscall** or **eBPF net** sections, you should edit your
+[`ebpf.conf`](/collectors/ebpf.plugin/README.md#ebpf-programs) file to ensure the eBPF program is enabled.
+
+Also see our [guide on troubleshooting apps with eBPF
+metrics](/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md) for ideas on how to interpret these charts in a
+few scenarios.
+
## Permissions
`apps.plugin` requires additional privileges to collect all the information it needs.
@@ -217,7 +234,7 @@ Examples below for process group `sql`:
- Open Pipes ![image](https://registry.my-netdata.io/api/v1/badge.svg?chart=apps.pipes&dimensions=sql&value_color=green=0%7Cred)
- Open Sockets ![image](https://registry.my-netdata.io/api/v1/badge.svg?chart=apps.sockets&dimensions=sql&value_color=green%3E=3%7Cred)
-For more information about badges check [Generating Badges](../../web/api/badges)
+For more information about badges check [Generating Badges](/web/api/badges/README.md)
## Comparison with console tools
@@ -351,9 +368,7 @@ So, the `ssh` session is using 95% CPU time.
Why `ssh`?
-`apps.plugin` groups all processes based on its configuration file
-[`/etc/netdata/apps_groups.conf`](apps_groups.conf)
-(to edit it on your system run `/etc/netdata/edit-config apps_groups.conf`).
+`apps.plugin` groups all processes based on its configuration file.
The default configuration has nothing for `bash`, but it has for `sshd`, so Netdata accumulates
all ssh sessions to a dimension on the charts, called `ssh`. This includes all the processes in
the process tree of `sshd`, **including the exited children**.
@@ -368,10 +383,9 @@ the process tree of `sshd`, **including the exited children**.
Netdata reads `/proc/<pid>/stat` for all processes, once per second and extracts `utime` and
`stime` (user and system cpu utilization), much like all the console tools do.
-But it [also extracts `cutime` and `cstime`](https://github.com/netdata/netdata/blob/62596cc6b906b1564657510ca9135c08f6d4cdda/src/apps_plugin.c#L636-L642)
-that account the user and system time of the exit children of each process. By keeping a map in
-memory of the whole process tree, it is capable of assigning the right time to every process,
-taking into account all its exited children.
+But it also extracts `cutime` and `cstime` that account the user and system time of the exit children of each process.
+By keeping a map in memory of the whole process tree, it is capable of assigning the right time to every process, taking
+into account all its exited children.
It is tricky, since a process may be running for 1 hour and once it exits, its parent should not
receive the whole 1 hour of cpu time in just 1 second - you have to subtract the cpu time that has