summaryrefslogtreecommitdiffstats
path: root/collectors/cgroups.plugin/README.md
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-05 12:08:03 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-05 12:08:18 +0000
commit5da14042f70711ea5cf66e034699730335462f66 (patch)
tree0f6354ccac934ed87a2d555f45be4c831cf92f4a /collectors/cgroups.plugin/README.md
parentReleasing debian version 1.44.3-2. (diff)
downloadnetdata-5da14042f70711ea5cf66e034699730335462f66.tar.xz
netdata-5da14042f70711ea5cf66e034699730335462f66.zip
Merging upstream version 1.45.3+dfsg.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'collectors/cgroups.plugin/README.md')
-rw-r--r--collectors/cgroups.plugin/README.md302
1 files changed, 0 insertions, 302 deletions
diff --git a/collectors/cgroups.plugin/README.md b/collectors/cgroups.plugin/README.md
deleted file mode 100644
index ba6a20e5e..000000000
--- a/collectors/cgroups.plugin/README.md
+++ /dev/null
@@ -1,302 +0,0 @@
-<!--
-title: "Monitor Cgroups (cgroups.plugin)"
-custom_edit_url: "https://github.com/netdata/netdata/edit/master/collectors/cgroups.plugin/README.md"
-sidebar_label: "Monitor Cgroups"
-learn_status: "Published"
-learn_topic_type: "References"
-learn_rel_path: "Integrations/Monitor/Virtualized environments/Containers"
--->
-
-# Monitor Cgroups (cgroups.plugin)
-
-You can monitor containers and virtual machines using **cgroups**.
-
-cgroups (or control groups), are a Linux kernel feature that provides accounting and resource usage limiting for
-processes. When cgroups are bundled with namespaces (i.e. isolation), they form what we usually call **containers**.
-
-cgroups are hierarchical, meaning that cgroups can contain child cgroups, which can contain more cgroups, etc. All
-accounting is reported (and resource usage limits are applied) also in a hierarchical way.
-
-To visualize cgroup metrics Netdata provides configuration for cherry picking the cgroups of interest. By default (
-without any configuration) Netdata should pick **systemd services**, all kinds of **containers** (lxc, docker, etc)
-and **virtual machines** spawn by managers that register them with cgroups (qemu, libvirt, etc).
-
-## Configuring Netdata for cgroups
-
-In general, no additional settings are required. Netdata discovers all available cgroups on the host system and
-collects their metrics.
-
-### How Netdata finds the available cgroups
-
-Linux exposes resource usage reporting and provides dynamic configuration for cgroups, using virtual files (usually)
-under `/sys/fs/cgroup`. Netdata reads `/proc/self/mountinfo` to detect the exact mount point of cgroups. Netdata also
-allows manual configuration of this mount point, using these settings:
-
-```text
-[plugin:cgroups]
- check for new cgroups every = 10
- path to /sys/fs/cgroup/cpuacct = /sys/fs/cgroup/cpuacct
- path to /sys/fs/cgroup/blkio = /sys/fs/cgroup/blkio
- path to /sys/fs/cgroup/memory = /sys/fs/cgroup/memory
- path to /sys/fs/cgroup/devices = /sys/fs/cgroup/devices
-```
-
-Netdata rescans these directories for added or removed cgroups every `check for new cgroups every` seconds.
-
-### Hierarchical search for cgroups
-
-Since cgroups are hierarchical, for each of the directories shown above, Netdata walks through the subdirectories
-recursively searching for cgroups (each subdirectory is another cgroup).
-
-To provide a sane default for this setting, Netdata uses the following pattern list (patterns starting with `!` give a
-negative match and their order is important: the first matching a path will be used):
-
-```text
-[plugin:cgroups]
- search for cgroups in subpaths matching = !*/init.scope !*-qemu !/init.scope !/system !/systemd !/user !/user.slice *
-```
-
-So, we disable checking for **child cgroups** in systemd internal
-cgroups ([systemd services are monitored by Netdata](#monitoring-systemd-services)), user cgroups (normally used for
-desktop and remote user sessions), qemu virtual machines (child cgroups of virtual machines) and `init.scope`. All
-others are enabled.
-
-### Unified cgroups (cgroups v2) support
-
-Netdata automatically detects cgroups version. If detection fails Netdata assumes v1.
-To switch to v2 manually add:
-
-```text
-[plugin:cgroups]
- use unified cgroups = yes
- path to unified cgroups = /sys/fs/cgroup
-```
-
-Unified cgroups use same name pattern matching as v1 cgroups. `cgroup_enable_systemd_services_detailed_memory` is
-currently unsupported when using unified cgroups.
-
-### Enabled cgroups
-
-To provide a sane default, Netdata uses the
-following [pattern list](https://github.com/netdata/netdata/blob/master/libnetdata/simple_pattern/README.md):
-
-- Checks the pattern against the path of the cgroup
-
- ```text
- [plugin:cgroups]
- enable by default cgroups matching = !*/init.scope *.scope !*/vcpu* !*/emulator !*.mount !*.partition !*.service !*.slice !*.swap !*.user !/ !/docker !/libvirt !/lxc !/lxc/*/ns !/lxc/*/ns/* !/machine !/qemu !/system !/systemd !/user *
- ```
-
-- Checks the pattern against the name of the cgroup (as you see it on the dashboard)
-
- ```text
- [plugin:cgroups]
- enable by default cgroups names matching = *
- ```
-
-Renaming is configured with the following options:
-
-```text
-[plugin:cgroups]
- run script to rename cgroups matching = *.scope *docker* *lxc* *qemu* !/ !*.mount !*.partition !*.service !*.slice !*.swap !*.user *
- script to get cgroup names = /usr/libexec/netdata/plugins.d/cgroup-name.sh
-```
-
-The whole point for the additional pattern list, is to limit the number of times the script will be called. Without this
-pattern list, the script might be called thousands of times, depending on the number of cgroups available in the system.
-
-The above pattern list is matched against the path of the cgroup. For matched cgroups, Netdata calls the
-script [cgroup-name.sh](https://raw.githubusercontent.com/netdata/netdata/master/collectors/cgroups.plugin/cgroup-name.sh)
-to get its name. This script queries `docker`, `kubectl`, `podman`, or applies heuristics to find give a name for the
-cgroup.
-
-#### Note on Podman container names
-
-Podman's security model is a lot more restrictive than Docker's, so Netdata will not be able to detect container names
-out of the box unless they were started by the same user as Netdata itself.
-
-If Podman is used in "rootful" mode, it's also possible to use `podman system service` to grant Netdata access to
-container names. To do this, ensure `podman system service` is running and Netdata has access
-to `/run/podman/podman.sock` (the default permissions as specified by upstream are `0600`, with owner `root`, so you
-will have to adjust the configuration).
-
-[Docker Socket Proxy (HAProxy)](https://github.com/Tecnativa/docker-socket-proxy) or [CetusGuard](https://github.com/hectorm/cetusguard)
-can also be used to give Netdata restricted access to the socket. Note that `PODMAN_HOST` in Netdata's environment should
-be set to the proxy's URL in this case.
-
-### Charts with zero metrics
-
-By default, Netdata will enable monitoring metrics only when they are not zero. If they are constantly zero they are
-ignored. Metrics that will start having values, after Netdata is started, will be detected and charts will be
-automatically added to the dashboard (a refresh of the dashboard is needed for them to appear though). Set `yes` for a
-chart instead of `auto` to enable it permanently. For example:
-
-```text
-[plugin:cgroups]
- enable memory (used mem including cache) = yes
-```
-
-You can also set the `enable zero metrics` option to `yes` in the `[global]` section which enables charts with zero
-metrics for all internal Netdata plugins.
-
-### Alerts
-
-CPU and memory limits are watched and used to rise alerts. Memory usage for every cgroup is checked against `ram`
-and `ram+swap` limits. CPU usage for every cgroup is checked against `cpuset.cpus` and `cpu.cfs_period_us` + `cpu.cfs_quota_us` pair assigned for the cgroup. Configuration for the alerts is available in `health.d/cgroups.conf`
-file.
-
-## Monitoring systemd services
-
-Netdata monitors **systemd services**. Example:
-
-![image](https://cloud.githubusercontent.com/assets/2662304/21964372/20cd7b84-db53-11e6-98a2-b9c986b082c0.png)
-
-Support per distribution:
-
-| system | charts shown | `/sys/fs/cgroup` tree | comments |
-|:----------------:|:------------:|:------------------------------------:|:--------------------------|
-| Arch Linux | YES | | |
-| Gentoo | NO | | can be enabled, see below |
-| Ubuntu 16.04 LTS | YES | | |
-| Ubuntu 16.10 | YES | [here](http://pastebin.com/PiWbQEXy) | |
-| Fedora 25 | YES | [here](http://pastebin.com/ax0373wF) | |
-| Debian 8 | NO | | can be enabled, see below |
-| AMI | NO | [here](http://pastebin.com/FrxmptjL) | not a systemd system |
-| CentOS 7.3.1611 | NO | [here](http://pastebin.com/SpzgezAg) | can be enabled, see below |
-
-### Monitored systemd service metrics
-
-- CPU utilization
-- Used memory
-- RSS memory
-- Mapped memory
-- Cache memory
-- Writeback memory
-- Memory minor page faults
-- Memory major page faults
-- Memory charging activity
-- Memory uncharging activity
-- Memory limit failures
-- Swap memory used
-- Disk read bandwidth
-- Disk write bandwidth
-- Disk read operations
-- Disk write operations
-- Throttle disk read bandwidth
-- Throttle disk write bandwidth
-- Throttle disk read operations
-- Throttle disk write operations
-- Queued disk read operations
-- Queued disk write operations
-- Merged disk read operations
-- Merged disk write operations
-
-### How to enable cgroup accounting on systemd systems that is by default disabled
-
-You can verify there is no accounting enabled, by running `systemd-cgtop`. The program will show only resources for
-cgroup `/`, but all services will show nothing.
-
-To enable cgroup accounting, execute this:
-
-```sh
-sed -e 's|^#Default\(.*\)Accounting=.*$|Default\1Accounting=yes|g' /etc/systemd/system.conf >/tmp/system.conf
-```
-
-To see the changes it made, run this:
-
-```sh
-# diff /etc/systemd/system.conf /tmp/system.conf
-40,44c40,44
-< #DefaultCPUAccounting=no
-< #DefaultIOAccounting=no
-< #DefaultBlockIOAccounting=no
-< #DefaultMemoryAccounting=no
-< #DefaultTasksAccounting=yes
----
-> DefaultCPUAccounting=yes
-> DefaultIOAccounting=yes
-> DefaultBlockIOAccounting=yes
-> DefaultMemoryAccounting=yes
-> DefaultTasksAccounting=yes
-```
-
-If you are happy with the changes, run:
-
-```sh
-# copy the file to the right location
-sudo cp /tmp/system.conf /etc/systemd/system.conf
-
-# restart systemd to take it into account
-sudo systemctl daemon-reexec
-```
-
-(`systemctl daemon-reload` does not reload the configuration of the server - so you have to
-execute `systemctl daemon-reexec`).
-
-Now, when you run `systemd-cgtop`, services will start reporting usage (if it does not, restart any service to wake it up). Refresh your Netdata dashboard, and you will have the charts too.
-
-In case memory accounting is missing, you will need to enable it at your kernel, by appending the following kernel boot
-options and rebooting:
-
-```sh
-cgroup_enable=memory swapaccount=1
-```
-
-You can add the above, directly at the `linux` line in your `/boot/grub/grub.cfg` or appending them to
-the `GRUB_CMDLINE_LINUX` in `/etc/default/grub` (in which case you will have to run `update-grub` before rebooting). On
-DigitalOcean debian images you may have to set it at `/etc/default/grub.d/50-cloudimg-settings.cfg`.
-
-Which systemd services are monitored by Netdata is determined by the following pattern list:
-
-```text
-[plugin:cgroups]
- cgroups to match as systemd services = !/system.slice/*/*.service /system.slice/*.service
-```
-
-- - -
-
-## Monitoring ephemeral containers
-
-Netdata monitors containers automatically when it is installed at the host, or when it is installed in a container that
-has access to the `/proc` and `/sys` filesystems of the host.
-
-Network interfaces and cgroups (containers) are self-cleaned. When a network interface or container stops, Netdata might log
-a few errors in error.log complaining about files it cannot find, but immediately:
-
-1. It will detect this is a removed container or network interface
-2. It will freeze/pause all alerts for them
-3. It will mark their charts as obsolete
-4. Obsolete charts are not be offered on new dashboard sessions (so hit F5 and the charts are gone)
-5. Existing dashboard sessions will continue to see them, but of course they will not refresh
-6. Obsolete charts will be removed from memory, 1 hour after the last user viewed them (configurable
- with `[global].cleanup obsolete charts after seconds = 3600` (at `netdata.conf`).
-7. When obsolete charts are removed from memory they are also deleted from disk (configurable
- with `[global].delete obsolete charts files = yes`)
-
-### Monitored container metrics
-
-- CPU usage
-- CPU usage within the limits
-- CPU usage per core
-- Memory usage
-- Writeback memory
-- Memory activity
-- Memory page faults
-- Used memory
-- Used RAM within the limits
-- Memory utilization
-- Memory limit failures
-- I/O bandwidth (all disks)
-- Serviced I/O operations (all disks)
-- Throttle I/O bandwidth (all disks)
-- Throttle serviced I/O operations (all disks)
-- Queued I/O operations (all disks)
-- Merged I/O operations (all disks)
-- CPU pressure
-- Memory pressure
-- Memory full pressure
-- I/O pressure
-- I/O full pressure
-
-Network interfaces are monitored by means of
-the [proc plugin](https://github.com/netdata/netdata/blob/master/collectors/proc.plugin/README.md#monitored-network-interface-metrics).