diff options
Diffstat (limited to 'collectors/cgroups.plugin')
-rw-r--r-- | collectors/cgroups.plugin/Makefile.am | 21 | ||||
-rw-r--r-- | collectors/cgroups.plugin/README.md | 233 | ||||
-rwxr-xr-x | collectors/cgroups.plugin/cgroup-name.sh.in | 470 | ||||
-rwxr-xr-x | collectors/cgroups.plugin/cgroup-network-helper.sh | 297 | ||||
-rw-r--r-- | collectors/cgroups.plugin/cgroup-network.c | 714 | ||||
-rw-r--r-- | collectors/cgroups.plugin/sys_fs_cgroup.c | 4051 | ||||
-rw-r--r-- | collectors/cgroups.plugin/sys_fs_cgroup.h | 33 | ||||
-rw-r--r-- | collectors/cgroups.plugin/tests/test_cgroups_plugin.c | 110 | ||||
-rw-r--r-- | collectors/cgroups.plugin/tests/test_cgroups_plugin.h | 16 | ||||
-rw-r--r-- | collectors/cgroups.plugin/tests/test_doubles.c | 162 |
10 files changed, 6107 insertions, 0 deletions
diff --git a/collectors/cgroups.plugin/Makefile.am b/collectors/cgroups.plugin/Makefile.am new file mode 100644 index 0000000..9e924ab --- /dev/null +++ b/collectors/cgroups.plugin/Makefile.am @@ -0,0 +1,21 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +CLEANFILES = \ + cgroup-name.sh \ + $(NULL) + +include $(top_srcdir)/build/subst.inc +SUFFIXES = .in + +dist_plugins_SCRIPTS = \ + cgroup-name.sh \ + cgroup-network-helper.sh \ + $(NULL) + +dist_noinst_DATA = \ + cgroup-name.sh.in \ + README.md \ + $(NULL) diff --git a/collectors/cgroups.plugin/README.md b/collectors/cgroups.plugin/README.md new file mode 100644 index 0000000..21dbcae --- /dev/null +++ b/collectors/cgroups.plugin/README.md @@ -0,0 +1,233 @@ +<!-- +title: "cgroups.plugin" +custom_edit_url: https://github.com/netdata/netdata/edit/master/collectors/cgroups.plugin/README.md +--> + +# cgroups.plugin + +You can monitor containers and virtual machines using **cgroups**. + +cgroups (or control groups), are a Linux kernel feature that provides accounting and resource usage limiting for processes. When cgroups are bundled with namespaces (i.e. isolation), they form what we usually call **containers**. + +cgroups are hierarchical, meaning that cgroups can contain child cgroups, which can contain more cgroups, etc. All accounting is reported (and resource usage limits are applied) also in a hierarchical way. + +To visualize cgroup metrics Netdata provides configuration for cherry picking the cgroups of interest. By default (without any configuration) Netdata should pick **systemd services**, all kinds of **containers** (lxc, docker, etc) and **virtual machines** spawn by managers that register them with cgroups (qemu, libvirt, etc). + +## configuring Netdata for cgroups + +For each cgroup available in the system, Netdata provides this configuration: + +``` +[plugin:cgroups] + enable cgroup XXX = yes | no +``` + +But it also provides a few patterns to provide a sane default (`yes` or `no`). + +Below we see, how this works. + +### how Netdata finds the available cgroups + +Linux exposes resource usage reporting and provides dynamic configuration for cgroups, using virtual files (usually) under `/sys/fs/cgroup`. Netdata reads `/proc/self/mountinfo` to detect the exact mount point of cgroups. Netdata also allows manual configuration of this mount point, using these settings: + +``` +[plugin:cgroups] + check for new cgroups every = 10 + path to /sys/fs/cgroup/cpuacct = /sys/fs/cgroup/cpuacct + path to /sys/fs/cgroup/blkio = /sys/fs/cgroup/blkio + path to /sys/fs/cgroup/memory = /sys/fs/cgroup/memory + path to /sys/fs/cgroup/devices = /sys/fs/cgroup/devices +``` + +Netdata rescans these directories for added or removed cgroups every `check for new cgroups every` seconds. + +### hierarchical search for cgroups + +Since cgroups are hierarchical, for each of the directories shown above, Netdata walks through the subdirectories recursively searching for cgroups (each subdirectory is another cgroup). + +For each of the directories found, Netdata provides a configuration variable: + +``` +[plugin:cgroups] + search for cgroups under PATH = yes | no +``` + +To provide a sane default for this setting, Netdata uses the following pattern list (patterns starting with `!` give a negative match and their order is important: the first matching a path will be used): + +``` +[plugin:cgroups] + search for cgroups in subpaths matching = !*/init.scope !*-qemu !/init.scope !/system !/systemd !/user !/user.slice * +``` + +So, we disable checking for **child cgroups** in systemd internal cgroups ([systemd services are monitored by Netdata](#monitoring-systemd-services)), user cgroups (normally used for desktop and remote user sessions), qemu virtual machines (child cgroups of virtual machines) and `init.scope`. All others are enabled. + +### unified cgroups (cgroups v2) support + +Basic unified cgroups metrics are supported. To use them instead of v1 cgroups add: + +``` +[plugin:cgroups] + use unified cgroups = yes + path to unified cgroups = /sys/fs/cgroup +``` + +Unified cgroups use same name pattern matching as v1 cgroups. `cgroup_enable_systemd_services_detailed_memory` is currently unsupported when using unified cgroups. + +### enabled cgroups + +To check if the cgroup is enabled, Netdata uses this setting: + +``` +[plugin:cgroups] + enable cgroup NAME = yes | no +``` + +To provide a sane default, Netdata uses the following pattern list (it checks the pattern against the path of the cgroup): + +``` +[plugin:cgroups] + enable by default cgroups matching = !*/init.scope *.scope !*/vcpu* !*/emulator !*.mount !*.partition !*.service !*.slice !*.swap !*.user !/ !/docker !/libvirt !/lxc !/lxc/*/ns !/lxc/*/ns/* !/machine !/qemu !/system !/systemd !/user * +``` + +The above provides the default `yes` or `no` setting for the cgroup. However, there is an additional step. In many cases the cgroups found in the `/sys/fs/cgroup` hierarchy are just random numbers and in many cases these numbers are ephemeral: they change across reboots or sessions. + +So, we need to somehow map the paths of the cgroups to names, to provide consistent Netdata configuration (i.e. there is no point to say `enable cgroup 1234 = yes | no`, if `1234` is a random number that changes over time - we need a name for the cgroup first, so that `enable cgroup NAME = yes | no` will be consistent). + +For this mapping Netdata provides 2 configuration options: + +``` +[plugin:cgroups] + run script to rename cgroups matching = *.scope *docker* *lxc* *qemu* !/ !*.mount !*.partition !*.service !*.slice !*.swap !*.user * + script to get cgroup names = /usr/libexec/netdata/plugins.d/cgroup-name.sh +``` + +The whole point for the additional pattern list, is to limit the number of times the script will be called. Without this pattern list, the script might be called thousands of times, depending on the number of cgroups available in the system. + +The above pattern list is matched against the path of the cgroup. For matched cgroups, Netdata calls the script [cgroup-name.sh](https://raw.githubusercontent.com/netdata/netdata/master/collectors/cgroups.plugin/cgroup-name.sh.in) to get its name. This script queries `docker`, `kubectl`, `podman`, or applies heuristics to find give a name for the cgroup. + +#### Note on Podman container names + +Podman's security model is a lot more restrictive than Docker's, so Netdata will not be able to detect container names out of the box unless they were started by the same user as Netdata itself. + +If Podman is used in "rootful" mode, it's also possible to use `podman system service` to grant Netdata access to container names. To do this, ensure `podman system service` is running and Netdata has access to `/run/podman/podman.sock` (the default permissions as specified by upstream are `0600`, with owner `root`, so you will have to adjust the configuration). + +[docker-socket-proxy](https://github.com/Tecnativa/docker-socket-proxy) can also be used to give Netdata restricted access to the socket. Note that `PODMAN_HOST` in Netdata's environment should be set to the proxy's URL in this case. + +### charts with zero metrics + +By default, Netdata will enable monitoring metrics only when they are not zero. If they are constantly zero they are ignored. Metrics that will start having values, after Netdata is started, will be detected and charts will be automatically added to the dashboard (a refresh of the dashboard is needed for them to appear though). Set `yes` for a chart instead of `auto` to enable it permanently. For example: + +``` +[plugin:cgroups] + enable memory (used mem including cache) = yes +``` + +You can also set the `enable zero metrics` option to `yes` in the `[global]` section which enables charts with zero metrics for all internal Netdata plugins. + +### alarms + +CPU and memory limits are watched and used to rise alarms. Memory usage for every cgroup is checked against `ram` and `ram+swap` limits. CPU usage for every cgroup is checked against `cpuset.cpus` and `cpu.cfs_period_us` + `cpu.cfs_quota_us` pair assigned for the cgroup. Configuration for the alarms is available in `health.d/cgroups.conf` file. + +## Monitoring systemd services + +Netdata monitors **systemd services**. Example: + +![image](https://cloud.githubusercontent.com/assets/2662304/21964372/20cd7b84-db53-11e6-98a2-b9c986b082c0.png) + +Support per distribution: + +|system|systemd services<br/>charts shown|`tree`<br/>`/sys/fs/cgroup`|comments| +|:----:|:-------------------------------:|:-------------------------:|:-------| +|Arch Linux|YES||| +|Gentoo|NO||can be enabled, see below| +|Ubuntu 16.04 LTS|YES||| +|Ubuntu 16.10|YES|[here](http://pastebin.com/PiWbQEXy)|| +|Fedora 25|YES|[here](http://pastebin.com/ax0373wF)|| +|Debian 8|NO||can be enabled, see below| +|AMI|NO|[here](http://pastebin.com/FrxmptjL)|not a systemd system| +|CentOS 7.3.1611|NO|[here](http://pastebin.com/SpzgezAg)|can be enabled, see below| + +### how to enable cgroup accounting on systemd systems that is by default disabled + +You can verify there is no accounting enabled, by running `systemd-cgtop`. The program will show only resources for cgroup `/`, but all services will show nothing. + +To enable cgroup accounting, execute this: + +```sh +sed -e 's|^#Default\(.*\)Accounting=.*$|Default\1Accounting=yes|g' /etc/systemd/system.conf >/tmp/system.conf +``` + +To see the changes it made, run this: + +``` +# diff /etc/systemd/system.conf /tmp/system.conf +40,44c40,44 +< #DefaultCPUAccounting=no +< #DefaultIOAccounting=no +< #DefaultBlockIOAccounting=no +< #DefaultMemoryAccounting=no +< #DefaultTasksAccounting=yes +--- +> DefaultCPUAccounting=yes +> DefaultIOAccounting=yes +> DefaultBlockIOAccounting=yes +> DefaultMemoryAccounting=yes +> DefaultTasksAccounting=yes +``` + +If you are happy with the changes, run: + +```sh +# copy the file to the right location +sudo cp /tmp/system.conf /etc/systemd/system.conf + +# restart systemd to take it into account +sudo systemctl daemon-reexec +``` + +(`systemctl daemon-reload` does not reload the configuration of the server - so you have to execute `systemctl daemon-reexec`). + +Now, when you run `systemd-cgtop`, services will start reporting usage (if it does not, restart a service - any service - to wake it up). Refresh your Netdata dashboard, and you will have the charts too. + +In case memory accounting is missing, you will need to enable it at your kernel, by appending the following kernel boot options and rebooting: + +``` +cgroup_enable=memory swapaccount=1 +``` + +You can add the above, directly at the `linux` line in your `/boot/grub/grub.cfg` or appending them to the `GRUB_CMDLINE_LINUX` in `/etc/default/grub` (in which case you will have to run `update-grub` before rebooting). On DigitalOcean debian images you may have to set it at `/etc/default/grub.d/50-cloudimg-settings.cfg`. + +Which systemd services are monitored by Netdata is determined by the following pattern list: + +``` +[plugin:cgroups] + cgroups to match as systemd services = !/system.slice/*/*.service /system.slice/*.service +``` + +- - - + +## Monitoring ephemeral containers + +Netdata monitors containers automatically when it is installed at the host, or when it is installed in a container that has access to the `/proc` and `/sys` filesystems of the host. + +Netdata prior to v1.6 had 2 issues when such containers were monitored: + +1. network interface alarms where triggering when containers were stopped + +2. charts were never cleaned up, so after some time dozens of containers were showing up on the dashboard, and they were occupying memory. + +### the current Netdata + +network interfaces and cgroups (containers) are now self-cleaned. + +So, when a network interface or container stops, Netdata might log a few errors in error.log complaining about files it cannot find, but immediately: + +1. it will detect this is a removed container or network interface +2. it will freeze/pause all alarms for them +3. it will mark their charts as obsolete +4. obsolete charts are not be offered on new dashboard sessions (so hit F5 and the charts are gone) +5. existing dashboard sessions will continue to see them, but of course they will not refresh +6. obsolete charts will be removed from memory, 1 hour after the last user viewed them (configurable with `[global].cleanup obsolete charts after seconds = 3600` (at `netdata.conf`). +7. when obsolete charts are removed from memory they are also deleted from disk (configurable with `[global].delete obsolete charts files = yes`) + +[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fcollectors%2Fcgroups.plugin%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/collectors/cgroups.plugin/cgroup-name.sh.in b/collectors/cgroups.plugin/cgroup-name.sh.in new file mode 100755 index 0000000..19fbf39 --- /dev/null +++ b/collectors/cgroups.plugin/cgroup-name.sh.in @@ -0,0 +1,470 @@ +#!/usr/bin/env bash +#shellcheck disable=SC2001 + +# netdata +# real-time performance and health monitoring, done right! +# (C) 2016 Costa Tsaousis <costa@tsaousis.gr> +# SPDX-License-Identifier: GPL-3.0-or-later +# +# Script to find a better name for cgroups +# + +export PATH="${PATH}:/sbin:/usr/sbin:/usr/local/sbin" +export LC_ALL=C + +# ----------------------------------------------------------------------------- + +PROGRAM_NAME="$(basename "${0}")" + +logdate() { + date "+%Y-%m-%d %H:%M:%S" +} + +log() { + local status="${1}" + shift + + echo >&2 "$(logdate): ${PROGRAM_NAME}: ${status}: ${*}" + +} + +warning() { + log WARNING "${@}" +} + +error() { + log ERROR "${@}" +} + +info() { + log INFO "${@}" +} + +fatal() { + log FATAL "${@}" + exit 1 +} + +function docker_like_get_name_command() { + local command="${1}" + local id="${2}" + info "Running command: ${command} ps --filter=id=\"${id}\" --format=\"{{.Names}}\"" + NAME="$(${command} ps --filter=id="${id}" --format="{{.Names}}")" + return 0 +} + +function docker_like_get_name_api() { + local host_var="${1}" + local host="${!host_var}" + local path="/containers/${2}/json" + if [ -z "${host}" ]; then + warning "No ${host_var} is set" + return 1 + fi + if ! command -v jq > /dev/null 2>&1; then + warning "Can't find jq command line tool. jq is required for netdata to retrieve container name using ${host} API, falling back to docker ps" + return 1 + fi + if [ -S "${host}" ]; then + info "Running API command: curl --unix-socket \"${host}\" http://localhost${path}" + JSON=$(curl -sS --unix-socket "${host}" "http://localhost${path}") + else + info "Running API command: curl \"${host}${path}\"" + JSON=$(curl -sS "${host}${path}") + fi + NAME=$(echo "${JSON}" | jq -r .Name,.Config.Hostname | grep -v null | head -n1 | sed 's|^/||') + return 0 +} + +# get_lbl_val returns the value for the label with the given name. +# Returns "null" string if the label doesn't exist. +# Expected labels format: 'name="value",...'. +function get_lbl_val() { + local labels want_name + labels="${1}" + want_name="${2}" + + IFS=, read -ra labels <<< "$labels" + + local lname lval + for l in "${labels[@]}"; do + IFS="=" read -r lname lval <<< "$l" + if [ "$want_name" = "$lname" ] && [ -n "$lval" ]; then + echo "${lval:1:-1}" # trim " + return 0 + fi + done + + echo "null" + return 1 +} + +function add_lbl_prefix() { + local orig_labels prefix + orig_labels="${1}" + prefix="${2}" + + IFS=, read -ra labels <<< "$orig_labels" + + local new_labels + for l in "${labels[@]}"; do + new_labels+="${prefix}${l}," + done + + echo "${new_labels:0:-1}" # trim last ',' +} + +# k8s_get_kubepod_name resolves */kubepods/* cgroup name. +# pod level cgroup name format: 'pod_<namespace>_<pod_name>' +# container level cgroup name format: 'cntr_<namespace>_<pod_name>_<container_name>' +function k8s_get_kubepod_name() { + # GKE /sys/fs/cgroup/*/ tree: + # |-- kubepods + # | |-- burstable + # | | |-- pod98cee708-023b-11eb-933d-42010a800193 + # | | | |-- 922161c98e6ea450bf665226cdc64ca2aa3e889934c2cff0aec4325f8f78ac03 + # | | `-- a5d223eec35e00f5a1c6fa3e3a5faac6148cdc1f03a2e762e873b7efede012d7 + # | `-- pode314bbac-d577-11ea-a171-42010a80013b + # | |-- 7d505356b04507de7b710016d540b2759483ed5f9136bb01a80872b08f771930 + # | `-- 88ab4683b99cfa7cc8c5f503adf7987dd93a3faa7c4ce0d17d419962b3220d50 + # + # Minikube (v1.8.2) /sys/fs/cgroup/*/ tree: + # |-- kubepods.slice + # | |-- kubepods-besteffort.slice + # | | |-- kubepods-besteffort-pod10fb5647_c724_400c_b9cc_0e6eae3110e7.slice + # | | | |-- docker-36e5eb5056dfdf6dbb75c0c44a1ecf23217fe2c50d606209d8130fcbb19fb5a7.scope + # | | | `-- docker-87e18c2323621cf0f635c53c798b926e33e9665c348c60d489eef31ee1bd38d7.scope + # + # NOTE: cgroups plugin uses '_' to join dir names, so it is <parent>_<child>_<child>_... + + local fn="${FUNCNAME[0]}" + local id="${1}" + + if [[ ! $id =~ ^kubepods ]]; then + warning "${fn}: '${id}' is not kubepod cgroup." + return 1 + fi + + local clean_id="$id" + clean_id=${clean_id//.slice/} + clean_id=${clean_id//.scope/} + + local name pod_uid cntr_id + if [[ $clean_id == "kubepods" ]]; then + name="$clean_id" + elif [[ $clean_id =~ .+(besteffort|burstable|guaranteed)$ ]]; then + # kubepods_<QOS_CLASS> + # kubepods_kubepods-<QOS_CLASS> + name=${clean_id//-/_} + name=${name/#kubepods_kubepods/kubepods} + elif [[ $clean_id =~ .+pod[a-f0-9_-]+_docker-([a-f0-9]+)$ ]]; then + # ...pod<POD_UID>_docker-<CONTAINER_ID> (POD_UID w/ "_") + cntr_id=${BASH_REMATCH[1]} + elif [[ $clean_id =~ .+pod[a-f0-9-]+_([a-f0-9]+)$ ]]; then + # ...pod<POD_UID>_<CONTAINER_ID> + cntr_id=${BASH_REMATCH[1]} + elif [[ $clean_id =~ .+pod([a-f0-9_-]+)$ ]]; then + # ...pod<POD_UID> (POD_UID w/ and w/o "_") + pod_uid=${BASH_REMATCH[1]} + pod_uid=${pod_uid//_/-} + fi + + if [ -n "$name" ]; then + echo "$name" + return 0 + fi + + if [ -z "$pod_uid" ] && [ -z "$cntr_id" ]; then + warning "${fn}: can't extract pod_uid or container_id from the cgroup '$id'." + return 1 + fi + + [ -n "$pod_uid" ] && info "${fn}: cgroup '$id' is a pod(uid:$pod_uid)" + [ -n "$cntr_id" ] && info "${fn}: cgroup '$id' is a container(id:$cntr_id)" + + if ! command -v jq > /dev/null 2>&1; then + warning "${fn}: 'jq' command not available." + return 1 + fi + + local kube_system_ns + local tmp_kube_system_ns_file="${TMPDIR:-"/tmp/"}netdata-cgroups-kube-system-ns" + [ -f "$tmp_kube_system_ns_file" ] && kube_system_ns=$(cat "$tmp_kube_system_ns_file" 2> /dev/null) + + local pods + if [ -n "${KUBERNETES_SERVICE_HOST}" ] && [ -n "${KUBERNETES_PORT_443_TCP_PORT}" ]; then + local token header host url + token="$(< /var/run/secrets/kubernetes.io/serviceaccount/token)" + header="Authorization: Bearer $token" + host="$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT" + + if [ -z "$kube_system_ns" ]; then + url="https://$host/api/v1/namespaces/kube-system" + # FIX: check HTTP response code + if ! kube_system_ns=$(curl -sSk -H "$header" "$url" 2>&1); then + warning "${fn}: error on curl '${url}': ${kube_system_ns}." + else + echo "$kube_system_ns" > "$tmp_kube_system_ns_file" 2> /dev/null + fi + fi + + url="https://$host/api/v1/pods" + [ -n "$MY_NODE_NAME" ] && url+="?fieldSelector=spec.nodeName==$MY_NODE_NAME" + # FIX: check HTTP response code + if ! pods=$(curl -sSk -H "$header" "$url" 2>&1); then + warning "${fn}: error on curl '${url}': ${pods}." + return 1 + fi + elif ps -C kubelet > /dev/null 2>&1 && command -v kubectl > /dev/null 2>&1; then + if [ -z "$kube_system_ns" ]; then + if ! kube_system_ns=$(kubectl get namespaces kube-system -o json 2>&1); then + warning "${fn}: error on 'kubectl': ${kube_system_ns}." + else + echo "$kube_system_ns" > "$tmp_kube_system_ns_file" 2> /dev/null + fi + fi + + [[ -z ${KUBE_CONFIG+x} ]] && KUBE_CONFIG="/etc/kubernetes/admin.conf" + if ! pods=$(kubectl --kubeconfig="$KUBE_CONFIG" get pods --all-namespaces -o json 2>&1); then + warning "${fn}: error on 'kubectl': ${pods}." + return 1 + fi + else + warning "${fn}: not inside the k8s cluster and 'kubectl' command not available." + return 1 + fi + + local kube_system_uid + if [ -n "$kube_system_ns" ] && ! kube_system_uid=$(jq -r '.metadata.uid' <<< "$kube_system_ns" 2>&1); then + warning "${fn}: error on 'jq' parse kube_system_ns: ${kube_system_uid}." + fi + + local jq_filter + jq_filter+='.items[] | "' + jq_filter+='namespace=\"\(.metadata.namespace)\",' + jq_filter+='pod_name=\"\(.metadata.name)\",' + jq_filter+='pod_uid=\"\(.metadata.uid)\",' + #jq_filter+='\(.metadata.labels | to_entries | map("pod_label_"+.key+"=\""+.value+"\"") | join(",") | if length > 0 then .+"," else . end)' + jq_filter+='\((.metadata.ownerReferences[]? | select(.controller==true) | "controller_kind=\""+.kind+"\",controller_name=\""+.name+"\",") // "")' + jq_filter+='node_name=\"\(.spec.nodeName)\",' + jq_filter+='" + ' + jq_filter+='(.status.containerStatuses[]? | "' + jq_filter+='container_name=\"\(.name)\",' + jq_filter+='container_id=\"\(.containerID)\"' + jq_filter+='") | ' + jq_filter+='sub("docker://";"")' # containerID: docker://a346da9bc0e3eaba6b295f64ac16e02f2190db2cef570835706a9e7a36e2c722 + + local containers + if ! containers=$(jq -r "${jq_filter}" <<< "$pods" 2>&1); then + warning "${fn}: error on 'jq' parse pods: ${containers}." + return 1 + fi + + # available labels: + # namespace, pod_name, pod_uid, container_name, container_id, node_name + local labels + if [ -n "$cntr_id" ]; then + if labels=$(grep "$cntr_id" <<< "$containers" 2> /dev/null); then + labels+=',kind="container"' + [ -n "$kube_system_uid" ] && [ "$kube_system_uid" != "null" ] && labels+=",cluster_id=\"$kube_system_uid\"" + name="cntr" + name+="_$(get_lbl_val "$labels" namespace)" + name+="_$(get_lbl_val "$labels" pod_name)" + name+="_$(get_lbl_val "$labels" container_name)" + labels=$(add_lbl_prefix "$labels" "k8s_") + name+=" $labels" + fi + elif [ -n "$pod_uid" ]; then + if labels=$(grep "$pod_uid" -m 1 <<< "$containers" 2> /dev/null); then + labels="${labels%%,container_*}" + labels+=',kind="pod"' + [ -n "$kube_system_uid" ] && [ "$kube_system_uid" != "null" ] && labels+=",cluster_id=\"$kube_system_uid\"" + name="pod" + name+="_$(get_lbl_val "$labels" namespace)" + name+="_$(get_lbl_val "$labels" pod_name)" + labels=$(add_lbl_prefix "$labels" "k8s_") + name+=" $labels" + fi + fi + + # jq filter nonexistent field and nonexistent label value is 'null' + if [[ $name =~ _null(_|$) ]]; then + warning "${fn}: invalid name: $name (cgroup '$id')" + name="" + fi + + echo "$name" + [ -n "$name" ] + return +} + +function k8s_get_name() { + local fn="${FUNCNAME[0]}" + local id="${1}" + + NAME=$(k8s_get_kubepod_name "$id") + + if [ -z "${NAME}" ]; then + warning "${fn}: cannot find the name of cgroup with id '${id}'. Setting name to ${id} and disabling it." + NAME="${id}" + NAME_NOT_FOUND=3 + else + NAME="k8s_${NAME}" + + local name labels + name=${NAME%% *} + labels=${NAME#* } + if [ "$name" != "$labels" ]; then + info "${fn}: cgroup '${id}' has chart name '${name}', labels '${labels}" + else + info "${fn}: cgroup '${id}' has chart name '${NAME}'" + fi + fi +} + +function docker_get_name() { + local id="${1}" + if hash docker 2> /dev/null; then + docker_like_get_name_command docker "${id}" + else + docker_like_get_name_api DOCKER_HOST "${id}" || docker_like_get_name_command podman "${id}" + fi + if [ -z "${NAME}" ]; then + warning "cannot find the name of docker container '${id}'" + NAME_NOT_FOUND=2 + NAME="${id:0:12}" + else + info "docker container '${id}' is named '${NAME}'" + fi +} + +function docker_validate_id() { + local id="${1}" + if [ -n "${id}" ] && { [ ${#id} -eq 64 ] || [ ${#id} -eq 12 ]; }; then + docker_get_name "${id}" + else + error "a docker id cannot be extracted from docker cgroup '${CGROUP}'." + fi +} + +function podman_get_name() { + local id="${1}" + + # for Podman, prefer using the API if we can, as netdata will not normally have access + # to other users' containers, so they will not be visible when running `podman ps` + docker_like_get_name_api PODMAN_HOST "${id}" || docker_like_get_name_command podman "${id}" + + if [ -z "${NAME}" ]; then + warning "cannot find the name of podman container '${id}'" + NAME_NOT_FOUND=2 + NAME="${id:0:12}" + else + info "podman container '${id}' is named '${NAME}'" + fi +} + +function podman_validate_id() { + local id="${1}" + if [ -n "${id}" ] && [ ${#id} -eq 64 ]; then + podman_get_name "${id}" + else + error "a podman id cannot be extracted from docker cgroup '${CGROUP}'." + fi +} + +# ----------------------------------------------------------------------------- + +[ -z "${NETDATA_USER_CONFIG_DIR}" ] && NETDATA_USER_CONFIG_DIR="@configdir_POST@" +[ -z "${NETDATA_STOCK_CONFIG_DIR}" ] && NETDATA_STOCK_CONFIG_DIR="@libconfigdir_POST@" + +DOCKER_HOST="${DOCKER_HOST:=/var/run/docker.sock}" +PODMAN_HOST="${PODMAN_HOST:=/run/podman/podman.sock}" +CGROUP="${1}" +NAME_NOT_FOUND=0 +NAME= + +# ----------------------------------------------------------------------------- + +if [ -z "${CGROUP}" ]; then + fatal "called without a cgroup name. Nothing to do." +fi + +for CONFIG in "${NETDATA_USER_CONFIG_DIR}/cgroups-names.conf" "${NETDATA_STOCK_CONFIG_DIR}/cgroups-names.conf"; do + if [ -f "${CONFIG}" ]; then + NAME="$(grep "^${CGROUP} " "${CONFIG}" | sed 's/[[:space:]]\+/ /g' | cut -d ' ' -f 2)" + if [ -z "${NAME}" ]; then + info "cannot find cgroup '${CGROUP}' in '${CONFIG}'." + else + break + fi + #else + # info "configuration file '${CONFIG}' is not available." + fi +done + +if [ -z "${NAME}" ]; then + if [[ ${CGROUP} =~ ^.*kubepods.* ]]; then + k8s_get_name "${CGROUP}" + fi +fi + +if [ -z "${NAME}" ]; then + if [[ ${CGROUP} =~ ^.*docker[-_/\.][a-fA-F0-9]+[-_\.]?.*$ ]]; then + # docker containers + #shellcheck disable=SC1117 + DOCKERID="$(echo "${CGROUP}" | sed "s|^.*docker[-_/]\([a-fA-F0-9]\+\)[-_\.]\?.*$|\1|")" + docker_validate_id "${DOCKERID}" + elif [[ ${CGROUP} =~ ^.*ecs[-_/\.][a-fA-F0-9]+[-_\.]?.*$ ]]; then + # ECS + #shellcheck disable=SC1117 + DOCKERID="$(echo "${CGROUP}" | sed "s|^.*ecs[-_/].*[-_/]\([a-fA-F0-9]\+\)[-_\.]\?.*$|\1|")" + docker_validate_id "${DOCKERID}" + elif [[ ${CGROUP} =~ ^.*libpod-[a-fA-F0-9]+.*$ ]]; then + # Podman + PODMANID="$(echo "${CGROUP}" | sed "s|^.*libpod-\([a-fA-F0-9]\+\).*$|\1|")" + podman_validate_id "${PODMANID}" + + elif [[ ${CGROUP} =~ machine.slice[_/].*\.service ]]; then + # systemd-nspawn + NAME="$(echo "${CGROUP}" | sed 's/.*machine.slice[_\/]\(.*\)\.service/\1/g')" + + elif [[ ${CGROUP} =~ machine.slice_machine.*-qemu ]]; then + # libvirtd / qemu virtual machines + # NAME="$(echo ${CGROUP} | sed 's/machine.slice_machine.*-qemu//; s/\/x2d//; s/\/x2d/\-/g; s/\.scope//g')" + NAME="qemu_$(echo "${CGROUP}" | sed 's/machine.slice_machine.*-qemu//; s/\/x2d[[:digit:]]*//; s/\/x2d//g; s/\.scope//g')" + + elif [[ ${CGROUP} =~ machine_.*\.libvirt-qemu ]]; then + # libvirtd / qemu virtual machines + NAME="qemu_$(echo "${CGROUP}" | sed 's/^machine_//; s/\.libvirt-qemu$//; s/-/_/;')" + + elif [[ ${CGROUP} =~ qemu.slice_([0-9]+).scope && -d /etc/pve ]]; then + # Proxmox VMs + + FILENAME="/etc/pve/qemu-server/${BASH_REMATCH[1]}.conf" + if [[ -f $FILENAME && -r $FILENAME ]]; then + NAME="qemu_$(grep -e '^name: ' "/etc/pve/qemu-server/${BASH_REMATCH[1]}.conf" | head -1 | sed -rn 's|\s*name\s*:\s*(.*)?$|\1|p')" + else + error "proxmox config file missing ${FILENAME} or netdata does not have read access. Please ensure netdata is a member of www-data group." + fi + elif [[ ${CGROUP} =~ lxc_([0-9]+) && -d /etc/pve ]]; then + # Proxmox Containers (LXC) + + FILENAME="/etc/pve/lxc/${BASH_REMATCH[1]}.conf" + if [[ -f ${FILENAME} && -r ${FILENAME} ]]; then + NAME=$(grep -e '^hostname: ' "/etc/pve/lxc/${BASH_REMATCH[1]}.conf" | head -1 | sed -rn 's|\s*hostname\s*:\s*(.*)?$|\1|p') + else + error "proxmox config file missing ${FILENAME} or netdata does not have read access. Please ensure netdata is a member of www-data group." + fi + elif [[ ${CGROUP} =~ lxc.payload.* ]]; then + # LXC 4.0 + NAME="$(echo "${CGROUP}" | sed 's/lxc\.payload\.\(.*\)/\1/g')" + fi + + [ -z "${NAME}" ] && NAME="${CGROUP}" + [ ${#NAME} -gt 100 ] && NAME="${NAME:0:100}" +fi + +info "cgroup '${CGROUP}' is called '${NAME}'" +echo "${NAME}" + +exit ${NAME_NOT_FOUND} diff --git a/collectors/cgroups.plugin/cgroup-network-helper.sh b/collectors/cgroups.plugin/cgroup-network-helper.sh new file mode 100755 index 0000000..eb839ef --- /dev/null +++ b/collectors/cgroups.plugin/cgroup-network-helper.sh @@ -0,0 +1,297 @@ +#!/usr/bin/env bash +# shellcheck disable=SC1117 + +# cgroup-network-helper.sh +# detect container and virtual machine interfaces +# +# (C) 2017 Costa Tsaousis +# SPDX-License-Identifier: GPL-3.0-or-later +# +# This script is called as root (by cgroup-network), with either a pid, or a cgroup path. +# It tries to find all the network interfaces that belong to the same cgroup. +# +# It supports several method for this detection: +# +# 1. cgroup-network (the binary father of this script) detects veth network interfaces, +# by examining iflink and ifindex IDs and switching namespaces +# (it also detects the interface name as it is used by the container). +# +# 2. this script, uses /proc/PID/fdinfo to find tun/tap network interfaces. +# +# 3. this script, calls virsh to find libvirt network interfaces. +# + +# ----------------------------------------------------------------------------- + +# the system path is cleared by cgroup-network +# shellcheck source=/dev/null +[ -f /etc/profile ] && source /etc/profile + +export LC_ALL=C + +PROGRAM_NAME="$(basename "${0}")" + +logdate() { + date "+%Y-%m-%d %H:%M:%S" +} + +log() { + local status="${1}" + shift + + echo >&2 "$(logdate): ${PROGRAM_NAME}: ${status}: ${*}" + +} + +warning() { + log WARNING "${@}" +} + +error() { + log ERROR "${@}" +} + +info() { + log INFO "${@}" +} + +fatal() { + log FATAL "${@}" + exit 1 +} + +debug=${NETDATA_CGROUP_NETWORK_HELPER_DEBUG=0} +debug() { + [ "${debug}" = "1" ] && log DEBUG "${@}" +} + +# ----------------------------------------------------------------------------- +# check for BASH v4+ (required for associative arrays) + +[ $(( BASH_VERSINFO[0] )) -lt 4 ] && \ + fatal "BASH version 4 or later is required (this is ${BASH_VERSION})." + +# ----------------------------------------------------------------------------- +# parse the arguments + +pid= +cgroup= +while [ ! -z "${1}" ] +do + case "${1}" in + --cgroup) cgroup="${2}"; shift 1;; + --pid|-p) pid="${2}"; shift 1;; + --debug|debug) debug=1;; + *) fatal "Cannot understand argument '${1}'";; + esac + + shift +done + +if [ -z "${pid}" ] && [ -z "${cgroup}" ] +then + fatal "Either --pid or --cgroup is required" +fi + +# ----------------------------------------------------------------------------- + +set_source() { + [ ${debug} -eq 1 ] && echo "SRC ${*}" +} + + +# ----------------------------------------------------------------------------- +# veth interfaces via cgroup + +# cgroup-network can detect veth interfaces by itself (written in C). +# If you seek for a shell version of what it does, check this: +# https://github.com/netdata/netdata/issues/474#issuecomment-317866709 + + +# ----------------------------------------------------------------------------- +# tun/tap interfaces via /proc/PID/fdinfo + +# find any tun/tap devices linked to a pid +proc_pid_fdinfo_iff() { + local p="${1}" # the pid + + debug "Searching for tun/tap interfaces for pid ${p}..." + set_source "fdinfo" + grep "^iff:.*" "${NETDATA_HOST_PREFIX}/proc/${p}/fdinfo"/* 2>/dev/null | cut -f 2 +} + +find_tun_tap_interfaces_for_cgroup() { + local c="${1}" # the cgroup path + [ -d "${c}/emulator" ] && c="${c}/emulator" # check for 'emulator' subdirectory + c="${c}/cgroup.procs" # make full path + + # for each pid of the cgroup + # find any tun/tap devices linked to the pid + if [ -f "${c}" ] + then + local p + for p in $(< "${c}" ) + do + proc_pid_fdinfo_iff "${p}" + done + else + debug "Cannot find file '${c}', not searching for tun/tap interfaces." + fi +} + + +# ----------------------------------------------------------------------------- +# virsh domain network interfaces + +virsh_cgroup_to_domain_name() { + local c="${1}" # the cgroup path + + debug "extracting a possible virsh domain from cgroup ${c}..." + + # extract for the cgroup path + sed -n -e "s|.*/machine-qemu\\\\x2d[0-9]\+\\\\x2d\(.*\)\.scope$|\1|p" \ + -e "s|.*/machine/\(.*\)\.libvirt-qemu$|\1|p" \ + <<EOF +${c} +EOF +} + +virsh_find_all_interfaces_for_cgroup() { + local c="${1}" # the cgroup path + + # the virsh command + local virsh + # shellcheck disable=SC2230 + virsh="$(which virsh 2>/dev/null || command -v virsh 2>/dev/null)" + + if [ ! -z "${virsh}" ] + then + local d + d="$(virsh_cgroup_to_domain_name "${c}")" + + if [ ! -z "${d}" ] + then + debug "running: virsh domiflist ${d}; to find the network interfaces" + + # match only 'network' interfaces from virsh output + + set_source "virsh" + "${virsh}" -r domiflist "${d}" |\ + sed -n \ + -e "s|^\([^[:space:]]\+\)[[:space:]]\+network[[:space:]]\+\([^[:space:]]\+\)[[:space:]]\+[^[:space:]]\+[[:space:]]\+[^[:space:]]\+$|\1 \1_\2|p" \ + -e "s|^\([^[:space:]]\+\)[[:space:]]\+bridge[[:space:]]\+\([^[:space:]]\+\)[[:space:]]\+[^[:space:]]\+[[:space:]]\+[^[:space:]]\+$|\1 \1_\2|p" + else + debug "no virsh domain extracted from cgroup ${c}" + fi + else + debug "virsh command is not available" + fi +} + +# ----------------------------------------------------------------------------- +# netnsid detected interfaces + +netnsid_find_all_interfaces_for_pid() { + local pid="${1}" + [ -z "${pid}" ] && return 1 + + local nsid=$(lsns -t net -p ${pid} -o NETNSID -nr) + [ -z "${nsid}" -o "${nsid}" = "unassigned" ] && return 1 + + set_source "netnsid" + ip link show |\ + grep -B 1 -E " link-netnsid ${nsid}($| )" |\ + sed -n -e "s|^[[:space:]]*[0-9]\+:[[:space:]]\+\([A-Za-z0-9_]\+\)\(@[A-Za-z0-9_]\+\)*:[[:space:]].*$|\1|p" +} + +netnsid_find_all_interfaces_for_cgroup() { + local c="${1}" # the cgroup path + + # for each pid of the cgroup + # find any tun/tap devices linked to the pid + if [ -f "${c}/cgroup.procs" ] + then + local p + for p in $(< "${c}/cgroup.procs" ) + do + netnsid_find_all_interfaces_for_pid "${p}" + done + else + debug "Cannot find file '${c}/cgroup.procs', not searching for netnsid interfaces." + fi +} + +# ----------------------------------------------------------------------------- + +find_all_interfaces_of_pid_or_cgroup() { + local p="${1}" c="${2}" # the pid and the cgroup path + + if [ ! -z "${pid}" ] + then + # we have been called with a pid + + proc_pid_fdinfo_iff "${p}" + netnsid_find_all_interfaces_for_pid "${p}" + + elif [ ! -z "${c}" ] + then + # we have been called with a cgroup + + info "searching for network interfaces of cgroup '${c}'" + + find_tun_tap_interfaces_for_cgroup "${c}" + virsh_find_all_interfaces_for_cgroup "${c}" + netnsid_find_all_interfaces_for_cgroup "${c}" + + else + + error "Either a pid or a cgroup path is needed" + return 1 + + fi + + return 0 +} + +# ----------------------------------------------------------------------------- + +# an associative array to store the interfaces +# the index is the interface name as seen by the host +# the value is the interface name as seen by the guest / container +declare -A devs=() + +# store all interfaces found in the associative array +# this will also give the unique devices, as seen by the host +last_src= +# shellcheck disable=SC2162 +while read host_device guest_device +do + [ -z "${host_device}" ] && continue + + [ "${host_device}" = "SRC" ] && last_src="${guest_device}" && continue + + # the default guest_device is the host_device + [ -z "${guest_device}" ] && guest_device="${host_device}" + + # when we run in debug, show the source + debug "Found host device '${host_device}', guest device '${guest_device}', detected via '${last_src}'" + + if [ -z "${devs[${host_device}]}" ] || [ "${devs[${host_device}]}" = "${host_device}" ]; then + devs[${host_device}]="${guest_device}" + fi + +done < <( find_all_interfaces_of_pid_or_cgroup "${pid}" "${cgroup}" ) + +# print the interfaces found, in the format netdata expects them +found=0 +for x in "${!devs[@]}" +do + found=$((found + 1)) + echo "${x} ${devs[${x}]}" +done + +debug "found ${found} network interfaces for pid '${pid}', cgroup '${cgroup}', run as ${USER}, ${UID}" + +# let netdata know if we found any +[ ${found} -eq 0 ] && exit 1 +exit 0 diff --git a/collectors/cgroups.plugin/cgroup-network.c b/collectors/cgroups.plugin/cgroup-network.c new file mode 100644 index 0000000..921b14d --- /dev/null +++ b/collectors/cgroups.plugin/cgroup-network.c @@ -0,0 +1,714 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "libnetdata/libnetdata.h" + +#ifdef HAVE_SETNS +#ifndef _GNU_SOURCE +#define _GNU_SOURCE /* See feature_test_macros(7) */ +#endif +#include <sched.h> +#endif + +char environment_variable2[FILENAME_MAX + 50] = ""; +char *environment[] = { + "PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin", + environment_variable2, + NULL +}; + + +// ---------------------------------------------------------------------------- + +// callback required by fatal() +void netdata_cleanup_and_exit(int ret) { + exit(ret); +} + +void send_statistics( const char *action, const char *action_result, const char *action_data) { + (void) action; + (void) action_result; + (void) action_data; + return; +} + +// callbacks required by popen() +void signals_block(void) {}; +void signals_unblock(void) {}; +void signals_reset(void) {}; + +// callback required by eval() +int health_variable_lookup(const char *variable, uint32_t hash, struct rrdcalc *rc, calculated_number *result) { + (void)variable; + (void)hash; + (void)rc; + (void)result; + return 0; +}; + +// required by get_system_cpus() +char *netdata_configured_host_prefix = ""; + +// ---------------------------------------------------------------------------- + +struct iface { + const char *device; + uint32_t hash; + + unsigned int ifindex; + unsigned int iflink; + + struct iface *next; +}; + +unsigned int read_iface_iflink(const char *prefix, const char *iface) { + if(!prefix) prefix = ""; + + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s/sys/class/net/%s/iflink", prefix, iface); + + unsigned long long iflink = 0; + int ret = read_single_number_file(filename, &iflink); + if(ret) error("Cannot read '%s'.", filename); + + return (unsigned int)iflink; +} + +unsigned int read_iface_ifindex(const char *prefix, const char *iface) { + if(!prefix) prefix = ""; + + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s/sys/class/net/%s/ifindex", prefix, iface); + + unsigned long long ifindex = 0; + int ret = read_single_number_file(filename, &ifindex); + if(ret) error("Cannot read '%s'.", filename); + + return (unsigned int)ifindex; +} + +struct iface *read_proc_net_dev(const char *scope __maybe_unused, const char *prefix) { + if(!prefix) prefix = ""; + + procfile *ff = NULL; + char filename[FILENAME_MAX + 1]; + + snprintfz(filename, FILENAME_MAX, "%s%s", prefix, (*prefix)?"/proc/1/net/dev":"/proc/net/dev"); + +#ifdef NETDATA_INTERNAL_CHECKS + info("parsing '%s'", filename); +#endif + + ff = procfile_open(filename, " \t,:|", PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + error("Cannot open file '%s'", filename); + return NULL; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + error("Cannot read file '%s'", filename); + return NULL; + } + + size_t lines = procfile_lines(ff), l; + struct iface *root = NULL; + for(l = 2; l < lines ;l++) { + if (unlikely(procfile_linewords(ff, l) < 1)) continue; + + struct iface *t = callocz(1, sizeof(struct iface)); + t->device = strdupz(procfile_lineword(ff, l, 0)); + t->hash = simple_hash(t->device); + t->ifindex = read_iface_ifindex(prefix, t->device); + t->iflink = read_iface_iflink(prefix, t->device); + t->next = root; + root = t; + +#ifdef NETDATA_INTERNAL_CHECKS + info("added %s interface '%s', ifindex %u, iflink %u", scope, t->device, t->ifindex, t->iflink); +#endif + } + + procfile_close(ff); + + return root; +} + +void free_iface(struct iface *iface) { + freez((void *)iface->device); + freez(iface); +} + +void free_host_ifaces(struct iface *iface) { + while(iface) { + struct iface *t = iface->next; + free_iface(iface); + iface = t; + } +} + +int iface_is_eligible(struct iface *iface) { + if(iface->iflink != iface->ifindex) + return 1; + + return 0; +} + +int eligible_ifaces(struct iface *root) { + int eligible = 0; + + struct iface *t; + for(t = root; t ; t = t->next) + if(iface_is_eligible(t)) + eligible++; + + return eligible; +} + +static void continue_as_child(void) { + pid_t child = fork(); + int status; + pid_t ret; + + if (child < 0) + error("fork() failed"); + + /* Only the child returns */ + if (child == 0) + return; + + for (;;) { + ret = waitpid(child, &status, WUNTRACED); + if ((ret == child) && (WIFSTOPPED(status))) { + /* The child suspended so suspend us as well */ + kill(getpid(), SIGSTOP); + kill(child, SIGCONT); + } else { + break; + } + } + + /* Return the child's exit code if possible */ + if (WIFEXITED(status)) { + exit(WEXITSTATUS(status)); + } else if (WIFSIGNALED(status)) { + kill(getpid(), WTERMSIG(status)); + } + + exit(EXIT_FAILURE); +} + +int proc_pid_fd(const char *prefix, const char *ns, pid_t pid) { + if(!prefix) prefix = ""; + + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s/proc/%d/%s", prefix, (int)pid, ns); + int fd = open(filename, O_RDONLY); + + if(fd == -1) + error("Cannot open proc_pid_fd() file '%s'", filename); + + return fd; +} + +static struct ns { + int nstype; + int fd; + int status; + const char *name; + const char *path; +} all_ns[] = { + // { .nstype = CLONE_NEWUSER, .fd = -1, .status = -1, .name = "user", .path = "ns/user" }, + // { .nstype = CLONE_NEWCGROUP, .fd = -1, .status = -1, .name = "cgroup", .path = "ns/cgroup" }, + // { .nstype = CLONE_NEWIPC, .fd = -1, .status = -1, .name = "ipc", .path = "ns/ipc" }, + // { .nstype = CLONE_NEWUTS, .fd = -1, .status = -1, .name = "uts", .path = "ns/uts" }, + { .nstype = CLONE_NEWNET, .fd = -1, .status = -1, .name = "network", .path = "ns/net" }, + { .nstype = CLONE_NEWPID, .fd = -1, .status = -1, .name = "pid", .path = "ns/pid" }, + { .nstype = CLONE_NEWNS, .fd = -1, .status = -1, .name = "mount", .path = "ns/mnt" }, + + // terminator + { .nstype = 0, .fd = -1, .status = -1, .name = NULL, .path = NULL } +}; + +int switch_namespace(const char *prefix, pid_t pid) { + +#ifdef HAVE_SETNS + + int i; + for(i = 0; all_ns[i].name ; i++) + all_ns[i].fd = proc_pid_fd(prefix, all_ns[i].path, pid); + + int root_fd = proc_pid_fd(prefix, "root", pid); + int cwd_fd = proc_pid_fd(prefix, "cwd", pid); + + setgroups(0, NULL); + + // 2 passes - found it at nsenter source code + // this is related CLONE_NEWUSER functionality + + // This code cannot switch user namespace (it can all the other namespaces) + // Fortunately, we don't need to switch user namespaces. + + int pass; + for(pass = 0; pass < 2 ;pass++) { + for(i = 0; all_ns[i].name ; i++) { + if (all_ns[i].fd != -1 && all_ns[i].status == -1) { + if(setns(all_ns[i].fd, all_ns[i].nstype) == -1) { + if(pass == 1) { + all_ns[i].status = 0; + error("Cannot switch to %s namespace of pid %d", all_ns[i].name, (int) pid); + } + } + else + all_ns[i].status = 1; + } + } + } + + setgroups(0, NULL); + + if(root_fd != -1) { + if(fchdir(root_fd) < 0) + error("Cannot fchdir() to pid %d root directory", (int)pid); + + if(chroot(".") < 0) + error("Cannot chroot() to pid %d root directory", (int)pid); + + close(root_fd); + } + + if(cwd_fd != -1) { + if(fchdir(cwd_fd) < 0) + error("Cannot fchdir() to pid %d current working directory", (int)pid); + + close(cwd_fd); + } + + int do_fork = 0; + for(i = 0; all_ns[i].name ; i++) + if(all_ns[i].fd != -1) { + + // CLONE_NEWPID requires a fork() to become effective + if(all_ns[i].nstype == CLONE_NEWPID && all_ns[i].status) + do_fork = 1; + + close(all_ns[i].fd); + } + + if(do_fork) + continue_as_child(); + + return 0; + +#else + + errno = ENOSYS; + error("setns() is missing on this system."); + return 1; + +#endif +} + +pid_t read_pid_from_cgroup_file(const char *filename) { + int fd = open(filename, procfile_open_flags); + if(fd == -1) { + error("Cannot open pid_from_cgroup() file '%s'.", filename); + return 0; + } + + FILE *fp = fdopen(fd, "r"); + if(!fp) { + error("Cannot upgrade fd to fp for file '%s'.", filename); + return 0; + } + + char buffer[100 + 1]; + pid_t pid = 0; + char *s; + while((s = fgets(buffer, 100, fp))) { + buffer[100] = '\0'; + pid = atoi(s); + if(pid > 0) break; + } + + fclose(fp); + +#ifdef NETDATA_INTERNAL_CHECKS + if(pid > 0) info("found pid %d on file '%s'", pid, filename); +#endif + + return pid; +} + +pid_t read_pid_from_cgroup_files(const char *path) { + char filename[FILENAME_MAX + 1]; + + snprintfz(filename, FILENAME_MAX, "%s/cgroup.procs", path); + pid_t pid = read_pid_from_cgroup_file(filename); + if(pid > 0) return pid; + + snprintfz(filename, FILENAME_MAX, "%s/tasks", path); + return read_pid_from_cgroup_file(filename); +} + +pid_t read_pid_from_cgroup(const char *path) { + pid_t pid = read_pid_from_cgroup_files(path); + if (pid > 0) return pid; + + DIR *dir = opendir(path); + if (!dir) { + error("cannot read directory '%s'", path); + return 0; + } + + struct dirent *de = NULL; + while ((de = readdir(dir))) { + if (de->d_type == DT_DIR + && ( + (de->d_name[0] == '.' && de->d_name[1] == '\0') + || (de->d_name[0] == '.' && de->d_name[1] == '.' && de->d_name[2] == '\0') + )) + continue; + + if (de->d_type == DT_DIR) { + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s/%s", path, de->d_name); + pid = read_pid_from_cgroup(filename); + if(pid > 0) break; + } + } + closedir(dir); + return pid; +} + +// ---------------------------------------------------------------------------- +// send the result to netdata + +struct found_device { + const char *host_device; + const char *guest_device; + + uint32_t host_device_hash; + + struct found_device *next; +} *detected_devices = NULL; + +void add_device(const char *host, const char *guest) { +#ifdef NETDATA_INTERNAL_CHECKS + info("adding device with host '%s', guest '%s'", host, guest); +#endif + + uint32_t hash = simple_hash(host); + + if(guest && (!*guest || strcmp(host, guest) == 0)) + guest = NULL; + + struct found_device *f; + for(f = detected_devices; f ; f = f->next) { + if(f->host_device_hash == hash && !strcmp(host, f->host_device)) { + + if(guest && (!f->guest_device || !strcmp(f->host_device, f->guest_device))) { + if(f->guest_device) freez((void *)f->guest_device); + f->guest_device = strdupz(guest); + } + + return; + } + } + + f = mallocz(sizeof(struct found_device)); + f->host_device = strdupz(host); + f->host_device_hash = hash; + f->guest_device = (guest)?strdupz(guest):NULL; + f->next = detected_devices; + detected_devices = f; +} + +int send_devices(void) { + int found = 0; + + struct found_device *f; + for(f = detected_devices; f ; f = f->next) { + found++; + printf("%s %s\n", f->host_device, (f->guest_device)?f->guest_device:f->host_device); + } + + return found; +} + +// ---------------------------------------------------------------------------- +// this function should be called only **ONCE** +// also it has to be the **LAST** to be called +// since it switches namespaces, so after this call, everything is different! + +void detect_veth_interfaces(pid_t pid) { + struct iface *cgroup = NULL; + struct iface *host, *h, *c; + + host = read_proc_net_dev("host", netdata_configured_host_prefix); + if(!host) { + errno = 0; + error("cannot read host interface list."); + goto cleanup; + } + + if(!eligible_ifaces(host)) { + errno = 0; + error("there are no double-linked host interfaces available."); + goto cleanup; + } + + if(switch_namespace(netdata_configured_host_prefix, pid)) { + errno = 0; + error("cannot switch to the namespace of pid %u", (unsigned int) pid); + goto cleanup; + } + +#ifdef NETDATA_INTERNAL_CHECKS + info("switched to namespaces of pid %d", pid); +#endif + + cgroup = read_proc_net_dev("cgroup", NULL); + if(!cgroup) { + errno = 0; + error("cannot read cgroup interface list."); + goto cleanup; + } + + if(!eligible_ifaces(cgroup)) { + errno = 0; + error("there are not double-linked cgroup interfaces available."); + goto cleanup; + } + + for(h = host; h ; h = h->next) { + if(iface_is_eligible(h)) { + for (c = cgroup; c; c = c->next) { + if(iface_is_eligible(c) && h->ifindex == c->iflink && h->iflink == c->ifindex) { + add_device(h->device, c->device); + } + } + } + } + +cleanup: + free_host_ifaces(cgroup); + free_host_ifaces(host); +} + +// ---------------------------------------------------------------------------- +// call the external helper + +#define CGROUP_NETWORK_INTERFACE_MAX_LINE 2048 +void call_the_helper(pid_t pid, const char *cgroup) { + if(setresuid(0, 0, 0) == -1) + error("setresuid(0, 0, 0) failed."); + + char command[CGROUP_NETWORK_INTERFACE_MAX_LINE + 1]; + if(cgroup) + snprintfz(command, CGROUP_NETWORK_INTERFACE_MAX_LINE, "exec " PLUGINS_DIR "/cgroup-network-helper.sh --cgroup '%s'", cgroup); + else + snprintfz(command, CGROUP_NETWORK_INTERFACE_MAX_LINE, "exec " PLUGINS_DIR "/cgroup-network-helper.sh --pid %d", pid); + + info("running: %s", command); + + pid_t cgroup_pid; + FILE *fp = mypopene(command, &cgroup_pid, environment); + if(fp) { + char buffer[CGROUP_NETWORK_INTERFACE_MAX_LINE + 1]; + char *s; + while((s = fgets(buffer, CGROUP_NETWORK_INTERFACE_MAX_LINE, fp))) { + trim(s); + + if(*s && *s != '\n') { + char *t = s; + while(*t && *t != ' ') t++; + if(*t == ' ') { + *t = '\0'; + t++; + } + + if(!*s || !*t) continue; + add_device(s, t); + } + } + + mypclose(fp, cgroup_pid); + } + else + error("cannot execute cgroup-network helper script: %s", command); +} + +int is_valid_path_symbol(char c) { + switch(c) { + case '/': // path separators + case '\\': // needed for virsh domains \x2d1\x2dname + case ' ': // space + case '-': // hyphen + case '_': // underscore + case '.': // dot + case ',': // comma + return 1; + + default: + return 0; + } +} + +// we will pass this path a shell script running as root +// so, we need to make sure the path will be valid +// and will not include anything that could allow +// the caller use shell expansion for gaining escalated +// privileges. +int verify_path(const char *path) { + struct stat sb; + + char c; + const char *s = path; + while((c = *s++)) { + if(!( isalnum(c) || is_valid_path_symbol(c) )) { + error("invalid character in path '%s'", path); + return -1; + } + } + + if(strstr(path, "\\") && !strstr(path, "\\x")) { + error("invalid escape sequence in path '%s'", path); + return 1; + } + + if(strstr(path, "/../")) { + error("invalid parent path sequence detected in '%s'", path); + return 1; + } + + if(path[0] != '/') { + error("only absolute path names are supported - invalid path '%s'", path); + return -1; + } + + if (stat(path, &sb) == -1) { + error("cannot stat() path '%s'", path); + return -1; + } + + if((sb.st_mode & S_IFMT) != S_IFDIR) { + error("path '%s' is not a directory", path); + return -1; + } + + return 0; +} + +/* +char *fix_path_variable(void) { + const char *path = getenv("PATH"); + if(!path || !*path) return 0; + + char *p = strdupz(path); + char *safe_path = callocz(1, strlen(p) + strlen("PATH=") + 1); + strcpy(safe_path, "PATH="); + + int added = 0; + char *ptr = p; + while(ptr && *ptr) { + char *s = strsep(&ptr, ":"); + if(s && *s) { + if(verify_path(s) == -1) { + error("the PATH variable includes an invalid path '%s' - removed it.", s); + } + else { + info("the PATH variable includes a valid path '%s'.", s); + if(added) strcat(safe_path, ":"); + strcat(safe_path, s); + added++; + } + } + } + + info("unsafe PATH: '%s'.", path); + info(" safe PATH: '%s'.", safe_path); + + freez(p); + return safe_path; +} +*/ + +// ---------------------------------------------------------------------------- +// main + +void usage(void) { + fprintf(stderr, "%s [ -p PID | --pid PID | --cgroup /path/to/cgroup ]\n", program_name); + exit(1); +} + +int main(int argc, char **argv) { + pid_t pid = 0; + + program_name = argv[0]; + program_version = VERSION; + error_log_syslog = 0; + + // since cgroup-network runs as root, prevent it from opening symbolic links + procfile_open_flags = O_RDONLY|O_NOFOLLOW; + + // ------------------------------------------------------------------------ + // make sure NETDATA_HOST_PREFIX is safe + + netdata_configured_host_prefix = getenv("NETDATA_HOST_PREFIX"); + if(verify_netdata_host_prefix() == -1) exit(1); + + if(netdata_configured_host_prefix[0] != '\0' && verify_path(netdata_configured_host_prefix) == -1) + fatal("invalid NETDATA_HOST_PREFIX '%s'", netdata_configured_host_prefix); + + // ------------------------------------------------------------------------ + // build a safe environment for our script + + // the first environment variable is a fixed PATH= + snprintfz(environment_variable2, sizeof(environment_variable2) - 1, "NETDATA_HOST_PREFIX=%s", netdata_configured_host_prefix); + + // ------------------------------------------------------------------------ + + if(argc == 2 && (!strcmp(argv[1], "version") || !strcmp(argv[1], "-version") || !strcmp(argv[1], "--version") || !strcmp(argv[1], "-v") || !strcmp(argv[1], "-V"))) { + fprintf(stderr, "cgroup-network %s\n", VERSION); + exit(0); + } + + if(argc != 3) + usage(); + + if(!strcmp(argv[1], "-p") || !strcmp(argv[1], "--pid")) { + pid = atoi(argv[2]); + + if(pid <= 0) { + errno = 0; + error("Invalid pid %d given", (int) pid); + return 2; + } + + call_the_helper(pid, NULL); + } + else if(!strcmp(argv[1], "--cgroup")) { + char *cgroup = argv[2]; + if(verify_path(cgroup) == -1) { + error("cgroup '%s' does not exist or is not valid.", cgroup); + return 1; + } + + pid = read_pid_from_cgroup(cgroup); + call_the_helper(pid, cgroup); + + if(pid <= 0 && !detected_devices) { + errno = 0; + error("Cannot find a cgroup PID from cgroup '%s'", cgroup); + } + } + else + usage(); + + if(pid > 0) + detect_veth_interfaces(pid); + + int found = send_devices(); + if(found <= 0) return 1; + return 0; +} diff --git a/collectors/cgroups.plugin/sys_fs_cgroup.c b/collectors/cgroups.plugin/sys_fs_cgroup.c new file mode 100644 index 0000000..df1f5f2 --- /dev/null +++ b/collectors/cgroups.plugin/sys_fs_cgroup.c @@ -0,0 +1,4051 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "sys_fs_cgroup.h" + +#define PLUGIN_CGROUPS_NAME "cgroups.plugin" +#define PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME "systemd" +#define PLUGIN_CGROUPS_MODULE_CGROUPS_NAME "/sys/fs/cgroup" + +// ---------------------------------------------------------------------------- +// cgroup globals + +static long system_page_size = 4096; // system will be queried via sysconf() in configuration() + +static int cgroup_enable_cpuacct_stat = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_cpuacct_usage = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_memory = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_detailed_memory = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_memory_failcnt = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_swap = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_blkio_io = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_blkio_ops = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_blkio_throttle_io = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_blkio_throttle_ops = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_blkio_merged_ops = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_blkio_queued_ops = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_pressure_cpu = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_pressure_io_some = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_pressure_io_full = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_pressure_memory_some = CONFIG_BOOLEAN_AUTO; +static int cgroup_enable_pressure_memory_full = CONFIG_BOOLEAN_AUTO; + +static int cgroup_enable_systemd_services = CONFIG_BOOLEAN_YES; +static int cgroup_enable_systemd_services_detailed_memory = CONFIG_BOOLEAN_NO; +static int cgroup_used_memory_without_cache = CONFIG_BOOLEAN_YES; + +static int cgroup_use_unified_cgroups = CONFIG_BOOLEAN_NO; +static int cgroup_unified_exist = CONFIG_BOOLEAN_AUTO; + +static int cgroup_search_in_devices = 1; + +static int cgroup_enable_new_cgroups_detected_at_runtime = 1; +static int cgroup_check_for_new_every = 10; +static int cgroup_update_every = 1; +static int cgroup_containers_chart_priority = NETDATA_CHART_PRIO_CGROUPS_CONTAINERS; + +static int cgroup_recheck_zero_blkio_every_iterations = 10; +static int cgroup_recheck_zero_mem_failcnt_every_iterations = 10; +static int cgroup_recheck_zero_mem_detailed_every_iterations = 10; + +static char *cgroup_cpuacct_base = NULL; +static char *cgroup_cpuset_base = NULL; +static char *cgroup_blkio_base = NULL; +static char *cgroup_memory_base = NULL; +static char *cgroup_devices_base = NULL; +static char *cgroup_unified_base = NULL; + +static int cgroup_root_count = 0; +static int cgroup_root_max = 1000; +static int cgroup_max_depth = 0; + +static SIMPLE_PATTERN *enabled_cgroup_patterns = NULL; +static SIMPLE_PATTERN *enabled_cgroup_paths = NULL; +static SIMPLE_PATTERN *enabled_cgroup_renames = NULL; +static SIMPLE_PATTERN *systemd_services_cgroups = NULL; + +static char *cgroups_rename_script = NULL; +static char *cgroups_network_interface_script = NULL; + +static int cgroups_check = 0; + +static uint32_t Read_hash = 0; +static uint32_t Write_hash = 0; +static uint32_t user_hash = 0; +static uint32_t system_hash = 0; + +enum cgroups_type { CGROUPS_AUTODETECT_FAIL, CGROUPS_V1, CGROUPS_V2 }; + +enum cgroups_systemd_setting { + SYSTEMD_CGROUP_ERR, + SYSTEMD_CGROUP_LEGACY, + SYSTEMD_CGROUP_HYBRID, + SYSTEMD_CGROUP_UNIFIED +}; + +struct cgroups_systemd_config_setting { + char *name; + enum cgroups_systemd_setting setting; +}; + +static struct cgroups_systemd_config_setting cgroups_systemd_options[] = { + { .name = "legacy", .setting = SYSTEMD_CGROUP_LEGACY }, + { .name = "hybrid", .setting = SYSTEMD_CGROUP_HYBRID }, + { .name = "unified", .setting = SYSTEMD_CGROUP_UNIFIED }, + { .name = NULL, .setting = SYSTEMD_CGROUP_ERR }, +}; + +/* on Fed systemd is not in PATH for some reason */ +#define SYSTEMD_CMD_RHEL "/usr/lib/systemd/systemd --version" +#define SYSTEMD_HIERARCHY_STRING "default-hierarchy=" + +#define MAXSIZE_PROC_CMDLINE 4096 +static enum cgroups_systemd_setting cgroups_detect_systemd(const char *exec) +{ + pid_t command_pid; + enum cgroups_systemd_setting retval = SYSTEMD_CGROUP_ERR; + char buf[MAXSIZE_PROC_CMDLINE]; + char *begin, *end; + + FILE *f = mypopen(exec, &command_pid); + + if (!f) + return retval; + + while (fgets(buf, MAXSIZE_PROC_CMDLINE, f) != NULL) { + if ((begin = strstr(buf, SYSTEMD_HIERARCHY_STRING))) { + end = begin = begin + strlen(SYSTEMD_HIERARCHY_STRING); + if (!*begin) + break; + while (isalpha(*end)) + end++; + *end = 0; + for (int i = 0; cgroups_systemd_options[i].name; i++) { + if (!strcmp(begin, cgroups_systemd_options[i].name)) { + retval = cgroups_systemd_options[i].setting; + break; + } + } + break; + } + } + + if (mypclose(f, command_pid)) + return SYSTEMD_CGROUP_ERR; + + return retval; +} + +static enum cgroups_type cgroups_try_detect_version() +{ + pid_t command_pid; + char buf[MAXSIZE_PROC_CMDLINE]; + enum cgroups_systemd_setting systemd_setting; + int cgroups2_available = 0; + + // 1. check if cgroups2 availible on system at all + FILE *f = mypopen("grep cgroup /proc/filesystems", &command_pid); + if (!f) { + error("popen failed"); + return CGROUPS_AUTODETECT_FAIL; + } + while (fgets(buf, MAXSIZE_PROC_CMDLINE, f) != NULL) { + if (strstr(buf, "cgroup2")) { + cgroups2_available = 1; + break; + } + } + if(mypclose(f, command_pid)) + return CGROUPS_AUTODETECT_FAIL; + + if(!cgroups2_available) + return CGROUPS_V1; + + // 2. check systemd compiletime setting + if ((systemd_setting = cgroups_detect_systemd("systemd --version")) == SYSTEMD_CGROUP_ERR) + systemd_setting = cgroups_detect_systemd(SYSTEMD_CMD_RHEL); + + if(systemd_setting == SYSTEMD_CGROUP_ERR) + return CGROUPS_AUTODETECT_FAIL; + + if(systemd_setting == SYSTEMD_CGROUP_LEGACY || systemd_setting == SYSTEMD_CGROUP_HYBRID) { + // curently we prefer V1 if HYBRID is set as it seems to be more feature complete + // in the future we might want to continue here if SYSTEMD_CGROUP_HYBRID + // and go ahead with V2 + return CGROUPS_V1; + } + + // 3. if we are unified as on Fedora (default cgroups2 only mode) + // check kernel command line flag that can override that setting + f = fopen("/proc/cmdline", "r"); + if (!f) { + error("Error reading kernel boot commandline parameters"); + return CGROUPS_AUTODETECT_FAIL; + } + + if (!fgets(buf, MAXSIZE_PROC_CMDLINE, f)) { + error("couldn't read all cmdline params into buffer"); + fclose(f); + return CGROUPS_AUTODETECT_FAIL; + } + + fclose(f); + + if (strstr(buf, "systemd.unified_cgroup_hierarchy=0")) { + info("cgroups v2 (unified cgroups) is available but are disabled on this system."); + return CGROUPS_V1; + } + return CGROUPS_V2; +} + +void read_cgroup_plugin_configuration() { + system_page_size = sysconf(_SC_PAGESIZE); + + Read_hash = simple_hash("Read"); + Write_hash = simple_hash("Write"); + user_hash = simple_hash("user"); + system_hash = simple_hash("system"); + + cgroup_update_every = (int)config_get_number("plugin:cgroups", "update every", localhost->rrd_update_every); + if(cgroup_update_every < localhost->rrd_update_every) + cgroup_update_every = localhost->rrd_update_every; + + cgroup_check_for_new_every = (int)config_get_number("plugin:cgroups", "check for new cgroups every", (long long)cgroup_check_for_new_every * (long long)cgroup_update_every); + if(cgroup_check_for_new_every < cgroup_update_every) + cgroup_check_for_new_every = cgroup_update_every; + + cgroup_use_unified_cgroups = config_get_boolean_ondemand("plugin:cgroups", "use unified cgroups", CONFIG_BOOLEAN_AUTO); + if(cgroup_use_unified_cgroups == CONFIG_BOOLEAN_AUTO) + cgroup_use_unified_cgroups = (cgroups_try_detect_version() == CGROUPS_V2); + + info("use unified cgroups %s", cgroup_use_unified_cgroups ? "true" : "false"); + + cgroup_containers_chart_priority = (int)config_get_number("plugin:cgroups", "containers priority", cgroup_containers_chart_priority); + if(cgroup_containers_chart_priority < 1) + cgroup_containers_chart_priority = NETDATA_CHART_PRIO_CGROUPS_CONTAINERS; + + cgroup_enable_cpuacct_stat = config_get_boolean_ondemand("plugin:cgroups", "enable cpuacct stat (total CPU)", cgroup_enable_cpuacct_stat); + cgroup_enable_cpuacct_usage = config_get_boolean_ondemand("plugin:cgroups", "enable cpuacct usage (per core CPU)", cgroup_enable_cpuacct_usage); + + cgroup_enable_memory = config_get_boolean_ondemand("plugin:cgroups", "enable memory (used mem including cache)", cgroup_enable_memory); + cgroup_enable_detailed_memory = config_get_boolean_ondemand("plugin:cgroups", "enable detailed memory", cgroup_enable_detailed_memory); + cgroup_enable_memory_failcnt = config_get_boolean_ondemand("plugin:cgroups", "enable memory limits fail count", cgroup_enable_memory_failcnt); + cgroup_enable_swap = config_get_boolean_ondemand("plugin:cgroups", "enable swap memory", cgroup_enable_swap); + + cgroup_enable_blkio_io = config_get_boolean_ondemand("plugin:cgroups", "enable blkio bandwidth", cgroup_enable_blkio_io); + cgroup_enable_blkio_ops = config_get_boolean_ondemand("plugin:cgroups", "enable blkio operations", cgroup_enable_blkio_ops); + cgroup_enable_blkio_throttle_io = config_get_boolean_ondemand("plugin:cgroups", "enable blkio throttle bandwidth", cgroup_enable_blkio_throttle_io); + cgroup_enable_blkio_throttle_ops = config_get_boolean_ondemand("plugin:cgroups", "enable blkio throttle operations", cgroup_enable_blkio_throttle_ops); + cgroup_enable_blkio_queued_ops = config_get_boolean_ondemand("plugin:cgroups", "enable blkio queued operations", cgroup_enable_blkio_queued_ops); + cgroup_enable_blkio_merged_ops = config_get_boolean_ondemand("plugin:cgroups", "enable blkio merged operations", cgroup_enable_blkio_merged_ops); + + cgroup_enable_pressure_cpu = config_get_boolean_ondemand("plugin:cgroups", "enable cpu pressure", cgroup_enable_pressure_cpu); + cgroup_enable_pressure_io_some = config_get_boolean_ondemand("plugin:cgroups", "enable io some pressure", cgroup_enable_pressure_io_some); + cgroup_enable_pressure_io_full = config_get_boolean_ondemand("plugin:cgroups", "enable io full pressure", cgroup_enable_pressure_io_full); + cgroup_enable_pressure_memory_some = config_get_boolean_ondemand("plugin:cgroups", "enable memory some pressure", cgroup_enable_pressure_memory_some); + cgroup_enable_pressure_memory_full = config_get_boolean_ondemand("plugin:cgroups", "enable memory full pressure", cgroup_enable_pressure_memory_full); + + cgroup_recheck_zero_blkio_every_iterations = (int)config_get_number("plugin:cgroups", "recheck zero blkio every iterations", cgroup_recheck_zero_blkio_every_iterations); + cgroup_recheck_zero_mem_failcnt_every_iterations = (int)config_get_number("plugin:cgroups", "recheck zero memory failcnt every iterations", cgroup_recheck_zero_mem_failcnt_every_iterations); + cgroup_recheck_zero_mem_detailed_every_iterations = (int)config_get_number("plugin:cgroups", "recheck zero detailed memory every iterations", cgroup_recheck_zero_mem_detailed_every_iterations); + + cgroup_enable_systemd_services = config_get_boolean("plugin:cgroups", "enable systemd services", cgroup_enable_systemd_services); + cgroup_enable_systemd_services_detailed_memory = config_get_boolean("plugin:cgroups", "enable systemd services detailed memory", cgroup_enable_systemd_services_detailed_memory); + cgroup_used_memory_without_cache = config_get_boolean("plugin:cgroups", "report used memory without cache", cgroup_used_memory_without_cache); + + char filename[FILENAME_MAX + 1], *s; + struct mountinfo *mi, *root = mountinfo_read(0); + if(!cgroup_use_unified_cgroups) { + // cgroup v1 does not have pressure metrics + cgroup_enable_pressure_cpu = + cgroup_enable_pressure_io_some = + cgroup_enable_pressure_io_full = + cgroup_enable_pressure_memory_some = + cgroup_enable_pressure_memory_full = CONFIG_BOOLEAN_NO; + + mi = mountinfo_find_by_filesystem_super_option(root, "cgroup", "cpuacct"); + if(!mi) mi = mountinfo_find_by_filesystem_mount_source(root, "cgroup", "cpuacct"); + if(!mi) { + error("CGROUP: cannot find cpuacct mountinfo. Assuming default: /sys/fs/cgroup/cpuacct"); + s = "/sys/fs/cgroup/cpuacct"; + } + else s = mi->mount_point; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, s); + cgroup_cpuacct_base = config_get("plugin:cgroups", "path to /sys/fs/cgroup/cpuacct", filename); + + mi = mountinfo_find_by_filesystem_super_option(root, "cgroup", "cpuset"); + if(!mi) mi = mountinfo_find_by_filesystem_mount_source(root, "cgroup", "cpuset"); + if(!mi) { + error("CGROUP: cannot find cpuset mountinfo. Assuming default: /sys/fs/cgroup/cpuset"); + s = "/sys/fs/cgroup/cpuset"; + } + else s = mi->mount_point; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, s); + cgroup_cpuset_base = config_get("plugin:cgroups", "path to /sys/fs/cgroup/cpuset", filename); + + mi = mountinfo_find_by_filesystem_super_option(root, "cgroup", "blkio"); + if(!mi) mi = mountinfo_find_by_filesystem_mount_source(root, "cgroup", "blkio"); + if(!mi) { + error("CGROUP: cannot find blkio mountinfo. Assuming default: /sys/fs/cgroup/blkio"); + s = "/sys/fs/cgroup/blkio"; + } + else s = mi->mount_point; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, s); + cgroup_blkio_base = config_get("plugin:cgroups", "path to /sys/fs/cgroup/blkio", filename); + + mi = mountinfo_find_by_filesystem_super_option(root, "cgroup", "memory"); + if(!mi) mi = mountinfo_find_by_filesystem_mount_source(root, "cgroup", "memory"); + if(!mi) { + error("CGROUP: cannot find memory mountinfo. Assuming default: /sys/fs/cgroup/memory"); + s = "/sys/fs/cgroup/memory"; + } + else s = mi->mount_point; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, s); + cgroup_memory_base = config_get("plugin:cgroups", "path to /sys/fs/cgroup/memory", filename); + + mi = mountinfo_find_by_filesystem_super_option(root, "cgroup", "devices"); + if(!mi) mi = mountinfo_find_by_filesystem_mount_source(root, "cgroup", "devices"); + if(!mi) { + error("CGROUP: cannot find devices mountinfo. Assuming default: /sys/fs/cgroup/devices"); + s = "/sys/fs/cgroup/devices"; + } + else s = mi->mount_point; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, s); + cgroup_devices_base = config_get("plugin:cgroups", "path to /sys/fs/cgroup/devices", filename); + } + else { + //cgroup_enable_cpuacct_stat = + cgroup_enable_cpuacct_usage = + //cgroup_enable_memory = + //cgroup_enable_detailed_memory = + cgroup_enable_memory_failcnt = + //cgroup_enable_swap = + //cgroup_enable_blkio_io = + //cgroup_enable_blkio_ops = + cgroup_enable_blkio_throttle_io = + cgroup_enable_blkio_throttle_ops = + cgroup_enable_blkio_merged_ops = + cgroup_enable_blkio_queued_ops = CONFIG_BOOLEAN_NO; + cgroup_search_in_devices = 0; + cgroup_enable_systemd_services_detailed_memory = CONFIG_BOOLEAN_NO; + cgroup_used_memory_without_cache = CONFIG_BOOLEAN_NO; //unified cgroups use different values + + //TODO: can there be more than 1 cgroup2 mount point? + mi = mountinfo_find_by_filesystem_super_option(root, "cgroup2", "rw"); //there is no cgroup2 specific super option - for now use 'rw' option + if(mi) debug(D_CGROUP, "found unified cgroup root using super options, with path: '%s'", mi->mount_point); + if(!mi) { + mi = mountinfo_find_by_filesystem_mount_source(root, "cgroup2", "cgroup"); + if(mi) debug(D_CGROUP, "found unified cgroup root using mountsource info, with path: '%s'", mi->mount_point); + } + if(!mi) { + error("CGROUP: cannot find cgroup2 mountinfo. Assuming default: /sys/fs/cgroup"); + s = "/sys/fs/cgroup"; + } + else s = mi->mount_point; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, s); + cgroup_unified_base = config_get("plugin:cgroups", "path to unified cgroups", filename); + debug(D_CGROUP, "using cgroup root: '%s'", cgroup_unified_base); + } + + cgroup_root_max = (int)config_get_number("plugin:cgroups", "max cgroups to allow", cgroup_root_max); + cgroup_max_depth = (int)config_get_number("plugin:cgroups", "max cgroups depth to monitor", cgroup_max_depth); + + cgroup_enable_new_cgroups_detected_at_runtime = config_get_boolean("plugin:cgroups", "enable new cgroups detected at run time", cgroup_enable_new_cgroups_detected_at_runtime); + + enabled_cgroup_patterns = simple_pattern_create( + config_get("plugin:cgroups", "enable by default cgroups matching", + // ---------------------------------------------------------------- + + " !*/init.scope " // ignore init.scope + " !/system.slice/run-*.scope " // ignore system.slice/run-XXXX.scope + " *.scope " // we need all other *.scope for sure + + // ---------------------------------------------------------------- + + " /machine.slice/*.service " // #3367 systemd-nspawn + " /kubepods/pod*/* " // k8s containers + " /kubepods/*/pod*/* " // k8s containers + + // ---------------------------------------------------------------- + + " !/kubepods* " // all other k8s cgroups + " !*/vcpu* " // libvirtd adds these sub-cgroups + " !*/emulator " // libvirtd adds these sub-cgroups + " !*.mount " + " !*.partition " + " !*.service " + " !*.socket " + " !*.slice " + " !*.swap " + " !*.user " + " !/ " + " !/docker " + " !/libvirt " + " !/lxc " + " !/lxc/*/* " // #1397 #2649 + " !/lxc.monitor* " + " !/lxc.pivot " + " !/lxc.payload " + " !/machine " + " !/qemu " + " !/system " + " !/systemd " + " !/user " + " * " // enable anything else + ), NULL, SIMPLE_PATTERN_EXACT); + + enabled_cgroup_paths = simple_pattern_create( + config_get("plugin:cgroups", "search for cgroups in subpaths matching", + " !*/init.scope " // ignore init.scope + " !*-qemu " // #345 + " !*.libvirt-qemu " // #3010 + " !/init.scope " + " !/system " + " !/systemd " + " !/user " + " !/user.slice " + " !/lxc/*/* " // #2161 #2649 + " !/lxc.monitor " + " !/lxc.payload/*/* " + " !/lxc.payload.* " + " * " + ), NULL, SIMPLE_PATTERN_EXACT); + + snprintfz(filename, FILENAME_MAX, "%s/cgroup-name.sh", netdata_configured_primary_plugins_dir); + cgroups_rename_script = config_get("plugin:cgroups", "script to get cgroup names", filename); + + snprintfz(filename, FILENAME_MAX, "%s/cgroup-network", netdata_configured_primary_plugins_dir); + cgroups_network_interface_script = config_get("plugin:cgroups", "script to get cgroup network interfaces", filename); + + enabled_cgroup_renames = simple_pattern_create( + config_get("plugin:cgroups", "run script to rename cgroups matching", + " !/ " + " !*.mount " + " !*.socket " + " !*.partition " + " /machine.slice/*.service " // #3367 systemd-nspawn + " !*.service " + " !*.slice " + " !*.swap " + " !*.user " + " !init.scope " + " !*.scope/vcpu* " // libvirtd adds these sub-cgroups + " !*.scope/emulator " // libvirtd adds these sub-cgroups + " *.scope " + " *docker* " + " *lxc* " + " *qemu* " + " *kubepods* " // #3396 kubernetes + " *.libvirt-qemu " // #3010 + " * " + ), NULL, SIMPLE_PATTERN_EXACT); + + if(cgroup_enable_systemd_services) { + systemd_services_cgroups = simple_pattern_create( + config_get("plugin:cgroups", "cgroups to match as systemd services", + " !/system.slice/*/*.service " + " /system.slice/*.service " + ), NULL, SIMPLE_PATTERN_EXACT); + } + + mountinfo_free_all(root); +} + +// ---------------------------------------------------------------------------- +// cgroup objects + +struct blkio { + int updated; + int enabled; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + int delay_counter; + + char *filename; + + unsigned long long Read; + unsigned long long Write; +/* + unsigned long long Sync; + unsigned long long Async; + unsigned long long Total; +*/ +}; + +// https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt +struct memory { + ARL_BASE *arl_base; + ARL_ENTRY *arl_dirty; + ARL_ENTRY *arl_swap; + + int updated_detailed; + int updated_usage_in_bytes; + int updated_msw_usage_in_bytes; + int updated_failcnt; + + int enabled_detailed; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + int enabled_usage_in_bytes; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + int enabled_msw_usage_in_bytes; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + int enabled_failcnt; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + + int delay_counter_detailed; + int delay_counter_failcnt; + + char *filename_detailed; + char *filename_usage_in_bytes; + char *filename_msw_usage_in_bytes; + char *filename_failcnt; + + int detailed_has_dirty; + int detailed_has_swap; + + // detailed metrics +/* + unsigned long long cache; + unsigned long long rss; + unsigned long long rss_huge; + unsigned long long mapped_file; + unsigned long long writeback; + unsigned long long dirty; + unsigned long long swap; + unsigned long long pgpgin; + unsigned long long pgpgout; + unsigned long long pgfault; + unsigned long long pgmajfault; + unsigned long long inactive_anon; + unsigned long long active_anon; + unsigned long long inactive_file; + unsigned long long active_file; + unsigned long long unevictable; + unsigned long long hierarchical_memory_limit; +*/ + //unified cgroups metrics + unsigned long long anon; + unsigned long long kernel_stack; + unsigned long long slab; + unsigned long long sock; + unsigned long long shmem; + unsigned long long anon_thp; + //unsigned long long file_writeback; + //unsigned long long file_dirty; + //unsigned long long file; + + unsigned long long total_cache; + unsigned long long total_rss; + unsigned long long total_rss_huge; + unsigned long long total_mapped_file; + unsigned long long total_writeback; + unsigned long long total_dirty; + unsigned long long total_swap; + unsigned long long total_pgpgin; + unsigned long long total_pgpgout; + unsigned long long total_pgfault; + unsigned long long total_pgmajfault; +/* + unsigned long long total_inactive_anon; + unsigned long long total_active_anon; + unsigned long long total_inactive_file; + unsigned long long total_active_file; + unsigned long long total_unevictable; +*/ + + // single file metrics + unsigned long long usage_in_bytes; + unsigned long long msw_usage_in_bytes; + unsigned long long failcnt; +}; + +// https://www.kernel.org/doc/Documentation/cgroup-v1/cpuacct.txt +struct cpuacct_stat { + int updated; + int enabled; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + + char *filename; + + unsigned long long user; + unsigned long long system; +}; + +// https://www.kernel.org/doc/Documentation/cgroup-v1/cpuacct.txt +struct cpuacct_usage { + int updated; + int enabled; // CONFIG_BOOLEAN_YES or CONFIG_BOOLEAN_AUTO + + char *filename; + + unsigned int cpus; + unsigned long long *cpu_percpu; +}; + +struct cgroup_network_interface { + const char *host_device; + const char *container_device; + struct cgroup_network_interface *next; +}; + +#define CGROUP_OPTIONS_DISABLED_DUPLICATE 0x00000001 +#define CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE 0x00000002 +#define CGROUP_OPTIONS_IS_UNIFIED 0x00000004 + +// *** WARNING *** The fields are not thread safe. Take care of safe usage. +struct cgroup { + uint32_t options; + + char available; // found in the filesystem + char enabled; // enabled in the config + + char pending_renames; + + char *id; + uint32_t hash; + + char *chart_id; + uint32_t hash_chart; + + char *chart_title; + + struct label *chart_labels; + + struct cpuacct_stat cpuacct_stat; + struct cpuacct_usage cpuacct_usage; + + struct memory memory; + + struct blkio io_service_bytes; // bytes + struct blkio io_serviced; // operations + + struct blkio throttle_io_service_bytes; // bytes + struct blkio throttle_io_serviced; // operations + + struct blkio io_merged; // operations + struct blkio io_queued; // operations + + struct cgroup_network_interface *interfaces; + + struct pressure cpu_pressure; + struct pressure io_pressure; + struct pressure memory_pressure; + + // per cgroup charts + RRDSET *st_cpu; + RRDSET *st_cpu_limit; + RRDSET *st_cpu_per_core; + RRDSET *st_mem; + RRDSET *st_writeback; + RRDSET *st_mem_activity; + RRDSET *st_pgfaults; + RRDSET *st_mem_usage; + RRDSET *st_mem_usage_limit; + RRDSET *st_mem_failcnt; + RRDSET *st_io; + RRDSET *st_serviced_ops; + RRDSET *st_throttle_io; + RRDSET *st_throttle_serviced_ops; + RRDSET *st_queued_ops; + RRDSET *st_merged_ops; + + // per cgroup chart variables + char *filename_cpuset_cpus; + unsigned long long cpuset_cpus; + + char *filename_cpu_cfs_period; + unsigned long long cpu_cfs_period; + + char *filename_cpu_cfs_quota; + unsigned long long cpu_cfs_quota; + + RRDSETVAR *chart_var_cpu_limit; + calculated_number prev_cpu_usage; + + char *filename_memory_limit; + unsigned long long memory_limit; + RRDSETVAR *chart_var_memory_limit; + + char *filename_memoryswap_limit; + unsigned long long memoryswap_limit; + RRDSETVAR *chart_var_memoryswap_limit; + + // services + RRDDIM *rd_cpu; + RRDDIM *rd_mem_usage; + RRDDIM *rd_mem_failcnt; + RRDDIM *rd_swap_usage; + + RRDDIM *rd_mem_detailed_cache; + RRDDIM *rd_mem_detailed_rss; + RRDDIM *rd_mem_detailed_mapped; + RRDDIM *rd_mem_detailed_writeback; + RRDDIM *rd_mem_detailed_pgpgin; + RRDDIM *rd_mem_detailed_pgpgout; + RRDDIM *rd_mem_detailed_pgfault; + RRDDIM *rd_mem_detailed_pgmajfault; + + RRDDIM *rd_io_service_bytes_read; + RRDDIM *rd_io_serviced_read; + RRDDIM *rd_throttle_io_read; + RRDDIM *rd_throttle_io_serviced_read; + RRDDIM *rd_io_queued_read; + RRDDIM *rd_io_merged_read; + + RRDDIM *rd_io_service_bytes_write; + RRDDIM *rd_io_serviced_write; + RRDDIM *rd_throttle_io_write; + RRDDIM *rd_throttle_io_serviced_write; + RRDDIM *rd_io_queued_write; + RRDDIM *rd_io_merged_write; + + struct cgroup *next; + struct cgroup *discovered_next; + +} *cgroup_root = NULL; + +uv_mutex_t cgroup_root_mutex; + +struct cgroup *discovered_cgroup_root = NULL; + +struct discovery_thread { + uv_thread_t thread; + uv_mutex_t mutex; + uv_cond_t cond_var; + int start_discovery; + int exited; +} discovery_thread; + +// ---------------------------------------------------------------------------- +// read values from /sys + +static inline void cgroup_read_cpuacct_stat(struct cpuacct_stat *cp) { + static procfile *ff = NULL; + + if(likely(cp->filename)) { + ff = procfile_reopen(ff, cp->filename, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + cp->updated = 0; + cgroups_check = 1; + return; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + cp->updated = 0; + cgroups_check = 1; + return; + } + + unsigned long i, lines = procfile_lines(ff); + + if(unlikely(lines < 1)) { + error("CGROUP: file '%s' should have 1+ lines.", cp->filename); + cp->updated = 0; + return; + } + + for(i = 0; i < lines ; i++) { + char *s = procfile_lineword(ff, i, 0); + uint32_t hash = simple_hash(s); + + if(unlikely(hash == user_hash && !strcmp(s, "user"))) + cp->user = str2ull(procfile_lineword(ff, i, 1)); + + else if(unlikely(hash == system_hash && !strcmp(s, "system"))) + cp->system = str2ull(procfile_lineword(ff, i, 1)); + } + + cp->updated = 1; + + if(unlikely(cp->enabled == CONFIG_BOOLEAN_AUTO && + (cp->user || cp->system || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES))) + cp->enabled = CONFIG_BOOLEAN_YES; + } +} + +static inline void cgroup2_read_cpuacct_stat(struct cpuacct_stat *cp) { + static procfile *ff = NULL; + + if(likely(cp->filename)) { + ff = procfile_reopen(ff, cp->filename, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + cp->updated = 0; + cgroups_check = 1; + return; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + cp->updated = 0; + cgroups_check = 1; + return; + } + + unsigned long lines = procfile_lines(ff); + + if(unlikely(lines < 3)) { + error("CGROUP: file '%s' should have 3+ lines.", cp->filename); + cp->updated = 0; + return; + } + + cp->user = str2ull(procfile_lineword(ff, 1, 1)); + cp->system = str2ull(procfile_lineword(ff, 2, 1)); + + cp->updated = 1; + + if(unlikely(cp->enabled == CONFIG_BOOLEAN_AUTO && + (cp->user || cp->system || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES))) + cp->enabled = CONFIG_BOOLEAN_YES; + } +} + +static inline void cgroup_read_cpuacct_usage(struct cpuacct_usage *ca) { + static procfile *ff = NULL; + + if(likely(ca->filename)) { + ff = procfile_reopen(ff, ca->filename, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + ca->updated = 0; + cgroups_check = 1; + return; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + ca->updated = 0; + cgroups_check = 1; + return; + } + + if(unlikely(procfile_lines(ff) < 1)) { + error("CGROUP: file '%s' should have 1+ lines but has %zu.", ca->filename, procfile_lines(ff)); + ca->updated = 0; + return; + } + + unsigned long i = procfile_linewords(ff, 0); + if(unlikely(i == 0)) { + ca->updated = 0; + return; + } + + // we may have 1 more CPU reported + while(i > 0) { + char *s = procfile_lineword(ff, 0, i - 1); + if(!*s) i--; + else break; + } + + if(unlikely(i != ca->cpus)) { + freez(ca->cpu_percpu); + ca->cpu_percpu = mallocz(sizeof(unsigned long long) * i); + ca->cpus = (unsigned int)i; + } + + unsigned long long total = 0; + for(i = 0; i < ca->cpus ;i++) { + unsigned long long n = str2ull(procfile_lineword(ff, 0, i)); + ca->cpu_percpu[i] = n; + total += n; + } + + ca->updated = 1; + + if(unlikely(ca->enabled == CONFIG_BOOLEAN_AUTO && + (total || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES))) + ca->enabled = CONFIG_BOOLEAN_YES; + } +} + +static inline void cgroup_read_blkio(struct blkio *io) { + if(unlikely(io->enabled == CONFIG_BOOLEAN_AUTO && io->delay_counter > 0)) { + io->delay_counter--; + return; + } + + if(likely(io->filename)) { + static procfile *ff = NULL; + + ff = procfile_reopen(ff, io->filename, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + io->updated = 0; + cgroups_check = 1; + return; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + io->updated = 0; + cgroups_check = 1; + return; + } + + unsigned long i, lines = procfile_lines(ff); + + if(unlikely(lines < 1)) { + error("CGROUP: file '%s' should have 1+ lines.", io->filename); + io->updated = 0; + return; + } + + io->Read = 0; + io->Write = 0; +/* + io->Sync = 0; + io->Async = 0; + io->Total = 0; +*/ + + for(i = 0; i < lines ; i++) { + char *s = procfile_lineword(ff, i, 1); + uint32_t hash = simple_hash(s); + + if(unlikely(hash == Read_hash && !strcmp(s, "Read"))) + io->Read += str2ull(procfile_lineword(ff, i, 2)); + + else if(unlikely(hash == Write_hash && !strcmp(s, "Write"))) + io->Write += str2ull(procfile_lineword(ff, i, 2)); + +/* + else if(unlikely(hash == Sync_hash && !strcmp(s, "Sync"))) + io->Sync += str2ull(procfile_lineword(ff, i, 2)); + + else if(unlikely(hash == Async_hash && !strcmp(s, "Async"))) + io->Async += str2ull(procfile_lineword(ff, i, 2)); + + else if(unlikely(hash == Total_hash && !strcmp(s, "Total"))) + io->Total += str2ull(procfile_lineword(ff, i, 2)); +*/ + } + + io->updated = 1; + + if(unlikely(io->enabled == CONFIG_BOOLEAN_AUTO)) { + if(unlikely(io->Read || io->Write || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES)) + io->enabled = CONFIG_BOOLEAN_YES; + else + io->delay_counter = cgroup_recheck_zero_blkio_every_iterations; + } + } +} + +static inline void cgroup2_read_blkio(struct blkio *io, unsigned int word_offset) { + if(unlikely(io->enabled == CONFIG_BOOLEAN_AUTO && io->delay_counter > 0)) { + io->delay_counter--; + return; + } + + if(likely(io->filename)) { + static procfile *ff = NULL; + + ff = procfile_reopen(ff, io->filename, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + io->updated = 0; + cgroups_check = 1; + return; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + io->updated = 0; + cgroups_check = 1; + return; + } + + unsigned long i, lines = procfile_lines(ff); + + if (unlikely(lines < 1)) { + error("CGROUP: file '%s' should have 1+ lines.", io->filename); + io->updated = 0; + return; + } + + io->Read = 0; + io->Write = 0; + + for (i = 0; i < lines; i++) { + io->Read += str2ull(procfile_lineword(ff, i, 2 + word_offset)); + io->Write += str2ull(procfile_lineword(ff, i, 4 + word_offset)); + } + + io->updated = 1; + + if(unlikely(io->enabled == CONFIG_BOOLEAN_AUTO)) { + if(unlikely(io->Read || io->Write || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES)) + io->enabled = CONFIG_BOOLEAN_YES; + else + io->delay_counter = cgroup_recheck_zero_blkio_every_iterations; + } + } +} + +static inline void cgroup2_read_pressure(struct pressure *res) { + static procfile *ff = NULL; + + if (likely(res->filename)) { + ff = procfile_reopen(ff, res->filename, " =", PROCFILE_FLAG_DEFAULT); + if (unlikely(!ff)) { + res->updated = 0; + cgroups_check = 1; + return; + } + + ff = procfile_readall(ff); + if (unlikely(!ff)) { + res->updated = 0; + cgroups_check = 1; + return; + } + + size_t lines = procfile_lines(ff); + if (lines < 1) { + error("CGROUP: file '%s' should have 1+ lines.", res->filename); + res->updated = 0; + return; + } + + res->some.value10 = strtod(procfile_lineword(ff, 0, 2), NULL); + res->some.value60 = strtod(procfile_lineword(ff, 0, 4), NULL); + res->some.value300 = strtod(procfile_lineword(ff, 0, 6), NULL); + + if (lines > 2) { + res->full.value10 = strtod(procfile_lineword(ff, 1, 2), NULL); + res->full.value60 = strtod(procfile_lineword(ff, 1, 4), NULL); + res->full.value300 = strtod(procfile_lineword(ff, 1, 6), NULL); + } + + res->updated = 1; + + if (unlikely(res->some.enabled == CONFIG_BOOLEAN_AUTO)) { + res->some.enabled = CONFIG_BOOLEAN_YES; + if (lines > 2) { + res->full.enabled = CONFIG_BOOLEAN_YES; + } else { + res->full.enabled = CONFIG_BOOLEAN_NO; + } + } + } +} + +static inline void cgroup_read_memory(struct memory *mem, char parent_cg_is_unified) { + static procfile *ff = NULL; + + // read detailed ram usage + if(likely(mem->filename_detailed)) { + if(unlikely(mem->enabled_detailed == CONFIG_BOOLEAN_AUTO && mem->delay_counter_detailed > 0)) { + mem->delay_counter_detailed--; + goto memory_next; + } + + ff = procfile_reopen(ff, mem->filename_detailed, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + mem->updated_detailed = 0; + cgroups_check = 1; + goto memory_next; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + mem->updated_detailed = 0; + cgroups_check = 1; + goto memory_next; + } + + unsigned long i, lines = procfile_lines(ff); + + if(unlikely(lines < 1)) { + error("CGROUP: file '%s' should have 1+ lines.", mem->filename_detailed); + mem->updated_detailed = 0; + goto memory_next; + } + + + if(unlikely(!mem->arl_base)) { + if(parent_cg_is_unified == 0){ + mem->arl_base = arl_create("cgroup/memory", NULL, 60); + + arl_expect(mem->arl_base, "total_cache", &mem->total_cache); + arl_expect(mem->arl_base, "total_rss", &mem->total_rss); + arl_expect(mem->arl_base, "total_rss_huge", &mem->total_rss_huge); + arl_expect(mem->arl_base, "total_mapped_file", &mem->total_mapped_file); + arl_expect(mem->arl_base, "total_writeback", &mem->total_writeback); + mem->arl_dirty = arl_expect(mem->arl_base, "total_dirty", &mem->total_dirty); + mem->arl_swap = arl_expect(mem->arl_base, "total_swap", &mem->total_swap); + arl_expect(mem->arl_base, "total_pgpgin", &mem->total_pgpgin); + arl_expect(mem->arl_base, "total_pgpgout", &mem->total_pgpgout); + arl_expect(mem->arl_base, "total_pgfault", &mem->total_pgfault); + arl_expect(mem->arl_base, "total_pgmajfault", &mem->total_pgmajfault); + } else { + mem->arl_base = arl_create("cgroup/memory", NULL, 60); + + arl_expect(mem->arl_base, "anon", &mem->anon); + arl_expect(mem->arl_base, "kernel_stack", &mem->kernel_stack); + arl_expect(mem->arl_base, "slab", &mem->slab); + arl_expect(mem->arl_base, "sock", &mem->sock); + arl_expect(mem->arl_base, "anon_thp", &mem->anon_thp); + arl_expect(mem->arl_base, "file", &mem->total_mapped_file); + arl_expect(mem->arl_base, "file_writeback", &mem->total_writeback); + mem->arl_dirty = arl_expect(mem->arl_base, "file_dirty", &mem->total_dirty); + arl_expect(mem->arl_base, "pgfault", &mem->total_pgfault); + arl_expect(mem->arl_base, "pgmajfault", &mem->total_pgmajfault); + } + } + + arl_begin(mem->arl_base); + + for(i = 0; i < lines ; i++) { + if(arl_check(mem->arl_base, + procfile_lineword(ff, i, 0), + procfile_lineword(ff, i, 1))) break; + } + + if(unlikely(mem->arl_dirty->flags & ARL_ENTRY_FLAG_FOUND)) + mem->detailed_has_dirty = 1; + + if(unlikely(parent_cg_is_unified == 0 && mem->arl_swap->flags & ARL_ENTRY_FLAG_FOUND)) + mem->detailed_has_swap = 1; + + // fprintf(stderr, "READ: '%s', cache: %llu, rss: %llu, rss_huge: %llu, mapped_file: %llu, writeback: %llu, dirty: %llu, swap: %llu, pgpgin: %llu, pgpgout: %llu, pgfault: %llu, pgmajfault: %llu, inactive_anon: %llu, active_anon: %llu, inactive_file: %llu, active_file: %llu, unevictable: %llu, hierarchical_memory_limit: %llu, total_cache: %llu, total_rss: %llu, total_rss_huge: %llu, total_mapped_file: %llu, total_writeback: %llu, total_dirty: %llu, total_swap: %llu, total_pgpgin: %llu, total_pgpgout: %llu, total_pgfault: %llu, total_pgmajfault: %llu, total_inactive_anon: %llu, total_active_anon: %llu, total_inactive_file: %llu, total_active_file: %llu, total_unevictable: %llu\n", mem->filename, mem->cache, mem->rss, mem->rss_huge, mem->mapped_file, mem->writeback, mem->dirty, mem->swap, mem->pgpgin, mem->pgpgout, mem->pgfault, mem->pgmajfault, mem->inactive_anon, mem->active_anon, mem->inactive_file, mem->active_file, mem->unevictable, mem->hierarchical_memory_limit, mem->total_cache, mem->total_rss, mem->total_rss_huge, mem->total_mapped_file, mem->total_writeback, mem->total_dirty, mem->total_swap, mem->total_pgpgin, mem->total_pgpgout, mem->total_pgfault, mem->total_pgmajfault, mem->total_inactive_anon, mem->total_active_anon, mem->total_inactive_file, mem->total_active_file, mem->total_unevictable); + + mem->updated_detailed = 1; + + if(unlikely(mem->enabled_detailed == CONFIG_BOOLEAN_AUTO)) { + if(( (!parent_cg_is_unified) && ( mem->total_cache || mem->total_dirty || mem->total_rss || mem->total_rss_huge || mem->total_mapped_file || mem->total_writeback + || mem->total_swap || mem->total_pgpgin || mem->total_pgpgout || mem->total_pgfault || mem->total_pgmajfault)) + || (parent_cg_is_unified && ( mem->anon || mem->total_dirty || mem->kernel_stack || mem->slab || mem->sock || mem->total_writeback + || mem->anon_thp || mem->total_pgfault || mem->total_pgmajfault)) + || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES) + mem->enabled_detailed = CONFIG_BOOLEAN_YES; + else + mem->delay_counter_detailed = cgroup_recheck_zero_mem_detailed_every_iterations; + } + } + +memory_next: + + // read usage_in_bytes + if(likely(mem->filename_usage_in_bytes)) { + mem->updated_usage_in_bytes = !read_single_number_file(mem->filename_usage_in_bytes, &mem->usage_in_bytes); + if(unlikely(mem->updated_usage_in_bytes && mem->enabled_usage_in_bytes == CONFIG_BOOLEAN_AUTO && + (mem->usage_in_bytes || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES))) + mem->enabled_usage_in_bytes = CONFIG_BOOLEAN_YES; + } + + // read msw_usage_in_bytes + if(likely(mem->filename_msw_usage_in_bytes)) { + mem->updated_msw_usage_in_bytes = !read_single_number_file(mem->filename_msw_usage_in_bytes, &mem->msw_usage_in_bytes); + if(unlikely(mem->updated_msw_usage_in_bytes && mem->enabled_msw_usage_in_bytes == CONFIG_BOOLEAN_AUTO && + (mem->msw_usage_in_bytes || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES))) + mem->enabled_msw_usage_in_bytes = CONFIG_BOOLEAN_YES; + } + + // read failcnt + if(likely(mem->filename_failcnt)) { + if(unlikely(mem->enabled_failcnt == CONFIG_BOOLEAN_AUTO && mem->delay_counter_failcnt > 0)) { + mem->updated_failcnt = 0; + mem->delay_counter_failcnt--; + } + else { + mem->updated_failcnt = !read_single_number_file(mem->filename_failcnt, &mem->failcnt); + if(unlikely(mem->updated_failcnt && mem->enabled_failcnt == CONFIG_BOOLEAN_AUTO)) { + if(unlikely(mem->failcnt || netdata_zero_metrics_enabled == CONFIG_BOOLEAN_YES)) + mem->enabled_failcnt = CONFIG_BOOLEAN_YES; + else + mem->delay_counter_failcnt = cgroup_recheck_zero_mem_failcnt_every_iterations; + } + } + } +} + +static inline void cgroup_read(struct cgroup *cg) { + debug(D_CGROUP, "reading metrics for cgroups '%s'", cg->id); + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + cgroup_read_cpuacct_stat(&cg->cpuacct_stat); + cgroup_read_cpuacct_usage(&cg->cpuacct_usage); + cgroup_read_memory(&cg->memory, 0); + cgroup_read_blkio(&cg->io_service_bytes); + cgroup_read_blkio(&cg->io_serviced); + cgroup_read_blkio(&cg->throttle_io_service_bytes); + cgroup_read_blkio(&cg->throttle_io_serviced); + cgroup_read_blkio(&cg->io_merged); + cgroup_read_blkio(&cg->io_queued); + } + else { + //TODO: io_service_bytes and io_serviced use same file merge into 1 function + cgroup2_read_blkio(&cg->io_service_bytes, 0); + cgroup2_read_blkio(&cg->io_serviced, 4); + cgroup2_read_cpuacct_stat(&cg->cpuacct_stat); + cgroup2_read_pressure(&cg->cpu_pressure); + cgroup2_read_pressure(&cg->io_pressure); + cgroup2_read_pressure(&cg->memory_pressure); + cgroup_read_memory(&cg->memory, 1); + } +} + +static inline void read_all_cgroups(struct cgroup *root) { + debug(D_CGROUP, "reading metrics for all cgroups"); + + struct cgroup *cg; + + for(cg = root; cg ; cg = cg->next) + if(cg->enabled && !cg->pending_renames) + cgroup_read(cg); +} + +// ---------------------------------------------------------------------------- +// cgroup network interfaces + +#define CGROUP_NETWORK_INTERFACE_MAX_LINE 2048 +static inline void read_cgroup_network_interfaces(struct cgroup *cg) { + debug(D_CGROUP, "looking for the network interfaces of cgroup '%s' with chart id '%s' and title '%s'", cg->id, cg->chart_id, cg->chart_title); + + pid_t cgroup_pid; + char command[CGROUP_NETWORK_INTERFACE_MAX_LINE + 1]; + + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + snprintfz(command, CGROUP_NETWORK_INTERFACE_MAX_LINE, "exec %s --cgroup '%s%s'", cgroups_network_interface_script, cgroup_cpuacct_base, cg->id); + } + else { + snprintfz(command, CGROUP_NETWORK_INTERFACE_MAX_LINE, "exec %s --cgroup '%s%s'", cgroups_network_interface_script, cgroup_unified_base, cg->id); + } + + debug(D_CGROUP, "executing command '%s' for cgroup '%s'", command, cg->id); + FILE *fp = mypopen(command, &cgroup_pid); + if(!fp) { + error("CGROUP: cannot popen(\"%s\", \"r\").", command); + return; + } + + char *s; + char buffer[CGROUP_NETWORK_INTERFACE_MAX_LINE + 1]; + while((s = fgets(buffer, CGROUP_NETWORK_INTERFACE_MAX_LINE, fp))) { + trim(s); + + if(*s && *s != '\n') { + char *t = s; + while(*t && *t != ' ') t++; + if(*t == ' ') { + *t = '\0'; + t++; + } + + if(!*s) { + error("CGROUP: empty host interface returned by script"); + continue; + } + + if(!*t) { + error("CGROUP: empty guest interface returned by script"); + continue; + } + + struct cgroup_network_interface *i = callocz(1, sizeof(struct cgroup_network_interface)); + i->host_device = strdupz(s); + i->container_device = strdupz(t); + i->next = cg->interfaces; + cg->interfaces = i; + + info("CGROUP: cgroup '%s' has network interface '%s' as '%s'", cg->id, i->host_device, i->container_device); + + // register a device rename to proc_net_dev.c + netdev_rename_device_add(i->host_device, i->container_device, cg->chart_id, cg->chart_labels); + } + } + + mypclose(fp, cgroup_pid); + // debug(D_CGROUP, "closed command for cgroup '%s'", cg->id); +} + +static inline void free_cgroup_network_interfaces(struct cgroup *cg) { + while(cg->interfaces) { + struct cgroup_network_interface *i = cg->interfaces; + cg->interfaces = i->next; + + // delete the registration of proc_net_dev rename + netdev_rename_device_del(i->host_device); + + freez((void *)i->host_device); + freez((void *)i->container_device); + freez((void *)i); + } +} + +// ---------------------------------------------------------------------------- +// add/remove/find cgroup objects + +#define CGROUP_CHARTID_LINE_MAX 1024 + +static inline char *cgroup_title_strdupz(const char *s) { + if(!s || !*s) s = "/"; + + if(*s == '/' && s[1] != '\0') s++; + + char *r = strdupz(s); + netdata_fix_chart_name(r); + + return r; +} + +static inline char *cgroup_chart_id_strdupz(const char *s) { + if(!s || !*s) s = "/"; + + if(*s == '/' && s[1] != '\0') s++; + + char *r = strdupz(s); + netdata_fix_chart_id(r); + + return r; +} + +char *parse_k8s_data(struct label **labels, char *data) +{ + char *name = mystrsep(&data, " "); + + if (!data) { + return name; + } + + while (data) { + char *key = mystrsep(&data, "="); + + char *value; + if (data && *data == ',') { + value = ""; + *data++ = '\0'; + } else { + value = mystrsep(&data, ","); + } + value = strip_double_quotes(value, 1); + + if (!key || *key == '\0' || !value || *value == '\0') + continue; + + *labels = add_label_to_list(*labels, key, value, LABEL_SOURCE_KUBERNETES); + } + + return name; +} + +static inline void cgroup_get_chart_name(struct cgroup *cg) { + debug(D_CGROUP, "looking for the name of cgroup '%s' with chart id '%s' and title '%s'", cg->id, cg->chart_id, cg->chart_title); + + pid_t cgroup_pid; + char command[CGROUP_CHARTID_LINE_MAX + 1]; + + snprintfz(command, CGROUP_CHARTID_LINE_MAX, "exec %s '%s'", cgroups_rename_script, cg->chart_id); + + debug(D_CGROUP, "executing command \"%s\" for cgroup '%s'", command, cg->chart_id); + FILE *fp = mypopen(command, &cgroup_pid); + if(fp) { + // debug(D_CGROUP, "reading from command '%s' for cgroup '%s'", command, cg->id); + char buffer[CGROUP_CHARTID_LINE_MAX + 1]; + char *s = fgets(buffer, CGROUP_CHARTID_LINE_MAX, fp); + // debug(D_CGROUP, "closing command for cgroup '%s'", cg->id); + int name_error = mypclose(fp, cgroup_pid); + // debug(D_CGROUP, "closed command for cgroup '%s'", cg->id); + + if(s && *s && *s != '\n') { + debug(D_CGROUP, "cgroup '%s' should be renamed to '%s'", cg->chart_id, s); + + s = trim(s); + if (s) { + if(likely(name_error==0)) + cg->pending_renames = 0; + else if (unlikely(name_error==3)) { + debug(D_CGROUP, "cgroup '%s' disabled based due to rename command output", cg->chart_id); + cg->enabled = 0; + } + + if (likely(cg->pending_renames < 2)) { + char *name = s; + + if (!strncmp(s, "k8s_", 4)) { + free_label_list(cg->chart_labels); + name = parse_k8s_data(&cg->chart_labels, s); + } + + freez(cg->chart_title); + cg->chart_title = cgroup_title_strdupz(name); + + freez(cg->chart_id); + cg->chart_id = cgroup_chart_id_strdupz(name); + cg->hash_chart = simple_hash(cg->chart_id); + } + } + } + } + else + error("CGROUP: cannot popen(\"%s\", \"r\").", command); +} + +static inline struct cgroup *cgroup_add(const char *id) { + if(!id || !*id) id = "/"; + debug(D_CGROUP, "adding to list, cgroup with id '%s'", id); + + if(cgroup_root_count >= cgroup_root_max) { + info("CGROUP: maximum number of cgroups reached (%d). Not adding cgroup '%s'", cgroup_root_count, id); + return NULL; + } + + int def = simple_pattern_matches(enabled_cgroup_patterns, id)?cgroup_enable_new_cgroups_detected_at_runtime:0; + struct cgroup *cg = callocz(1, sizeof(struct cgroup)); + + cg->id = strdupz(id); + cg->hash = simple_hash(cg->id); + + cg->chart_title = cgroup_title_strdupz(id); + + cg->chart_id = cgroup_chart_id_strdupz(id); + cg->hash_chart = simple_hash(cg->chart_id); + + if(cgroup_use_unified_cgroups) cg->options |= CGROUP_OPTIONS_IS_UNIFIED; + + if(!discovered_cgroup_root) + discovered_cgroup_root = cg; + else { + // append it + struct cgroup *e; + for(e = discovered_cgroup_root; e->discovered_next ;e = e->discovered_next) ; + e->discovered_next = cg; + } + + cgroup_root_count++; + + // fix the chart_id and title by calling the external script + if(simple_pattern_matches(enabled_cgroup_renames, cg->id)) { + + cg->pending_renames = 2; + cgroup_get_chart_name(cg); + + debug(D_CGROUP, "cgroup '%s' renamed to '%s' (title: '%s')", cg->id, cg->chart_id, cg->chart_title); + } + else + debug(D_CGROUP, "cgroup '%s' will not be renamed - it matches the list of disabled cgroup renames (will be shown as '%s')", cg->id, cg->chart_id); + + int user_configurable = 1; + + // check if this cgroup should be a systemd service + if(cgroup_enable_systemd_services) { + if(simple_pattern_matches(systemd_services_cgroups, cg->id) || + simple_pattern_matches(systemd_services_cgroups, cg->chart_id)) { + debug(D_CGROUP, "cgroup '%s' with chart id '%s' (title: '%s') matches systemd services cgroups", cg->id, cg->chart_id, cg->chart_title); + + char buffer[CGROUP_CHARTID_LINE_MAX + 1]; + cg->options |= CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE; + + strncpy(buffer, cg->id, CGROUP_CHARTID_LINE_MAX); + char *s = buffer; + + //freez(cg->chart_id); + //cg->chart_id = cgroup_chart_id_strdupz(s); + //cg->hash_chart = simple_hash(cg->chart_id); + + // skip to the last slash + size_t len = strlen(s); + while(len--) if(unlikely(s[len] == '/')) break; + if(len) s = &s[len + 1]; + + // remove extension + len = strlen(s); + while(len--) if(unlikely(s[len] == '.')) break; + if(len) s[len] = '\0'; + + freez(cg->chart_title); + cg->chart_title = cgroup_title_strdupz(s); + + cg->enabled = 1; + user_configurable = 0; + + debug(D_CGROUP, "cgroup '%s' renamed to '%s' (title: '%s')", cg->id, cg->chart_id, cg->chart_title); + } + else + debug(D_CGROUP, "cgroup '%s' with chart id '%s' (title: '%s') does not match systemd services groups", cg->id, cg->chart_id, cg->chart_title); + } + + if(user_configurable) { + // allow the user to enable/disable this individualy + char option[FILENAME_MAX + 1]; + snprintfz(option, FILENAME_MAX, "enable cgroup %s", cg->chart_title); + cg->enabled = (char) config_get_boolean("plugin:cgroups", option, def); + } + + // detect duplicate cgroups + if(cg->enabled) { + struct cgroup *t; + for (t = discovered_cgroup_root; t; t = t->discovered_next) { + if (t != cg && t->enabled && t->hash_chart == cg->hash_chart && !strcmp(t->chart_id, cg->chart_id)) { + // TODO: use it after refactoring if system.slice might be scanned before init.scope/system.slice + // + // if (!strncmp(t->id, "/system.slice/", 14) && !strncmp(cg->id, "/init.scope/system.slice/", 25)) { + // error("CGROUP: chart id '%s' already exists with id '%s' and is enabled. Swapping them by enabling cgroup with id '%s' and disabling cgroup with id '%s'.", + // cg->chart_id, t->id, cg->id, t->id); + // t->enabled = 0; + // t->options |= CGROUP_OPTIONS_DISABLED_DUPLICATE; + // } + // else {} + // + // https://github.com/netdata/netdata/issues/797#issuecomment-241248884 + error("CGROUP: chart id '%s' already exists with id '%s' and is enabled and available. Disabling cgroup with id '%s'.", + cg->chart_id, t->id, cg->id); + cg->enabled = 0; + cg->options |= CGROUP_OPTIONS_DISABLED_DUPLICATE; + + break; + } + } + } + + if(cg->enabled && !cg->pending_renames && !(cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE)) + read_cgroup_network_interfaces(cg); + + debug(D_CGROUP, "ADDED CGROUP: '%s' with chart id '%s' and title '%s' as %s (default was %s)", cg->id, cg->chart_id, cg->chart_title, (cg->enabled)?"enabled":"disabled", (def)?"enabled":"disabled"); + + return cg; +} + +static inline void free_pressure(struct pressure *res) { + if (res->some.st) rrdset_is_obsolete(res->some.st); + if (res->full.st) rrdset_is_obsolete(res->full.st); + freez(res->filename); +} + +static inline void cgroup_free(struct cgroup *cg) { + debug(D_CGROUP, "Removing cgroup '%s' with chart id '%s' (was %s and %s)", cg->id, cg->chart_id, (cg->enabled)?"enabled":"disabled", (cg->available)?"available":"not available"); + + if(cg->st_cpu) rrdset_is_obsolete(cg->st_cpu); + if(cg->st_cpu_limit) rrdset_is_obsolete(cg->st_cpu_limit); + if(cg->st_cpu_per_core) rrdset_is_obsolete(cg->st_cpu_per_core); + if(cg->st_mem) rrdset_is_obsolete(cg->st_mem); + if(cg->st_writeback) rrdset_is_obsolete(cg->st_writeback); + if(cg->st_mem_activity) rrdset_is_obsolete(cg->st_mem_activity); + if(cg->st_pgfaults) rrdset_is_obsolete(cg->st_pgfaults); + if(cg->st_mem_usage) rrdset_is_obsolete(cg->st_mem_usage); + if(cg->st_mem_usage_limit) rrdset_is_obsolete(cg->st_mem_usage_limit); + if(cg->st_mem_failcnt) rrdset_is_obsolete(cg->st_mem_failcnt); + if(cg->st_io) rrdset_is_obsolete(cg->st_io); + if(cg->st_serviced_ops) rrdset_is_obsolete(cg->st_serviced_ops); + if(cg->st_throttle_io) rrdset_is_obsolete(cg->st_throttle_io); + if(cg->st_throttle_serviced_ops) rrdset_is_obsolete(cg->st_throttle_serviced_ops); + if(cg->st_queued_ops) rrdset_is_obsolete(cg->st_queued_ops); + if(cg->st_merged_ops) rrdset_is_obsolete(cg->st_merged_ops); + + freez(cg->filename_cpuset_cpus); + freez(cg->filename_cpu_cfs_period); + freez(cg->filename_cpu_cfs_quota); + freez(cg->filename_memory_limit); + freez(cg->filename_memoryswap_limit); + + free_cgroup_network_interfaces(cg); + + freez(cg->cpuacct_usage.cpu_percpu); + + freez(cg->cpuacct_stat.filename); + freez(cg->cpuacct_usage.filename); + + arl_free(cg->memory.arl_base); + freez(cg->memory.filename_detailed); + freez(cg->memory.filename_failcnt); + freez(cg->memory.filename_usage_in_bytes); + freez(cg->memory.filename_msw_usage_in_bytes); + + freez(cg->io_service_bytes.filename); + freez(cg->io_serviced.filename); + + freez(cg->throttle_io_service_bytes.filename); + freez(cg->throttle_io_serviced.filename); + + freez(cg->io_merged.filename); + freez(cg->io_queued.filename); + + free_pressure(&cg->cpu_pressure); + free_pressure(&cg->io_pressure); + free_pressure(&cg->memory_pressure); + + freez(cg->id); + freez(cg->chart_id); + freez(cg->chart_title); + + free_label_list(cg->chart_labels); + + freez(cg); + + cgroup_root_count--; +} + +// find if a given cgroup exists +static inline struct cgroup *cgroup_find(const char *id) { + debug(D_CGROUP, "searching for cgroup '%s'", id); + + uint32_t hash = simple_hash(id); + + struct cgroup *cg; + for(cg = discovered_cgroup_root; cg ; cg = cg->discovered_next) { + if(hash == cg->hash && strcmp(id, cg->id) == 0) + break; + } + + debug(D_CGROUP, "cgroup '%s' %s in memory", id, (cg)?"found":"not found"); + return cg; +} + +// ---------------------------------------------------------------------------- +// detect running cgroups + +// callback for find_file_in_subdirs() +static inline void found_subdir_in_dir(const char *dir) { + debug(D_CGROUP, "examining cgroup dir '%s'", dir); + + struct cgroup *cg = cgroup_find(dir); + if(!cg) { + if(*dir && cgroup_max_depth > 0) { + int depth = 0; + const char *s; + + for(s = dir; *s ;s++) + if(unlikely(*s == '/')) + depth++; + + if(depth > cgroup_max_depth) { + info("CGROUP: '%s' is too deep (%d, while max is %d)", dir, depth, cgroup_max_depth); + return; + } + } + // debug(D_CGROUP, "will add dir '%s' as cgroup", dir); + cg = cgroup_add(dir); + } + + if(cg) { + // delay renaming of the cgroup and looking for network interfaces to deal with the docker lag when starting the container + if(unlikely(cg->pending_renames == 1)) { + // fix the chart_id and title by calling the external script + if(simple_pattern_matches(enabled_cgroup_renames, cg->id)) { + + cgroup_get_chart_name(cg); + cg->pending_renames = 0; + + if(cg->enabled && !(cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE)) + read_cgroup_network_interfaces(cg); + + debug(D_CGROUP, "cgroup '%s' renamed to '%s' (title: '%s')", cg->id, cg->chart_id, cg->chart_title); + } + else + debug(D_CGROUP, "cgroup '%s' will not be renamed - it matches the list of disabled cgroup renames (will be shown as '%s')", cg->id, cg->chart_id); + } + + cg->available = 1; + } +} + +static inline int find_dir_in_subdirs(const char *base, const char *this, void (*callback)(const char *)) { + if(!this) this = base; + debug(D_CGROUP, "searching for directories in '%s' (base '%s')", this?this:"", base); + + size_t dirlen = strlen(this), baselen = strlen(base); + + int ret = -1; + int enabled = -1; + + const char *relative_path = &this[baselen]; + if(!*relative_path) relative_path = "/"; + + DIR *dir = opendir(this); + if(!dir) { + error("CGROUP: cannot read directory '%s'", base); + return ret; + } + ret = 1; + + callback(relative_path); + + struct dirent *de = NULL; + while((de = readdir(dir))) { + if(de->d_type == DT_DIR + && ( + (de->d_name[0] == '.' && de->d_name[1] == '\0') + || (de->d_name[0] == '.' && de->d_name[1] == '.' && de->d_name[2] == '\0') + )) + continue; + + if(de->d_type == DT_DIR) { + if(enabled == -1) { + const char *r = relative_path; + if(*r == '\0') r = "/"; + + // do not decent in directories we are not interested + int def = simple_pattern_matches(enabled_cgroup_paths, r); + + // we check for this option here + // so that the config will not have settings + // for leaf directories + char option[FILENAME_MAX + 1]; + snprintfz(option, FILENAME_MAX, "search for cgroups under %s", r); + option[FILENAME_MAX] = '\0'; + enabled = config_get_boolean("plugin:cgroups", option, def); + } + + if(enabled) { + char *s = mallocz(dirlen + strlen(de->d_name) + 2); + strcpy(s, this); + strcat(s, "/"); + strcat(s, de->d_name); + int ret2 = find_dir_in_subdirs(base, s, callback); + if(ret2 > 0) ret += ret2; + freez(s); + } + } + } + + closedir(dir); + return ret; +} + +static inline void mark_all_cgroups_as_not_available() { + debug(D_CGROUP, "marking all cgroups as not available"); + + struct cgroup *cg; + + // mark all as not available + for(cg = discovered_cgroup_root; cg ; cg = cg->discovered_next) { + cg->available = 0; + } +} + +static inline void update_filenames() +{ + struct cgroup *cg; + struct stat buf; + for(cg = discovered_cgroup_root; cg ; cg = cg->discovered_next) { + // fprintf(stderr, " >>> CGROUP '%s' (%u - %s) with name '%s'\n", cg->id, cg->hash, cg->available?"available":"stopped", cg->name); + + if(unlikely(cg->pending_renames)) + cg->pending_renames--; + + if(unlikely(!cg->available || cg->pending_renames)) + continue; + + debug(D_CGROUP, "checking paths for cgroup '%s'", cg->id); + + // check for newly added cgroups + // and update the filenames they read + char filename[FILENAME_MAX + 1]; + if(!cgroup_use_unified_cgroups) { + if(unlikely(cgroup_enable_cpuacct_stat && !cg->cpuacct_stat.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/cpuacct.stat", cgroup_cpuacct_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->cpuacct_stat.filename = strdupz(filename); + cg->cpuacct_stat.enabled = cgroup_enable_cpuacct_stat; + snprintfz(filename, FILENAME_MAX, "%s%s/cpuset.cpus", cgroup_cpuset_base, cg->id); + cg->filename_cpuset_cpus = strdupz(filename); + snprintfz(filename, FILENAME_MAX, "%s%s/cpu.cfs_period_us", cgroup_cpuacct_base, cg->id); + cg->filename_cpu_cfs_period = strdupz(filename); + snprintfz(filename, FILENAME_MAX, "%s%s/cpu.cfs_quota_us", cgroup_cpuacct_base, cg->id); + cg->filename_cpu_cfs_quota = strdupz(filename); + debug(D_CGROUP, "cpuacct.stat filename for cgroup '%s': '%s'", cg->id, cg->cpuacct_stat.filename); + } + else + debug(D_CGROUP, "cpuacct.stat file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_cpuacct_usage && !cg->cpuacct_usage.filename && !(cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE))) { + snprintfz(filename, FILENAME_MAX, "%s%s/cpuacct.usage_percpu", cgroup_cpuacct_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->cpuacct_usage.filename = strdupz(filename); + cg->cpuacct_usage.enabled = cgroup_enable_cpuacct_usage; + debug(D_CGROUP, "cpuacct.usage_percpu filename for cgroup '%s': '%s'", cg->id, cg->cpuacct_usage.filename); + } + else + debug(D_CGROUP, "cpuacct.usage_percpu file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely((cgroup_enable_detailed_memory || cgroup_used_memory_without_cache) && !cg->memory.filename_detailed && (cgroup_used_memory_without_cache || cgroup_enable_systemd_services_detailed_memory || !(cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE)))) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.stat", cgroup_memory_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_detailed = strdupz(filename); + cg->memory.enabled_detailed = (cgroup_enable_detailed_memory == CONFIG_BOOLEAN_YES)?CONFIG_BOOLEAN_YES:CONFIG_BOOLEAN_AUTO; + debug(D_CGROUP, "memory.stat filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_detailed); + } + else + debug(D_CGROUP, "memory.stat file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_memory && !cg->memory.filename_usage_in_bytes)) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.usage_in_bytes", cgroup_memory_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_usage_in_bytes = strdupz(filename); + cg->memory.enabled_usage_in_bytes = cgroup_enable_memory; + debug(D_CGROUP, "memory.usage_in_bytes filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_usage_in_bytes); + snprintfz(filename, FILENAME_MAX, "%s%s/memory.limit_in_bytes", cgroup_memory_base, cg->id); + cg->filename_memory_limit = strdupz(filename); + } + else + debug(D_CGROUP, "memory.usage_in_bytes file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_swap && !cg->memory.filename_msw_usage_in_bytes)) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.memsw.usage_in_bytes", cgroup_memory_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_msw_usage_in_bytes = strdupz(filename); + cg->memory.enabled_msw_usage_in_bytes = cgroup_enable_swap; + snprintfz(filename, FILENAME_MAX, "%s%s/memory.memsw.limit_in_bytes", cgroup_memory_base, cg->id); + cg->filename_memoryswap_limit = strdupz(filename); + debug(D_CGROUP, "memory.msw_usage_in_bytes filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_msw_usage_in_bytes); + } + else + debug(D_CGROUP, "memory.msw_usage_in_bytes file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_memory_failcnt && !cg->memory.filename_failcnt)) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.failcnt", cgroup_memory_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_failcnt = strdupz(filename); + cg->memory.enabled_failcnt = cgroup_enable_memory_failcnt; + debug(D_CGROUP, "memory.failcnt filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_failcnt); + } + else + debug(D_CGROUP, "memory.failcnt file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_blkio_io && !cg->io_service_bytes.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/blkio.io_service_bytes", cgroup_blkio_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->io_service_bytes.filename = strdupz(filename); + cg->io_service_bytes.enabled = cgroup_enable_blkio_io; + debug(D_CGROUP, "io_service_bytes filename for cgroup '%s': '%s'", cg->id, cg->io_service_bytes.filename); + } + else + debug(D_CGROUP, "io_service_bytes file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_blkio_ops && !cg->io_serviced.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/blkio.io_serviced", cgroup_blkio_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->io_serviced.filename = strdupz(filename); + cg->io_serviced.enabled = cgroup_enable_blkio_ops; + debug(D_CGROUP, "io_serviced filename for cgroup '%s': '%s'", cg->id, cg->io_serviced.filename); + } + else + debug(D_CGROUP, "io_serviced file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_blkio_throttle_io && !cg->throttle_io_service_bytes.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/blkio.throttle.io_service_bytes", cgroup_blkio_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->throttle_io_service_bytes.filename = strdupz(filename); + cg->throttle_io_service_bytes.enabled = cgroup_enable_blkio_throttle_io; + debug(D_CGROUP, "throttle_io_service_bytes filename for cgroup '%s': '%s'", cg->id, cg->throttle_io_service_bytes.filename); + } + else + debug(D_CGROUP, "throttle_io_service_bytes file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_blkio_throttle_ops && !cg->throttle_io_serviced.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/blkio.throttle.io_serviced", cgroup_blkio_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->throttle_io_serviced.filename = strdupz(filename); + cg->throttle_io_serviced.enabled = cgroup_enable_blkio_throttle_ops; + debug(D_CGROUP, "throttle_io_serviced filename for cgroup '%s': '%s'", cg->id, cg->throttle_io_serviced.filename); + } + else + debug(D_CGROUP, "throttle_io_serviced file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_blkio_merged_ops && !cg->io_merged.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/blkio.io_merged", cgroup_blkio_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->io_merged.filename = strdupz(filename); + cg->io_merged.enabled = cgroup_enable_blkio_merged_ops; + debug(D_CGROUP, "io_merged filename for cgroup '%s': '%s'", cg->id, cg->io_merged.filename); + } + else + debug(D_CGROUP, "io_merged file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_blkio_queued_ops && !cg->io_queued.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/blkio.io_queued", cgroup_blkio_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->io_queued.filename = strdupz(filename); + cg->io_queued.enabled = cgroup_enable_blkio_queued_ops; + debug(D_CGROUP, "io_queued filename for cgroup '%s': '%s'", cg->id, cg->io_queued.filename); + } + else + debug(D_CGROUP, "io_queued file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + } + else if(likely(cgroup_unified_exist)) { + if(unlikely(cgroup_enable_blkio_io && !cg->io_service_bytes.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/io.stat", cgroup_unified_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->io_service_bytes.filename = strdupz(filename); + cg->io_service_bytes.enabled = cgroup_enable_blkio_io; + debug(D_CGROUP, "io.stat filename for unified cgroup '%s': '%s'", cg->id, cg->io_service_bytes.filename); + } else + debug(D_CGROUP, "io.stat file for unified cgroup '%s': '%s' does not exist.", cg->id, filename); + } + if (unlikely(cgroup_enable_blkio_ops && !cg->io_serviced.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/io.stat", cgroup_unified_base, cg->id); + if (likely(stat(filename, &buf) != -1)) { + cg->io_serviced.filename = strdupz(filename); + cg->io_serviced.enabled = cgroup_enable_blkio_ops; + debug(D_CGROUP, "io.stat filename for unified cgroup '%s': '%s'", cg->id, cg->io_service_bytes.filename); + } else + debug(D_CGROUP, "io.stat file for unified cgroup '%s': '%s' does not exist.", cg->id, filename); + } + if(unlikely(cgroup_enable_cpuacct_stat && !cg->cpuacct_stat.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/cpu.stat", cgroup_unified_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->cpuacct_stat.filename = strdupz(filename); + cg->cpuacct_stat.enabled = cgroup_enable_cpuacct_stat; + cg->filename_cpuset_cpus = NULL; + cg->filename_cpu_cfs_period = NULL; + snprintfz(filename, FILENAME_MAX, "%s%s/cpu.max", cgroup_unified_base, cg->id); + cg->filename_cpu_cfs_quota = strdupz(filename); + debug(D_CGROUP, "cpu.stat filename for unified cgroup '%s': '%s'", cg->id, cg->cpuacct_stat.filename); + } + else + debug(D_CGROUP, "cpu.stat file for unified cgroup '%s': '%s' does not exist.", cg->id, filename); + } + if(unlikely((cgroup_enable_detailed_memory || cgroup_used_memory_without_cache) && !cg->memory.filename_detailed && (cgroup_used_memory_without_cache || cgroup_enable_systemd_services_detailed_memory || !(cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE)))) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.stat", cgroup_unified_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_detailed = strdupz(filename); + cg->memory.enabled_detailed = (cgroup_enable_detailed_memory == CONFIG_BOOLEAN_YES)?CONFIG_BOOLEAN_YES:CONFIG_BOOLEAN_AUTO; + debug(D_CGROUP, "memory.stat filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_detailed); + } + else + debug(D_CGROUP, "memory.stat file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_memory && !cg->memory.filename_usage_in_bytes)) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.current", cgroup_unified_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_usage_in_bytes = strdupz(filename); + cg->memory.enabled_usage_in_bytes = cgroup_enable_memory; + debug(D_CGROUP, "memory.current filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_usage_in_bytes); + snprintfz(filename, FILENAME_MAX, "%s%s/memory.max", cgroup_unified_base, cg->id); + cg->filename_memory_limit = strdupz(filename); + } + else + debug(D_CGROUP, "memory.current file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if(unlikely(cgroup_enable_swap && !cg->memory.filename_msw_usage_in_bytes)) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.swap.current", cgroup_unified_base, cg->id); + if(likely(stat(filename, &buf) != -1)) { + cg->memory.filename_msw_usage_in_bytes = strdupz(filename); + cg->memory.enabled_msw_usage_in_bytes = cgroup_enable_swap; + snprintfz(filename, FILENAME_MAX, "%s%s/memory.swap.max", cgroup_unified_base, cg->id); + cg->filename_memoryswap_limit = strdupz(filename); + debug(D_CGROUP, "memory.swap.current filename for cgroup '%s': '%s'", cg->id, cg->memory.filename_msw_usage_in_bytes); + } + else + debug(D_CGROUP, "memory.swap file for cgroup '%s': '%s' does not exist.", cg->id, filename); + } + + if (unlikely(cgroup_enable_pressure_cpu && !cg->cpu_pressure.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/cpu.pressure", cgroup_unified_base, cg->id); + if (likely(stat(filename, &buf) != -1)) { + cg->cpu_pressure.filename = strdupz(filename); + cg->cpu_pressure.some.enabled = cgroup_enable_pressure_cpu; + cg->cpu_pressure.full.enabled = CONFIG_BOOLEAN_NO; + debug(D_CGROUP, "cpu.pressure filename for cgroup '%s': '%s'", cg->id, cg->cpu_pressure.filename); + } else { + debug(D_CGROUP, "cpu.pressure file for cgroup '%s': '%s' does not exist", cg->id, filename); + } + } + + if (unlikely((cgroup_enable_pressure_io_some || cgroup_enable_pressure_io_full) && !cg->io_pressure.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/io.pressure", cgroup_unified_base, cg->id); + if (likely(stat(filename, &buf) != -1)) { + cg->io_pressure.filename = strdupz(filename); + cg->io_pressure.some.enabled = cgroup_enable_pressure_io_some; + cg->io_pressure.full.enabled = cgroup_enable_pressure_io_full; + debug(D_CGROUP, "io.pressure filename for cgroup '%s': '%s'", cg->id, cg->io_pressure.filename); + } else { + debug(D_CGROUP, "io.pressure file for cgroup '%s': '%s' does not exist", cg->id, filename); + } + } + + if (unlikely((cgroup_enable_pressure_memory_some || cgroup_enable_pressure_memory_full) && !cg->memory_pressure.filename)) { + snprintfz(filename, FILENAME_MAX, "%s%s/memory.pressure", cgroup_unified_base, cg->id); + if (likely(stat(filename, &buf) != -1)) { + cg->memory_pressure.filename = strdupz(filename); + cg->memory_pressure.some.enabled = cgroup_enable_pressure_memory_some; + cg->memory_pressure.full.enabled = cgroup_enable_pressure_memory_full; + debug(D_CGROUP, "memory.pressure filename for cgroup '%s': '%s'", cg->id, cg->memory_pressure.filename); + } else { + debug(D_CGROUP, "memory.pressure file for cgroup '%s': '%s' does not exist", cg->id, filename); + } + } + } + } +} + +static inline void cleanup_all_cgroups() { + struct cgroup *cg = discovered_cgroup_root, *last = NULL; + + for(; cg ;) { + if(!cg->available) { + // enable the first duplicate cgroup + { + struct cgroup *t; + for(t = discovered_cgroup_root; t ; t = t->discovered_next) { + if(t != cg && t->available && !t->enabled && t->options & CGROUP_OPTIONS_DISABLED_DUPLICATE && t->hash_chart == cg->hash_chart && !strcmp(t->chart_id, cg->chart_id)) { + debug(D_CGROUP, "Enabling duplicate of cgroup '%s' with id '%s', because the original with id '%s' stopped.", t->chart_id, t->id, cg->id); + t->enabled = 1; + t->options &= ~CGROUP_OPTIONS_DISABLED_DUPLICATE; + break; + } + } + } + + if(!last) + discovered_cgroup_root = cg->discovered_next; + else + last->discovered_next = cg->discovered_next; + + cgroup_free(cg); + + if(!last) + cg = discovered_cgroup_root; + else + cg = last->discovered_next; + } + else { + last = cg; + cg = cg->discovered_next; + } + } +} + +static inline void copy_discovered_cgroups() +{ + debug(D_CGROUP, "copy discovered cgroups to the main group list"); + + struct cgroup *cg; + + for(cg = discovered_cgroup_root; cg ; cg = cg->discovered_next) { + cg->next = cg->discovered_next; + } + + cgroup_root = discovered_cgroup_root; +} + +static inline void find_all_cgroups() { + debug(D_CGROUP, "searching for cgroups"); + + mark_all_cgroups_as_not_available(); + if(!cgroup_use_unified_cgroups) { + if(cgroup_enable_cpuacct_stat || cgroup_enable_cpuacct_usage) { + if(find_dir_in_subdirs(cgroup_cpuacct_base, NULL, found_subdir_in_dir) == -1) { + cgroup_enable_cpuacct_stat = + cgroup_enable_cpuacct_usage = CONFIG_BOOLEAN_NO; + error("CGROUP: disabled cpu statistics."); + } + } + + if(cgroup_enable_blkio_io || cgroup_enable_blkio_ops || cgroup_enable_blkio_throttle_io || cgroup_enable_blkio_throttle_ops || cgroup_enable_blkio_merged_ops || cgroup_enable_blkio_queued_ops) { + if(find_dir_in_subdirs(cgroup_blkio_base, NULL, found_subdir_in_dir) == -1) { + cgroup_enable_blkio_io = + cgroup_enable_blkio_ops = + cgroup_enable_blkio_throttle_io = + cgroup_enable_blkio_throttle_ops = + cgroup_enable_blkio_merged_ops = + cgroup_enable_blkio_queued_ops = CONFIG_BOOLEAN_NO; + error("CGROUP: disabled blkio statistics."); + } + } + + if(cgroup_enable_memory || cgroup_enable_detailed_memory || cgroup_enable_swap || cgroup_enable_memory_failcnt) { + if(find_dir_in_subdirs(cgroup_memory_base, NULL, found_subdir_in_dir) == -1) { + cgroup_enable_memory = + cgroup_enable_detailed_memory = + cgroup_enable_swap = + cgroup_enable_memory_failcnt = CONFIG_BOOLEAN_NO; + error("CGROUP: disabled memory statistics."); + } + } + + if(cgroup_search_in_devices) { + if(find_dir_in_subdirs(cgroup_devices_base, NULL, found_subdir_in_dir) == -1) { + cgroup_search_in_devices = 0; + error("CGROUP: disabled devices statistics."); + } + } + } + else { + if (find_dir_in_subdirs(cgroup_unified_base, NULL, found_subdir_in_dir) == -1) { + cgroup_unified_exist = CONFIG_BOOLEAN_NO; + error("CGROUP: disabled unified cgroups statistics."); + } + } + + update_filenames(); + + uv_mutex_lock(&cgroup_root_mutex); + cleanup_all_cgroups(); + copy_discovered_cgroups(); + uv_mutex_unlock(&cgroup_root_mutex); + + debug(D_CGROUP, "done searching for cgroups"); +} + +void cgroup_discovery_worker(void *ptr) +{ + UNUSED(ptr); + + while (!netdata_exit) { + uv_mutex_lock(&discovery_thread.mutex); + while (!discovery_thread.start_discovery) + uv_cond_wait(&discovery_thread.cond_var, &discovery_thread.mutex); + discovery_thread.start_discovery = 0; + uv_mutex_unlock(&discovery_thread.mutex); + + if (unlikely(netdata_exit)) + break; + + find_all_cgroups(); + } + + discovery_thread.exited = 1; +} + +// ---------------------------------------------------------------------------- +// generate charts + +#define CHART_TITLE_MAX 300 + +void update_systemd_services_charts( + int update_every + , int do_cpu + , int do_mem_usage + , int do_mem_detailed + , int do_mem_failcnt + , int do_swap_usage + , int do_io + , int do_io_ops + , int do_throttle_io + , int do_throttle_ops + , int do_queued_ops + , int do_merged_ops +) { + static RRDSET + *st_cpu = NULL, + *st_mem_usage = NULL, + *st_mem_failcnt = NULL, + *st_swap_usage = NULL, + + *st_mem_detailed_cache = NULL, + *st_mem_detailed_rss = NULL, + *st_mem_detailed_mapped = NULL, + *st_mem_detailed_writeback = NULL, + *st_mem_detailed_pgfault = NULL, + *st_mem_detailed_pgmajfault = NULL, + *st_mem_detailed_pgpgin = NULL, + *st_mem_detailed_pgpgout = NULL, + + *st_io_read = NULL, + *st_io_serviced_read = NULL, + *st_throttle_io_read = NULL, + *st_throttle_ops_read = NULL, + *st_queued_ops_read = NULL, + *st_merged_ops_read = NULL, + + *st_io_write = NULL, + *st_io_serviced_write = NULL, + *st_throttle_io_write = NULL, + *st_throttle_ops_write = NULL, + *st_queued_ops_write = NULL, + *st_merged_ops_write = NULL; + + // create the charts + + if(likely(do_cpu)) { + if(unlikely(!st_cpu)) { + char title[CHART_TITLE_MAX + 1]; + snprintfz(title, CHART_TITLE_MAX, "Systemd Services CPU utilization (100%% = 1 core)"); + + st_cpu = rrdset_create_localhost( + "services" + , "cpu" + , NULL + , "cpu" + , "services.cpu" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_cpu); + } + + if(likely(do_mem_usage)) { + if(unlikely(!st_mem_usage)) { + + st_mem_usage = rrdset_create_localhost( + "services" + , "mem_usage" + , NULL + , "mem" + , "services.mem_usage" + , (cgroup_used_memory_without_cache) ? "Systemd Services Used Memory without Cache" + : "Systemd Services Used Memory" + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 10 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_usage); + } + + if(likely(do_mem_detailed)) { + if(unlikely(!st_mem_detailed_rss)) { + + st_mem_detailed_rss = rrdset_create_localhost( + "services" + , "mem_rss" + , NULL + , "mem" + , "services.mem_rss" + , "Systemd Services RSS Memory" + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 20 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_rss); + + if(unlikely(!st_mem_detailed_mapped)) { + + st_mem_detailed_mapped = rrdset_create_localhost( + "services" + , "mem_mapped" + , NULL + , "mem" + , "services.mem_mapped" + , "Systemd Services Mapped Memory" + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 30 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_mapped); + + if(unlikely(!st_mem_detailed_cache)) { + + st_mem_detailed_cache = rrdset_create_localhost( + "services" + , "mem_cache" + , NULL + , "mem" + , "services.mem_cache" + , "Systemd Services Cache Memory" + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 40 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_cache); + + if(unlikely(!st_mem_detailed_writeback)) { + + st_mem_detailed_writeback = rrdset_create_localhost( + "services" + , "mem_writeback" + , NULL + , "mem" + , "services.mem_writeback" + , "Systemd Services Writeback Memory" + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 50 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_writeback); + + if(unlikely(!st_mem_detailed_pgfault)) { + + st_mem_detailed_pgfault = rrdset_create_localhost( + "services" + , "mem_pgfault" + , NULL + , "mem" + , "services.mem_pgfault" + , "Systemd Services Memory Minor Page Faults" + , "MiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 60 + , update_every + , RRDSET_TYPE_STACKED + ); + } + else + rrdset_next(st_mem_detailed_pgfault); + + if(unlikely(!st_mem_detailed_pgmajfault)) { + + st_mem_detailed_pgmajfault = rrdset_create_localhost( + "services" + , "mem_pgmajfault" + , NULL + , "mem" + , "services.mem_pgmajfault" + , "Systemd Services Memory Major Page Faults" + , "MiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 70 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_pgmajfault); + + if(unlikely(!st_mem_detailed_pgpgin)) { + + st_mem_detailed_pgpgin = rrdset_create_localhost( + "services" + , "mem_pgpgin" + , NULL + , "mem" + , "services.mem_pgpgin" + , "Systemd Services Memory Charging Activity" + , "MiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 80 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_pgpgin); + + if(unlikely(!st_mem_detailed_pgpgout)) { + + st_mem_detailed_pgpgout = rrdset_create_localhost( + "services" + , "mem_pgpgout" + , NULL + , "mem" + , "services.mem_pgpgout" + , "Systemd Services Memory Uncharging Activity" + , "MiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 90 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_detailed_pgpgout); + } + + if(likely(do_mem_failcnt)) { + if(unlikely(!st_mem_failcnt)) { + + st_mem_failcnt = rrdset_create_localhost( + "services" + , "mem_failcnt" + , NULL + , "mem" + , "services.mem_failcnt" + , "Systemd Services Memory Limit Failures" + , "failures" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 110 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_mem_failcnt); + } + + if(likely(do_swap_usage)) { + if(unlikely(!st_swap_usage)) { + + st_swap_usage = rrdset_create_localhost( + "services" + , "swap_usage" + , NULL + , "swap" + , "services.swap_usage" + , "Systemd Services Swap Memory Used" + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 100 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_swap_usage); + } + + if(likely(do_io)) { + if(unlikely(!st_io_read)) { + + st_io_read = rrdset_create_localhost( + "services" + , "io_read" + , NULL + , "disk" + , "services.io_read" + , "Systemd Services Disk Read Bandwidth" + , "KiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 120 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_io_read); + + if(unlikely(!st_io_write)) { + + st_io_write = rrdset_create_localhost( + "services" + , "io_write" + , NULL + , "disk" + , "services.io_write" + , "Systemd Services Disk Write Bandwidth" + , "KiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 130 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_io_write); + } + + if(likely(do_io_ops)) { + if(unlikely(!st_io_serviced_read)) { + + st_io_serviced_read = rrdset_create_localhost( + "services" + , "io_ops_read" + , NULL + , "disk" + , "services.io_ops_read" + , "Systemd Services Disk Read Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 140 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_io_serviced_read); + + if(unlikely(!st_io_serviced_write)) { + + st_io_serviced_write = rrdset_create_localhost( + "services" + , "io_ops_write" + , NULL + , "disk" + , "services.io_ops_write" + , "Systemd Services Disk Write Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 150 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_io_serviced_write); + } + + if(likely(do_throttle_io)) { + if(unlikely(!st_throttle_io_read)) { + + st_throttle_io_read = rrdset_create_localhost( + "services" + , "throttle_io_read" + , NULL + , "disk" + , "services.throttle_io_read" + , "Systemd Services Throttle Disk Read Bandwidth" + , "KiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 160 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_throttle_io_read); + + if(unlikely(!st_throttle_io_write)) { + + st_throttle_io_write = rrdset_create_localhost( + "services" + , "throttle_io_write" + , NULL + , "disk" + , "services.throttle_io_write" + , "Systemd Services Throttle Disk Write Bandwidth" + , "KiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 170 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_throttle_io_write); + } + + if(likely(do_throttle_ops)) { + if(unlikely(!st_throttle_ops_read)) { + + st_throttle_ops_read = rrdset_create_localhost( + "services" + , "throttle_io_ops_read" + , NULL + , "disk" + , "services.throttle_io_ops_read" + , "Systemd Services Throttle Disk Read Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 180 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_throttle_ops_read); + + if(unlikely(!st_throttle_ops_write)) { + + st_throttle_ops_write = rrdset_create_localhost( + "services" + , "throttle_io_ops_write" + , NULL + , "disk" + , "services.throttle_io_ops_write" + , "Systemd Services Throttle Disk Write Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 190 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_throttle_ops_write); + } + + if(likely(do_queued_ops)) { + if(unlikely(!st_queued_ops_read)) { + + st_queued_ops_read = rrdset_create_localhost( + "services" + , "queued_io_ops_read" + , NULL + , "disk" + , "services.queued_io_ops_read" + , "Systemd Services Queued Disk Read Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 200 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_queued_ops_read); + + if(unlikely(!st_queued_ops_write)) { + + st_queued_ops_write = rrdset_create_localhost( + "services" + , "queued_io_ops_write" + , NULL + , "disk" + , "services.queued_io_ops_write" + , "Systemd Services Queued Disk Write Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 210 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_queued_ops_write); + } + + if(likely(do_merged_ops)) { + if(unlikely(!st_merged_ops_read)) { + + st_merged_ops_read = rrdset_create_localhost( + "services" + , "merged_io_ops_read" + , NULL + , "disk" + , "services.merged_io_ops_read" + , "Systemd Services Merged Disk Read Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 220 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_merged_ops_read); + + if(unlikely(!st_merged_ops_write)) { + + st_merged_ops_write = rrdset_create_localhost( + "services" + , "merged_io_ops_write" + , NULL + , "disk" + , "services.merged_io_ops_write" + , "Systemd Services Merged Disk Write Operations" + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_SYSTEMD_NAME + , NETDATA_CHART_PRIO_CGROUPS_SYSTEMD + 230 + , update_every + , RRDSET_TYPE_STACKED + ); + + } + else + rrdset_next(st_merged_ops_write); + } + + // update the values + struct cgroup *cg; + for(cg = cgroup_root; cg ; cg = cg->next) { + if(unlikely(!cg->enabled || cg->pending_renames || !(cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE))) + continue; + + if(likely(do_cpu && cg->cpuacct_stat.updated)) { + if(unlikely(!cg->rd_cpu)){ + + + if (!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + cg->rd_cpu = rrddim_add(st_cpu, cg->chart_id, cg->chart_title, 100, system_hz, RRD_ALGORITHM_INCREMENTAL); + } else { + cg->rd_cpu = rrddim_add(st_cpu, cg->chart_id, cg->chart_title, 100, 1000000, RRD_ALGORITHM_INCREMENTAL); + } + } + + rrddim_set_by_pointer(st_cpu, cg->rd_cpu, cg->cpuacct_stat.user + cg->cpuacct_stat.system); + } + + if(likely(do_mem_usage && cg->memory.updated_usage_in_bytes)) { + if(unlikely(!cg->rd_mem_usage)) + cg->rd_mem_usage = rrddim_add(st_mem_usage, cg->chart_id, cg->chart_title, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_set_by_pointer(st_mem_usage, cg->rd_mem_usage, cg->memory.usage_in_bytes - ((cgroup_used_memory_without_cache)?cg->memory.total_cache:0)); + } + + if(likely(do_mem_detailed && cg->memory.updated_detailed)) { + if(unlikely(!cg->rd_mem_detailed_rss)) + cg->rd_mem_detailed_rss = rrddim_add(st_mem_detailed_rss, cg->chart_id, cg->chart_title, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_set_by_pointer(st_mem_detailed_rss, cg->rd_mem_detailed_rss, cg->memory.total_rss + cg->memory.total_rss_huge); + + if(unlikely(!cg->rd_mem_detailed_mapped)) + cg->rd_mem_detailed_mapped = rrddim_add(st_mem_detailed_mapped, cg->chart_id, cg->chart_title, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_set_by_pointer(st_mem_detailed_mapped, cg->rd_mem_detailed_mapped, cg->memory.total_mapped_file); + + if(unlikely(!cg->rd_mem_detailed_cache)) + cg->rd_mem_detailed_cache = rrddim_add(st_mem_detailed_cache, cg->chart_id, cg->chart_title, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_set_by_pointer(st_mem_detailed_cache, cg->rd_mem_detailed_cache, cg->memory.total_cache); + + if(unlikely(!cg->rd_mem_detailed_writeback)) + cg->rd_mem_detailed_writeback = rrddim_add(st_mem_detailed_writeback, cg->chart_id, cg->chart_title, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_set_by_pointer(st_mem_detailed_writeback, cg->rd_mem_detailed_writeback, cg->memory.total_writeback); + + if(unlikely(!cg->rd_mem_detailed_pgfault)) + cg->rd_mem_detailed_pgfault = rrddim_add(st_mem_detailed_pgfault, cg->chart_id, cg->chart_title, system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_mem_detailed_pgfault, cg->rd_mem_detailed_pgfault, cg->memory.total_pgfault); + + if(unlikely(!cg->rd_mem_detailed_pgmajfault)) + cg->rd_mem_detailed_pgmajfault = rrddim_add(st_mem_detailed_pgmajfault, cg->chart_id, cg->chart_title, system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_mem_detailed_pgmajfault, cg->rd_mem_detailed_pgmajfault, cg->memory.total_pgmajfault); + + if(unlikely(!cg->rd_mem_detailed_pgpgin)) + cg->rd_mem_detailed_pgpgin = rrddim_add(st_mem_detailed_pgpgin, cg->chart_id, cg->chart_title, system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_mem_detailed_pgpgin, cg->rd_mem_detailed_pgpgin, cg->memory.total_pgpgin); + + if(unlikely(!cg->rd_mem_detailed_pgpgout)) + cg->rd_mem_detailed_pgpgout = rrddim_add(st_mem_detailed_pgpgout, cg->chart_id, cg->chart_title, system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_mem_detailed_pgpgout, cg->rd_mem_detailed_pgpgout, cg->memory.total_pgpgout); + } + + if(likely(do_mem_failcnt && cg->memory.updated_failcnt)) { + if(unlikely(!cg->rd_mem_failcnt)) + cg->rd_mem_failcnt = rrddim_add(st_mem_failcnt, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_mem_failcnt, cg->rd_mem_failcnt, cg->memory.failcnt); + } + + if(likely(do_swap_usage && cg->memory.updated_msw_usage_in_bytes)) { + if(unlikely(!cg->rd_swap_usage)) + cg->rd_swap_usage = rrddim_add(st_swap_usage, cg->chart_id, cg->chart_title, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_set_by_pointer(st_swap_usage, cg->rd_swap_usage, cg->memory.msw_usage_in_bytes); + } + + if(likely(do_io && cg->io_service_bytes.updated)) { + if(unlikely(!cg->rd_io_service_bytes_read)) + cg->rd_io_service_bytes_read = rrddim_add(st_io_read, cg->chart_id, cg->chart_title, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_io_read, cg->rd_io_service_bytes_read, cg->io_service_bytes.Read); + + if(unlikely(!cg->rd_io_service_bytes_write)) + cg->rd_io_service_bytes_write = rrddim_add(st_io_write, cg->chart_id, cg->chart_title, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_io_write, cg->rd_io_service_bytes_write, cg->io_service_bytes.Write); + } + + if(likely(do_io_ops && cg->io_serviced.updated)) { + if(unlikely(!cg->rd_io_serviced_read)) + cg->rd_io_serviced_read = rrddim_add(st_io_serviced_read, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_io_serviced_read, cg->rd_io_serviced_read, cg->io_serviced.Read); + + if(unlikely(!cg->rd_io_serviced_write)) + cg->rd_io_serviced_write = rrddim_add(st_io_serviced_write, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_io_serviced_write, cg->rd_io_serviced_write, cg->io_serviced.Write); + } + + if(likely(do_throttle_io && cg->throttle_io_service_bytes.updated)) { + if(unlikely(!cg->rd_throttle_io_read)) + cg->rd_throttle_io_read = rrddim_add(st_throttle_io_read, cg->chart_id, cg->chart_title, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_throttle_io_read, cg->rd_throttle_io_read, cg->throttle_io_service_bytes.Read); + + if(unlikely(!cg->rd_throttle_io_write)) + cg->rd_throttle_io_write = rrddim_add(st_throttle_io_write, cg->chart_id, cg->chart_title, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_throttle_io_write, cg->rd_throttle_io_write, cg->throttle_io_service_bytes.Write); + } + + if(likely(do_throttle_ops && cg->throttle_io_serviced.updated)) { + if(unlikely(!cg->rd_throttle_io_serviced_read)) + cg->rd_throttle_io_serviced_read = rrddim_add(st_throttle_ops_read, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_throttle_ops_read, cg->rd_throttle_io_serviced_read, cg->throttle_io_serviced.Read); + + if(unlikely(!cg->rd_throttle_io_serviced_write)) + cg->rd_throttle_io_serviced_write = rrddim_add(st_throttle_ops_write, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_throttle_ops_write, cg->rd_throttle_io_serviced_write, cg->throttle_io_serviced.Write); + } + + if(likely(do_queued_ops && cg->io_queued.updated)) { + if(unlikely(!cg->rd_io_queued_read)) + cg->rd_io_queued_read = rrddim_add(st_queued_ops_read, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_queued_ops_read, cg->rd_io_queued_read, cg->io_queued.Read); + + if(unlikely(!cg->rd_io_queued_write)) + cg->rd_io_queued_write = rrddim_add(st_queued_ops_write, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_queued_ops_write, cg->rd_io_queued_write, cg->io_queued.Write); + } + + if(likely(do_merged_ops && cg->io_merged.updated)) { + if(unlikely(!cg->rd_io_merged_read)) + cg->rd_io_merged_read = rrddim_add(st_merged_ops_read, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_merged_ops_read, cg->rd_io_merged_read, cg->io_merged.Read); + + if(unlikely(!cg->rd_io_merged_write)) + cg->rd_io_merged_write = rrddim_add(st_merged_ops_write, cg->chart_id, cg->chart_title, 1, 1, RRD_ALGORITHM_INCREMENTAL); + + rrddim_set_by_pointer(st_merged_ops_write, cg->rd_io_merged_write, cg->io_merged.Write); + } + } + + // complete the iteration + if(likely(do_cpu)) + rrdset_done(st_cpu); + + if(likely(do_mem_usage)) + rrdset_done(st_mem_usage); + + if(unlikely(do_mem_detailed)) { + rrdset_done(st_mem_detailed_cache); + rrdset_done(st_mem_detailed_rss); + rrdset_done(st_mem_detailed_mapped); + rrdset_done(st_mem_detailed_writeback); + rrdset_done(st_mem_detailed_pgfault); + rrdset_done(st_mem_detailed_pgmajfault); + rrdset_done(st_mem_detailed_pgpgin); + rrdset_done(st_mem_detailed_pgpgout); + } + + if(likely(do_mem_failcnt)) + rrdset_done(st_mem_failcnt); + + if(likely(do_swap_usage)) + rrdset_done(st_swap_usage); + + if(likely(do_io)) { + rrdset_done(st_io_read); + rrdset_done(st_io_write); + } + + if(likely(do_io_ops)) { + rrdset_done(st_io_serviced_read); + rrdset_done(st_io_serviced_write); + } + + if(likely(do_throttle_io)) { + rrdset_done(st_throttle_io_read); + rrdset_done(st_throttle_io_write); + } + + if(likely(do_throttle_ops)) { + rrdset_done(st_throttle_ops_read); + rrdset_done(st_throttle_ops_write); + } + + if(likely(do_queued_ops)) { + rrdset_done(st_queued_ops_read); + rrdset_done(st_queued_ops_write); + } + + if(likely(do_merged_ops)) { + rrdset_done(st_merged_ops_read); + rrdset_done(st_merged_ops_write); + } +} + +static inline char *cgroup_chart_type(char *buffer, const char *id, size_t len) { + if(buffer[0]) return buffer; + + if(id[0] == '\0' || (id[0] == '/' && id[1] == '\0')) + strncpy(buffer, "cgroup_root", len); + else + snprintfz(buffer, len, "cgroup_%s", id); + + netdata_fix_chart_id(buffer); + return buffer; +} + +static inline unsigned long long cpuset_str2ull(char **s) { + unsigned long long n = 0; + char c; + for(c = **s; c >= '0' && c <= '9' ; c = *(++*s)) { + n *= 10; + n += c - '0'; + } + return n; +} + +static inline void update_cpu_limits(char **filename, unsigned long long *value, struct cgroup *cg) { + if(*filename) { + int ret = -1; + + if(value == &cg->cpuset_cpus) { + static char *buf = NULL; + static size_t buf_size = 0; + + if(!buf) { + buf_size = 100U + 6 * get_system_cpus(); // taken from kernel/cgroup/cpuset.c + buf = mallocz(buf_size + 1); + } + + ret = read_file(*filename, buf, buf_size); + + if(!ret) { + char *s = buf; + unsigned long long ncpus = 0; + + // parse the cpuset string and calculate the number of cpus the cgroup is allowed to use + while(*s) { + unsigned long long n = cpuset_str2ull(&s); + if(*s == ',') { + s++; + ncpus++; + continue; + } + if(*s == '-') { + s++; + unsigned long long m = cpuset_str2ull(&s); + ncpus += m - n + 1; // calculate the number of cpus in the region + } + s++; + } + + if(likely(ncpus)) *value = ncpus; + } + } + else if(value == &cg->cpu_cfs_period) { + ret = read_single_number_file(*filename, value); + } + else if(value == &cg->cpu_cfs_quota) { + ret = read_single_number_file(*filename, value); + } + else ret = -1; + + if(ret) { + error("Cannot refresh cgroup %s cpu limit by reading '%s'. Will not update its limit anymore.", cg->id, *filename); + freez(*filename); + *filename = NULL; + } + } +} + +static inline void update_cpu_limits2(struct cgroup *cg) { + if(cg->filename_cpu_cfs_quota){ + static procfile *ff = NULL; + + ff = procfile_reopen(ff, cg->filename_cpu_cfs_quota, NULL, PROCFILE_FLAG_DEFAULT); + if(unlikely(!ff)) { + goto cpu_limits2_err; + } + + ff = procfile_readall(ff); + if(unlikely(!ff)) { + goto cpu_limits2_err; + } + + unsigned long lines = procfile_lines(ff); + + if (unlikely(lines < 1)) { + error("CGROUP: file '%s' should have 1 lines.", cg->filename_cpu_cfs_quota); + return; + } + + cg->cpu_cfs_period = str2ull(procfile_lineword(ff, 0, 1)); + cg->cpuset_cpus = get_system_cpus(); + + char *s = "max\n\0"; + if(strsame(s, procfile_lineword(ff, 0, 0)) == 0){ + cg->cpu_cfs_quota = cg->cpu_cfs_period * cg->cpuset_cpus; + } else { + cg->cpu_cfs_quota = str2ull(procfile_lineword(ff, 0, 0)); + } + debug(D_CGROUP, "CPU limits values: %llu %llu %llu", cg->cpu_cfs_period, cg->cpuset_cpus, cg->cpu_cfs_quota); + return; + +cpu_limits2_err: + error("Cannot refresh cgroup %s cpu limit by reading '%s'. Will not update its limit anymore.", cg->id, cg->filename_cpu_cfs_quota); + freez(cg->filename_cpu_cfs_quota); + cg->filename_cpu_cfs_quota = NULL; + + } +} + +static inline int update_memory_limits(char **filename, RRDSETVAR **chart_var, unsigned long long *value, const char *chart_var_name, struct cgroup *cg) { + if(*filename) { + if(unlikely(!*chart_var)) { + *chart_var = rrdsetvar_custom_chart_variable_create(cg->st_mem_usage, chart_var_name); + if(!*chart_var) { + error("Cannot create cgroup %s chart variable '%s'. Will not update its limit anymore.", cg->id, chart_var_name); + freez(*filename); + *filename = NULL; + } + } + + if(*filename && *chart_var) { + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + if(read_single_number_file(*filename, value)) { + error("Cannot refresh cgroup %s memory limit by reading '%s'. Will not update its limit anymore.", cg->id, *filename); + freez(*filename); + *filename = NULL; + } + else { + rrdsetvar_custom_chart_variable_set(*chart_var, (calculated_number)(*value / (1024 * 1024))); + return 1; + } + } else { + char buffer[30 + 1]; + int ret = read_file(*filename, buffer, 30); + if(ret) { + error("Cannot refresh cgroup %s memory limit by reading '%s'. Will not update its limit anymore.", cg->id, *filename); + freez(*filename); + *filename = NULL; + return 0; + } + char *s = "max\n\0"; + if(strsame(s, buffer) == 0){ + *value = UINT64_MAX; + rrdsetvar_custom_chart_variable_set(*chart_var, (calculated_number)(*value / (1024 * 1024))); + return 1; + } + *value = str2ull(buffer); + rrdsetvar_custom_chart_variable_set(*chart_var, (calculated_number)(*value / (1024 * 1024))); + return 1; + } + } + } + return 0; +} + +void update_cgroup_charts(int update_every) { + debug(D_CGROUP, "updating cgroups charts"); + + char type[RRD_ID_LENGTH_MAX + 1]; + char title[CHART_TITLE_MAX + 1]; + + int services_do_cpu = 0, + services_do_mem_usage = 0, + services_do_mem_detailed = 0, + services_do_mem_failcnt = 0, + services_do_swap_usage = 0, + services_do_io = 0, + services_do_io_ops = 0, + services_do_throttle_io = 0, + services_do_throttle_ops = 0, + services_do_queued_ops = 0, + services_do_merged_ops = 0; + + struct cgroup *cg; + for(cg = cgroup_root; cg ; cg = cg->next) { + if(unlikely(!cg->enabled || cg->pending_renames)) + continue; + + if(likely(cgroup_enable_systemd_services && cg->options & CGROUP_OPTIONS_SYSTEM_SLICE_SERVICE)) { + if(cg->cpuacct_stat.updated && cg->cpuacct_stat.enabled == CONFIG_BOOLEAN_YES) services_do_cpu++; + + if(cgroup_enable_systemd_services_detailed_memory && cg->memory.updated_detailed && cg->memory.enabled_detailed) services_do_mem_detailed++; + if(cg->memory.updated_usage_in_bytes && cg->memory.enabled_usage_in_bytes == CONFIG_BOOLEAN_YES) services_do_mem_usage++; + if(cg->memory.updated_failcnt && cg->memory.enabled_failcnt == CONFIG_BOOLEAN_YES) services_do_mem_failcnt++; + if(cg->memory.updated_msw_usage_in_bytes && cg->memory.enabled_msw_usage_in_bytes == CONFIG_BOOLEAN_YES) services_do_swap_usage++; + + if(cg->io_service_bytes.updated && cg->io_service_bytes.enabled == CONFIG_BOOLEAN_YES) services_do_io++; + if(cg->io_serviced.updated && cg->io_serviced.enabled == CONFIG_BOOLEAN_YES) services_do_io_ops++; + if(cg->throttle_io_service_bytes.updated && cg->throttle_io_service_bytes.enabled == CONFIG_BOOLEAN_YES) services_do_throttle_io++; + if(cg->throttle_io_serviced.updated && cg->throttle_io_serviced.enabled == CONFIG_BOOLEAN_YES) services_do_throttle_ops++; + if(cg->io_queued.updated && cg->io_queued.enabled == CONFIG_BOOLEAN_YES) services_do_queued_ops++; + if(cg->io_merged.updated && cg->io_merged.enabled == CONFIG_BOOLEAN_YES) services_do_merged_ops++; + continue; + } + + type[0] = '\0'; + + if(likely(cg->cpuacct_stat.updated && cg->cpuacct_stat.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_cpu)) { + snprintfz(title, CHART_TITLE_MAX, "CPU Usage (100%% = 1 core)"); + + cg->st_cpu = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "cpu" + , NULL + , "cpu" + , "cgroup.cpu" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + , update_every + , RRDSET_TYPE_STACKED + ); + + rrdset_update_labels(cg->st_cpu, cg->chart_labels); + + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + rrddim_add(cg->st_cpu, "user", NULL, 100, system_hz, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_cpu, "system", NULL, 100, system_hz, RRD_ALGORITHM_INCREMENTAL); + } + else { + rrddim_add(cg->st_cpu, "user", NULL, 100, 1000000, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_cpu, "system", NULL, 100, 1000000, RRD_ALGORITHM_INCREMENTAL); + } + } + else + rrdset_next(cg->st_cpu); + + rrddim_set(cg->st_cpu, "user", cg->cpuacct_stat.user); + rrddim_set(cg->st_cpu, "system", cg->cpuacct_stat.system); + rrdset_done(cg->st_cpu); + + if(likely(cg->filename_cpuset_cpus || cg->filename_cpu_cfs_period || cg->filename_cpu_cfs_quota)) { + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + update_cpu_limits(&cg->filename_cpuset_cpus, &cg->cpuset_cpus, cg); + update_cpu_limits(&cg->filename_cpu_cfs_period, &cg->cpu_cfs_period, cg); + update_cpu_limits(&cg->filename_cpu_cfs_quota, &cg->cpu_cfs_quota, cg); + } else { + update_cpu_limits2(cg); + } + + if(unlikely(!cg->chart_var_cpu_limit)) { + cg->chart_var_cpu_limit = rrdsetvar_custom_chart_variable_create(cg->st_cpu, "cpu_limit"); + if(!cg->chart_var_cpu_limit) { + error("Cannot create cgroup %s chart variable 'cpu_limit'. Will not update its limit anymore.", cg->id); + if(cg->filename_cpuset_cpus) freez(cg->filename_cpuset_cpus); + cg->filename_cpuset_cpus = NULL; + if(cg->filename_cpu_cfs_period) freez(cg->filename_cpu_cfs_period); + cg->filename_cpu_cfs_period = NULL; + if(cg->filename_cpu_cfs_quota) freez(cg->filename_cpu_cfs_quota); + cg->filename_cpu_cfs_quota = NULL; + } + } + else { + calculated_number value = 0, quota = 0; + + if(likely( ((!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) && (cg->filename_cpuset_cpus || (cg->filename_cpu_cfs_period && cg->filename_cpu_cfs_quota))) + || ((cg->options & CGROUP_OPTIONS_IS_UNIFIED) && cg->filename_cpu_cfs_quota))) { + if(unlikely(cg->cpu_cfs_quota > 0)) + quota = (calculated_number)cg->cpu_cfs_quota / (calculated_number)cg->cpu_cfs_period; + + if(unlikely(quota > 0 && quota < cg->cpuset_cpus)) + value = quota * 100; + else + value = (calculated_number)cg->cpuset_cpus * 100; + } + if(likely(value)) { + rrdsetvar_custom_chart_variable_set(cg->chart_var_cpu_limit, value); + + if(unlikely(!cg->st_cpu_limit)) { + snprintfz(title, CHART_TITLE_MAX, "CPU Usage within the limits"); + + cg->st_cpu_limit = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "cpu_limit" + , NULL + , "cpu" + , "cgroup.cpu_limit" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority - 1 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_cpu_limit, cg->chart_labels); + + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) + rrddim_add(cg->st_cpu_limit, "used", NULL, 1, system_hz, RRD_ALGORITHM_ABSOLUTE); + else + rrddim_add(cg->st_cpu_limit, "used", NULL, 1, 1000000, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(cg->st_cpu_limit); + + calculated_number cpu_usage = 0; + cpu_usage = (calculated_number)(cg->cpuacct_stat.user + cg->cpuacct_stat.system) * 100; + calculated_number cpu_used = 100 * (cpu_usage - cg->prev_cpu_usage) / (value * update_every); + + rrdset_isnot_obsolete(cg->st_cpu_limit); + + rrddim_set(cg->st_cpu_limit, "used", (cpu_used > 0)?cpu_used:0); + + cg->prev_cpu_usage = cpu_usage; + + rrdset_done(cg->st_cpu_limit); + } + else { + rrdsetvar_custom_chart_variable_set(cg->chart_var_cpu_limit, NAN); + if(unlikely(cg->st_cpu_limit)) { + rrdset_is_obsolete(cg->st_cpu_limit); + cg->st_cpu_limit = NULL; + } + } + } + } + } + + if(likely(cg->cpuacct_usage.updated && cg->cpuacct_usage.enabled == CONFIG_BOOLEAN_YES)) { + char id[RRD_ID_LENGTH_MAX + 1]; + unsigned int i; + + if(unlikely(!cg->st_cpu_per_core)) { + snprintfz(title, CHART_TITLE_MAX, "CPU Usage (100%% = 1 core) Per Core"); + + cg->st_cpu_per_core = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "cpu_per_core" + , NULL + , "cpu" + , "cgroup.cpu_per_core" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 100 + , update_every + , RRDSET_TYPE_STACKED + ); + + rrdset_update_labels(cg->st_cpu_per_core, cg->chart_labels); + + for(i = 0; i < cg->cpuacct_usage.cpus; i++) { + snprintfz(id, RRD_ID_LENGTH_MAX, "cpu%u", i); + rrddim_add(cg->st_cpu_per_core, id, NULL, 100, 1000000000, RRD_ALGORITHM_INCREMENTAL); + } + } + else + rrdset_next(cg->st_cpu_per_core); + + for(i = 0; i < cg->cpuacct_usage.cpus ;i++) { + snprintfz(id, RRD_ID_LENGTH_MAX, "cpu%u", i); + rrddim_set(cg->st_cpu_per_core, id, cg->cpuacct_usage.cpu_percpu[i]); + } + rrdset_done(cg->st_cpu_per_core); + } + + if(likely(cg->memory.updated_detailed && cg->memory.enabled_detailed == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_mem)) { + snprintfz(title, CHART_TITLE_MAX, "Memory Usage"); + + cg->st_mem = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem" + , NULL + , "mem" + , "cgroup.mem" + , title + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 210 + , update_every + , RRDSET_TYPE_STACKED + ); + + rrdset_update_labels(cg->st_mem, cg->chart_labels); + + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + rrddim_add(cg->st_mem, "cache", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "rss", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + if(cg->memory.detailed_has_swap) + rrddim_add(cg->st_mem, "swap", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_add(cg->st_mem, "rss_huge", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "mapped_file", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + } else { + rrddim_add(cg->st_mem, "anon", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "kernel_stack", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "slab", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "sock", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "anon_thp", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem, "file", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + } + } + else + rrdset_next(cg->st_mem); + + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + rrddim_set(cg->st_mem, "cache", cg->memory.total_cache); + rrddim_set(cg->st_mem, "rss", (cg->memory.total_rss > cg->memory.total_rss_huge)?(cg->memory.total_rss - cg->memory.total_rss_huge):0); + + if(cg->memory.detailed_has_swap) + rrddim_set(cg->st_mem, "swap", cg->memory.total_swap); + + rrddim_set(cg->st_mem, "rss_huge", cg->memory.total_rss_huge); + rrddim_set(cg->st_mem, "mapped_file", cg->memory.total_mapped_file); + } else { + rrddim_set(cg->st_mem, "anon", cg->memory.anon); + rrddim_set(cg->st_mem, "kernel_stack", cg->memory.kernel_stack); + rrddim_set(cg->st_mem, "slab", cg->memory.slab); + rrddim_set(cg->st_mem, "sock", cg->memory.sock); + rrddim_set(cg->st_mem, "anon_thp", cg->memory.anon_thp); + rrddim_set(cg->st_mem, "file", cg->memory.total_mapped_file); + } + rrdset_done(cg->st_mem); + + if(unlikely(!cg->st_writeback)) { + snprintfz(title, CHART_TITLE_MAX, "Writeback Memory"); + + cg->st_writeback = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "writeback" + , NULL + , "mem" + , "cgroup.writeback" + , title + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 300 + , update_every + , RRDSET_TYPE_AREA + ); + + rrdset_update_labels(cg->st_writeback, cg->chart_labels); + + if(cg->memory.detailed_has_dirty) + rrddim_add(cg->st_writeback, "dirty", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + + rrddim_add(cg->st_writeback, "writeback", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(cg->st_writeback); + + if(cg->memory.detailed_has_dirty) + rrddim_set(cg->st_writeback, "dirty", cg->memory.total_dirty); + + rrddim_set(cg->st_writeback, "writeback", cg->memory.total_writeback); + rrdset_done(cg->st_writeback); + + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + if(unlikely(!cg->st_mem_activity)) { + snprintfz(title, CHART_TITLE_MAX, "Memory Activity"); + + cg->st_mem_activity = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem_activity" + , NULL + , "mem" + , "cgroup.mem_activity" + , title + , "MiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 400 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_mem_activity, cg->chart_labels); + + rrddim_add(cg->st_mem_activity, "pgpgin", "in", system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_mem_activity, "pgpgout", "out", -system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_mem_activity); + + rrddim_set(cg->st_mem_activity, "pgpgin", cg->memory.total_pgpgin); + rrddim_set(cg->st_mem_activity, "pgpgout", cg->memory.total_pgpgout); + rrdset_done(cg->st_mem_activity); + } + + if(unlikely(!cg->st_pgfaults)) { + snprintfz(title, CHART_TITLE_MAX, "Memory Page Faults"); + + cg->st_pgfaults = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "pgfaults" + , NULL + , "mem" + , "cgroup.pgfaults" + , title + , "MiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 500 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_pgfaults, cg->chart_labels); + + rrddim_add(cg->st_pgfaults, "pgfault", NULL, system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_pgfaults, "pgmajfault", "swap", -system_page_size, 1024 * 1024, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_pgfaults); + + rrddim_set(cg->st_pgfaults, "pgfault", cg->memory.total_pgfault); + rrddim_set(cg->st_pgfaults, "pgmajfault", cg->memory.total_pgmajfault); + rrdset_done(cg->st_pgfaults); + } + + if(likely(cg->memory.updated_usage_in_bytes && cg->memory.enabled_usage_in_bytes == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_mem_usage)) { + snprintfz(title, CHART_TITLE_MAX, "Used Memory %s", (cgroup_used_memory_without_cache && cg->memory.updated_detailed)?"without Cache ":""); + + cg->st_mem_usage = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem_usage" + , NULL + , "mem" + , "cgroup.mem_usage" + , title + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 200 + , update_every + , RRDSET_TYPE_STACKED + ); + + rrdset_update_labels(cg->st_mem_usage, cg->chart_labels); + + rrddim_add(cg->st_mem_usage, "ram", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem_usage, "swap", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(cg->st_mem_usage); + + rrddim_set(cg->st_mem_usage, "ram", cg->memory.usage_in_bytes - ((cgroup_used_memory_without_cache)?cg->memory.total_cache:0)); + if(!(cg->options & CGROUP_OPTIONS_IS_UNIFIED)) { + rrddim_set(cg->st_mem_usage, "swap", (cg->memory.msw_usage_in_bytes > cg->memory.usage_in_bytes)?cg->memory.msw_usage_in_bytes - cg->memory.usage_in_bytes:0); + } else { + rrddim_set(cg->st_mem_usage, "swap", cg->memory.msw_usage_in_bytes); + } + rrdset_done(cg->st_mem_usage); + + if (likely(update_memory_limits(&cg->filename_memory_limit, &cg->chart_var_memory_limit, &cg->memory_limit, "memory_limit", cg))) { + static unsigned long long ram_total = 0; + + if(unlikely(!ram_total)) { + procfile *ff = NULL; + + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s%s", netdata_configured_host_prefix, "/proc/meminfo"); + ff = procfile_open(config_get("plugin:cgroups", "meminfo filename to monitor", filename), " \t:", PROCFILE_FLAG_DEFAULT); + + if(likely(ff)) + ff = procfile_readall(ff); + if(likely(ff && procfile_lines(ff) && !strncmp(procfile_word(ff, 0), "MemTotal", 8))) + ram_total = str2ull(procfile_word(ff, 1)) * 1024; + else { + error("Cannot read file %s. Will not update cgroup %s RAM limit anymore.", filename, cg->id); + freez(cg->filename_memory_limit); + cg->filename_memory_limit = NULL; + } + + procfile_close(ff); + } + + if(likely(ram_total)) { + unsigned long long memory_limit = ram_total; + + if(unlikely(cg->memory_limit < ram_total)) + memory_limit = cg->memory_limit; + + if(unlikely(!cg->st_mem_usage_limit)) { + snprintfz(title, CHART_TITLE_MAX, "Used RAM without Cache within the limits"); + + cg->st_mem_usage_limit = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem_usage_limit" + , NULL + , "mem" + , "cgroup.mem_usage_limit" + , title + , "MiB" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 199 + , update_every + , RRDSET_TYPE_STACKED + ); + + rrdset_update_labels(cg->st_mem_usage_limit, cg->chart_labels); + + rrddim_add(cg->st_mem_usage_limit, "available", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_mem_usage_limit, "used", NULL, 1, 1024 * 1024, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(cg->st_mem_usage_limit); + + rrdset_isnot_obsolete(cg->st_mem_usage_limit); + + rrddim_set(cg->st_mem_usage_limit, "available", memory_limit - (cg->memory.usage_in_bytes - ((cgroup_used_memory_without_cache)?cg->memory.total_cache:0))); + rrddim_set(cg->st_mem_usage_limit, "used", cg->memory.usage_in_bytes - ((cgroup_used_memory_without_cache)?cg->memory.total_cache:0)); + rrdset_done(cg->st_mem_usage_limit); + } + } + else { + if(unlikely(cg->st_mem_usage_limit)) { + rrdset_is_obsolete(cg->st_mem_usage_limit); + cg->st_mem_usage_limit = NULL; + } + } + + update_memory_limits(&cg->filename_memoryswap_limit, &cg->chart_var_memoryswap_limit, &cg->memoryswap_limit, "memory_and_swap_limit", cg); + } + + if(likely(cg->memory.updated_failcnt && cg->memory.enabled_failcnt == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_mem_failcnt)) { + snprintfz(title, CHART_TITLE_MAX, "Memory Limit Failures"); + + cg->st_mem_failcnt = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem_failcnt" + , NULL + , "mem" + , "cgroup.mem_failcnt" + , title + , "count" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 250 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_mem_failcnt, cg->chart_labels); + + rrddim_add(cg->st_mem_failcnt, "failures", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_mem_failcnt); + + rrddim_set(cg->st_mem_failcnt, "failures", cg->memory.failcnt); + rrdset_done(cg->st_mem_failcnt); + } + + if(likely(cg->io_service_bytes.updated && cg->io_service_bytes.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_io)) { + snprintfz(title, CHART_TITLE_MAX, "I/O Bandwidth (all disks)"); + + cg->st_io = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "io" + , NULL + , "disk" + , "cgroup.io" + , title + , "KiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 1200 + , update_every + , RRDSET_TYPE_AREA + ); + + rrdset_update_labels(cg->st_io, cg->chart_labels); + + rrddim_add(cg->st_io, "read", NULL, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_io, "write", NULL, -1, 1024, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_io); + + rrddim_set(cg->st_io, "read", cg->io_service_bytes.Read); + rrddim_set(cg->st_io, "write", cg->io_service_bytes.Write); + rrdset_done(cg->st_io); + } + + if(likely(cg->io_serviced.updated && cg->io_serviced.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_serviced_ops)) { + snprintfz(title, CHART_TITLE_MAX, "Serviced I/O Operations (all disks)"); + + cg->st_serviced_ops = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "serviced_ops" + , NULL + , "disk" + , "cgroup.serviced_ops" + , title + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 1200 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_serviced_ops, cg->chart_labels); + + rrddim_add(cg->st_serviced_ops, "read", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_serviced_ops, "write", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_serviced_ops); + + rrddim_set(cg->st_serviced_ops, "read", cg->io_serviced.Read); + rrddim_set(cg->st_serviced_ops, "write", cg->io_serviced.Write); + rrdset_done(cg->st_serviced_ops); + } + + if(likely(cg->throttle_io_service_bytes.updated && cg->throttle_io_service_bytes.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_throttle_io)) { + snprintfz(title, CHART_TITLE_MAX, "Throttle I/O Bandwidth (all disks)"); + + cg->st_throttle_io = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "throttle_io" + , NULL + , "disk" + , "cgroup.throttle_io" + , title + , "KiB/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 1200 + , update_every + , RRDSET_TYPE_AREA + ); + + rrdset_update_labels(cg->st_throttle_io, cg->chart_labels); + + rrddim_add(cg->st_throttle_io, "read", NULL, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_throttle_io, "write", NULL, -1, 1024, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_throttle_io); + + rrddim_set(cg->st_throttle_io, "read", cg->throttle_io_service_bytes.Read); + rrddim_set(cg->st_throttle_io, "write", cg->throttle_io_service_bytes.Write); + rrdset_done(cg->st_throttle_io); + } + + if(likely(cg->throttle_io_serviced.updated && cg->throttle_io_serviced.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_throttle_serviced_ops)) { + snprintfz(title, CHART_TITLE_MAX, "Throttle Serviced I/O Operations (all disks)"); + + cg->st_throttle_serviced_ops = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "throttle_serviced_ops" + , NULL + , "disk" + , "cgroup.throttle_serviced_ops" + , title + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 1200 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_throttle_serviced_ops, cg->chart_labels); + + rrddim_add(cg->st_throttle_serviced_ops, "read", NULL, 1, 1, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_throttle_serviced_ops, "write", NULL, -1, 1, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_throttle_serviced_ops); + + rrddim_set(cg->st_throttle_serviced_ops, "read", cg->throttle_io_serviced.Read); + rrddim_set(cg->st_throttle_serviced_ops, "write", cg->throttle_io_serviced.Write); + rrdset_done(cg->st_throttle_serviced_ops); + } + + if(likely(cg->io_queued.updated && cg->io_queued.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_queued_ops)) { + snprintfz(title, CHART_TITLE_MAX, "Queued I/O Operations (all disks)"); + + cg->st_queued_ops = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "queued_ops" + , NULL + , "disk" + , "cgroup.queued_ops" + , title + , "operations" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2000 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_queued_ops, cg->chart_labels); + + rrddim_add(cg->st_queued_ops, "read", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); + rrddim_add(cg->st_queued_ops, "write", NULL, -1, 1, RRD_ALGORITHM_ABSOLUTE); + } + else + rrdset_next(cg->st_queued_ops); + + rrddim_set(cg->st_queued_ops, "read", cg->io_queued.Read); + rrddim_set(cg->st_queued_ops, "write", cg->io_queued.Write); + rrdset_done(cg->st_queued_ops); + } + + if(likely(cg->io_merged.updated && cg->io_merged.enabled == CONFIG_BOOLEAN_YES)) { + if(unlikely(!cg->st_merged_ops)) { + snprintfz(title, CHART_TITLE_MAX, "Merged I/O Operations (all disks)"); + + cg->st_merged_ops = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "merged_ops" + , NULL + , "disk" + , "cgroup.merged_ops" + , title + , "operations/s" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2100 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(cg->st_merged_ops, cg->chart_labels); + + rrddim_add(cg->st_merged_ops, "read", NULL, 1, 1024, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(cg->st_merged_ops, "write", NULL, -1, 1024, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(cg->st_merged_ops); + + rrddim_set(cg->st_merged_ops, "read", cg->io_merged.Read); + rrddim_set(cg->st_merged_ops, "write", cg->io_merged.Write); + rrdset_done(cg->st_merged_ops); + } + + if (cg->options & CGROUP_OPTIONS_IS_UNIFIED) { + struct pressure *res = &cg->cpu_pressure; + if (likely(res->updated && res->some.enabled)) { + if (unlikely(!res->some.st)) { + RRDSET *chart; + snprintfz(title, CHART_TITLE_MAX, "CPU pressure"); + + chart = res->some.st = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "cpu_pressure" + , NULL + , "cpu" + , "cgroup.cpu_pressure" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2200 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(chart = res->some.st, cg->chart_labels); + + res->some.rd10 = rrddim_add(chart, "some 10", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->some.rd60 = rrddim_add(chart, "some 60", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->some.rd300 = rrddim_add(chart, "some 300", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + } else { + rrdset_next(res->some.st); + } + + update_pressure_chart(&res->some); + } + + res = &cg->memory_pressure; + if (likely(res->updated && res->some.enabled)) { + if (unlikely(!res->some.st)) { + RRDSET *chart; + snprintfz(title, CHART_TITLE_MAX, "Memory pressure"); + + chart = res->some.st = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem_pressure" + , NULL + , "mem" + , "cgroup.memory_pressure" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2300 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(chart = res->some.st, cg->chart_labels); + + res->some.rd10 = rrddim_add(chart, "some 10", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->some.rd60 = rrddim_add(chart, "some 60", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->some.rd300 = rrddim_add(chart, "some 300", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + } else { + rrdset_next(res->some.st); + } + + update_pressure_chart(&res->some); + } + + if (likely(res->updated && res->full.enabled)) { + if (unlikely(!res->full.st)) { + RRDSET *chart; + snprintfz(title, CHART_TITLE_MAX, "Memory full pressure"); + + chart = res->full.st = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "mem_full_pressure" + , NULL + , "mem" + , "cgroup.memory_full_pressure" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2350 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(chart = res->full.st, cg->chart_labels); + + res->full.rd10 = rrddim_add(chart, "full 10", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->full.rd60 = rrddim_add(chart, "full 60", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->full.rd300 = rrddim_add(chart, "full 300", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + } else { + rrdset_next(res->full.st); + } + + update_pressure_chart(&res->full); + } + + res = &cg->io_pressure; + if (likely(res->updated && res->some.enabled)) { + if (unlikely(!res->some.st)) { + RRDSET *chart; + snprintfz(title, CHART_TITLE_MAX, "I/O pressure"); + + chart = res->some.st = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "io_pressure" + , NULL + , "disk" + , "cgroup.io_pressure" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2400 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(chart = res->some.st, cg->chart_labels); + + res->some.rd10 = rrddim_add(chart, "some 10", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->some.rd60 = rrddim_add(chart, "some 60", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->some.rd300 = rrddim_add(chart, "some 300", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + } else { + rrdset_next(res->some.st); + } + + update_pressure_chart(&res->some); + } + + if (likely(res->updated && res->full.enabled)) { + if (unlikely(!res->full.st)) { + RRDSET *chart; + snprintfz(title, CHART_TITLE_MAX, "I/O full pressure"); + + chart = res->full.st = rrdset_create_localhost( + cgroup_chart_type(type, cg->chart_id, RRD_ID_LENGTH_MAX) + , "io_full_pressure" + , NULL + , "disk" + , "cgroup.io_full_pressure" + , title + , "percentage" + , PLUGIN_CGROUPS_NAME + , PLUGIN_CGROUPS_MODULE_CGROUPS_NAME + , cgroup_containers_chart_priority + 2450 + , update_every + , RRDSET_TYPE_LINE + ); + + rrdset_update_labels(chart = res->full.st, cg->chart_labels); + + res->full.rd10 = rrddim_add(chart, "full 10", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->full.rd60 = rrddim_add(chart, "full 60", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + res->full.rd300 = rrddim_add(chart, "full 300", NULL, 1, 100, RRD_ALGORITHM_ABSOLUTE); + } else { + rrdset_next(res->full.st); + } + + update_pressure_chart(&res->full); + } + } + } + + if(likely(cgroup_enable_systemd_services)) + update_systemd_services_charts(update_every, services_do_cpu, services_do_mem_usage, services_do_mem_detailed + , services_do_mem_failcnt, services_do_swap_usage, services_do_io + , services_do_io_ops, services_do_throttle_io, services_do_throttle_ops + , services_do_queued_ops, services_do_merged_ops + ); + + debug(D_CGROUP, "done updating cgroups charts"); +} + +// ---------------------------------------------------------------------------- +// cgroups main + +static void cgroup_main_cleanup(void *ptr) { + struct netdata_static_thread *static_thread = (struct netdata_static_thread *)ptr; + static_thread->enabled = NETDATA_MAIN_THREAD_EXITING; + + info("cleaning up..."); + + usec_t max = 2 * USEC_PER_SEC, step = 50000; + + if (!discovery_thread.exited) { + info("stopping discovery thread worker"); + uv_mutex_lock(&discovery_thread.mutex); + discovery_thread.start_discovery = 1; + uv_cond_signal(&discovery_thread.cond_var); + uv_mutex_unlock(&discovery_thread.mutex); + } + + while (!discovery_thread.exited && max > 0) { + max -= step; + info("waiting for discovery thread to finish..."); + sleep_usec(step); + } + + static_thread->enabled = NETDATA_MAIN_THREAD_EXITED; +} + +void *cgroups_main(void *ptr) { + netdata_thread_cleanup_push(cgroup_main_cleanup, ptr); + + struct rusage thread; + + // when ZERO, attempt to do it + int vdo_cpu_netdata = config_get_boolean("plugin:cgroups", "cgroups plugin resource charts", 1); + + read_cgroup_plugin_configuration(); + + RRDSET *stcpu_thread = NULL; + + if (uv_mutex_init(&cgroup_root_mutex)) { + error("CGROUP: cannot initialize mutex for the main cgroup list"); + goto exit; + } + + // dispatch a discovery worker thread + discovery_thread.start_discovery = 0; + discovery_thread.exited = 0; + + if (uv_mutex_init(&discovery_thread.mutex)) { + error("CGROUP: cannot initialize mutex for discovery thread"); + goto exit; + } + if (uv_cond_init(&discovery_thread.cond_var)) { + error("CGROUP: cannot initialize conditional variable for discovery thread"); + goto exit; + } + + int error = uv_thread_create(&discovery_thread.thread, cgroup_discovery_worker, NULL); + if (error) { + error("CGROUP: cannot create tread worker. uv_thread_create(): %s", uv_strerror(error)); + goto exit; + } + uv_thread_set_name_np(discovery_thread.thread, "PLUGIN[cgroups]"); + + heartbeat_t hb; + heartbeat_init(&hb); + usec_t step = cgroup_update_every * USEC_PER_SEC; + usec_t find_every = cgroup_check_for_new_every * USEC_PER_SEC, find_dt = 0; + + while(!netdata_exit) { + usec_t hb_dt = heartbeat_next(&hb, step); + if(unlikely(netdata_exit)) break; + + find_dt += hb_dt; + if(unlikely(find_dt >= find_every || cgroups_check)) { + uv_cond_signal(&discovery_thread.cond_var); + discovery_thread.start_discovery = 1; + find_dt = 0; + cgroups_check = 0; + } + + uv_mutex_lock(&cgroup_root_mutex); + read_all_cgroups(cgroup_root); + update_cgroup_charts(cgroup_update_every); + uv_mutex_unlock(&cgroup_root_mutex); + + // -------------------------------------------------------------------- + + if(vdo_cpu_netdata) { + getrusage(RUSAGE_THREAD, &thread); + + if(unlikely(!stcpu_thread)) { + + stcpu_thread = rrdset_create_localhost( + "netdata" + , "plugin_cgroups_cpu" + , NULL + , "cgroups" + , NULL + , "NetData CGroups Plugin CPU usage" + , "milliseconds/s" + , PLUGIN_CGROUPS_NAME + , "stats" + , 132000 + , cgroup_update_every + , RRDSET_TYPE_STACKED + ); + + rrddim_add(stcpu_thread, "user", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL); + rrddim_add(stcpu_thread, "system", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL); + } + else + rrdset_next(stcpu_thread); + + rrddim_set(stcpu_thread, "user" , thread.ru_utime.tv_sec * 1000000ULL + thread.ru_utime.tv_usec); + rrddim_set(stcpu_thread, "system", thread.ru_stime.tv_sec * 1000000ULL + thread.ru_stime.tv_usec); + rrdset_done(stcpu_thread); + } + } + +exit: + netdata_thread_cleanup_pop(1); + return NULL; +} diff --git a/collectors/cgroups.plugin/sys_fs_cgroup.h b/collectors/cgroups.plugin/sys_fs_cgroup.h new file mode 100644 index 0000000..155330f --- /dev/null +++ b/collectors/cgroups.plugin/sys_fs_cgroup.h @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_SYS_FS_CGROUP_H +#define NETDATA_SYS_FS_CGROUP_H 1 + +#include "../../daemon/common.h" + +#if (TARGET_OS == OS_LINUX) + +#define NETDATA_PLUGIN_HOOK_LINUX_CGROUPS \ + { \ + .name = "PLUGIN[cgroups]", \ + .config_section = CONFIG_SECTION_PLUGINS, \ + .config_name = "cgroups", \ + .enabled = 1, \ + .thread = NULL, \ + .init_routine = NULL, \ + .start_routine = cgroups_main \ + }, + +extern void *cgroups_main(void *ptr); + +#include "../proc.plugin/plugin_proc.h" + +#else // (TARGET_OS == OS_LINUX) + +#define NETDATA_PLUGIN_HOOK_LINUX_CGROUPS + +#endif // (TARGET_OS == OS_LINUX) + +extern char *parse_k8s_data(struct label **labels, char *data); + +#endif //NETDATA_SYS_FS_CGROUP_H diff --git a/collectors/cgroups.plugin/tests/test_cgroups_plugin.c b/collectors/cgroups.plugin/tests/test_cgroups_plugin.c new file mode 100644 index 0000000..be8ea2c --- /dev/null +++ b/collectors/cgroups.plugin/tests/test_cgroups_plugin.c @@ -0,0 +1,110 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "test_cgroups_plugin.h" +#include "libnetdata/required_dummies.h" + +RRDHOST *localhost; +int netdata_zero_metrics_enabled = 1; +struct config netdata_config; +char *netdata_configured_primary_plugins_dir = NULL; + +static void test_parse_k8s_data(void **state) +{ + UNUSED(state); + + struct label *labels = (struct label *)0xff; + + struct k8s_test_data { + char *data; + char *name; + char *key[3]; + char *value[3]; + }; + + struct k8s_test_data test_data[] = { + // One label + { .data = "name label1=\"value1\"", + .name = "name", + .key[0] = "label1", .value[0] = "value1" }, + + // Three labels + { .data = "name label1=\"value1\",label2=\"value2\",label3=\"value3\"", + .name = "name", + .key[0] = "label1", .value[0] = "value1", + .key[1] = "label2", .value[1] = "value2", + .key[2] = "label3", .value[2] = "value3" }, + + // Comma at the end of the data string + { .data = "name label1=\"value1\",", + .name = "name", + .key[0] = "label1", .value[0] = "value1" }, + + // Equals sign in the value + { .data = "name label1=\"value=1\"", + .name = "name", + .key[0] = "label1", .value[0] = "value=1" }, + + // Double quotation mark in the value + { .data = "name label1=\"value\"1\"", + .name = "name", + .key[0] = "label1", .value[0] = "value" }, + + // Escaped double quotation mark in the value + { .data = "name label1=\"value\\\"1\"", + .name = "name", + .key[0] = "label1", .value[0] = "value\\\"1" }, + + // Equals sign in the key + { .data = "name label=1=\"value1\"", + .name = "name", + .key[0] = "label", .value[0] = "1=\"value1\"" }, + + // Skipped value + { .data = "name label1=,label2=\"value2\"", + .name = "name", + .key[0] = "label2", .value[0] = "value2" }, + + // A pair of equals signs + { .data = "name= =", + .name = "name=" }, + + // A pair of commas + { .data = "name, ,", + .name = "name," }, + + { .data = NULL } + }; + + for (int i = 0; test_data[i].data != NULL; i++) { + char *data = strdup(test_data[i].data); + + for (int l = 0; l < 3 && test_data[i].key[l] != NULL; l++) { + char *key = test_data[i].key[l]; + char *value = test_data[i].value[l]; + + expect_function_call(__wrap_add_label_to_list); + expect_value(__wrap_add_label_to_list, l, 0xff); + expect_string(__wrap_add_label_to_list, key, key); + expect_string(__wrap_add_label_to_list, value, value); + expect_value(__wrap_add_label_to_list, label_source, LABEL_SOURCE_KUBERNETES); + } + + char *name = parse_k8s_data(&labels, data); + + assert_string_equal(name, test_data[i].name); + assert_ptr_equal(labels, 0xff); + + free(data); + } +} + +int main(void) +{ + const struct CMUnitTest tests[] = { + cmocka_unit_test(test_parse_k8s_data), + }; + + int test_res = cmocka_run_group_tests_name("test_parse_k8s_data", tests, NULL, NULL); + + return test_res; +} diff --git a/collectors/cgroups.plugin/tests/test_cgroups_plugin.h b/collectors/cgroups.plugin/tests/test_cgroups_plugin.h new file mode 100644 index 0000000..3d68e92 --- /dev/null +++ b/collectors/cgroups.plugin/tests/test_cgroups_plugin.h @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef TEST_CGROUPS_PLUGIN_H +#define TEST_CGROUPS_PLUGIN_H 1 + +#include "libnetdata/libnetdata.h" + +#include "../sys_fs_cgroup.h" + +#include <stdarg.h> +#include <stddef.h> +#include <setjmp.h> +#include <stdint.h> +#include <cmocka.h> + +#endif /* TEST_CGROUPS_PLUGIN_H */ diff --git a/collectors/cgroups.plugin/tests/test_doubles.c b/collectors/cgroups.plugin/tests/test_doubles.c new file mode 100644 index 0000000..5572fb2 --- /dev/null +++ b/collectors/cgroups.plugin/tests/test_doubles.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "test_cgroups_plugin.h" + +void rrdset_is_obsolete(RRDSET *st) +{ + UNUSED(st); +} + +void rrdset_isnot_obsolete(RRDSET *st) +{ + UNUSED(st); +} + +struct mountinfo *mountinfo_read(int do_statvfs) +{ + UNUSED(do_statvfs); + + return NULL; +} + +struct mountinfo * +mountinfo_find_by_filesystem_mount_source(struct mountinfo *root, const char *filesystem, const char *mount_source) +{ + UNUSED(root); + UNUSED(filesystem); + UNUSED(mount_source); + + return NULL; +} + +struct mountinfo * +mountinfo_find_by_filesystem_super_option(struct mountinfo *root, const char *filesystem, const char *super_options) +{ + UNUSED(root); + UNUSED(filesystem); + UNUSED(super_options); + + return NULL; +} + +void mountinfo_free_all(struct mountinfo *mi) +{ + UNUSED(mi); +} + +struct label *__wrap_add_label_to_list(struct label *l, char *key, char *value, LABEL_SOURCE label_source) +{ + function_called(); + check_expected_ptr(l); + check_expected_ptr(key); + check_expected_ptr(value); + check_expected(label_source); + return l; +} + +void rrdset_update_labels(RRDSET *st, struct label *labels) +{ + UNUSED(st); + UNUSED(labels); +} + +RRDSET *rrdset_create_custom( + RRDHOST *host, const char *type, const char *id, const char *name, const char *family, const char *context, + const char *title, const char *units, const char *plugin, const char *module, long priority, int update_every, + RRDSET_TYPE chart_type, RRD_MEMORY_MODE memory_mode, long history_entries) +{ + UNUSED(host); + UNUSED(type); + UNUSED(id); + UNUSED(name); + UNUSED(family); + UNUSED(context); + UNUSED(title); + UNUSED(units); + UNUSED(plugin); + UNUSED(module); + UNUSED(priority); + UNUSED(update_every); + UNUSED(chart_type); + UNUSED(memory_mode); + UNUSED(history_entries); + + return NULL; +} + +RRDDIM *rrddim_add_custom( + RRDSET *st, const char *id, const char *name, collected_number multiplier, collected_number divisor, + RRD_ALGORITHM algorithm, RRD_MEMORY_MODE memory_mode) +{ + UNUSED(st); + UNUSED(id); + UNUSED(name); + UNUSED(multiplier); + UNUSED(divisor); + UNUSED(algorithm); + UNUSED(memory_mode); + + return NULL; +} + +collected_number rrddim_set(RRDSET *st, const char *id, collected_number value) +{ + UNUSED(st); + UNUSED(id); + UNUSED(value); + + return 0; +} + +collected_number rrddim_set_by_pointer(RRDSET *st, RRDDIM *rd, collected_number value) +{ + UNUSED(st); + UNUSED(rd); + UNUSED(value); + + return 0; +} + +RRDSETVAR *rrdsetvar_custom_chart_variable_create(RRDSET *st, const char *name) +{ + UNUSED(st); + UNUSED(name); + + return NULL; +} + +void rrdsetvar_custom_chart_variable_set(RRDSETVAR *rs, calculated_number value) +{ + UNUSED(rs); + UNUSED(value); +} + +void rrdset_next_usec(RRDSET *st, usec_t microseconds) +{ + UNUSED(st); + UNUSED(microseconds); +} + +void rrdset_done(RRDSET *st) +{ + UNUSED(st); +} + +void update_pressure_chart(struct pressure_chart *chart) +{ + UNUSED(chart); +} + +void netdev_rename_device_add( + const char *host_device, const char *container_device, const char *container_name, struct label *labels) +{ + UNUSED(host_device); + UNUSED(container_device); + UNUSED(container_name); + UNUSED(labels); +} + +void netdev_rename_device_del(const char *host_device) +{ + UNUSED(host_device); +} |