Merging upstream version 1.44.3.

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-03-09 13:19:48 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-03-09 13:20:02 +0000
commit: 58daab21cd043e1dc37024a7f99b396788372918 (patch)
tree: 96771e43bb69f7c1c2b0b4f7374cb74d7866d0cb /health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_099.md
parent: Releasing debian version 1.43.2-1. (diff)
download: netdata-58daab21cd043e1dc37024a7f99b396788372918.tar.xz
netdata-58daab21cd043e1dc37024a7f99b396788372918.zip
1 files changed, 36 insertions, 0 deletions
diff --git a/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_099.md b/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_099.md
new file mode 100644
index 000000000..39e031628
--- /dev/null
+++ b/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_099.md
@@ -0,0 +1,36 @@
+### Understand the alert
+
+This alert calculates the average Pod Lifecycle Event Generator (PLEG) relisting latency over the last minute with a quantile of 0.99 in microseconds. If you receive this alert, it means that the Kubelet's PLEG latency is high, which can slow down your Kubernetes cluster.
+
+### What does PLEG latency mean?
+
+Pod Lifecycle Event Generator (PLEG) is a component of the Kubelet that watches for container events on the system and generates events for a pod's lifecycle. High PLEG latency indicates a delay in processing these events, which can cause delays in pod startup, termination, and updates.
+
+### Troubleshoot the alert
+
+1. Check the overall Kubelet performance and system load:
+
+   a. Run `kubectl get nodes` to check the status of the nodes in your cluster.
+   b. Investigate the node with high PLEG latency using `kubectl describe node <NODE_NAME>` to view detailed information about resource usage and events.
+   c. Use monitoring tools like `top`, `htop`, or `vmstat` to check for high CPU, memory, or disk usage on the node.
+
+2. Look for problematic pods or containers:
+
+   a. Run `kubectl get pods --all-namespaces` to check the status of all pods across namespaces.
+   b. Use `kubectl logs <POD_NAME> -n <NAMESPACE>` to check the logs of the pods in the namespace.
+   c. Investigate pods with high restart counts, crash loops, or other abnormal statuses.
+
+3. Verify Kubelet configurations and logs:
+
+   a. Check the Kubelet configuration on the node. Look for any misconfigurations or settings that could cause high latency.
+   b. Check Kubelet logs using `journalctl -u kubelet` for more information about PLEG events and errors.
+
+4. Consider evaluating your workloads and scaling your cluster:
+
+   a. If you have multiple nodes experiencing high PLEG latency or if the overall load on your nodes is consistently high, you might need to scale your cluster.
+   b. Evaluate your workloads and adjust resource requests and limits to make the best use of your available resources.
+
+### Useful resources
+
+1. [Understanding the Kubernetes Kubelet](https://kubernetes.io/docs/concepts/overview/components/#kubelet)
+2. [Troubleshooting Kubernetes Clusters](https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/)
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-03-09 13:19:48 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-03-09 13:20:02 +0000
commit	58daab21cd043e1dc37024a7f99b396788372918 (patch)
tree	96771e43bb69f7c1c2b0b4f7374cb74d7866d0cb /health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_099.md
parent	Releasing debian version 1.43.2-1. (diff)
download	netdata-58daab21cd043e1dc37024a7f99b396788372918.tar.xz netdata-58daab21cd043e1dc37024a7f99b396788372918.zip