summaryrefslogtreecommitdiffstats
path: root/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_05.md
diff options
context:
space:
mode:
Diffstat (limited to 'health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_05.md')
-rw-r--r--health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_05.md59
1 files changed, 59 insertions, 0 deletions
diff --git a/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_05.md b/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_05.md
new file mode 100644
index 00000000..b448c4d9
--- /dev/null
+++ b/health/guides/kubelet/kubelet_1m_pleg_relist_latency_quantile_05.md
@@ -0,0 +1,59 @@
+### Understand the alert
+
+This alert is related to Kubernetes and is triggered when the average `Pod Lifecycle Event Generator (PLEG)` relisting latency over the last minute is higher than the expected threshold (quantile 0.5). If you receive this alert, it means that the kubelet is experiencing some latency issues, which may affect the scheduling and management of your Kubernetes Pods.
+
+### What is PLEG?
+
+The Pod Lifecycle Event Generator (PLEG) is a component within the kubelet responsible for keeping track of changes (events) to the Pod and updating the kubelet's internal status. This ensures that the kubelet can successfully manage and schedule Pods on the Kubernetes node.
+
+### What does relisting latency mean?
+
+Relisting latency refers to the time taken by the PLEG to detect, process, and update the kubelet about the events or changes in a Pod's lifecycle. High relisting latency can lead to delays in the kubelet reacting to these changes, which can affect the overall functioning of the Kubernetes cluster.
+
+### Troubleshoot the alert
+
+1. Check the kubelet logs for any errors or warnings related to PLEG:
+
+ ```
+ sudo journalctl -u kubelet
+ ```
+
+ Look for any logs related to PLEG delays, issues, or timeouts.
+
+2. Restart the kubelet if necessary:
+
+ ```
+ sudo systemctl restart kubelet
+ ```
+
+ Sometimes, restarting the kubelet can resolve sporadic latency issues.
+
+3. Monitor the Kubernetes node's resource usage (CPU, Memory, Disk) using `kubectl top nodes`:
+
+ ```
+ kubectl top nodes
+ ```
+
+ If the node's resource usage is too high, consider scaling your cluster or optimizing workloads.
+
+4. Check the overall health of your Kubernetes cluster:
+
+ ```
+ kubectl get nodes
+ kubectl get pods --all-namespaces
+ ```
+
+ These commands will help you identify any issues with other nodes or Pods in your cluster.
+
+5. Investigate the specific Pods experiencing latency in PLEG:
+
+ ```
+ kubectl describe pod <pod_name> -n <namespace>
+ ```
+
+ Look for any signs of the Pod being stuck in a pending state, startup issues, or container crashes.
+
+### Useful resources
+
+1. [Kubernetes Kubelet - PLEG](https://kubernetes.io/docs/concepts/overview/components/#kubelet)
+2. [Kubernetes Troubleshooting](https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/)