summaryrefslogtreecommitdiffstats
path: root/health/guides/ram
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-19 02:57:58 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-19 02:57:58 +0000
commitbe1c7e50e1e8809ea56f2c9d472eccd8ffd73a97 (patch)
tree9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/ram
parentInitial commit. (diff)
downloadnetdata-upstream/1.44.3.tar.xz
netdata-upstream/1.44.3.zip
Adding upstream version 1.44.3.upstream/1.44.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'health/guides/ram')
-rw-r--r--health/guides/ram/oom_kill.md89
-rw-r--r--health/guides/ram/ram_available.md30
-rw-r--r--health/guides/ram/ram_in_use.md28
3 files changed, 147 insertions, 0 deletions
diff --git a/health/guides/ram/oom_kill.md b/health/guides/ram/oom_kill.md
new file mode 100644
index 00000000..69afb814
--- /dev/null
+++ b/health/guides/ram/oom_kill.md
@@ -0,0 +1,89 @@
+### Understand the alert
+
+The OOM Killer (Out of Memory Killer) is a process that the Linux kernel uses when the system is critically low on memory or a process reached its memory limits. As the name suggests, it has the duty to review all running processes and kill one or more of them in order to free up memory and keep the system running.
+
+Linux Kernel 4.19 introduced cgroup awareness of OOM killer implementation which adds an ability to kill a cgroup as a single unit and to guarantee the integrity of the workload. In a nutshell, cgroups allow the limitation of memory, disk I/O, and network usage for a group of processes. Furthermore, cgroups may set usage quotas, and prioritize a process group to receive more CPU time or memory than other groups. You can see more about cgroups in
+the [cgroup man pages](https://man7.org/linux/man-pages/man7/cgroups.7.html)
+
+The Netdata Agent monitors the number of Out Of Memory (OOM) kills in the last 30 minutes. Receiving this alert indicates that some processes got killed by OOM Killer.
+
+### Troubleshoot the alert
+
+- Troubleshoot issues in the OOM killer
+
+The OOM Killer uses a heuristic system to choose a processes for termination. It is based on a score associated with each running application, which is calculated by `oom_badness()` call inside Linux kernel
+
+1. To identify which process/apps was killed from the OOM killer, inspect the logs:
+
+```
+dmesg -T | egrep -i 'killed process'
+```
+The system response looks similar to this:
+```
+Jan 7 07:12:33 mysql-server-01 kernel: Out of Memory: Killed process 3154 (mysqld).
+```
+
+2. To see the current `oom_score` (the priority in which OOM killer will act upon your processes) run the following script.
+The script prints all running processes (by pid and name) with likelihood to be killed by the OOM killer (second column).
+The greater the `oom_score` (second column) the more propably to be killed by OOM killer.
+
+```
+while read -r pid comm; do
+ printf '%d\t%d\t%s\n' "$pid" "$(cat /proc/$pid/oom_score)" "$comm";
+done < <(ps -e -o pid= -o comm=) | sort -k 2n
+```
+
+3. Adjust the `oom_score` to protect processes using the `choom` util from
+the `util-linux` [package v2.33-rc1+](https://github.com/util-linux/util-linux/commit/8fa223daba1963c34cc828075ce6773ff01fafe3)
+
+```
+choom -p PID -n number
+```
+
+4. Once the settings work to your case, make the change permanent. In the unit file of your service, under the [Service] section, add the following value: `OOMScoreAdjust=<PREFFERRED_VALUE>`
+
+- Add a temporary swap file</summary>
+
+Keep in mind this requires creating a swap file in one of the disks. Performance of your system may be affected.
+
+1. Decide where your swapfile will live. It is strongly advised to allocate the swap file under in
+ the root directory. A swap file is like an extension of your RAM and it should be protected, far
+ from normal user accessible directories. Run the following command:
+
+ ```
+ dd if=/dev/zero of=<path_in_root> bs=1024 count=<size_in_bytes>
+ ```
+
+2. Grant root only access to the swap file:
+
+ ```
+ chmod 600 <path_to_the_swap_file_you_created>
+ ```
+
+3. Make it a Linux swap area:
+
+ ```
+ mkswap <path_to_the_swap_file_you_created>
+ ```
+
+4. Enable the swap with the following command:
+
+ ```
+ swapon <path_to_the_swap_file_you_created>
+ ```
+
+5. If you plan to use it a regular basis, you should update the `/etc/fstab` config. The entry you
+ will add would look like:
+
+ ```
+ /swap_file swap sw 0 0
+ ```
+
+ For more information see the fstab manpage: `man fstab`.
+
+
+### Useful resources
+
+1. [Linux Out of Memory Killer](https://neo4j.com/developer/kb/linux-out-of-memory-killer/)
+2. [Memory Resource Controller in linux kernel](https://docs.kernel.org/admin-guide/cgroup-v1/memory.html?highlight=oom)
+3. [OOM killer blogspot](https://www.psce.com/en/blog/2012/05/31/mysql-oom-killer-and-everything-related/)
diff --git a/health/guides/ram/ram_available.md b/health/guides/ram/ram_available.md
new file mode 100644
index 00000000..f94bdf3b
--- /dev/null
+++ b/health/guides/ram/ram_available.md
@@ -0,0 +1,30 @@
+### Understand the alert
+
+This alarm shows the percentage of an estimated amount of RAM that is available for use in userspace processes without causing swapping. If this alarm gets raised it means that your system has low amount of available RAM memory, and it may affect the performance of running applications.
+
+- If there is no `swap` space available, the OOM Killer can start killing processes.
+
+- When a system runs out of RAM memory, it can store its inactive content in another storage's partition (e.g. your
+main drive). The borrowed space is called `swap` or "swap space".
+
+- The OOM Killer (Out of Memory Killer) is a process that the Linux Kernel uses when the system is critically low on
+RAM. As the name suggests, it has the duty to review all running processes and kill one or more of them in order
+to free up RAM memory and keep the system running.<sup>[1](https://neo4j.com/developer/kb/linux-out-of-memory-killer/)</sup>
+
+### Troubleshoot the alert
+
+- Check per-process RAM usage to find the top consumers
+
+Linux:
+```
+top -b -o +%MEM | head -n 22
+```
+FreeBSD:
+```
+top -b -o res | head -n 22
+```
+
+It would be helpful to close any of the main consumer processes, but Netdata strongly suggests knowing exactly what processes you are closing and being certain that they are not necessary.
+
+### Useful resources
+[Linux Out of Memory Killer](https://neo4j.com/developer/kb/linux-out-of-memory-killer/)
diff --git a/health/guides/ram/ram_in_use.md b/health/guides/ram/ram_in_use.md
new file mode 100644
index 00000000..9c686daa
--- /dev/null
+++ b/health/guides/ram/ram_in_use.md
@@ -0,0 +1,28 @@
+### Understand the alert
+
+This alert shows the percentage of used RAM. If you receive this alert, there is high RAM utilization on the node. Running low on RAM memory, means that the performance of running applications might be affected.
+
+If there is no `swap` space available, the OOM Killer can start killing processes.
+
+When a system runs out of RAM, it can store it's inactive content in persistent storage (e.g. your main drive). The borrowed space is called `swap` or "swap space".
+
+The OOM Killer (Out of Memory Killer) is a process that the Linux Kernel uses when the system is critically low on RAM. As the name suggests, it has the duty to review all running processes and kill one or more of them in order
+to free up RAM memory and keep the system running.
+
+### Troubleshoot the alert
+
+- Check per-process RAM usage to find the top consumers
+
+Linux:
+```
+top -b -o +%MEM | head -n 22
+```
+FreeBSD:
+```
+top -b -o res | head -n 22
+```
+
+It would be helpful to close any of the main consumer processes, but Netdata strongly suggests knowing exactly what processes you are closing and being certain that they are not necessary.
+
+### Useful resources
+[Linux Out of Memory Killer](https://neo4j.com/developer/kb/linux-out-of-memory-killer/)