diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-03-09 13:19:22 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-03-09 13:19:22 +0000 |
commit | c21c3b0befeb46a51b6bf3758ffa30813bea0ff0 (patch) | |
tree | 9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/dbengine | |
parent | Adding upstream version 1.43.2. (diff) | |
download | netdata-upstream/1.44.3.tar.xz netdata-upstream/1.44.3.zip |
Adding upstream version 1.44.3.upstream/1.44.3
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
4 files changed, 56 insertions, 0 deletions
diff --git a/health/guides/dbengine/10min_dbengine_global_flushing_errors.md b/health/guides/dbengine/10min_dbengine_global_flushing_errors.md new file mode 100644 index 00000000..4e388eb2 --- /dev/null +++ b/health/guides/dbengine/10min_dbengine_global_flushing_errors.md @@ -0,0 +1,13 @@ +### Understand the alert + +The Database Engine works like a traditional database. It dedicates a certain amount of RAM to data caching and indexing, while the rest of the data resides compressed on disk. Unlike other memory modes, the amount of historical metrics stored is based on the amount of disk space you allocate and the effective compression ratio, not a fixed number of metrics collected. + +By using both RAM and disk space, the database engine allows for long-term storage of per-second metrics inside of the Netdata Agent itself. + +Netdata monitors the number of pages deleted due to failure to flush data to disk in the last 10 minutes. In this situation some metric data was dropped to unblock data collection. To remedy this issue, reduce disk load or use +faster disks. This alert is triggered in critical state when the number deleted pages is greater than 0. + +### Useful resources + +[Read more about Netdata DB engine](https://learn.netdata.cloud/docs/agent/database/engine) + diff --git a/health/guides/dbengine/10min_dbengine_global_flushing_warnings.md b/health/guides/dbengine/10min_dbengine_global_flushing_warnings.md new file mode 100644 index 00000000..1029e7f6 --- /dev/null +++ b/health/guides/dbengine/10min_dbengine_global_flushing_warnings.md @@ -0,0 +1,15 @@ +### Understand the alert + +The Database Engine works like a traditional database. It dedicates a certain amount of RAM to data caching and indexing, while the rest of the data resides compressed on disk. Unlike other memory modes, the amount of historical metrics stored is based on the amount of disk space you allocate and the effective compression ratio, not a fixed number +of metrics collected. + +By using both RAM and disk space, the database engine allows for long-term storage of per-second metrics inside of the Netdata Agent itself. + +Netdata monitors the number of times when `dbengine` dirty pages were over 50% of the instance page cache in the last 10 minutes. In this situation, the metric data are at risk of not being stored in the database. To remedy this issue, reduce disk load or use faster disks. + +This alert is triggered in warn state when the number of `dbengine` dirty pages which were over 50% of the instance is greater than 0. + +### Useful resources + +[Read more about Netdata DB engine](https://learn.netdata.cloud/docs/agent/database/engine) + diff --git a/health/guides/dbengine/10min_dbengine_global_fs_errors.md b/health/guides/dbengine/10min_dbengine_global_fs_errors.md new file mode 100644 index 00000000..446289a9 --- /dev/null +++ b/health/guides/dbengine/10min_dbengine_global_fs_errors.md @@ -0,0 +1,14 @@ +### Understand the alert + +The Database Engine works like a traditional database. It dedicates a certain amount of RAM to data caching and indexing, while the rest of the data resides compressed on disk. Unlike other memory modes, the amount of historical metrics stored is based on the amount of disk space you allocate and the effective compression ratio, not a fixed number of metrics collected. + +By using both RAM and disk space, the database engine allows for long-term storage of per-second metrics inside of the Netdata agent itself. + +Netdata monitors the number of filesystem errors in the last 10 minutes. The Dbengine is experiencing filesystem errors (too many open files, wrong permissions, etc.) + +This alert is triggered in warning state when the number of filesystem errors is greater than 0. + +### Useful resources + +[Read more about Netdata DB engine](https://learn.netdata.cloud/docs/agent/database/engine) + diff --git a/health/guides/dbengine/10min_dbengine_global_io_errors.md b/health/guides/dbengine/10min_dbengine_global_io_errors.md new file mode 100644 index 00000000..c47004f4 --- /dev/null +++ b/health/guides/dbengine/10min_dbengine_global_io_errors.md @@ -0,0 +1,14 @@ +### Understand the alert + +The Database Engine works like a traditional database. It dedicates a certain amount of RAM to data caching and indexing, while the rest of the data resides compressed on disk. Unlike other memory modes, the amount of historical metrics stored is based on the amount of disk space you allocate and the effective compression ratio, not a fixed number of metrics collected. + +By using both RAM and disk space, the database engine allows for long-term storage of per-second metrics inside of the Netdata Agent itself. + +The Netdata Agent monitors the number of IO errors in the last 10 minutes. The dbengine is experiencing I/O errors (CRC errors, out of space, bad disk, etc.). + +This alert is triggered in critical state when the number of IO errors is greater that 0. + +### Useful resources + +[Read more about Netdata DB engine](https://learn.netdata.cloud/docs/agent/database/engine) + |