diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
commit | be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97 (patch) | |
tree | 9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/btrfs/btrfs_device_read_errors.md | |
parent | Initial commit. (diff) | |
download | netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.tar.xz netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.zip |
Adding upstream version 1.44.3.upstream/1.44.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
-rw-r--r-- | health/guides/btrfs/btrfs_device_read_errors.md | 50 |
1 files changed, 50 insertions, 0 deletions
diff --git a/health/guides/btrfs/btrfs_device_read_errors.md b/health/guides/btrfs/btrfs_device_read_errors.md new file mode 100644 index 00000000..684cd0be --- /dev/null +++ b/health/guides/btrfs/btrfs_device_read_errors.md @@ -0,0 +1,50 @@ +### Understand the alert + +This alert monitors the number of BTRFS read errors on a device. If you receive this alert, it means that your system has encountered at least one BTRFS read error in the last 10 minutes. + +### What are BTRFS read errors? + +BTRFS (B-Tree File System) is a modern file system designed for Linux. BTRFS read errors are instances where the file system fails to read data from a device. This can occur due to various reasons like hardware failure, file system corruption, or disk problems. + +### Troubleshoot the alert + +1. Check system logs for BTRFS errors + + Review the output from the following command to identify any BTRFS errors: + ``` + sudo journalctl -k | grep -i BTRFS + ``` + +2. Identify the affected BTRFS device and partition + + List all BTRFS devices with their respective information by running the following command: + ``` + sudo btrfs filesystem show + ``` + +3. Perform a BTRFS filesystem check + + To check the integrity of the BTRFS file system, run the following command, replacing `<device>` with the affected device path: + ``` + sudo btrfs check --readonly <device> + ``` + Note: Be careful when using the `--repair` option, as it may cause data loss. It is recommended to take a backup before attempting a repair. + +4. Verify the disk health + + Check the disk health using SMART tools to determine if there are any hardware issues. This can be done by first installing `smartmontools` if not already installed: + ``` + sudo apt install smartmontools + ``` + Then running a disk health check on the affected device: + ``` + sudo smartctl -a <device> + ``` + +5. Analyze the read error patterns + + If the read errors are happening consistently or increasing, consider replacing the affected device with a new one or adding redundancy to the system by using RAID or BTRFS built-in features. + +### Useful resources + +1. [smartmontools documentation](https://www.smartmontools.org/) |