summaryrefslogtreecommitdiffstats
path: root/health/guides/btrfs/btrfs_device_read_errors.md
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--health/guides/btrfs/btrfs_device_read_errors.md50
1 files changed, 50 insertions, 0 deletions
diff --git a/health/guides/btrfs/btrfs_device_read_errors.md b/health/guides/btrfs/btrfs_device_read_errors.md
new file mode 100644
index 00000000..684cd0be
--- /dev/null
+++ b/health/guides/btrfs/btrfs_device_read_errors.md
@@ -0,0 +1,50 @@
+### Understand the alert
+
+This alert monitors the number of BTRFS read errors on a device. If you receive this alert, it means that your system has encountered at least one BTRFS read error in the last 10 minutes.
+
+### What are BTRFS read errors?
+
+BTRFS (B-Tree File System) is a modern file system designed for Linux. BTRFS read errors are instances where the file system fails to read data from a device. This can occur due to various reasons like hardware failure, file system corruption, or disk problems.
+
+### Troubleshoot the alert
+
+1. Check system logs for BTRFS errors
+
+ Review the output from the following command to identify any BTRFS errors:
+ ```
+ sudo journalctl -k | grep -i BTRFS
+ ```
+
+2. Identify the affected BTRFS device and partition
+
+ List all BTRFS devices with their respective information by running the following command:
+ ```
+ sudo btrfs filesystem show
+ ```
+
+3. Perform a BTRFS filesystem check
+
+ To check the integrity of the BTRFS file system, run the following command, replacing `<device>` with the affected device path:
+ ```
+ sudo btrfs check --readonly <device>
+ ```
+ Note: Be careful when using the `--repair` option, as it may cause data loss. It is recommended to take a backup before attempting a repair.
+
+4. Verify the disk health
+
+ Check the disk health using SMART tools to determine if there are any hardware issues. This can be done by first installing `smartmontools` if not already installed:
+ ```
+ sudo apt install smartmontools
+ ```
+ Then running a disk health check on the affected device:
+ ```
+ sudo smartctl -a <device>
+ ```
+
+5. Analyze the read error patterns
+
+ If the read errors are happening consistently or increasing, consider replacing the affected device with a new one or adding redundancy to the system by using RAID or BTRFS built-in features.
+
+### Useful resources
+
+1. [smartmontools documentation](https://www.smartmontools.org/)