summaryrefslogtreecommitdiffstats
path: root/health/guides/megacli
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-19 02:57:58 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-19 02:57:58 +0000
commitbe1c7e50e1e8809ea56f2c9d472eccd8ffd73a97 (patch)
tree9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/megacli
parentInitial commit. (diff)
downloadnetdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.tar.xz
netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.zip
Adding upstream version 1.44.3.upstream/1.44.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'health/guides/megacli')
-rw-r--r--health/guides/megacli/megacli_adapter_state.md29
-rw-r--r--health/guides/megacli/megacli_bbu_cycle_count.md28
-rw-r--r--health/guides/megacli/megacli_bbu_relative_charge.md36
-rw-r--r--health/guides/megacli/megacli_pd_media_errors.md30
-rw-r--r--health/guides/megacli/megacli_pd_predictive_failures.md29
5 files changed, 152 insertions, 0 deletions
diff --git a/health/guides/megacli/megacli_adapter_state.md b/health/guides/megacli/megacli_adapter_state.md
new file mode 100644
index 00000000..1202184e
--- /dev/null
+++ b/health/guides/megacli/megacli_adapter_state.md
@@ -0,0 +1,29 @@
+### Understand the alert
+
+This alert indicates that the status of a virtual drive on your MegaRAID controller is in a degraded state. A degraded state means that the virtual drive's operating condition is not optimal, and one of the configured drives has failed or is offline.
+
+### Troubleshoot the alert
+
+#### General approach
+
+1. Gather more information about your virtual drives in all adapters:
+
+```
+root@netdata # megacli –LDInfo -Lall -aALL
+```
+
+2. Check which virtual drive is in a degraded state and in which adapter.
+
+3. Consult the MegaRAID SAS Software User Guide [1]:
+
+ 1. Section `2.1.16` to check what is going wrong with your drives.
+ 2. Section `7.18` to perform any action on drives. Focus on sections `7.18.2`, `7.18.6`, `7.18.7`, `7.18.8`, `7.18.11`, and `7.18.14`.
+
+### Warning
+
+Data is priceless. Before performing any action, make sure that you have taken any necessary backup steps. Netdata is not liable for any loss or corruption of any data, database, or software.
+
+### Useful resources
+
+1. [MegaRAID SAS Software User Guide [PDF download]](https://docs.broadcom.com/docs/12353236)
+2. [MegaCLI commands cheatsheet](https://www.broadcom.com/support/knowledgebase/1211161496959/megacli-commands) \ No newline at end of file
diff --git a/health/guides/megacli/megacli_bbu_cycle_count.md b/health/guides/megacli/megacli_bbu_cycle_count.md
new file mode 100644
index 00000000..14f1d22d
--- /dev/null
+++ b/health/guides/megacli/megacli_bbu_cycle_count.md
@@ -0,0 +1,28 @@
+### Understand the alert
+
+The `megacli_bbu_cycle_count` alert is related to the battery backup unit (BBU) of your MegaCLI controller. This alert is triggered when the average number of full recharge cycles during the BBU's lifetime exceeds a predefined threshold. High numbers of charge cycles can affect the battery's relative capacity.
+
+A warning state is triggered when the number of charge cycles is greater than 100, and a critical state is triggered when the number of charge cycles is greater than 500.
+
+### Troubleshoot the alert
+
+**Caution:** Before performing any troubleshooting steps, ensure that you have taken the necessary backup measures to protect your data. Netdata is not liable for any data loss or corruption.
+
+1. Gather information about the battery units for all of your adapters:
+
+ ```
+ megacli -AdpBbuCmd -GetBbuStatus -aALL
+ ```
+
+2. Perform a battery check on the BBU with a low relative charge. Before taking any action, consult the manual's[section 7.14](https://docs.broadcom.com/docs/12353236):
+
+ ```
+ megacli -AdpBbuCmd -BbuLearn -aX // X is the adapter's number
+ ```
+
+3. If necessary, replace the battery in question.
+
+### Useful resources
+
+1. [MegaRAID SAS Software User Guide (PDF download)](https://docs.broadcom.com/docs/12353236)
+2. [MegaCLI commands cheatsheet](https://www.broadcom.com/support/knowledgebase/1211161496959/megacli-commands) \ No newline at end of file
diff --git a/health/guides/megacli/megacli_bbu_relative_charge.md b/health/guides/megacli/megacli_bbu_relative_charge.md
new file mode 100644
index 00000000..74a03a3b
--- /dev/null
+++ b/health/guides/megacli/megacli_bbu_relative_charge.md
@@ -0,0 +1,36 @@
+### Understand the alert
+
+This alert is related to the disk array controller's battery backup unit (BBU) relative state of charge. If you receive this alert, it means that the battery backup unit's charge is low, which may affect your RAID controller's performance or lead to data loss in case of a power failure.
+
+### What does low BBU relative charge mean?
+
+A low BBU relative charge indicates that the state of charge is low compared to its design capacity. The relative state of charge is a percentage indication of the full charge capacity compared to its designed capacity. If the relative charge is constantly low, it may suggest that the battery is worn out and needs replacement.
+
+### Troubleshoot the alert
+
+1. Gather information about your battery units for all controllers:
+
+ ```
+ sudo megacli -AdpBbuCmd -GetBbuStatus -aALL
+ ```
+
+ This command will provide you with detailed information about the BBU status for each controller.
+
+2. Perform a manual battery calibration (learning cycle) on the battery with a low relative charge:
+
+ ```
+ sudo megacli -AdpBbuCmd -BbuLearn -aX
+ ```
+
+ Replace `X` with the controller's number. Please consult the [MegaRAID SAS Software User Guide](https://docs.broadcom.com/docs/12353236), section 7.14, before performing this action.
+
+ A learning cycle discharges and recharges the battery, which can help recalibrate the battery and improve its relative state of charge. However, it may temporarily disable the write cache during this process.
+
+3. Monitor the BBU relative charge after the learning cycle. If the relative charge remains low, consider replacing the battery in question. Consult your hardware vendor's documentation for guidance on replacing the BBU.
+
+### Useful resources
+
+1. [MegaRAID SAS Software User Guide [pdf download]](https://docs.broadcom.com/docs/12353236)
+2. [MegaCLI commands cheatsheet](https://www.broadcom.com/support/knowledgebase/1211161496959/megacli-commands)
+
+**Note**: Data is priceless. Before you perform any action, make sure that you have taken any necessary backup steps. Netdata is not liable for any loss or corruption of any data, database, or software. \ No newline at end of file
diff --git a/health/guides/megacli/megacli_pd_media_errors.md b/health/guides/megacli/megacli_pd_media_errors.md
new file mode 100644
index 00000000..8988d09e
--- /dev/null
+++ b/health/guides/megacli/megacli_pd_media_errors.md
@@ -0,0 +1,30 @@
+### Understand the alert
+
+The `megacli_pd_media_errors` alert is triggered when there are media errors on the physical disks attached to the MegaCLI controller. A media error is an event where a storage disk was unable to perform the requested I/O operation due to problems accessing the stored data. This alert indicates that a bad sector was found on the drive during a patrol check or from a rebuild operation on a specific disk by the RAID adapter. Although this does not mean imminent disk failure, it is a warning, and you should monitor the affected disk.
+
+### Troubleshoot the alert
+
+**Data is priceless. Before you perform any action, make sure that you have taken any necessary backup steps. Netdata is not liable for any loss or corruption of any data, database, or software.**
+
+1. Gather more information about your virtual drives on all adapters:
+
+ ```
+ megacli –LDInfo -Lall -aALL
+ ```
+
+2. Check which virtual drive is reporting media errors and in which adapter.
+
+3. Check the Bad block table for the virtual drive in question:
+
+ ```
+ megacli –GetBbtEntries -LX -aY // X: virtual drive, Y: the adapter
+ ```
+
+4. Consult the MegaRAID SAS Software User Guide's section 7.17.11[^1] to recheck these block entries. **This operation removes any data stored on the physical drives. Back up the good data on the drives before making any changes to the configuration.**
+
+### Useful resources
+
+1. [MegaRAID SAS Software User Guide [PDF download]](https://docs.broadcom.com/docs/12353236)
+2. [MegaCLI command cheatsheet](https://www.broadcom.com/support/knowledgebase/1211161496959/megacli-commands)
+
+[^1]: https://docs.broadcom.com/docs/12353236 \ No newline at end of file
diff --git a/health/guides/megacli/megacli_pd_predictive_failures.md b/health/guides/megacli/megacli_pd_predictive_failures.md
new file mode 100644
index 00000000..1aa7b0d2
--- /dev/null
+++ b/health/guides/megacli/megacli_pd_predictive_failures.md
@@ -0,0 +1,29 @@
+### Understand the alert
+
+This alert indicates that one or more physical disks attached to the MegaCLI controller are experiencing predictive failures. A predictive failure is a warning that a hard disk may fail in the near future, even if it's still working normally. The failure prediction relies on the self-monitoring and analysis technology (S.M.A.R.T.) built into the disk drive.
+
+### Troubleshoot the alert
+
+**Make sure you have taken necessary backup steps before performing any action. Netdata is not liable for any loss or corruption of data, databases, or software.**
+
+1. Identify the problematic drives:
+
+ Use the following command to gather information about your virtual drives in all adapters:
+
+ ```
+ megacli –LDInfo -Lall -aALL
+ ```
+
+2. Determine the virtual drive and adapter reporting media errors.
+
+3. Consult the MegaRAID SAS Software User Guide [1]:
+
+ 1. Refer to Section 2.1.16 to check for issues with your drives.
+ 2. Refer to Section 7.18 to perform any appropriate actions on drives. Focus on Sections 7.18.2, 7.18.6, 7.18.7, 7.18.8, 7.18.11, and 7.18.14.
+
+4. Consider replacing the problematic disk(s) to prevent imminent failures and potential data loss.
+
+### Useful resources
+
+1. [MegaRAID SAS Software User Guide (PDF download)](https://docs.broadcom.com/docs/12353236)
+2. [MegaCLI commands cheatsheet](https://www.broadcom.com/support/knowledgebase/1211161496959/megacli-commands) \ No newline at end of file