summaryrefslogtreecommitdiffstats
path: root/health/guides/ipmi/ipmi_sensors_states.md
diff options
context:
space:
mode:
Diffstat (limited to 'health/guides/ipmi/ipmi_sensors_states.md')
-rw-r--r--health/guides/ipmi/ipmi_sensors_states.md41
1 files changed, 41 insertions, 0 deletions
diff --git a/health/guides/ipmi/ipmi_sensors_states.md b/health/guides/ipmi/ipmi_sensors_states.md
new file mode 100644
index 00000000..e7521a30
--- /dev/null
+++ b/health/guides/ipmi/ipmi_sensors_states.md
@@ -0,0 +1,41 @@
+### Understand the alert
+
+This alert is related to the IPMI (Intelligent Platform Management Interface) sensors in your system. IPMI is a hardware management interface used for monitoring server health and collecting information on various hardware components. The alert is triggered when any of the IPMI sensors detect conditions that are outside the normal operating range, and are in a warning or critical state.
+
+### Troubleshoot the alert
+
+1. Check IPMI sensor status:
+
+ To check the status of IPMI sensors, you can use the `ipmi-sensors` command with appropriate flags. For instance:
+
+ ```
+ sudo ipmi-sensors --output-sensor-state
+ ```
+
+ This command will provide you with detailed information on the current state of each sensor, allowing you to determine which ones are in a warning or critical state.
+
+2. Analyze sensor data:
+
+ Based on the output obtained in the previous step, identify the sensors that are causing the alert. Take note of their current values and thresholds.
+
+ To obtain more detailed information, you can also use the `-v` (verbose) flag with the command:
+
+ ```
+ sudo ipmi-sensors -v --output-sensor-state
+ ```
+
+3. Investigate the cause of the issue:
+
+ Once you have identified the sensors in a non-nominal state, start investigating the root cause of the issue. This may involve checking the hardware components, system logs, or contacting your hardware vendor for additional support.
+
+4. Resolve the issue:
+
+ Based on your investigation, take the necessary steps to resolve the issue. This may include replacing faulty hardware, addressing configuration errors, or applying firmware updates.
+
+5. Verify resolution:
+
+ After addressing the issue, use the `ipmi-sensors` command to check the status of the affected sensors. Ensure that they have returned to the nominal state, and no additional warning or critical conditions are being reported.
+
+### Useful resources
+
+1. ["ipmi-sensors" manual page](https://www.gnu.org/software/freeipmi/manpages/man8/ipmi-sensors.8.html)