diff options
Diffstat (limited to 'Documentation/watchdog/hpwdt.rst')
-rw-r--r-- | Documentation/watchdog/hpwdt.rst | 77 |
1 files changed, 77 insertions, 0 deletions
diff --git a/Documentation/watchdog/hpwdt.rst b/Documentation/watchdog/hpwdt.rst new file mode 100644 index 000000000..c824cd7f6 --- /dev/null +++ b/Documentation/watchdog/hpwdt.rst @@ -0,0 +1,77 @@ +=========================== +HPE iLO NMI Watchdog Driver +=========================== + +for iLO based ProLiant Servers +============================== + +Last reviewed: 08/20/2018 + + + The HPE iLO NMI Watchdog driver is a kernel module that provides basic + watchdog functionality and handler for the iLO "Generate NMI to System" + virtual button. + + All references to iLO in this document imply it also works on iLO2 and all + subsequent generations. + + Watchdog functionality is enabled like any other common watchdog driver. That + is, an application needs to be started that kicks off the watchdog timer. A + basic application exists in tools/testing/selftests/watchdog/ named + watchdog-test.c. Simply compile the C file and kick it off. If the system + gets into a bad state and hangs, the HPE ProLiant iLO timer register will + not be updated in a timely fashion and a hardware system reset (also known as + an Automatic Server Recovery (ASR)) event will occur. + + The hpwdt driver also has the following module parameters: + + ============ ================================================================ + soft_margin allows the user to set the watchdog timer value. + Default value is 30 seconds. + timeout an alias of soft_margin. + pretimeout allows the user to set the watchdog pretimeout value. + This is the number of seconds before timeout when an + NMI is delivered to the system. Setting the value to + zero disables the pretimeout NMI. + Default value is 9 seconds. + nowayout basic watchdog parameter that does not allow the timer to + be restarted or an impending ASR to be escaped. + Default value is set when compiling the kernel. If it is set + to "Y", then there is no way of disabling the watchdog once + it has been started. + kdumptimeout Minimum timeout in seconds to apply upon receipt of an NMI + before calling panic. (-1) disables the watchdog. When value + is > 0, the timer is reprogrammed with the greater of + value or current timeout value. + ============ ================================================================ + + NOTE: + More information about watchdog drivers in general, including the ioctl + interface to /dev/watchdog can be found in + Documentation/watchdog/watchdog-api.rst and Documentation/IPMI.txt. + + Due to limitations in the iLO hardware, the NMI pretimeout if enabled, + can only be set to 9 seconds. Attempts to set pretimeout to other + non-zero values will be rounded, possibly to zero. Users should verify + the pretimeout value after attempting to set pretimeout or timeout. + + Upon receipt of an NMI from the iLO, the hpwdt driver will initiate a + panic. This is to allow for a crash dump to be collected. It is incumbent + upon the user to have properly configured the system for kdump. + + The default Linux kernel behavior upon panic is to print a kernel tombstone + and loop forever. This is generally not what a watchdog user wants. + + For those wishing to learn more please see: + Documentation/admin-guide/kdump/kdump.rst + Documentation/admin-guide/kernel-parameters.txt (panic=) + Your Linux Distribution specific documentation. + + If the hpwdt does not receive the NMI associated with an expiring timer, + the iLO will proceed to reset the system at timeout if the timer hasn't + been updated. + +-- + + The HPE iLO NMI Watchdog Driver and documentation were originally developed + by Tom Mingarelli. |