summaryrefslogtreecommitdiffstats
path: root/Documentation/ABI/testing/sysfs-mce
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--Documentation/ABI/testing/sysfs-mce97
1 files changed, 97 insertions, 0 deletions
diff --git a/Documentation/ABI/testing/sysfs-mce b/Documentation/ABI/testing/sysfs-mce
new file mode 100644
index 000000000..83172f50e
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-mce
@@ -0,0 +1,97 @@
+What: /sys/devices/system/machinecheck/machinecheckX/
+Contact: Andi Kleen <ak@linux.intel.com>
+Date: Feb, 2007
+Description:
+ (X = CPU number)
+
+ Machine checks report internal hardware error conditions
+ detected by the CPU. Uncorrected errors typically cause a
+ machine check (often with panic), corrected ones cause a
+ machine check log entry.
+
+ For more details about the x86 machine check architecture
+ see the Intel and AMD architecture manuals from their
+ developer websites.
+
+ For more details about the architecture
+ see http://one.firstfloor.org/~andi/mce.pdf
+
+ Each CPU has its own directory.
+
+What: /sys/devices/system/machinecheck/machinecheckX/bank<Y>
+Contact: Andi Kleen <ak@linux.intel.com>
+Date: Feb, 2007
+Description:
+ (Y bank number)
+
+ 64bit Hex bitmask enabling/disabling specific subevents for
+ bank Y.
+
+ When a bit in the bitmask is zero then the respective
+ subevent will not be reported.
+
+ By default all events are enabled.
+
+ Note that BIOS maintain another mask to disable specific events
+ per bank. This is not visible here
+
+What: /sys/devices/system/machinecheck/machinecheckX/check_interval
+Contact: Andi Kleen <ak@linux.intel.com>
+Date: Feb, 2007
+Description:
+ The entries appear for each CPU, but they are truly shared
+ between all CPUs.
+
+ How often to poll for corrected machine check errors, in
+ seconds (Note output is hexadecimal). Default 5 minutes.
+ When the poller finds MCEs it triggers an exponential speedup
+ (poll more often) on the polling interval. When the poller
+ stops finding MCEs, it triggers an exponential backoff
+ (poll less often) on the polling interval. The check_interval
+ variable is both the initial and maximum polling interval.
+ 0 means no polling for corrected machine check errors
+ (but some corrected errors might be still reported
+ in other ways)
+
+What: /sys/devices/system/machinecheck/machinecheckX/trigger
+Contact: Andi Kleen <ak@linux.intel.com>
+Date: Feb, 2007
+Description:
+ The entries appear for each CPU, but they are truly shared
+ between all CPUs.
+
+ Program to run when a machine check event is detected.
+ This is an alternative to running mcelog regularly from cron
+ and allows to detect events faster.
+
+What: /sys/devices/system/machinecheck/machinecheckX/monarch_timeout
+Contact: Andi Kleen <ak@linux.intel.com>
+Date: Feb, 2007
+Description:
+ How long to wait for the other CPUs to machine check too on a
+ exception. 0 to disable waiting for other CPUs.
+
+ Unit: us
+
+What: /sys/devices/system/machinecheck/machinecheckX/ignore_ce
+Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
+Date: Jun 2009
+Description:
+ Disables polling and CMCI for corrected errors.
+ All corrected events are not cleared and kept in bank MSRs.
+
+What: /sys/devices/system/machinecheck/machinecheckX/dont_log_ce
+Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
+Date: Jun 2009
+Description:
+ Disables logging for corrected errors.
+ All reported corrected errors will be cleared silently.
+
+ This option will be useful if you never care about corrected
+ errors.
+
+What: /sys/devices/system/machinecheck/machinecheckX/cmci_disabled
+Contact: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
+Date: Jun 2009
+Description:
+ Disables the CMCI feature.