summaryrefslogtreecommitdiffstats
path: root/doc/mgr/insights.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/mgr/insights.rst')
-rw-r--r--doc/mgr/insights.rst52
1 files changed, 52 insertions, 0 deletions
diff --git a/doc/mgr/insights.rst b/doc/mgr/insights.rst
new file mode 100644
index 00000000..37b8903f
--- /dev/null
+++ b/doc/mgr/insights.rst
@@ -0,0 +1,52 @@
+Insights Module
+===============
+
+The insights module collects and exposes system information to the Insights Core
+data analysis framework. It is intended to replace explicit interrogation of
+Ceph CLIs and daemon admin sockets, reducing the API surface that Insights
+depends on. The insights reports contains the following:
+
+* **Health reports**. In addition to reporting the current health of the
+ cluster, the insights module reports a summary of the last 24 hours of health
+ checks. This feature is important for catching cluster health issues that are
+ transient and may not be present at the moment the report is generated. Health
+ checks are deduplicated to avoid unbounded data growth.
+
+* **Crash reports**. A summary of any daemon crashes in the past 24 hours is
+ included in the insights report. Crashes are reported as the number of crashes
+ per daemon type (e.g. `ceph-osd`) within the time window. Full details of a
+ crash may be obtained using the `crash module`_.
+
+* Software version, storage utilization, cluster maps, placement group summary,
+ monitor status, cluster configuration, and OSD metadata.
+
+Enabling
+--------
+
+The *insights* module is enabled with::
+
+ ceph mgr module enable insights
+
+Commands
+--------
+::
+
+ ceph insights
+
+Generate the full report.
+
+::
+
+ ceph insights prune-health <hours>
+
+Remove historical health data older than <hours>. Passing `0` for <hours> will
+clear all health data.
+
+This command is useful for cleaning the health history before automated nightly
+reports are generated, which may contain spurious health checks accumulated
+while performing system maintenance, or other health checks that have been
+resolved. There is no need to prune health data to reclaim storage space;
+garbage collection is performed regularly to remove old health data from
+persistent storage.
+
+.. _crash module: ../crash