summaryrefslogtreecommitdiffstats
path: root/doc/mgr/influx.rst
blob: 9a770530ac94b8216e5d6f1a2c9f4ab91304323b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
=============
Influx Module 
=============

The influx module continuously collects and sends time series data to an
influxdb database.

The influx module was introduced in the 13.x *Mimic* release.

--------
Enabling 
--------

To enable the module, use the following command:

::

    ceph mgr module enable influx

If you wish to subsequently disable the module, you can use the equivalent
*disable* command:

::

    ceph mgr module disable influx

-------------
Configuration 
-------------

For the influx module to send statistics to an InfluxDB server, it
is necessary to configure the servers address and some authentication
credentials.

Set configuration values using the following command:

::

    ceph config set mgr mgr/influx/<key> <value>


The most important settings are ``hostname``, ``username`` and ``password``.  
For example, a typical configuration might look like this:

::

    ceph config set mgr mgr/influx/hostname influx.mydomain.com
    ceph config set mgr mgr/influx/username admin123
    ceph config set mgr mgr/influx/password p4ssw0rd
    
Additional optional configuration settings are:

:interval: Time between reports to InfluxDB.  Default 30 seconds.
:database: InfluxDB database name.  Default "ceph".  You will need to create this database and grant write privileges to the configured username or the username must have admin privileges to create it.  
:port: InfluxDB server port.  Default 8086
:ssl: Use https connection for InfluxDB server. Use "true" or "false". Default false
:verify_ssl: Verify https cert for InfluxDB server. Use "true" or "false". Default true
:threads: How many worker threads should be spawned for sending data to InfluxDB. Default is 5
:batch_size: How big batches of data points should be when sending to InfluxDB. Default is 5000

---------
Debugging 
---------

By default, a few debugging statements as well as error statements have been set to print in the log files. Users can add more if necessary.
To make use of the debugging option in the module:

- Add this to the ceph.conf file.::

    [mgr]
        debug_mgr = 20  

- Use this command ``ceph influx self-test``.
- Check the log files. Users may find it easier to filter the log files using *mgr[influx]*.

--------------------
Interesting counters
--------------------

The following tables describe a subset of the values output by
this module.

^^^^^
Pools
^^^^^

+---------------+-----------------------------------------------------+
|Counter        | Description                                         |
+===============+=====================================================+
|stored         | Bytes stored in the pool not including copies       |
+---------------+-----------------------------------------------------+
|max_avail      | Max available number of bytes in the pool           |
+---------------+-----------------------------------------------------+
|objects        | Number of objects in the pool                       |
+---------------+-----------------------------------------------------+
|wr_bytes       | Number of bytes written in the pool                 |
+---------------+-----------------------------------------------------+
|dirty          | Number of bytes dirty in the pool                   |
+---------------+-----------------------------------------------------+
|rd_bytes       | Number of bytes read in the pool                    |
+---------------+-----------------------------------------------------+
|stored_raw     | Bytes used in pool including copies made            |
+---------------+-----------------------------------------------------+

^^^^
OSDs
^^^^

+------------+------------------------------------+
|Counter     | Description                        |
+============+====================================+
|op_w        | Client write operations            |
+------------+------------------------------------+
|op_in_bytes | Client operations total write size |
+------------+------------------------------------+
|op_r        | Client read operations             |
+------------+------------------------------------+
|op_out_bytes| Client operations total read size  |
+------------+------------------------------------+


+------------------------+--------------------------------------------------------------------------+
|Counter                 | Description                                                              |
+========================+==========================================================================+
|op_wip                  | Replication operations currently being processed (primary)               |
+------------------------+--------------------------------------------------------------------------+
|op_latency              | Latency of client operations (including queue time)                      |
+------------------------+--------------------------------------------------------------------------+
|op_process_latency      | Latency of client operations (excluding queue time)                      |           
+------------------------+--------------------------------------------------------------------------+
|op_prepare_latency      | Latency of client operations (excluding queue time and wait for finished)|
+------------------------+--------------------------------------------------------------------------+
|op_r_latency            | Latency of read operation (including queue time)                         |
+------------------------+--------------------------------------------------------------------------+
|op_r_process_latency    | Latency of read operation (excluding queue time)                         |
+------------------------+--------------------------------------------------------------------------+
|op_w_in_bytes           | Client data written                                                      |
+------------------------+--------------------------------------------------------------------------+
|op_w_latency            | Latency of write operation (including queue time)                        |
+------------------------+--------------------------------------------------------------------------+
|op_w_process_latency    | Latency of write operation (excluding queue time)                        |
+------------------------+--------------------------------------------------------------------------+
|op_w_prepare_latency    | Latency of write operations (excluding queue time and wait for finished) |
+------------------------+--------------------------------------------------------------------------+
|op_rw                   | Client read-modify-write operations                                      |
+------------------------+--------------------------------------------------------------------------+
|op_rw_in_bytes          | Client read-modify-write operations write in                             |
+------------------------+--------------------------------------------------------------------------+
|op_rw_out_bytes         | Client read-modify-write operations read out                             |
+------------------------+--------------------------------------------------------------------------+
|op_rw_latency           | Latency of read-modify-write operation (including queue time)            |
+------------------------+--------------------------------------------------------------------------+
|op_rw_process_latency   | Latency of read-modify-write operation (excluding queue time)            |
+------------------------+--------------------------------------------------------------------------+
|op_rw_prepare_latency   | Latency of read-modify-write operations (excluding queue time            |
|                        | and wait for finished)                                                   |
+------------------------+--------------------------------------------------------------------------+
|op_before_queue_op_lat  | Latency of IO before calling queue (before really queue into ShardedOpWq)|
|                        | op_before_dequeue_op_lat                                                 |
+------------------------+--------------------------------------------------------------------------+
|op_before_dequeue_op_lat| Latency of IO before calling dequeue_op(already dequeued and get PG lock)|
+------------------------+--------------------------------------------------------------------------+

Latency counters are measured in microseconds unless otherwise specified in the description.