summaryrefslogtreecommitdiffstats
path: root/doc/sphinx/arm/stats.rst
blob: 1a05c30bb85f75ff986b84388d8e21477b13dc6e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
.. _stats:

**********
Statistics
**********

Statistics Overview
===================

Both Kea DHCP servers support statistics gathering. A working DHCP
server encounters various events that can cause certain statistics to be
collected. For example, a DHCPv4 server may receive a packet
(the ``pkt4-received`` statistic increases by one) that after parsing is
identified as a DHCPDISCOVER (``pkt4-discover-received``). The server
processes it and decides to send a DHCPOFFER representing its answer
(the ``pkt4-offer-sent`` and ``pkt4-sent statistics`` increase by one). Such
events happen frequently, so it is not uncommon for the statistics to have
values in the high thousands. They can serve as an easy and powerful
tool for observing a server's and a network's health. For example, if
the ``pkt4-received`` statistic stops growing, it means that the clients'
packets are not reaching the server.

There are four types of statistics:

-  *integer* - this is the most common type. It is implemented as a
   64-bit integer (int64_t in C++), so it can hold any value between
   -2^63 to 2^63-1.

-  *floating point* - this type is intended to store floating-point
   precision. It is implemented as a C++ double type.

-  *duration* - this type is intended for recording time periods. It
   uses the \`boost::posix_time::time_duration type, which stores hours,
   minutes, seconds, and microseconds.

-  *string* - this type is intended for recording statistics in text
   form. It uses the C++ std::string type.

During normal operation, the DHCPv4 and DHCPv6 servers gather
statistics. For a list of DHCPv4 and DHCPv6 statistics, see
:ref:`dhcp4-stats` and :ref:`dhcp6-stats`, respectively.

To extract data from the statistics module, the control channel can be
used. See :ref:`ctrl-channel` for details. It is possible to
retrieve a single statistic or all statistics, reset the statistics (i.e.
set them to a neutral value, typically zero), or even completely remove a
single statistic or all statistics. See the section :ref:`command-stats`
for a list of statistics-oriented commands.

Statistics can be used by external tools to monitor Kea. One example of such a tool is Stork.
See :ref:`stork` for details on how to use it and other data sources to retrieve statistics periodically
to get better insight into Kea's health and operational status.

.. _stats-lifecycle:

Statistics Lifecycle
====================

All of the statistics supported by Kea's servers are initialized upon the servers' startup
and are returned in response to the commands such as
``statistic-get-all``. The runtime statistics concerning DHCP packets
processed are initially set to 0 and are reset upon the server
restart.

Per-subnet statistics are recalculated when reconfiguration takes place.

In general, once a statistic is initialized it is held in the manager until
explicitly removed, via ``statistic-remove`` or ``statistic-remove-all``,
or when the server is shut down.

Removing a statistic that is updated frequently makes little sense, as
it will be re-added when the server code next records that statistic.
The ``statistic-remove`` and ``statistic-remove-all`` commands are
intended to remove statistics that are not expected to be observed in
the near future. For example, a misconfigured device in a network may
cause clients to report duplicate addresses, so the server will report
increasing values of ``pkt4-decline-received``. Once the problem is found
and the device is removed, the system administrator may want to remove
the ``pkt4-decline-received`` statistic so that it is no longer reported, until
and unless a duplicate address is again detected.

.. _command-stats:

Commands for Manipulating Statistics
====================================

There are several commands defined that can be used for accessing
(``-get``), resetting to zero or a neutral value (``-reset``), or removing a
statistic completely (``-remove``). The statistics time-based
limit (``-sample-age-set``) and size-based limit (``-sample-count-set``), which
control how long or how many samples of a given statistic are retained, can also
be changed.

The difference between ``-reset`` and ``-remove`` is somewhat subtle.
The ``-reset`` command sets the value of the statistic to zero or a neutral value,
so that after this operation, the statistic has a value of 0 (integer),
0.0 (float), 0h0m0s0us (duration), or "" (string).
When requested, a statistic with the values mentioned is returned.
``-remove`` removes a statistic completely, so the statistic is no longer
reported. However, the server code may add it back if there is a reason
to record it.

.. note::

   The following sections describe commands that can be sent to the
   server; the examples are not fragments of a configuration file. For
   more information on sending commands to Kea, see
   :ref:`ctrl-channel`.

.. _command-statistic-get:

The ``statistic-get`` Command
-----------------------------

The ``statistic-get`` command retrieves a single statistic. It takes a
single-string parameter called ``name``, which specifies the statistic
name. An example command may look like this:

::

   {
       "command": "statistic-get",
       "arguments": {
           "name": "pkt4-received"
       }
   }

The server returns details of the requested statistic, with a result of
0 indicating success and the specified statistic as the value of the
``arguments`` parameter. If the requested statistic is not found, the
response contains an empty map, i.e. only { } as an argument, but
the status code still indicates success (0).

Here is an example response:

::

   {
       "command": "statistic-get",
       "arguments": {
           "pkt4-received": [ [ 125, "2019-07-30 10:11:19.498739" ], [ 100, "2019-07-30 10:11:19.498662" ] ]
       },
       "result": 0
   }

.. _command-statistic-reset:

The ``statistic-reset`` Command
-------------------------------

The ``statistic-reset`` command sets the specified statistic to its
neutral value: 0 for integer, 0.0 for float, 0h0m0s0us for time
duration, and "" for string type. It takes a single-string parameter
called ``name``, which specifies the statistic name. An example command
may look like this:

::

   {
       "command": "statistic-reset",
       "arguments": {
           "name": "pkt4-received"
       }
   }

If the specific statistic is found and the reset is successful, the
server responds with a status of 0, indicating success, and an empty
parameters field. If an error is encountered (e.g. the requested
statistic was not found), the server returns a status code of 1 (error)
and the text field contains the error description.

.. _command-statistic-remove:

The ``statistic-remove`` Command
--------------------------------

The ``statistic-remove`` command deletes a single statistic. It
takes a single-string parameter called ``name``, which specifies the
statistic name. An example command may look like this:

::

   {
       "command": "statistic-remove",
       "arguments": {
           "name": "pkt4-received"
       }
   }

If the specific statistic is found and its removal is successful, the
server responds with a status of 0, indicating success, and an empty
parameters field. If an error is encountered (e.g. the requested
statistic was not found), the server returns a status code of 1 (error)
and the text field contains the error description.

.. _command-statistic-get-all:

The ``statistic-get-all`` Command
---------------------------------

The ``statistic-get-all`` command retrieves all statistics recorded. An
example command may look like this:

::

   {
       "command": "statistic-get-all",
       "arguments": { }
   }

The server responds with details of all recorded statistics, with a
result set to 0 to indicate that it iterated over all statistics (even
when the total number of statistics is zero).

Here is an example response returning all collected statistics:

::

   {
       "command": "statistic-get-all",
       "arguments": {
           "cumulative-assigned-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487569"
               ]
           ],
           "declined-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487555"
               ]
           ],
           "pkt4-ack-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455233"
               ]
           ],
           "pkt4-ack-sent": [
               [
                   0,
                   "2022-02-11 17:54:17.455256"
               ]
           ],
           "pkt4-decline-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455259"
               ]
           ],
           "pkt4-discover-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455263"
               ]
           ],
           "pkt4-inform-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455265"
               ]
           ],
           "pkt4-nak-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455269"
               ]
           ],
           "pkt4-nak-sent": [
               [
                   0,
                   "2022-02-11 17:54:17.455271"
               ]
           ],
           "pkt4-offer-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455274"
               ]
           ],
           "pkt4-offer-sent": [
               [
                   0,
                   "2022-02-11 17:54:17.455277"
               ]
           ],
           "pkt4-parse-failed": [
               [
                   0,
                   "2022-02-11 17:54:17.455280"
               ]
           ],
           "pkt4-receive-drop": [
               [
                   0,
                   "2022-02-11 17:54:17.455284"
               ]
           ],
           "pkt4-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455287"
               ]
           ],
           "pkt4-release-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455290"
               ]
           ],
           "pkt4-request-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455293"
               ]
           ],
           "pkt4-sent": [
               [
                   0,
                   "2022-02-11 17:54:17.455296"
               ]
           ],
           "pkt4-unknown-received": [
               [
                   0,
                   "2022-02-11 17:54:17.455299"
               ]
           ],
           "reclaimed-declined-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487559"
               ]
           ],
           "reclaimed-leases": [
               [
                   0,
                   "2022-02-11 17:54:17.487564"
               ]
           ],
           "subnet[1].assigned-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487579"
               ]
           ],
           "subnet[1].cumulative-assigned-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487528"
               ]
           ],
           "subnet[1].declined-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487585"
               ]
           ],
           "subnet[1].reclaimed-declined-addresses": [
               [
                   0,
                   "2022-02-11 17:54:17.487595"
               ]
           ],
           "subnet[1].reclaimed-leases": [
               [
                   0,
                   "2022-02-11 17:54:17.487604"
               ]
           ],
           "subnet[1].total-addresses": [
               [
                   200,
                   "2022-02-11 17:54:17.487512"
               ]
           ],
           "v4-allocation-fail": [
               [
                   0,
                   "2022-02-11 17:54:17.455302"
               ]
           ],
           "v4-allocation-fail-classes": [
               [
                   0,
                   "2022-02-11 17:54:17.455306"
               ]
           ],
           "v4-allocation-fail-no-pools": [
               [
                   0,
                   "2022-02-11 17:54:17.455310"
               ]
           ],
           "v4-allocation-fail-shared-network": [
               [
                   0,
                   "2022-02-11 17:54:17.455319"
               ]
           ],
           "v4-allocation-fail-subnet": [
               [
                   0,
                   "2022-02-11 17:54:17.455323"
               ]
           ]
       },
       "result": 0
   }

.. _command-statistic-reset-all:

The ``statistic-reset-all`` Command
-----------------------------------

The ``statistic-reset`` command sets all statistics to their neutral
values: 0 for integer, 0.0 for float, 0h0m0s0us for time duration, and
"" for string type. An example command may look like this:

::

   {
       "command": "statistic-reset-all",
       "arguments": { }
   }

If the operation is successful, the server responds with a status of 0,
indicating success, and an empty parameters field. If an error is
encountered, the server returns a status code of 1 (error) and the text
field contains the error description.

.. _command-statistic-remove-all:

The ``statistic-remove-all`` Command
------------------------------------

The ``statistic-remove-all`` command attempts to delete all statistics. An
example command may look like this:

::

   {
       "command": "statistic-remove-all",
       "arguments": { }
   }

If the removal of all statistics is successful, the server responds with
a status of 0, indicating success, and an empty parameters field. If an
error is encountered, the server returns a status code of 1 (error) and
the text field contains the error description.

.. _command-statistic-sample-age-set:

The ``statistic-sample-age-set`` Command
----------------------------------------

The ``statistic-sample-age-set`` command sets a time-based limit
on samples for a given statistic. It takes two parameters: a string
called ``name``, which specifies the statistic name, and an integer value called
``duration``, which specifies the time limit for the given statistic in seconds.
An example command may look like this:

::

   {
       "command": "statistic-sample-age-set",
       "arguments": {
           "name": "pkt4-received",
           "duration": 1245
       }

   }

If the command is successful, the server responds with a status of
0, indicating success,
and an empty parameters field. If an error is encountered (e.g. the
requested statistic was not found), the server returns a status code
of 1 (error) and the text field contains the error description.

.. _command-statistic-sample-age-set-all:

The ``statistic-sample-age-set-all`` Command
--------------------------------------------

The ``statistic-sample-age-set-all`` command sets time-based limits
on samples for all statistics. It takes a single-integer parameter
called ``duration``, which specifies the time limit for the statistic
in seconds. An example command may look like this:

::

   {
       "command": "statistic-sample-age-set-all",
       "arguments": {
           "duration": 1245
       }

   }

If the command is successful, the server responds with a status of
0, indicating success,
and an empty parameters field. If an error is encountered, the server returns
a status code of 1 (error) and the text field contains the error description.

.. _command-statistic-sample-count-set:

The ``statistic-sample-count-set`` Command
------------------------------------------

The ``statistic-sample-count-set`` command sets a size-based limit
on samples for a given statistic. An example command may look
like this:

::

   {
       "command": "statistic-sample-count-set",
       "arguments": {
           "name": "pkt4-received",
           "max-samples": 100
       }

   }

If the command is successful, the server responds with a status of
0, indicating success,
and an empty parameters field. If an error is encountered (e.g. the
requested statistic was not found), the server returns a status code
of 1 (error) and the text field contains the error description.

.. _command-statistic-sample-count-set-all:

The ``statistic-sample-count-set-all`` Command
----------------------------------------------

The ``statistic-sample-count-set-all`` command sets size-based limits
on samples for all statistics. An example command may look
like this:

::

   {
       "command": "statistic-sample-count-set-all",
       "arguments": {
           "max-samples": 100
       }

   }

If the command is successful, the server responds with a status of
0, indicating success,
and an empty parameters field. If an error is encountered, the server returns
a status code of 1 (error) and the text field contains the error description.

.. _time-series:

Time Series
===========

With certain statistics, a single isolated data point may be useful. However,
some statistics, such as received
packet size, packet processing time, or number of database queries needed to
process a packet, are not cumulative and it is useful to keep many data
points, perhaps to do some statistical analysis afterwards.


Each Kea statistic holds 20 data points; setting such
a limit prevents unlimited memory growth.
There are two ways to define the limits: time-based (e.g. keep samples from
the last 5 minutes) and size-based. The size-based
limit can be changed using one of two commands: ``statistic-sample-count-set``,
to set a size limit for a single statistic, and ``statistic-sample-count-set-all``,
to set size-based limits for all statistics. To set time-based
limits for a single statistic, use ``statistic-sample-age-set``; use
``statistic-sample-age-set-all`` to set time-based limits for all statistics.
For a given statistic only one type of limit can be active; storage
is limited by either time or size, not both.