daemon/bindings/cache.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338

.. SPDX-License-Identifier: GPL-3.0-or-later

Cache
=====

Cache in Knot Resolver is stored on disk and also shared between
:ref:`systemd-multiple-instances` so resolver doesn't lose the cached data on
restart or crash.

To improve performance even further the resolver implements so-called aggressive caching
for DNSSEC-validated data (:rfc:`8198`), which improves performance and also protects
against some types of Random Subdomain Attacks.


.. _`cache_sizing`:

Sizing
------

For personal and small office use-cases cache size around 100 MB is more than enough.

For large deployments we recommend to run Knot Resolver on a dedicated machine,
and to allocate 90% of machine's free memory for resolver's cache.

.. note:: Choosing a cache size that can fit into RAM is important even if the
   cache is stored on disk (default). Otherwise, the extra I/O caused by disk
   access for missing pages can cause performance issues.

For example, imagine you have a machine with 16 GB of memory.
After machine restart you use command ``free -m`` to determine
amount of free memory (without swap):

.. code-block:: bash

  $ free -m
                total        used        free
  Mem:          15907         979       14928

Now you can configure cache size to be 90% of the free memory 14 928 MB, i.e. 13 453 MB:

.. code-block:: lua

   -- 90 % of free memory after machine restart
   cache.size = 13453 * MB

It is also possible to set the cache size based on the file system size. This is useful
if you use a dedicated partition for cache (e.g. non-persistent tmpfs). It is recommended
to leave some free space for special files, such as locks.:

.. code-block:: lua

   cache.size = cache.fssize() - 10*MB

.. note:: The :ref:`garbage-collector` can be used to periodically trim the
   cache. It is enabled and configured by default when running kresd with
   systemd integration.

.. _`cache_persistence`:

Persistence
-----------
.. tip:: Using tmpfs for cache improves performance and reduces disk I/O.

By default the cache is saved on a persistent storage device
so the content of the cache is persisted during system reboot.
This usually leads to smaller latency after restart etc.,
however in certain situations a non-persistent cache storage might be preferred, e.g.:

  - Resolver handles high volume of queries and I/O performance to disk is too low.
  - Threat model includes attacker getting access to disk content in power-off state.
  - Disk has limited number of writes (e.g. flash memory in routers).

If non-persistent cache is desired configure cache directory to be on
tmpfs_ filesystem, a temporary in-memory file storage.
The cache content will be saved in memory, and thus have faster access
and will be lost on power-off or reboot.


.. note:: In most of the Unix-like systems ``/tmp`` and ``/var/run`` are
   commonly mounted as tmpfs.  While it is technically possible to move the
   cache to an existing tmpfs filesystem, it is *not recommended*, since the
   path to cache is configured in multiple places.

Mounting the cache directory as tmpfs_ is the recommended approach.  Make sure
to use appropriate ``size=`` option and don't forget to adjust the size in the
config file as well.

.. code-block:: none

   # /etc/fstab
   tmpfs	/var/cache/knot-resolver	tmpfs	rw,size=2G,uid=knot-resolver,gid=knot-resolver,nosuid,nodev,noexec,mode=0700 0 0

.. code-block:: lua

   -- /etc/knot-resolver/kresd.conf
   cache.size = cache.fssize() - 10*MB

.. _tmpfs: https://en.wikipedia.org/wiki/Tmpfs

Configuration reference
-----------------------

.. function:: cache.open(max_size[, config_uri])

   :param number max_size: Maximum cache size in bytes.
   :return: ``true`` if cache was opened

   Open cache with a size limit. The cache will be reopened if already open.
   Note that the max_size cannot be lowered, only increased due to how cache is implemented.

   .. tip:: Use ``kB, MB, GB`` constants as a multiplier, e.g. ``100*MB``.

   The URI ``lmdb://path`` allows you to change the cache directory.

   Example:

   .. code-block:: lua

      cache.open(100 * MB, 'lmdb:///var/cache/knot-resolver')

.. envvar:: cache.size

   Set the cache maximum size in bytes. Note that this is only a hint to the backend,
   which may or may not respect it. See :func:`cache.open()`.

   .. code-block:: lua

	cache.size = 100 * MB -- equivalent to `cache.open(100 * MB)`

.. envvar:: cache.current_size

   Get the maximum size in bytes.

   .. code-block:: lua

	print(cache.current_size)

.. envvar:: cache.storage

   Set the cache storage backend configuration, see :func:`cache.backends()` for
   more information. If the new storage configuration is invalid, it is not set.

   .. code-block:: lua

	cache.storage = 'lmdb://.'

.. envvar:: cache.current_storage

   Get the storage backend configuration.

   .. code-block:: lua

	print(cache.current_storage)

.. function:: cache.backends()

   :return: map of backends

   .. note:: For now there is only one backend implementation, even though the APIs are ready for different (synchronous) backends.

   The cache supports runtime-changeable backends, using the optional :rfc:`3986` URI, where the scheme
   represents backend protocol and the rest of the URI backend-specific configuration. By default, it
   is a ``lmdb`` backend in working directory, i.e. ``lmdb://``.

   Example output:

   .. code-block:: lua

   	[lmdb://] => true

.. function:: cache.count()

   :return: Number of entries in the cache. Meaning of the number is an implementation detail and is subject of change.

.. function:: cache.close()

   :return: ``true`` if cache was closed

   Close the cache.

   .. note:: This may or may not clear the cache, depending on the cache backend.

.. function:: cache.fssize()

   :return: Partition size of cache storage.

.. function:: cache.stats()

   Return table with low-level statistics for internal cache operation and storage.
   This counts each access to cache and does not directly map to individual
   DNS queries or resource records.
   For query-level statistics see :ref:`stats module <mod-stats>`.

   Example:

   .. code-block:: lua

       > cache.stats()
       [clear] => 0
       [close] => 0
       [commit] => 117
       [count] => 2
       [count_entries] => 6187
       [match] => 21
       [match_miss] => 2
       [open] => 0
       [read] => 4313
       [read_leq] => 9
       [read_leq_miss] => 4
       [read_miss] => 1143
       [remove] => 17
       [remove_miss] => 0
       [usage_percent] => 15.625
       [write] => 189


   Cache operation `read_leq` (*read less or equal*, i.e. range search) was requested 9 times,
   and 4 out of 9 operations were finished with *cache miss*.
   Cache contains 6187 internal entries which occupy 15.625 % cache size.


.. function:: cache.max_ttl([ttl])

  :param number ttl: maximum TTL in seconds (default: 1 day)

  .. KR_CACHE_DEFAULT_TTL_MAX ^^

  :return: current maximum TTL

  Get or set upper TTL bound applied to all received records.

  .. note:: The `ttl` value must be in range `(min_ttl, 2147483647)`.

  .. code-block:: lua

     -- Get maximum TTL
     cache.max_ttl()
     518400
     -- Set maximum TTL
     cache.max_ttl(172800)
     172800

.. function:: cache.min_ttl([ttl])

  :param number ttl: minimum TTL in seconds (default: 5 seconds)

  .. KR_CACHE_DEFAULT_TTL_MIN ^^

  :return: current minimum TTL

  Get or set lower TTL bound applied to all received records.
  Forcing TTL higher than specified violates DNS standards, so use higher values with care.
  TTL still won't be extended beyond expiration of the corresponding DNSSEC signature.

  .. note:: The `ttl` value must be in range `<0, max_ttl)`.

  .. code-block:: lua

     -- Get minimum TTL
     cache.min_ttl()
     0
     -- Set minimum TTL
     cache.min_ttl(5)
     5

.. function:: cache.ns_tout([timeout])

  :param number timeout: NS retry interval in milliseconds (default: :c:macro:`KR_NS_TIMEOUT_RETRY_INTERVAL`)
  :return: current timeout

  Get or set time interval for which a nameserver address will be ignored after determining that it doesn't return (useful) answers.
  The intention is to avoid waiting if there's little hope; instead, kresd can immediately SERVFAIL or immediately use stale records (with :ref:`serve_stale <mod-serve_stale>` module).

  .. warning:: This settings applies only to the current kresd process.

.. function:: cache.get([domain])

  This function is not implemented at this moment.
  We plan to re-introduce it soon, probably with a slightly different API.

.. function:: cache.clear([name], [exact_name], [rr_type], [chunk_size], [callback], [prev_state])

     Purge cache records matching specified criteria. There are two specifics:

     * To reliably remove **negative** cache entries you need to clear subtree with the whole zone. E.g. to clear negative cache entries for (formerly non-existing) record `www.example.com. A` you need to flush whole subtree starting at zone apex, e.g. `example.com.` [#]_.
     * This operation is asynchronous and might not be yet finished when call to ``cache.clear()`` function returns. Return value indicates if clearing continues asynchronously or not.

  :param string name: subtree to purge; if the name isn't provided, whole cache is purged
        (and any other parameters are disregarded).
  :param bool exact_name: if set to ``true``, only records with *the same* name are removed;
                          default: false.
  :param kres.type rr_type: you may additionally specify the type to remove,
        but that is only supported with ``exact_name == true``; default: nil.
  :param integer chunk_size: the number of records to remove in one round; default: 100.
        The purpose is not to block the resolver for long.
        The default ``callback`` repeats the command after one millisecond
        until all matching data are cleared.
  :param function callback: a custom code to handle result of the underlying C call.
        Its parameters are copies of those passed to `cache.clear()` with one additional
        parameter ``rettable`` containing table with return value from current call.
        ``count`` field contains a return code from :func:`kr_cache_remove_subtree()`.
  :param table prev_state: return value from previous run (can be used by callback)

  :rtype: table
  :return: ``count`` key is always present. Other keys are optional and their presence indicate special conditions.

   * **count** *(integer)* - number of items removed from cache by this call (can be 0 if no entry matched criteria)
   * **not_apex** - cleared subtree is not cached as zone apex; proofs of non-existence were probably not removed
   * **subtree** *(string)* - hint where zone apex lies (this is estimation from cache content and might not be accurate)
   * **chunk_limit** - more than ``chunk_size`` items needs to be cleared, clearing will continue asynchronously


  Examples:

  .. code-block:: lua

     -- Clear whole cache
     > cache.clear()
     [count] => 76

     -- Clear records at and below 'com.'
     > cache.clear('com.')
     [chunk_limit] => chunk size limit reached; the default callback will continue asynchronously
     [not_apex] => to clear proofs of non-existence call cache.clear('com.')
     [count] => 100
     [round] => 1
     [subtree] => com.
     > worker.sleep(0.1)
     [cache] asynchronous cache.clear('com', false) finished

     -- Clear only 'www.example.com.'
     > cache.clear('www.example.com.', true)
     [round] => 1
     [count] => 1
     [not_apex] => to clear proofs of non-existence call cache.clear('example.com.')
     [subtree] => example.com.

.. [#] This is a consequence of DNSSEC negative cache which relies on proofs of non-existence on various owner nodes. It is impossible to efficiently flush part of DNS zones signed with NSEC3.