1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
|
.. Copyright (C) Internet Systems Consortium, Inc. ("ISC")
..
.. SPDX-License-Identifier: MPL-2.0
..
.. This Source Code Form is subject to the terms of the Mozilla Public
.. License, v. 2.0. If a copy of the MPL was not distributed with this
.. file, you can obtain one at https://mozilla.org/MPL/2.0/.
..
.. See the COPYRIGHT file distributed with this work for additional
.. information regarding copyright ownership.
.. _dnssec_troubleshooting:
Basic DNSSEC Troubleshooting
----------------------------
In this chapter, we cover some basic troubleshooting
techniques, some common DNSSEC symptoms, and their causes and solutions. This
is not a comprehensive "how to troubleshoot any DNS or DNSSEC problem"
guide, because that could easily be an entire book by itself.
.. _troubleshooting_query_path:
Query Path
~~~~~~~~~~
The first step in troubleshooting DNS or DNSSEC should be to
determine the query path. Whenever you are working with a DNS-related issue, it is
always a good idea to determine the exact query path to identify the
origin of the problem.
End clients, such as laptop computers or mobile phones, are configured
to talk to a recursive name server, and the recursive name server may in
turn forward requests on to other recursive name servers before arriving at the
authoritative name server. The giveaway is the presence of the
Authoritative Answer (``aa``) flag in a query response: when present, we know we are talking
to the authoritative server; when missing, we are talking to a recursive
server. The example below shows an answer to a query for
``www.example.com`` without the Authoritative Answer flag:
::
$ dig @10.53.0.3 www.example.com A
; <<>> DiG 9.16.0 <<>> @10.53.0.3 www.example.com a
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62714
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: c823fe302625db5b010000005e722b504d81bb01c2227259 (good)
;; QUESTION SECTION:
;www.example.com. IN A
;; ANSWER SECTION:
www.example.com. 60 IN A 10.1.0.1
;; Query time: 3 msec
;; SERVER: 10.53.0.3#53(10.53.0.3)
;; WHEN: Wed Mar 18 14:08:16 GMT 2020
;; MSG SIZE rcvd: 88
Not only do we not see the ``aa`` flag, we see an ``ra``
flag, which indicates Recursion Available. This indicates that the
server we are talking to (10.53.0.3 in this example) is a recursive name
server: although we were able to get an answer for
``www.example.com``, we know that the answer came from somewhere else.
If we query the authoritative server directly, we get:
::
$ dig @10.53.0.2 www.example.com A
; <<>> DiG 9.16.0 <<>> @10.53.0.2 www.example.com a
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39542
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
...
The ``aa`` flag tells us that we are now talking to the
authoritative name server for ``www.example.com``, and that this is not a
cached answer it obtained from some other name server; it served this
answer to us right from its own database. In fact,
the Recursion Available (``ra``) flag is not present, which means this
name server is not configured to perform recursion (at least not for
this client), so it could not have queried another name server to get
cached results.
.. _troubleshooting_visible_symptoms:
Visible DNSSEC Validation Symptoms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
After determining the query path, it is necessary to
determine whether the problem is actually related to DNSSEC
validation. You can use the ``+cd`` flag in ``dig`` to disable
validation, as described in
:ref:`how_do_i_know_validation_problem`.
When there is indeed a DNSSEC validation problem, the visible symptoms,
unfortunately, are very limited. With DNSSEC validation enabled, if a
DNS response is not fully validated, it results in a generic
SERVFAIL message, as shown below when querying against a recursive name
server at 192.168.1.7:
::
$ dig @10.53.0.3 www.example.org. A
; <<>> DiG 9.16.0 <<>> @10.53.0.3 www.example.org A
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 28947
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: d1301968aca086ad010000005e723a7113603c01916d136b (good)
;; QUESTION SECTION:
;www.example.org. IN A
;; Query time: 3 msec
;; SERVER: 10.53.0.3#53(10.53.0.3)
;; WHEN: Wed Mar 18 15:12:49 GMT 2020
;; MSG SIZE rcvd: 72
With ``delv``, a "resolution failed" message is output instead:
::
$ delv @10.53.0.3 www.example.org. A +rtrace
;; fetch: www.example.org/A
;; resolution failed: SERVFAIL
BIND 9 logging features may be useful when trying to identify
DNSSEC errors.
.. _troubleshooting_logging:
Basic Logging
~~~~~~~~~~~~~
DNSSEC validation error messages show up in ``syslog`` as a
query error by default. Here is an example of what it may look like:
::
validating www.example.org/A: no valid signature found
RRSIG failed to verify resolving 'www.example.org/A/IN': 10.53.0.2#53
Usually, this level of error logging is sufficient.
Debug logging, described in
:ref:`troubleshooting_logging_debug`, gives information on how
to get more details about why DNSSEC validation may have
failed.
.. _troubleshooting_logging_debug:
BIND DNSSEC Debug Logging
~~~~~~~~~~~~~~~~~~~~~~~~~
A word of caution: before you enable debug logging, be aware that this
may dramatically increase the load on your name servers. Enabling debug
logging is thus not recommended for production servers.
With that said, sometimes it may become necessary to temporarily enable
BIND debug logging to see more details of how and whether DNSSEC is
validating. DNSSEC-related messages are not recorded in ``syslog`` by default,
even if query log is enabled; only DNSSEC errors show up in ``syslog``.
The example below shows how to enable debug level 3 (to see full DNSSEC
validation messages) in BIND 9 and have it sent to ``syslog``:
::
logging {
channel dnssec_log {
syslog daemon;
severity debug 3;
print-category yes;
};
category dnssec { dnssec_log; };
};
The example below shows how to log DNSSEC messages to their own file
(here, ``/var/log/dnssec.log``):
::
logging {
channel dnssec_log {
file "/var/log/dnssec.log";
severity debug 3;
};
category dnssec { dnssec_log; };
};
After turning on debug logging and restarting BIND, a large
number of log messages appear in
``syslog``. The example below shows the log messages as a result of
successfully looking up and validating the domain name ``ftp.isc.org``.
::
validating ./NS: starting
validating ./NS: attempting positive response validation
validating ./DNSKEY: starting
validating ./DNSKEY: attempting positive response validation
validating ./DNSKEY: verify rdataset (keyid=20326): success
validating ./DNSKEY: marking as secure (DS)
validating ./NS: in validator_callback_dnskey
validating ./NS: keyset with trust secure
validating ./NS: resuming validate
validating ./NS: verify rdataset (keyid=33853): success
validating ./NS: marking as secure, noqname proof not needed
validating ftp.isc.org/A: starting
validating ftp.isc.org/A: attempting positive response validation
validating isc.org/DNSKEY: starting
validating isc.org/DNSKEY: attempting positive response validation
validating isc.org/DS: starting
validating isc.org/DS: attempting positive response validation
validating org/DNSKEY: starting
validating org/DNSKEY: attempting positive response validation
validating org/DS: starting
validating org/DS: attempting positive response validation
validating org/DS: keyset with trust secure
validating org/DS: verify rdataset (keyid=33853): success
validating org/DS: marking as secure, noqname proof not needed
validating org/DNSKEY: in validator_callback_ds
validating org/DNSKEY: dsset with trust secure
validating org/DNSKEY: verify rdataset (keyid=9795): success
validating org/DNSKEY: marking as secure (DS)
validating isc.org/DS: in fetch_callback_dnskey
validating isc.org/DS: keyset with trust secure
validating isc.org/DS: resuming validate
validating isc.org/DS: verify rdataset (keyid=33209): success
validating isc.org/DS: marking as secure, noqname proof not needed
validating isc.org/DNSKEY: in validator_callback_ds
validating isc.org/DNSKEY: dsset with trust secure
validating isc.org/DNSKEY: verify rdataset (keyid=7250): success
validating isc.org/DNSKEY: marking as secure (DS)
validating ftp.isc.org/A: in fetch_callback_dnskey
validating ftp.isc.org/A: keyset with trust secure
validating ftp.isc.org/A: resuming validate
validating ftp.isc.org/A: verify rdataset (keyid=27566): success
validating ftp.isc.org/A: marking as secure, noqname proof not needed
Note that these log messages indicate that the chain of trust has been
established and ``ftp.isc.org`` has been successfully validated.
If validation had failed, you would see log messages indicating errors.
We cover some of the most validation problems in the next section.
.. _troubleshooting_common_problems:
Common Problems
~~~~~~~~~~~~~~~
.. _troubleshooting_security_lameness:
Security Lameness
^^^^^^^^^^^^^^^^^
Similar to lame delegation in traditional DNS, security lameness refers to the
condition when the parent zone holds a set of DS records that point to
something that does not exist in the child zone. As a result,
the entire child zone may "disappear," having been marked as bogus by
validating resolvers.
Below is an example attempting to resolve the A record for a test domain
name ``www.example.net``. From the user's perspective, as described in
:ref:`how_do_i_know_validation_problem`, only a SERVFAIL
message is returned. On the validating resolver, we see the
following messages in ``syslog``:
::
named[126063]: validating example.net/DNSKEY: no valid signature found (DS)
named[126063]: no valid RRSIG resolving 'example.net/DNSKEY/IN': 10.53.0.2#53
named[126063]: broken trust chain resolving 'www.example.net/A/IN': 10.53.0.2#53
This gives us a hint that it is a broken trust chain issue. Let's take a
look at the DS records that are published for the zone (with the keys
shortened for ease of display):
::
$ dig @10.53.0.3 example.net. DS
; <<>> DiG 9.16.0 <<>> @10.53.0.3 example.net DS
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59602
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 7026d8f7c6e77e2a010000005e735d7c9d038d061b2d24da (good)
;; QUESTION SECTION:
;example.net. IN DS
;; ANSWER SECTION:
example.net. 256 IN DS 14956 8 2 9F3CACD...D3E3A396
;; Query time: 0 msec
;; SERVER: 10.53.0.3#53(10.53.0.3)
;; WHEN: Thu Mar 19 11:54:36 GMT 2020
;; MSG SIZE rcvd: 116
Next, we query for the DNSKEY and RRSIG of ``example.net`` to see if
there's anything wrong. Since we are having trouble validating, we
can use the ``+cd`` option to temporarily disable checking and return
results, even though they do not pass the validation tests. The
``+multiline`` option tells ``dig`` to print the type, algorithm type,
and key id for DNSKEY records. Again,
some long strings are shortened for ease of display:
::
$ dig @10.53.0.3 example.net. DNSKEY +dnssec +cd +multiline
; <<>> DiG 9.16.0 <<>> @10.53.0.3 example.net DNSKEY +cd +multiline +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42980
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: 4b5e7c88b3680c35010000005e73722057551f9f8be1990e (good)
;; QUESTION SECTION:
;example.net. IN DNSKEY
;; ANSWER SECTION:
example.net. 287 IN DNSKEY 256 3 8 (
AwEAAbu3NX...ADU/D7xjFFDu+8WRIn
) ; ZSK; alg = RSASHA256 ; key id = 35328
example.net. 287 IN DNSKEY 257 3 8 (
AwEAAbKtU1...PPP4aQZTybk75ZW+uL
6OJMAF63NO0s1nAZM2EWAVasbnn/X+J4N2rLuhk=
) ; KSK; alg = RSASHA256 ; key id = 27247
example.net. 287 IN RRSIG DNSKEY 8 2 300 (
20811123173143 20180101000000 27247 example.net.
Fz1sjClIoF...YEjzpAWuAj9peQ== )
example.net. 287 IN RRSIG DNSKEY 8 2 300 (
20811123173143 20180101000000 35328 example.net.
seKtUeJ4/l...YtDc1rcXTVlWIOw= )
;; Query time: 0 msec
;; SERVER: 10.53.0.3#53(10.53.0.3)
;; WHEN: Thu Mar 19 13:22:40 GMT 2020
;; MSG SIZE rcvd: 962
Here is the problem: the parent zone is telling the world that
``example.net`` is using the key 14956, but the authoritative server
indicates that it is using keys 27247 and 35328. There are several
potential causes for this mismatch: one possibility is that a malicious
attacker has compromised one side and changed the data. A more likely
scenario is that the DNS administrator for the child zone did not upload
the correct key information to the parent zone.
.. _troubleshooting_incorrect_time:
Incorrect Time
^^^^^^^^^^^^^^
In DNSSEC, every record comes with at least one RRSIG, and each RRSIG
contains two timestamps: one indicating when it becomes valid, and
one when it expires. If the validating resolver's current system time does
not fall within the two RRSIG timestamps, error messages
appear in the BIND debug log.
The example below shows a log message when the RRSIG appears to have
expired. This could mean the validating resolver system time is
incorrectly set too far in the future, or the zone administrator has not
kept up with RRSIG maintenance.
::
validating example.com/DNSKEY: verify failed due to bad signature (keyid=19036): RRSIG has expired
The log below shows that the RRSIG validity period has not yet begun. This could mean
the validation resolver's system time is incorrectly set too far in the past, or
the zone administrator has incorrectly generated signatures for this
domain name.
::
validating example.com/DNSKEY: verify failed due to bad signature (keyid=4521): RRSIG validity period has not begun
.. _troubleshooting_unable_to_load_keys:
Unable to Load Keys
^^^^^^^^^^^^^^^^^^^
This is a simple yet common issue. If the key files are present but
unreadable by ``named`` for some reason, the ``syslog`` returns clear error
messages, as shown below:
::
named[32447]: zone example.com/IN (signed): reconfiguring zone keys
named[32447]: dns_dnssec_findmatchingkeys: error reading key file Kexample.com.+008+06817.private: permission denied
named[32447]: dns_dnssec_findmatchingkeys: error reading key file Kexample.com.+008+17694.private: permission denied
named[32447]: zone example.com/IN (signed): next key event: 27-Nov-2014 20:04:36.521
However, if no keys are found, the error is not as obvious. Below shows
the ``syslog`` messages after executing ``rndc
reload`` with the key files missing from the key directory:
::
named[32516]: received control channel command 'reload'
named[32516]: loading configuration from '/etc/bind/named.conf'
named[32516]: reading built-in trusted keys from file '/etc/bind/bind.keys'
named[32516]: using default UDP/IPv4 port range: [1024, 65535]
named[32516]: using default UDP/IPv6 port range: [1024, 65535]
named[32516]: sizing zone task pool based on 6 zones
named[32516]: the working directory is not writable
named[32516]: reloading configuration succeeded
named[32516]: reloading zones succeeded
named[32516]: all zones loaded
named[32516]: running
named[32516]: zone example.com/IN (signed): reconfiguring zone keys
named[32516]: zone example.com/IN (signed): next key event: 27-Nov-2014 20:07:09.292
This happens to look exactly the same as if the keys were present and
readable, and appears to indicate that ``named`` loaded the keys and signed the zone. It
even generates the internal (raw) files:
::
# cd /etc/bind/db
# ls
example.com.db example.com.db.jbk example.com.db.signed
If ``named`` really loaded the keys and signed the zone, you should see
the following files:
::
# cd /etc/bind/db
# ls
example.com.db example.com.db.jbk example.com.db.signed example.com.db.signed.jnl
So, unless you see the ``*.signed.jnl`` file, your zone has not been
signed.
.. _troubleshooting_invalid_trust_anchors:
Invalid Trust Anchors
^^^^^^^^^^^^^^^^^^^^^
In most cases, you never need to explicitly configure trust
anchors. ``named`` supplies the current root trust anchor and,
with the default setting of ``dnssec-validation``, updates it on the
infrequent occasions when it is changed.
However, in some circumstances you may need to explicitly configure
your own trust anchor. As we saw in the :ref:`trust_anchors_description`
section, whenever a DNSKEY is received by the validating resolver, it is
compared to the list of keys the resolver explicitly trusts to see if
further action is needed. If the two keys match, the validating resolver
stops performing further verification and returns the answer(s) as
validated.
But what if the key file on the validating resolver is misconfigured or
missing? Below we show some examples of log messages when things are not
working properly.
First of all, if the key you copied is malformed, BIND does not even
start and you will likely find this error message in syslog:
::
named[18235]: /etc/bind/named.conf.options:29: bad base64 encoding
named[18235]: loading configuration: failure
If the key is a valid base64 string but the key algorithm is incorrect,
or if the wrong key is installed, the first thing you will notice is
that virtually all of your DNS lookups result in SERVFAIL, even when
you are looking up domain names that have not been DNSSEC-enabled. Below
shows an example of querying a recursive server 10.53.0.3:
::
$ dig @10.53.0.3 www.example.com. A
; <<>> DiG 9.16.0 <<>> @10.53.0.3 www.example.org A +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 29586
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: ee078fc321fa1367010000005e73a58bf5f205ca47e04bed (good)
;; QUESTION SECTION:
;www.example.org. IN A
``delv`` shows a similar result:
::
$ delv @192.168.1.7 www.example.com. +rtrace
;; fetch: www.example.com/A
;; resolution failed: SERVFAIL
The next symptom you see is in the DNSSEC log messages:
::
managed-keys-zone: DNSKEY set for zone '.' could not be verified with current keys
validating ./DNSKEY: starting
validating ./DNSKEY: attempting positive response validation
validating ./DNSKEY: no DNSKEY matching DS
validating ./DNSKEY: no DNSKEY matching DS
validating ./DNSKEY: no valid signature found (DS)
These errors are indications that there are problems with the trust
anchor.
.. _troubleshooting_nta:
Negative Trust Anchors
~~~~~~~~~~~~~~~~~~~~~~
BIND 9.11 introduced Negative Trust Anchors (NTAs) as a means to
*temporarily* disable DNSSEC validation for a zone when you know that
the zone's DNSSEC is misconfigured.
NTAs are added using the ``rndc`` command, e.g.:
::
$ rndc nta example.com
Negative trust anchor added: example.com/_default, expires 19-Mar-2020 19:57:42.000
The list of currently configured NTAs can also be examined using
``rndc``, e.g.:
::
$ rndc nta -dump
example.com/_default: expiry 19-Mar-2020 19:57:42.000
The default lifetime of an NTA is one hour, although by default, BIND
polls the zone every five minutes to see if the zone correctly
validates, at which point the NTA automatically expires. Both the
default lifetime and the polling interval may be configured via
``named.conf``, and the lifetime can be overridden on a per-zone basis
using the ``-lifetime duration`` parameter to ``rndc nta``. Both timer
values have a permitted maximum value of one week.
.. _troubleshooting_nsec3:
NSEC3 Troubleshooting
~~~~~~~~~~~~~~~~~~~~~
BIND includes a tool called ``nsec3hash`` that runs through the same
steps as a validating resolver, to generate the correct hashed name
based on NSEC3PARAM parameters. The command takes the following
parameters in order: salt, algorithm, iterations, and domain. For
example, if the salt is 1234567890ABCDEF, hash algorithm is 1, and
iteration is 10, to get the NSEC3-hashed name for ``www.example.com`` we
would execute a command like this:
::
$ nsec3hash 1234567890ABCEDF 1 10 www.example.com
RN7I9ME6E1I6BDKIP91B9TCE4FHJ7LKF (salt=1234567890ABCEDF, hash=1, iterations=10)
Zero-length salt can be specified as ``-``.
While it is unlikely you would construct a rainbow table of your own
zone data, this tool may be useful when troubleshooting NSEC3 problems.
|