1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
|
<?xml version="1.0" encoding="UTF-8"?>
<!--*-nxml-*-->
<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<!--
SPDX-License-Identifier: Apache-2.0
Copyright (c) 2021, Dell Inc. or its subsidiaries. All rights reserved.
-->
<refentry id="stacd.conf" xmlns:xi="http://www.w3.org/2001/XInclude">
<refentryinfo>
<title>stacd.conf</title>
<productname>nvme-stas</productname>
<author>
<personname>
<honorific>Mr</honorific>
<firstname>Martin</firstname>
<surname>Belanger</surname>
</personname>
<affiliation>
<orgname>Dell, Inc.</orgname>
</affiliation>
</author>
</refentryinfo>
<refmeta>
<refentrytitle>stacd.conf</refentrytitle>
<manvolnum>5</manvolnum>
</refmeta>
<refnamediv>
<refname>stacd.conf</refname>
<refpurpose>
<citerefentry project="man-pages">
<refentrytitle>stacd</refentrytitle>
<manvolnum>8</manvolnum>
</citerefentry>
configuration file
</refpurpose>
</refnamediv>
<refsynopsisdiv>
<para>
<filename>/etc/stas/stacd.conf</filename>
</para>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
When <citerefentry project="man-pages"><refentrytitle>stacd</refentrytitle>
<manvolnum>8</manvolnum></citerefentry> starts up, it reads its
configuration from <filename>stacd.conf</filename>.
</para>
</refsect1>
<refsect1>
<title>Configuration File Format</title>
<para>
<filename>stacd.conf</filename> is a plain text file divided into
sections, with configuration entries in the style
<replaceable>key</replaceable>=<replaceable>value</replaceable>.
Spaces immediately before or after the <literal>=</literal> are
ignored. Empty lines are ignored as well as lines starting with
<literal>#</literal>, which may be used for commenting.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<refsect2>
<title>[Global] section</title>
<para>
The following options are available in the
<literal>[Global]</literal> section:
</para>
<variablelist>
<xi:include href="standard-conf.xml" xpointer="tron"/>
<xi:include href="standard-conf.xml" xpointer="hdr-digest"/>
<xi:include href="standard-conf.xml" xpointer="data-digest"/>
<xi:include href="standard-conf.xml" xpointer="kato"/>
<xi:include href="standard-conf.xml" xpointer="ip-family"/>
<varlistentry>
<term><varname>nr-io-queues=</varname></term>
<listitem>
<para>
Takes a value in the range 1...N. Overrides the
default number of I/O queues create by the driver.
</para>
<para>Note: This parameter is identical to that provided by nvme-cli.</para>
<para>
Default: Depends on kernel and other run
time factors (e.g. number of CPUs).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>nr-write-queues=</varname></term>
<listitem>
<para>
Takes a value in the range 1...N. Adds additional
queues that will be used for write I/O.
</para>
<para>Note: This parameter is identical to that provided by nvme-cli.</para>
<para>
Default: Depends on kernel and other run
time factors (e.g. number of CPUs).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>nr-poll-queues=</varname></term>
<listitem>
<para>
Takes a value in the range 1...N. Adds additional
queues that will be used for polling latency
sensitive I/O.
</para>
<para>Note: This parameter is identical to that provided by nvme-cli.</para>
<para>
Default: Depends on kernel and other run
time factors (e.g. number of CPUs).
</para>
</listitem>
</varlistentry>
<xi:include href="standard-conf.xml" xpointer="queue-size"/>
<xi:include href="standard-conf.xml" xpointer="reconnect-delay"/>
<xi:include href="standard-conf.xml" xpointer="ctrl-loss-tmo"/>
<xi:include href="standard-conf.xml" xpointer="disable-sqflow"/>
<varlistentry>
<term><varname>ignore-iface=</varname></term>
<listitem>
<para>
Takes a boolean argument. This option controls how
connections with I/O Controllers (IOC) are made.
</para>
<para>
There is no guarantee that there will be a route to
reach that IOC. However, we can use the socket
option SO_BINDTODEVICE to force the connection to be
made on a specific interface instead of letting the
routing tables decide where to make the connection.
</para>
<para>
This option determines whether <code>stacd</code> will use
SO_BINDTODEVICE to force connections on an interface
or just rely on the routing tables. The default is
to use SO_BINDTODEVICE, in other words, <code>stacd</code> does
not ignore the interface.
</para>
<para>
BACKGROUND:
By default, <code>stacd</code> will connect to IOCs on the same
interface that was used to retrieve the discovery
log pages. If stafd discovers a DC on an interface
using mDNS, and stafd connects to that DC and
retrieves the log pages, it is expected that the
storage subsystems listed in the log pages are
reachable on the same interface where the DC was
discovered.
</para>
<para>
For example, let's say a DC is discovered on
interface ens102. Then all the subsystems listed in
the log pages retrieved from that DC must be
reachable on interface ens102. If this doesn't work,
for example you cannot "ping -I ens102 [storage-ip]",
then the most likely explanation is that proxy arp
is not enabled on the switch that the host is
connected to on interface ens102. Whatever you do,
resist the temptation to manually set up the routing
tables or to add alternate routes going over a
different interface than the one where the DC is
located. That simply won't work. Make sure proxy arp
is enabled on the switch first.
</para>
<para>
Setting routes won't work because, by default, <code>stacd</code>
uses the SO_BINDTODEVICE socket option when it
connects to IOCs. This option is used to force a
socket connection to be made on a specific interface
instead of letting the routing tables decide where
to connect the socket. Even if you were to manually
configure an alternate route on a different interface,
the connections (i.e. host to IOC) will still be
made on the interface where the DC was discovered by
stafd.
</para>
<para>
Defaults to <parameter>false</parameter>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect2>
<refsect2>
<title>[I/O controller connection management] section</title>
<para>
Connectivity between hosts and subsystems in a fabric is
controlled by Fabric Zoning. Entities that share a common
zone (i.e., are zoned together) are allowed to discover each
other and establish connections between them. Fabric Zoning is
configured on Discovery Controllers (DC). Users can add/remove
controllers and/or hosts to/from zones.
</para>
<para>
Hosts have no direct knowledge of the Fabric Zoning configuration
that is active on a given DC. As a result, if a host is impacted
by a Fabric Zoning configuration change, it will be notified of
the connectivity configuration change by the DC via Asynchronous
Event Notifications (AEN).
</para>
<table frame='all'>
<title>List of terms used in this section:</title>
<tgroup cols="2" align='left' colsep='1' rowsep='1'>
<thead>
<row>
<entry>Term</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>AEN</entry>
<entry>Asynchronous Event Notification. A CQE (Completion Queue Entry) for an Asynchronous Event Request that was previously transmitted by the host to a Discovery Controller. AENs are used by DCs to notify hosts that a change (e.g., a connectivity configuration change) has occurred.</entry>
</row>
<row>
<entry>DC</entry>
<entry>Discovery Controller.</entry>
</row>
<row>
<entry>DLP</entry>
<entry>Discovery Log Page. A host will issue a Get Log Page command to retrieve the list of controllers it may connect to.</entry>
</row>
<row>
<entry>DLPE</entry>
<entry><simpara>
Discovery Log Page Entry. The response
to a Get Log Page command contains a list of DLPEs identifying
each controller that the host is allowed to connect with.
</simpara><simpara>
Note that DLPEs may contain both I/O Controllers (IOCs)
and Discovery Controllers (DCs). DCs listed in DLPEs
are called referrals. <code>stacd</code> only deals with IOCs.
Referrals (DCs) are handled by <code>stafd</code>.
</simpara>
</entry>
</row>
<row>
<entry>IOC</entry>
<entry>I/O Controller.</entry>
</row>
<row>
<entry>Manual Config</entry>
<entry>Refers to manually adding entries to <filename>stacd.conf</filename> with the <varname>controller=</varname> parameter.</entry>
</row>
<row>
<entry>Automatic Config</entry>
<entry>Refers to receiving configuration from a DC as DLPEs</entry>
</row>
<row>
<entry>External Config</entry>
<entry>Refers to configuration done outside of the <code>nvme-stas</code> framework, for example using <code>nvme-cli</code> commands</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
DCs notify hosts of connectivity configuration changes by sending
AENs indicating a "Discovery Log" change. The host uses these AENs as
a trigger to issue a Get Log Page command. The response to this command
is used to update the list of DLPEs containing the controllers
the host is allowed to access.
Upon reception of the current DLPEs, the host will determine
whether DLPEs were added and/or removed, which will trigger the
addition and/or removal of controller connections. This happens in real time
and may affect active connections to controllers including controllers
that support I/O operations (IOCs). A host that was previously
connected to an IOC may suddenly be told that it is no longer
allowed to connect to that IOC and should disconnect from it.
</para>
<formalpara><title>IOC connection creation</title>
<para>
There are 3 ways to configure IOC connections on a host:
</para>
<orderedlist>
<listitem>
<para>
Manual Config by adding <varname>controller=</varname> entries
to the <literal>[Controllers]</literal> section (see below).
</para>
</listitem>
<listitem>
<para>
Automatic Config received in the form of
DLPEs from a remote DC.
</para>
</listitem>
<listitem>
<para>
External Config using <code>nvme-cli</code> (e.g. "<code>nvme connect</code>")
</para>
</listitem>
</orderedlist>
</formalpara>
<formalpara><title>IOC connection removal/prevention</title>
<para>
There are 3 ways to remove (or prevent) connections to an IOC:
</para>
<orderedlist>
<listitem>
<para>
Manual Config.
<orderedlist numeration='lowerroman'>
<listitem>
<para>
by adding <varname>exclude=</varname> entries to
the <literal>[Controllers]</literal> section (see below).
</para>
</listitem>
<listitem>
<para>
by removing <varname>controller=</varname> entries
from the <literal>[Controllers]</literal> section.
</para>
</listitem>
</orderedlist>
</para>
</listitem>
<listitem>
<para>
Automatic Config. As explained above, a host gets a
new list of DLPEs upon connectivity configuration
changes. On DLPE removal, the host should remove the
connection to the IOC matching that DLPE. This
behavior is configurable using the
<varname>disconnect-scope=</varname> parameter
described below.
</para>
</listitem>
<listitem>
<para>
External Config using <code>nvme-cli</code> (e.g. "<code>nvme
disconnect</code>" or "<code>nvme disconnect-all</code>")
</para>
</listitem>
</orderedlist>
</formalpara>
<para>
The decision by the host to automatically disconnect from an
IOC following connectivity configuration changes is controlled
by two parameters: <code>disconnect-scope</code>
and <code>disconnect-trtypes</code>.
</para>
<variablelist>
<varlistentry>
<term><varname>disconnect-scope=</varname></term>
<listitem>
<para>
Takes one of: <parameter>only-stas-connections</parameter>,
<parameter>all-connections-matching-disconnect-trtypes</parameter>, or <parameter>no-disconnect</parameter>.
</para>
<para>
In theory, hosts should only connect to IOCs that have
been zoned for them. Connections to IOCs that a host
is not zoned to have access to should simply not exist.
In practice, however, users may not want hosts to
disconnect from all IOCs in reaction to connectivity
configuration changes (or at least for some of the IOC
connections).
</para>
<para>
Some users may prefer for IOC connections to be "sticky"
and only be removed manually (<code>nvme-cli</code> or
<varname>exclude=</varname>) or removed by a system
reboot. Specifically, they don't want IOC connections
to be removed unexpectedly on DLPE removal. These users
may want to set <varname>disconnect-scope</varname>
to <parameter>no-disconnect</parameter>.
</para>
<para>
It is important to note that when IOC connections
are removed, ongoing I/O transactions will be
terminated immediately. There is no way to tell what
happens to the data being exchanged when such an abrupt
termination happens. If a host was in the middle of writing
to a storage subsystem, there is a chance that outstanding
I/O operations may not successfully complete.
</para>
<refsect3>
<title>Values:</title>
<variablelist>
<varlistentry>
<term><parameter>only-stas-connections</parameter></term>
<listitem>
<para>
Only remove connections previously made by <code>stacd</code>.
</para>
<para>
In this mode, when a DLPE is removed as a result of
connectivity configuration changes, the corresponding
IOC connection will be removed by <code>stacd</code>.
</para>
<para>
Connections to IOCs made externally, e.g. using <code>nvme-cli</code>,
will not be affected, unless they happen to be duplicates
of connections made by <code>stacd</code>. It's simply not
possible for <code>stacd</code> to tell that a connection
was previously made with <code>nvme-cli</code> (or any other external tool).
So, it's good practice to avoid duplicating
configuration between <code>stacd</code> and external tools.
</para>
<para>
Users wanting to persist some of their IOC connections
regardless of connectivity configuration changes should not use
<code>nvme-cli</code> to make those connections. Instead,
they should hard-code them in <filename>stacd.conf</filename>
with the <varname>controller=</varname> parameter. Using the
<varname>controller=</varname> parameter is the only way for a user
to tell <code>stacd</code> that a connection must be made and
not be deleted "<emphasis>no-matter-what</emphasis>".
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>all-connections-matching-disconnect-trtypes</parameter></term>
<listitem>
<para>
All connections that match the transport type specified by
<varname>disconnect-trtypes=</varname>, whether they were
made automatically by <code>stacd</code> or externally
(e.g., <code>nvme-cli</code>), will be audited and are
subject to removal on DLPE removal.
</para>
<para>
In this mode, as DLPEs are removed as a result of
connectivity configuration changes, the corresponding
IOC connections will be removed by the host immediately
whether they were made by <code>stacd</code>, <code>nvme-cli</code>,
or any other way. Basically, <code>stacd</code> audits
<emphasis>all</emphasis> IOC connections matching the
transport type specified by <varname>disconnect-trtypes=</varname>.
</para>
<formalpara><title><emphasis>NOTE</emphasis></title>
<para>
This mode implies that <code>stacd</code> will
only allow Manually Configured or Automatically
Configured IOC connections to exist. Externally
Configured connections using <code>nvme-cli</code>
(or other external mechanism)
that do not match any Manual Config
(<filename>stacd.conf</filename>)
or Automatic Config (DLPEs) will get deleted
immediately by <code>stacd</code>.
</para>
</formalpara>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>no-disconnect</parameter></term>
<listitem>
<para>
<code>stacd</code> does not disconnect from IOCs
when a DPLE is removed or a <varname>controller=</varname>
entry is removed from <filename>stacd.conf</filename>.
All IOC connections are "sticky".
</para>
<para>
Instead, users can remove connections
by issuing the <code>nvme-cli</code>
command "<code>nvme disconnect</code>", add an
<varname>exclude=</varname> entry to
<filename>stacd.conf</filename>, or wait
until the next system reboot at which time all
connections will be removed.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect3>
<para>
Defaults to <parameter>only-stas-connections</parameter>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>disconnect-trtypes=</varname></term>
<listitem>
<para>
This parameter only applies when <varname>disconnect-scope</varname>
is set to <parameter>all-connections-matching-disconnect-trtypes</parameter>.
It limits the scope of the audit to specific transport types.
</para>
<para>
Can take the values <parameter>tcp</parameter>,
<parameter>rdma</parameter>, <parameter>fc</parameter>, or
a combination thereof by separating them with a plus (+) sign.
For example: <parameter>tcp+fc</parameter>. No spaces
are allowed between values and the plus (+) sign.
</para>
<refsect3>
<title>Values:</title>
<variablelist>
<varlistentry>
<term><parameter>tcp</parameter></term>
<listitem>
<para>
Audit TCP connections.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>rdma</parameter></term>
<listitem>
<para>
Audit RDMA connections.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>fc</parameter></term>
<listitem>
<para>
Audit Fibre Channel connections.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect3>
<para>
Defaults to <parameter>tcp</parameter>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><varname>connect-attempts-on-ncc=</varname></term>
<listitem>
<para>
The NCC bit (Not Connected to CDC) is a bit returned
by the CDC in the EFLAGS field of the DLPE. Only CDCs
will set the NCC bit. DDCs will always clear NCC to
0. The NCC bit is a way for the CDC to let hosts
know that the subsystem is currently not reachable
by the CDC. This may indicate that the subsystem is
currently down or that there is an outage on the
section of the network connecting the CDC to the
subsystem.
</para>
<para>
If a host is currently failing to connect to an I/O
controller and if the NCC bit associated with that
I/O controller is asserted, the host can decide to
stop trying to connect to that subsystem until
connectivity is restored. This will be indicated by
the CDC when it clears the NCC bit.
</para>
<para>
The parameter <varname>connect-attempts-on-ncc=</varname>
controls whether <code>stacd</code> will take the
NCC bit into account when attempting to connect to
an I/O Controller. Setting <varname>connect-attempts-on-ncc=</varname>
to 0 means that <code>stacd</code> will ignore
the NCC bit and will keep trying to connect. Setting
<varname>connect-attempts-on-ncc=</varname> to a
non-zero value indicates the number of connection
attempts that will be made before <code>stacd</code>
gives up trying. Note that this value should be set
to a value greater than 1. In fact, when set to 1,
<code>stacd</code> will automatically use 2 instead.
The reason for this is simple. It is possible that a
first connect attempt may fail.
</para>
<para>
Defaults to <parameter>0</parameter>.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect2>
<xi:include href="standard-conf.xml" xpointer="controller"/>
</refsect1>
<refsect1>
<title>See Also</title>
<para>
<citerefentry>
<refentrytitle>stacd</refentrytitle>
<manvolnum>8</manvolnum>
</citerefentry>
</para>
</refsect1>
</refentry>
|