stacd.conf
nvme-stas
Mr
Martin
Belanger
Dell, Inc.
stacd.conf
5
stacd.conf
stacd
8
configuration file
/etc/stas/stacd.conf
Description
When stacd
8 starts up, it reads its
configuration from stacd.conf.
Configuration File Format
stacd.conf is a plain text file divided into
sections, with configuration entries in the style
key=value.
Spaces immediately before or after the = are
ignored. Empty lines are ignored as well as lines starting with
#, which may be used for commenting.
Options
[Global] section
The following options are available in the
[Global] section:
nr-io-queues=
Takes a value in the range 1...N. Overrides the
default number of I/O queues create by the driver.
Note: This parameter is identical to that provided by nvme-cli.
Default: Depends on kernel and other run
time factors (e.g. number of CPUs).
nr-write-queues=
Takes a value in the range 1...N. Adds additional
queues that will be used for write I/O.
Note: This parameter is identical to that provided by nvme-cli.
Default: Depends on kernel and other run
time factors (e.g. number of CPUs).
nr-poll-queues=
Takes a value in the range 1...N. Adds additional
queues that will be used for polling latency
sensitive I/O.
Note: This parameter is identical to that provided by nvme-cli.
Default: Depends on kernel and other run
time factors (e.g. number of CPUs).
ignore-iface=
Takes a boolean argument. This option controls how
connections with I/O Controllers (IOC) are made.
There is no guarantee that there will be a route to
reach that IOC. However, we can use the socket
option SO_BINDTODEVICE to force the connection to be
made on a specific interface instead of letting the
routing tables decide where to make the connection.
This option determines whether stacd
will use
SO_BINDTODEVICE to force connections on an interface
or just rely on the routing tables. The default is
to use SO_BINDTODEVICE, in other words, stacd
does
not ignore the interface.
BACKGROUND:
By default, stacd
will connect to IOCs on the same
interface that was used to retrieve the discovery
log pages. If stafd discovers a DC on an interface
using mDNS, and stafd connects to that DC and
retrieves the log pages, it is expected that the
storage subsystems listed in the log pages are
reachable on the same interface where the DC was
discovered.
For example, let's say a DC is discovered on
interface ens102. Then all the subsystems listed in
the log pages retrieved from that DC must be
reachable on interface ens102. If this doesn't work,
for example you cannot "ping -I ens102 [storage-ip]",
then the most likely explanation is that proxy arp
is not enabled on the switch that the host is
connected to on interface ens102. Whatever you do,
resist the temptation to manually set up the routing
tables or to add alternate routes going over a
different interface than the one where the DC is
located. That simply won't work. Make sure proxy arp
is enabled on the switch first.
Setting routes won't work because, by default, stacd
uses the SO_BINDTODEVICE socket option when it
connects to IOCs. This option is used to force a
socket connection to be made on a specific interface
instead of letting the routing tables decide where
to connect the socket. Even if you were to manually
configure an alternate route on a different interface,
the connections (i.e. host to IOC) will still be
made on the interface where the DC was discovered by
stafd.
Defaults to false.
[I/O controller connection management] section
Connectivity between hosts and subsystems in a fabric is
controlled by Fabric Zoning. Entities that share a common
zone (i.e., are zoned together) are allowed to discover each
other and establish connections between them. Fabric Zoning is
configured on Discovery Controllers (DC). Users can add/remove
controllers and/or hosts to/from zones.
Hosts have no direct knowledge of the Fabric Zoning configuration
that is active on a given DC. As a result, if a host is impacted
by a Fabric Zoning configuration change, it will be notified of
the connectivity configuration change by the DC via Asynchronous
Event Notifications (AEN).
List of terms used in this section:
Term
Description
AEN
Asynchronous Event Notification. A CQE (Completion Queue Entry) for an Asynchronous Event Request that was previously transmitted by the host to a Discovery Controller. AENs are used by DCs to notify hosts that a change (e.g., a connectivity configuration change) has occurred.
DC
Discovery Controller.
DLP
Discovery Log Page. A host will issue a Get Log Page command to retrieve the list of controllers it may connect to.
DLPE
Discovery Log Page Entry. The response
to a Get Log Page command contains a list of DLPEs identifying
each controller that the host is allowed to connect with.
Note that DLPEs may contain both I/O Controllers (IOCs)
and Discovery Controllers (DCs). DCs listed in DLPEs
are called referrals. stacd
only deals with IOCs.
Referrals (DCs) are handled by stafd
.
IOC
I/O Controller.
Manual Config
Refers to manually adding entries to stacd.conf with the controller= parameter.
Automatic Config
Refers to receiving configuration from a DC as DLPEs
External Config
Refers to configuration done outside of the nvme-stas
framework, for example using nvme-cli
commands
DCs notify hosts of connectivity configuration changes by sending
AENs indicating a "Discovery Log" change. The host uses these AENs as
a trigger to issue a Get Log Page command. The response to this command
is used to update the list of DLPEs containing the controllers
the host is allowed to access.
Upon reception of the current DLPEs, the host will determine
whether DLPEs were added and/or removed, which will trigger the
addition and/or removal of controller connections. This happens in real time
and may affect active connections to controllers including controllers
that support I/O operations (IOCs). A host that was previously
connected to an IOC may suddenly be told that it is no longer
allowed to connect to that IOC and should disconnect from it.
IOC connection creation
There are 3 ways to configure IOC connections on a host:
Manual Config by adding controller= entries
to the [Controllers] section (see below).
Automatic Config received in the form of
DLPEs from a remote DC.
External Config using nvme-cli
(e.g. "nvme connect
")
IOC connection removal/prevention
There are 3 ways to remove (or prevent) connections to an IOC:
Manual Config.
by adding exclude= entries to
the [Controllers] section (see below).
by removing controller= entries
from the [Controllers] section.
Automatic Config. As explained above, a host gets a
new list of DLPEs upon connectivity configuration
changes. On DLPE removal, the host should remove the
connection to the IOC matching that DLPE. This
behavior is configurable using the
disconnect-scope= parameter
described below.
External Config using nvme-cli
(e.g. "nvme
disconnect
" or "nvme disconnect-all
")
The decision by the host to automatically disconnect from an
IOC following connectivity configuration changes is controlled
by two parameters: disconnect-scope
and disconnect-trtypes
.
disconnect-scope=
Takes one of: only-stas-connections,
all-connections-matching-disconnect-trtypes, or no-disconnect.
In theory, hosts should only connect to IOCs that have
been zoned for them. Connections to IOCs that a host
is not zoned to have access to should simply not exist.
In practice, however, users may not want hosts to
disconnect from all IOCs in reaction to connectivity
configuration changes (or at least for some of the IOC
connections).
Some users may prefer for IOC connections to be "sticky"
and only be removed manually (nvme-cli
or
exclude=) or removed by a system
reboot. Specifically, they don't want IOC connections
to be removed unexpectedly on DLPE removal. These users
may want to set disconnect-scope
to no-disconnect.
It is important to note that when IOC connections
are removed, ongoing I/O transactions will be
terminated immediately. There is no way to tell what
happens to the data being exchanged when such an abrupt
termination happens. If a host was in the middle of writing
to a storage subsystem, there is a chance that outstanding
I/O operations may not successfully complete.
Values:
only-stas-connections
Only remove connections previously made by stacd
.
In this mode, when a DLPE is removed as a result of
connectivity configuration changes, the corresponding
IOC connection will be removed by stacd
.
Connections to IOCs made externally, e.g. using nvme-cli
,
will not be affected, unless they happen to be duplicates
of connections made by stacd
. It's simply not
possible for stacd
to tell that a connection
was previously made with nvme-cli
(or any other external tool).
So, it's good practice to avoid duplicating
configuration between stacd
and external tools.
Users wanting to persist some of their IOC connections
regardless of connectivity configuration changes should not use
nvme-cli
to make those connections. Instead,
they should hard-code them in stacd.conf
with the controller= parameter. Using the
controller= parameter is the only way for a user
to tell stacd
that a connection must be made and
not be deleted "no-matter-what".
all-connections-matching-disconnect-trtypes
All connections that match the transport type specified by
disconnect-trtypes=, whether they were
made automatically by stacd
or externally
(e.g., nvme-cli
), will be audited and are
subject to removal on DLPE removal.
In this mode, as DLPEs are removed as a result of
connectivity configuration changes, the corresponding
IOC connections will be removed by the host immediately
whether they were made by stacd
, nvme-cli
,
or any other way. Basically, stacd
audits
all IOC connections matching the
transport type specified by disconnect-trtypes=.
NOTE
This mode implies that stacd
will
only allow Manually Configured or Automatically
Configured IOC connections to exist. Externally
Configured connections using nvme-cli
(or other external mechanism)
that do not match any Manual Config
(stacd.conf)
or Automatic Config (DLPEs) will get deleted
immediately by stacd
.
no-disconnect
stacd
does not disconnect from IOCs
when a DPLE is removed or a controller=
entry is removed from stacd.conf.
All IOC connections are "sticky".
Instead, users can remove connections
by issuing the nvme-cli
command "nvme disconnect
", add an
exclude= entry to
stacd.conf, or wait
until the next system reboot at which time all
connections will be removed.
Defaults to only-stas-connections.
disconnect-trtypes=
This parameter only applies when disconnect-scope
is set to all-connections-matching-disconnect-trtypes.
It limits the scope of the audit to specific transport types.
Can take the values tcp,
rdma, fc, or
a combination thereof by separating them with a plus (+) sign.
For example: tcp+fc. No spaces
are allowed between values and the plus (+) sign.
Values:
tcp
Audit TCP connections.
rdma
Audit RDMA connections.
fc
Audit Fibre Channel connections.
Defaults to tcp.
connect-attempts-on-ncc=
The NCC bit (Not Connected to CDC) is a bit returned
by the CDC in the EFLAGS field of the DLPE. Only CDCs
will set the NCC bit. DDCs will always clear NCC to
0. The NCC bit is a way for the CDC to let hosts
know that the subsystem is currently not reachable
by the CDC. This may indicate that the subsystem is
currently down or that there is an outage on the
section of the network connecting the CDC to the
subsystem.
If a host is currently failing to connect to an I/O
controller and if the NCC bit associated with that
I/O controller is asserted, the host can decide to
stop trying to connect to that subsystem until
connectivity is restored. This will be indicated by
the CDC when it clears the NCC bit.
The parameter connect-attempts-on-ncc=
controls whether stacd
will take the
NCC bit into account when attempting to connect to
an I/O Controller. Setting connect-attempts-on-ncc=
to 0 means that stacd
will ignore
the NCC bit and will keep trying to connect. Setting
connect-attempts-on-ncc= to a
non-zero value indicates the number of connection
attempts that will be made before stacd
gives up trying. Note that this value should be set
to a value greater than 1. In fact, when set to 1,
stacd
will automatically use 2 instead.
The reason for this is simple. It is possible that a
first connect attempt may fail.
Defaults to 0.
See Also
stacd
8