summaryrefslogtreecommitdiffstats
path: root/Documentation/networking
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst11
-rw-r--r--Documentation/networking/devlink/devlink-info.rst5
-rw-r--r--Documentation/networking/devlink/devlink-port.rst33
-rw-r--r--Documentation/networking/devlink/devlink-region.rst2
-rw-r--r--Documentation/networking/devlink/hns3.rst5
-rw-r--r--Documentation/networking/devlink/ice.rst47
-rw-r--r--Documentation/networking/devlink/nfp.rst5
-rw-r--r--Documentation/networking/dns_resolver.rst4
-rw-r--r--Documentation/networking/ethtool-netlink.rst29
-rw-r--r--Documentation/networking/filter.rst4
-rw-r--r--Documentation/networking/index.rst1
-rw-r--r--Documentation/networking/nf_conntrack-sysctl.rst4
-rw-r--r--Documentation/networking/pse-pd/index.rst10
-rw-r--r--Documentation/networking/pse-pd/introduction.rst73
-rw-r--r--Documentation/networking/pse-pd/pse-pi.rst301
-rw-r--r--Documentation/networking/xfrm_proc.rst6
-rw-r--r--Documentation/networking/xsk-tx-metadata.rst16
17 files changed, 542 insertions, 14 deletions
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
index f69ee1ebee..fed821ef9b 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst
@@ -300,6 +300,11 @@ the software port.
in the beginning of the queue. This is a normal condition.
- Informative
+ * - `tx[i]_timestamps`
+ - Transmitted packets that were hardware timestamped at the device's DMA
+ layer.
+ - Informative
+
* - `tx[i]_added_vlan_packets`
- The number of packets sent where vlan tag insertion was offloaded to the
hardware.
@@ -702,6 +707,12 @@ the software port.
the device typically ensures not posting the CQE.
- Error
+ * - `ptp_cq[i]_lost_cqe`
+ - Number of times a CQE is expected to not be delivered on the PTP
+ timestamping CQE by the device due to a time delta elapsing. If such a
+ CQE is somehow delivered, `ptp_cq[i]_late_cqe` is incremented.
+ - Error
+
.. [#ring_global] The corresponding ring and global counters do not share the
same name (i.e. do not follow the common naming scheme).
diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst
index 1242b0e682..23073bc219 100644
--- a/Documentation/networking/devlink/devlink-info.rst
+++ b/Documentation/networking/devlink/devlink-info.rst
@@ -146,6 +146,11 @@ board.manufacture
An identifier of the company or the facility which produced the part.
+board.part_number
+-----------------
+
+Part number of the board and its components.
+
fw
--
diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst
index 562f46b412..9d22d41a7c 100644
--- a/Documentation/networking/devlink/devlink-port.rst
+++ b/Documentation/networking/devlink/devlink-port.rst
@@ -134,6 +134,9 @@ Users may also set the IPsec crypto capability of the function using
Users may also set the IPsec packet capability of the function using
`devlink port function set ipsec_packet` command.
+Users may also set the maximum IO event queues of the function
+using `devlink port function set max_io_eqs` command.
+
Function attributes
===================
@@ -295,6 +298,36 @@ policy is processed in software by the kernel.
function:
hw_addr 00:00:00:00:00:00 ipsec_packet enabled
+Maximum IO events queues setup
+------------------------------
+When user sets maximum number of IO event queues for a SF or
+a VF, such function driver is limited to consume only enforced
+number of IO event queues.
+
+IO event queues deliver events related to IO queues, including network
+device transmit and receive queues (txq and rxq) and RDMA Queue Pairs (QPs).
+For example, the number of netdevice channels and RDMA device completion
+vectors are derived from the function's IO event queues. Usually, the number
+of interrupt vectors consumed by the driver is limited by the number of IO
+event queues per device, as each of the IO event queues is connected to an
+interrupt vector.
+
+- Get maximum IO event queues of the VF device::
+
+ $ devlink port show pci/0000:06:00.0/2
+ pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+ function:
+ hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10
+
+- Set maximum IO event queues of the VF device::
+
+ $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32
+
+ $ devlink port show pci/0000:06:00.0/2
+ pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+ function:
+ hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32
+
Subfunction
============
diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst
index 9232cd7da3..5d0b68f752 100644
--- a/Documentation/networking/devlink/devlink-region.rst
+++ b/Documentation/networking/devlink/devlink-region.rst
@@ -49,7 +49,7 @@ example usage
$ devlink region show [ DEV/REGION ]
$ devlink region del DEV/REGION snapshot SNAPSHOT_ID
$ devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ]
- $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length length
+ $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length LENGTH
# Show all of the exposed regions with region sizes:
$ devlink region show
diff --git a/Documentation/networking/devlink/hns3.rst b/Documentation/networking/devlink/hns3.rst
index 4562a6e478..72bc1b9f37 100644
--- a/Documentation/networking/devlink/hns3.rst
+++ b/Documentation/networking/devlink/hns3.rst
@@ -23,3 +23,8 @@ The ``hns3`` driver reports the following versions
* - ``fw``
- running
- Used to represent the firmware version.
+ * - ``fw.scc``
+ - running
+ - Used to represent the Soft Congestion Control (SSC) firmware version.
+ SCC is a firmware component which provides multiple RDMA congestion
+ control algorithms, including DCQCN.
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
index 7f30ebd5de..830c043542 100644
--- a/Documentation/networking/devlink/ice.rst
+++ b/Documentation/networking/devlink/ice.rst
@@ -21,6 +21,53 @@ Parameters
* - ``enable_iwarp``
- runtime
- mutually exclusive with ``enable_roce``
+ * - ``tx_scheduling_layers``
+ - permanent
+ - The ice hardware uses hierarchical scheduling for Tx with a fixed
+ number of layers in the scheduling tree. Each of them are decision
+ points. Root node represents a port, while all the leaves represent
+ the queues. This way of configuring the Tx scheduler allows features
+ like DCB or devlink-rate (documented below) to configure how much
+ bandwidth is given to any given queue or group of queues, enabling
+ fine-grained control because scheduling parameters can be configured
+ at any given layer of the tree.
+
+ The default 9-layer tree topology was deemed best for most workloads,
+ as it gives an optimal ratio of performance to configurability. However,
+ for some specific cases, this 9-layer topology might not be desired.
+ One example would be sending traffic to queues that are not a multiple
+ of 8. Because the maximum radix is limited to 8 in 9-layer topology,
+ the 9th queue has a different parent than the rest, and it's given
+ more bandwidth credits. This causes a problem when the system is
+ sending traffic to 9 queues:
+
+ | tx_queue_0_packets: 24163396
+ | tx_queue_1_packets: 24164623
+ | tx_queue_2_packets: 24163188
+ | tx_queue_3_packets: 24163701
+ | tx_queue_4_packets: 24163683
+ | tx_queue_5_packets: 24164668
+ | tx_queue_6_packets: 23327200
+ | tx_queue_7_packets: 24163853
+ | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th
+
+ To address this need, you can switch to a 5-layer topology, which
+ changes the maximum topology radix to 512. With this enhancement,
+ the performance characteristic is equal as all queues can be assigned
+ to the same parent in the tree. The obvious drawback of this solution
+ is a lower configuration depth of the tree.
+
+ Use the ``tx_scheduling_layer`` parameter with the devlink command
+ to change the transmit scheduler topology. To use 5-layer topology,
+ use a value of 5. For example:
+ $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers
+ value 5 cmode permanent
+ Use a value of 9 to set it back to the default value.
+
+ You must do PCI slot powercycle for the selected topology to take effect.
+
+ To verify that value has been set:
+ $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers
Info versions
=============
diff --git a/Documentation/networking/devlink/nfp.rst b/Documentation/networking/devlink/nfp.rst
index a1717db0df..3093642bda 100644
--- a/Documentation/networking/devlink/nfp.rst
+++ b/Documentation/networking/devlink/nfp.rst
@@ -32,7 +32,7 @@ The ``nfp`` driver reports the following versions
- Description
* - ``board.id``
- fixed
- - Part number identifying the board design
+ - Identifier of the board design
* - ``board.rev``
- fixed
- Revision of the board design
@@ -42,6 +42,9 @@ The ``nfp`` driver reports the following versions
* - ``board.model``
- fixed
- Model name of the board design
+ * - ``board.part_number``
+ - fixed
+ - Part number of the board and its components
* - ``fw.bundle_id``
- stored, running
- Firmware bundle id
diff --git a/Documentation/networking/dns_resolver.rst b/Documentation/networking/dns_resolver.rst
index add4d59a99..c0364f7070 100644
--- a/Documentation/networking/dns_resolver.rst
+++ b/Documentation/networking/dns_resolver.rst
@@ -118,7 +118,7 @@ Keys of dns_resolver type can be read from userspace using keyctl_read() or
Mechanism
=========
-The dnsresolver module registers a key type called "dns_resolver". Keys of
+The dns_resolver module registers a key type called "dns_resolver". Keys of
this type are used to transport and cache DNS lookup results from userspace.
When dns_query() is invoked, it calls request_key() to search the local
@@ -152,4 +152,4 @@ Debugging
Debugging messages can be turned on dynamically by writing a 1 into the
following file::
- /sys/module/dnsresolver/parameters/debug
+ /sys/module/dns_resolver/parameters/debug
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index d583d9abf2..160bfb0ae8 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -1237,12 +1237,21 @@ Kernel response contents:
``ETHTOOL_A_TSINFO_TX_TYPES`` bitset supported Tx types
``ETHTOOL_A_TSINFO_RX_FILTERS`` bitset supported Rx filters
``ETHTOOL_A_TSINFO_PHC_INDEX`` u32 PTP hw clock index
+ ``ETHTOOL_A_TSINFO_STATS`` nested HW timestamping statistics
===================================== ====== ==========================
``ETHTOOL_A_TSINFO_PHC_INDEX`` is absent if there is no associated PHC (there
is no special value for this case). The bitset attributes are omitted if they
would be empty (no bit set).
+Additional hardware timestamping statistics response contents:
+
+ ===================================== ====== ===================================
+ ``ETHTOOL_A_TS_STAT_TX_PKTS`` uint Packets with Tx HW timestamps
+ ``ETHTOOL_A_TS_STAT_TX_LOST`` uint Tx HW timestamp not arrived count
+ ``ETHTOOL_A_TS_STAT_TX_ERR`` uint HW error request Tx timestamp count
+ ===================================== ====== ===================================
+
CABLE_TEST
==========
@@ -1717,6 +1726,10 @@ Kernel response contents:
PSE functions
``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` u32 power detection status of the
PoDL PSE.
+ ``ETHTOOL_A_C33_PSE_ADMIN_STATE`` u32 Operational state of the PoE
+ PSE functions.
+ ``ETHTOOL_A_C33_PSE_PW_D_STATUS`` u32 power detection status of the
+ PoE PSE.
====================================== ====== =============================
When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` attribute identifies
@@ -1728,6 +1741,12 @@ aPoDLPSEAdminState. Possible values are:
.. kernel-doc:: include/uapi/linux/ethtool.h
:identifiers: ethtool_podl_pse_admin_state
+The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_STATE`` implementing
+``IEEE 802.3-2022`` 30.9.1.1.2 aPSEAdminState.
+
+.. kernel-doc:: include/uapi/linux/ethtool.h
+ :identifiers: ethtool_c33_pse_admin_state
+
When set, the optional ``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` attribute identifies
the power detection status of the PoDL PSE. The status depend on internal PSE
state machine and automatic PD classification support. This option is
@@ -1737,6 +1756,12 @@ Possible values are:
.. kernel-doc:: include/uapi/linux/ethtool.h
:identifiers: ethtool_podl_pse_pw_d_status
+The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_PW_D_STATUS`` implementing
+``IEEE 802.3-2022`` 30.9.1.1.5 aPSEPowerDetectionStatus.
+
+.. kernel-doc:: include/uapi/linux/ethtool.h
+ :identifiers: ethtool_c33_pse_pw_d_status
+
PSE_SET
=======
@@ -1747,6 +1772,7 @@ Request contents:
====================================== ====== =============================
``ETHTOOL_A_PSE_HEADER`` nested request header
``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` u32 Control PoDL PSE Admin state
+ ``ETHTOOL_A_C33_PSE_ADMIN_CONTROL`` u32 Control PSE Admin state
====================================== ====== =============================
When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` attribute is used
@@ -1754,6 +1780,9 @@ to control PoDL PSE Admin functions. This option is implementing
``IEEE 802.3-2018`` 30.15.1.2.1 acPoDLPSEAdminControl. See
``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` for supported values.
+The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_CONTROL`` implementing
+``IEEE 802.3-2022`` 30.9.1.2.1 acPSEAdminControl.
+
RSS_GET
=======
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index 7d8c538049..8eb9a5d40f 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -513,7 +513,7 @@ JIT compiler
------------
The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC,
-PowerPC, ARM, ARM64, MIPS, RISC-V and s390 and can be enabled through
+PowerPC, ARM, ARM64, MIPS, RISC-V, s390, and ARC and can be enabled through
CONFIG_BPF_JIT. The JIT compiler is transparently invoked for each
attached filter from user space or for internal kernel users if it has
been previously enabled by root::
@@ -650,7 +650,7 @@ before a conversion to the new layout is being done behind the scenes!
Currently, the classic BPF format is being used for JITing on most
32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64,
-sparc64, arm32, riscv64, riscv32, loongarch64 perform JIT compilation
+sparc64, arm32, riscv64, riscv32, loongarch64, arc perform JIT compilation
from eBPF instruction set.
Testing
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 473d72c36d..7664c0bfe4 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -93,6 +93,7 @@ Contents:
plip
ppp_generic
proc_net_tcp
+ pse-pd/index
radiotap-headers
rds
regulatory
diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst
index c383a394c6..238b66d0e0 100644
--- a/Documentation/networking/nf_conntrack-sysctl.rst
+++ b/Documentation/networking/nf_conntrack-sysctl.rst
@@ -222,11 +222,11 @@ nf_flowtable_tcp_timeout - INTEGER (seconds)
Control offload timeout for tcp connections.
TCP connections may be offloaded from nf conntrack to nf flow table.
- Once aged, the connection is returned to nf conntrack with tcp pickup timeout.
+ Once aged, the connection is returned to nf conntrack.
nf_flowtable_udp_timeout - INTEGER (seconds)
default 30
Control offload timeout for udp connections.
UDP connections may be offloaded from nf conntrack to nf flow table.
- Once aged, the connection is returned to nf conntrack with udp pickup timeout.
+ Once aged, the connection is returned to nf conntrack.
diff --git a/Documentation/networking/pse-pd/index.rst b/Documentation/networking/pse-pd/index.rst
new file mode 100644
index 0000000000..de28a5aee3
--- /dev/null
+++ b/Documentation/networking/pse-pd/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Power Sourcing Equipment (PSE) Documentation
+============================================
+
+.. toctree::
+ :maxdepth: 2
+
+ introduction
+ pse-pi
diff --git a/Documentation/networking/pse-pd/introduction.rst b/Documentation/networking/pse-pd/introduction.rst
new file mode 100644
index 0000000000..e3d3faaef7
--- /dev/null
+++ b/Documentation/networking/pse-pd/introduction.rst
@@ -0,0 +1,73 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Power Sourcing Equipment (PSE) in IEEE 802.3 Standard
+=====================================================
+
+Overview
+--------
+
+Power Sourcing Equipment (PSE) is essential in networks for delivering power
+along with data over Ethernet cables. It usually refers to devices like
+switches and hubs that supply power to Powered Devices (PDs) such as IP
+cameras, VoIP phones, and wireless access points.
+
+PSE vs. PoDL PSE
+----------------
+
+PSE in the IEEE 802.3 standard generally refers to equipment that provides
+power alongside data over Ethernet cables, typically associated with Power over
+Ethernet (PoE).
+
+PoDL PSE, or Power over Data Lines PSE, specifically denotes PSEs operating
+with single balanced twisted-pair PHYs, as per Clause 104 of IEEE 802.3. PoDL
+is significant in contexts like automotive and industrial controls where power
+and data delivery over a single pair is advantageous.
+
+IEEE 802.3-2018 Addendums and Related Clauses
+---------------------------------------------
+
+Key addenda to the IEEE 802.3-2018 standard relevant to power delivery over
+Ethernet are as follows:
+
+- **802.3af (Approved in 2003-06-12)**: Known as PoE in the market, detailed in
+ Clause 33, delivering up to 15.4W of power.
+- **802.3at (Approved in 2009-09-11)**: Marketed as PoE+, enhancing PoE as
+ covered in Clause 33, increasing power delivery to up to 30W.
+- **802.3bt (Approved in 2018-09-27)**: Known as 4PPoE in the market, outlined
+ in Clause 33. Type 3 delivers up to 60W, and Type 4 up to 100W.
+- **802.3bu (Approved in 2016-12-07)**: Formerly referred to as PoDL, detailed
+ in Clause 104. Introduces Classes 0 - 9. Class 9 PoDL PSE delivers up to ~65W
+
+Kernel Naming Convention Recommendations
+----------------------------------------
+
+For clarity and consistency within the Linux kernel's networking subsystem, the
+following naming conventions are recommended:
+
+- For general PSE (PoE) code, use "c33_pse" key words. For example:
+ ``enum ethtool_c33_pse_admin_state c33_admin_control;``.
+ This aligns with Clause 33, encompassing various PoE forms.
+
+- For PoDL PSE - specific code, use "podl_pse". For example:
+ ``enum ethtool_podl_pse_admin_state podl_admin_control;`` to differentiate
+ PoDL PSE settings according to Clause 104.
+
+Summary of Clause 33: Data Terminal Equipment (DTE) Power via Media Dependent Interface (MDI)
+---------------------------------------------------------------------------------------------
+
+Clause 33 of the IEEE 802.3 standard defines the functional and electrical
+characteristics of Powered Device (PD) and Power Sourcing Equipment (PSE).
+These entities enable power delivery using the same generic cabling as for data
+transmission, integrating power with data communication for devices such as
+10BASE-T, 100BASE-TX, or 1000BASE-T.
+
+Summary of Clause 104: Power over Data Lines (PoDL) of Single Balanced Twisted-Pair Ethernet
+--------------------------------------------------------------------------------------------
+
+Clause 104 of the IEEE 802.3 standard delineates the functional and electrical
+characteristics of PoDL Powered Devices (PDs) and PoDL Power Sourcing Equipment
+(PSEs). These are designed for use with single balanced twisted-pair Ethernet
+Physical Layers. In this clause, 'PSE' refers specifically to PoDL PSE, and
+'PD' to PoDL PD. The key intent is to provide devices with a unified interface
+for both data and the power required to process this data over a single
+balanced twisted-pair Ethernet connection.
diff --git a/Documentation/networking/pse-pd/pse-pi.rst b/Documentation/networking/pse-pd/pse-pi.rst
new file mode 100644
index 0000000000..5cad14fedc
--- /dev/null
+++ b/Documentation/networking/pse-pd/pse-pi.rst
@@ -0,0 +1,301 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+PSE Power Interface (PSE PI) Documentation
+==========================================
+
+The Power Sourcing Equipment Power Interface (PSE PI) plays a pivotal role in
+the architecture of Power over Ethernet (PoE) systems. It is essentially a
+blueprint that outlines how one or multiple power sources are connected to the
+eight-pin modular jack, commonly known as the Ethernet RJ45 port. This
+connection scheme is crucial for enabling the delivery of power alongside data
+over Ethernet cables.
+
+Documentation and Standards
+---------------------------
+
+The IEEE 802.3 standard provides detailed documentation on the PSE PI.
+Specifically:
+
+- Section "33.2.3 PI pin assignments" covers the pin assignments for PoE
+ systems that utilize two pairs for power delivery.
+- Section "145.2.4 PSE PI" addresses the configuration for PoE systems that
+ deliver power over all four pairs of an Ethernet cable.
+
+PSE PI and Single Pair Ethernet
+-------------------------------
+
+Single Pair Ethernet (SPE) represents a different approach to Ethernet
+connectivity, utilizing just one pair of conductors for both data and power
+transmission. Unlike the configurations detailed in the PSE PI for standard
+Ethernet, which can involve multiple power sourcing arrangements across four or
+two pairs of wires, SPE operates on a simpler model due to its single-pair
+design. As a result, the complexities of choosing between alternative pin
+assignments for power delivery, as described in the PSE PI for multi-pair
+Ethernet, are not applicable to SPE.
+
+Understanding PSE PI
+--------------------
+
+The Power Sourcing Equipment Power Interface (PSE PI) is a framework defining
+how Power Sourcing Equipment (PSE) delivers power to Powered Devices (PDs) over
+Ethernet cables. It details two main configurations for power delivery, known
+as Alternative A and Alternative B, which are distinguished not only by their
+method of power transmission but also by the implications for polarity and data
+transmission direction.
+
+Alternative A and B Overview
+----------------------------
+
+- **Alternative A:** Utilizes RJ45 conductors 1, 2, 3 and 6. In either case of
+ networks 10/100BaseT or 1G/2G/5G/10GBaseT, the pairs used are carrying data.
+ The power delivery's polarity in this alternative can vary based on the MDI
+ (Medium Dependent Interface) or MDI-X (Medium Dependent Interface Crossover)
+ configuration.
+
+- **Alternative B:** Utilizes RJ45 conductors 4, 5, 7 and 8. In case of
+ 10/100BaseT network the pairs used are spare pairs without data and are less
+ influenced by data transmission direction. This is not the case for
+ 1G/2G/5G/10GBaseT network. Alternative B includes two configurations with
+ different polarities, known as variant X and variant S, to accommodate
+ different network requirements and device specifications.
+
+Table 145-3 PSE Pinout Alternatives
+-----------------------------------
+
+The following table outlines the pin configurations for both Alternative A and
+Alternative B.
+
++------------+-------------------+-----------------+-----------------+-----------------+
+| Conductor | Alternative A | Alternative A | Alternative B | Alternative B |
+| | (MDI-X) | (MDI) | (X) | (S) |
++============+===================+=================+=================+=================+
+| 1 | Negative V | Positive V | - | - |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 2 | Negative V | Positive V | - | - |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 3 | Positive V | Negative V | - | - |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 4 | - | - | Negative V | Positive V |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 5 | - | - | Negative V | Positive V |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 6 | Positive V | Negative V | - | - |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 7 | - | - | Positive V | Negative V |
++------------+-------------------+-----------------+-----------------+-----------------+
+| 8 | - | - | Positive V | Negative V |
++------------+-------------------+-----------------+-----------------+-----------------+
+
+.. note::
+ - "Positive V" and "Negative V" indicate the voltage polarity for each pin.
+ - "-" indicates that the pin is not used for power delivery in that
+ specific configuration.
+
+PSE PI compatibilities
+----------------------
+
+The following table outlines the compatibility between the pinout alternative
+and the 1000/2.5G/5G/10GBaseT in the PSE 2 pairs connection.
+
++---------+---------------+---------------------+-----------------------+
+| Variant | Alternative | Power Feeding Type | Compatibility with |
+| | (A/B) | (Direct/Phantom) | 1000/2.5G/5G/10GBaseT |
++=========+===============+=====================+=======================+
+| 1 | A | Phantom | Yes |
++---------+---------------+---------------------+-----------------------+
+| 2 | B | Phantom | Yes |
++---------+---------------+---------------------+-----------------------+
+| 3 | B | Direct | No |
++---------+---------------+---------------------+-----------------------+
+
+.. note::
+ - "Direct" indicate a variant where the power is injected directly to pairs
+ without using magnetics in case of spare pairs.
+ - "Phantom" indicate power path over coils/magnetics as it is done for
+ Alternative A variant.
+
+In case of PSE 4 pairs, a PSE supporting only 10/100BaseT (which mean Direct
+Power on pinout Alternative B) is not compatible with a 4 pairs
+1000/2.5G/5G/10GBaseT.
+
+PSE Power Interface (PSE PI) Connection Diagram
+-----------------------------------------------
+
+The diagram below illustrates the connection architecture between the RJ45
+port, the Ethernet PHY (Physical Layer), and the PSE PI (Power Sourcing
+Equipment Power Interface), demonstrating how power and data are delivered
+simultaneously through an Ethernet cable. The RJ45 port serves as the physical
+interface for these connections, with each of its eight pins connected to both
+the Ethernet PHY for data transmission and the PSE PI for power delivery.
+
+.. code-block::
+
+ +--------------------------+
+ | |
+ | RJ45 Port |
+ | |
+ +--+--+--+--+--+--+--+--+--+ +-------------+
+ 1| 2| 3| 4| 5| 6| 7| 8| | |
+ | | | | | | | o-------------------+ |
+ | | | | | | o--|-------------------+ +<--- PSE 1
+ | | | | | o--|--|-------------------+ |
+ | | | | o--|--|--|-------------------+ |
+ | | | o--|--|--|--|-------------------+ PSE PI |
+ | | o--|--|--|--|--|-------------------+ |
+ | o--|--|--|--|--|--|-------------------+ +<--- PSE 2 (optional)
+ o--|--|--|--|--|--|--|-------------------+ |
+ | | | | | | | | | |
+ +--+--+--+--+--+--+--+--+--+ +-------------+
+ | |
+ | Ethernet PHY |
+ | |
+ +--------------------------+
+
+Simple PSE PI Configuration for Alternative A
+---------------------------------------------
+
+The diagram below illustrates a straightforward PSE PI (Power Sourcing
+Equipment Power Interface) configuration designed to support the Alternative A
+setup for Power over Ethernet (PoE). This implementation is tailored to provide
+power delivery through the data-carrying pairs of an Ethernet cable, suitable
+for either MDI or MDI-X configurations, albeit supporting one variation at a
+time.
+
+.. code-block::
+
+ +-------------+
+ | PSE PI |
+ 8 -----+ +-------------+
+ 7 -----+ Rail 1 |
+ 6 -----+------+----------------------+
+ 5 -----+ | |
+ 4 -----+ | Rail 2 | PSE 1
+ 3 -----+------/ +------------+
+ 2 -----+--+-------------/ |
+ 1 -----+--/ +-------------+
+ |
+ +-------------+
+
+In this configuration:
+
+- Pins 1 and 2, as well as pins 3 and 6, are utilized for power delivery in
+ addition to data transmission. This aligns with the standard wiring for
+ 10/100BaseT Ethernet networks where these pairs are used for data.
+- Rail 1 and Rail 2 represent the positive and negative voltage rails, with
+ Rail 1 connected to pins 1 and 2, and Rail 2 connected to pins 3 and 6.
+ More advanced PSE PI configurations may include integrated or external
+ switches to change the polarity of the voltage rails, allowing for
+ compatibility with both MDI and MDI-X configurations.
+
+More complex PSE PI configurations may include additional components, to support
+Alternative B, or to provide additional features such as power management, or
+additional power delivery capabilities such as 2-pair or 4-pair power delivery.
+
+.. code-block::
+
+ +-------------+
+ | PSE PI |
+ | +---+
+ 8 -----+--------+ | +-------------+
+ 7 -----+--------+ | Rail 1 |
+ 6 -----+--------+ +-----------------+
+ 5 -----+--------+ | |
+ 4 -----+--------+ | Rail 2 | PSE 1
+ 3 -----+--------+ +----------------+
+ 2 -----+--------+ | |
+ 1 -----+--------+ | +-------------+
+ | +---+
+ +-------------+
+
+Device Tree Configuration: Describing PSE PI Configurations
+-----------------------------------------------------------
+
+The necessity for a separate PSE PI node in the device tree is influenced by
+the intricacy of the Power over Ethernet (PoE) system's setup. Here are
+descriptions of both simple and complex PSE PI configurations to illustrate
+this decision-making process:
+
+**Simple PSE PI Configuration:**
+In a straightforward scenario, the PSE PI setup involves a direct, one-to-one
+connection between a single PSE controller and an Ethernet port. This setup
+typically supports basic PoE functionality without the need for dynamic
+configuration or management of multiple power delivery modes. For such simple
+configurations, detailing the PSE PI within the existing PSE controller's node
+may suffice, as the system does not encompass additional complexity that
+warrants a separate node. The primary focus here is on the clear and direct
+association of power delivery to a specific Ethernet port.
+
+**Complex PSE PI Configuration:**
+Contrastingly, a complex PSE PI setup may encompass multiple PSE controllers or
+auxiliary circuits that collectively manage power delivery to one Ethernet
+port. Such configurations might support a range of PoE standards and require
+the capability to dynamically configure power delivery based on the operational
+mode (e.g., PoE2 versus PoE4) or specific requirements of connected devices. In
+these instances, a dedicated PSE PI node becomes essential for accurately
+documenting the system architecture. This node would serve to detail the
+interactions between different PSE controllers, the support for various PoE
+modes, and any additional logic required to coordinate power delivery across
+the network infrastructure.
+
+**Guidance:**
+
+For simple PSE setups, including PSE PI information in the PSE controller node
+might suffice due to the straightforward nature of these systems. However,
+complex configurations, involving multiple components or advanced PoE features,
+benefit from a dedicated PSE PI node. This method adheres to IEEE 802.3
+specifications, improving documentation clarity and ensuring accurate
+representation of the PoE system's complexity.
+
+PSE PI Node: Essential Information
+----------------------------------
+
+The PSE PI (Power Sourcing Equipment Power Interface) node in a device tree can
+include several key pieces of information critical for defining the power
+delivery capabilities and configurations of a PoE (Power over Ethernet) system.
+Below is a list of such information, along with explanations for their
+necessity and reasons why they might not be found within a PSE controller node:
+
+1. **Powered Pairs Configuration**
+
+ - *Description:* Identifies the pairs used for power delivery in the
+ Ethernet cable.
+ - *Necessity:* Essential to ensure the correct pairs are powered according
+ to the board's design.
+ - *PSE Controller Node:* Typically lacks details on physical pair usage,
+ focusing on power regulation.
+
+2. **Polarity of Powered Pairs**
+
+ - *Description:* Specifies the polarity (positive or negative) for each
+ powered pair.
+ - *Necessity:* Critical for safe and effective power transmission to PDs.
+ - *PSE Controller Node:* Polarity management may exceed the standard
+ functionalities of PSE controllers.
+
+3. **PSE Cells Association**
+
+ - *Description:* Details the association of PSE cells with Ethernet ports or
+ pairs in multi-cell configurations.
+ - *Necessity:* Allows for optimized power resource allocation in complex
+ systems.
+ - *PSE Controller Node:* Controllers may not manage cell associations
+ directly, focusing instead on power flow regulation.
+
+4. **Support for PoE Standards**
+
+ - *Description:* Lists the PoE standards and configurations supported by the
+ system.
+ - *Necessity:* Ensures system compatibility with various PDs and adherence
+ to industry standards.
+ - *PSE Controller Node:* Specific capabilities may depend on the overall PSE
+ PI design rather than the controller alone. Multiple PSE cells per PI
+ do not necessarily imply support for multiple PoE standards.
+
+5. **Protection Mechanisms**
+
+ - *Description:* Outlines additional protection mechanisms, such as
+ overcurrent protection and thermal management.
+ - *Necessity:* Provides extra safety and stability, complementing PSE
+ controller protections.
+ - *PSE Controller Node:* Some protections may be implemented via
+ board-specific hardware or algorithms external to the controller.
diff --git a/Documentation/networking/xfrm_proc.rst b/Documentation/networking/xfrm_proc.rst
index 0a771c5a73..973d1571ac 100644
--- a/Documentation/networking/xfrm_proc.rst
+++ b/Documentation/networking/xfrm_proc.rst
@@ -73,6 +73,9 @@ XfrmAcquireError:
XfrmFwdHdrError:
Forward routing of a packet is not allowed
+XfrmInStateDirError:
+ State direction mismatch (lookup found an output state on the input path, expected input or no direction)
+
Outbound errors
~~~~~~~~~~~~~~~
XfrmOutError:
@@ -111,3 +114,6 @@ XfrmOutPolError:
XfrmOutStateInvalid:
State is invalid, perhaps expired
+
+XfrmOutStateDirError:
+ State direction mismatch (lookup found an input state on the output path, expected output or no direction)
diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst
index bd033fe95c..e76b0cfc32 100644
--- a/Documentation/networking/xsk-tx-metadata.rst
+++ b/Documentation/networking/xsk-tx-metadata.rst
@@ -11,12 +11,16 @@ metadata on the receive side.
General Design
==============
-The headroom for the metadata is reserved via ``tx_metadata_len`` in
-``struct xdp_umem_reg``. The metadata length is therefore the same for
-every socket that shares the same umem. The metadata layout is a fixed UAPI,
-refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``.
-Thus, generally, the ``tx_metadata_len`` field above should contain
-``sizeof(union xsk_tx_metadata)``.
+The headroom for the metadata is reserved via ``tx_metadata_len`` and
+``XDP_UMEM_TX_METADATA_LEN`` flag in ``struct xdp_umem_reg``. The metadata
+length is therefore the same for every socket that shares the same umem.
+The metadata layout is a fixed UAPI, refer to ``union xsk_tx_metadata`` in
+``include/uapi/linux/if_xdp.h``. Thus, generally, the ``tx_metadata_len``
+field above should contain ``sizeof(union xsk_tx_metadata)``.
+
+Note that in the original implementation the ``XDP_UMEM_TX_METADATA_LEN``
+flag was not required. Applications might attempt to create a umem
+with a flag first and if it fails, do another attempt without a flag.
The headroom and the metadata itself should be located right before
``xdp_desc->addr`` in the umem frame. Within a frame, the metadata