diff options
Diffstat (limited to 'Documentation/networking')
-rw-r--r-- | Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst | 11 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-info.rst | 5 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-port.rst | 33 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-region.rst | 2 | ||||
-rw-r--r-- | Documentation/networking/devlink/hns3.rst | 5 | ||||
-rw-r--r-- | Documentation/networking/devlink/ice.rst | 47 | ||||
-rw-r--r-- | Documentation/networking/devlink/nfp.rst | 5 | ||||
-rw-r--r-- | Documentation/networking/dns_resolver.rst | 4 | ||||
-rw-r--r-- | Documentation/networking/ethtool-netlink.rst | 29 | ||||
-rw-r--r-- | Documentation/networking/filter.rst | 4 | ||||
-rw-r--r-- | Documentation/networking/index.rst | 1 | ||||
-rw-r--r-- | Documentation/networking/nf_conntrack-sysctl.rst | 4 | ||||
-rw-r--r-- | Documentation/networking/pse-pd/index.rst | 10 | ||||
-rw-r--r-- | Documentation/networking/pse-pd/introduction.rst | 73 | ||||
-rw-r--r-- | Documentation/networking/pse-pd/pse-pi.rst | 301 | ||||
-rw-r--r-- | Documentation/networking/xfrm_proc.rst | 6 | ||||
-rw-r--r-- | Documentation/networking/xsk-tx-metadata.rst | 16 |
17 files changed, 542 insertions, 14 deletions
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst index f69ee1ebee..fed821ef9b 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst @@ -300,6 +300,11 @@ the software port. in the beginning of the queue. This is a normal condition. - Informative + * - `tx[i]_timestamps` + - Transmitted packets that were hardware timestamped at the device's DMA + layer. + - Informative + * - `tx[i]_added_vlan_packets` - The number of packets sent where vlan tag insertion was offloaded to the hardware. @@ -702,6 +707,12 @@ the software port. the device typically ensures not posting the CQE. - Error + * - `ptp_cq[i]_lost_cqe` + - Number of times a CQE is expected to not be delivered on the PTP + timestamping CQE by the device due to a time delta elapsing. If such a + CQE is somehow delivered, `ptp_cq[i]_late_cqe` is incremented. + - Error + .. [#ring_global] The corresponding ring and global counters do not share the same name (i.e. do not follow the common naming scheme). diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst index 1242b0e682..23073bc219 100644 --- a/Documentation/networking/devlink/devlink-info.rst +++ b/Documentation/networking/devlink/devlink-info.rst @@ -146,6 +146,11 @@ board.manufacture An identifier of the company or the facility which produced the part. +board.part_number +----------------- + +Part number of the board and its components. + fw -- diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst index 562f46b412..9d22d41a7c 100644 --- a/Documentation/networking/devlink/devlink-port.rst +++ b/Documentation/networking/devlink/devlink-port.rst @@ -134,6 +134,9 @@ Users may also set the IPsec crypto capability of the function using Users may also set the IPsec packet capability of the function using `devlink port function set ipsec_packet` command. +Users may also set the maximum IO event queues of the function +using `devlink port function set max_io_eqs` command. + Function attributes =================== @@ -295,6 +298,36 @@ policy is processed in software by the kernel. function: hw_addr 00:00:00:00:00:00 ipsec_packet enabled +Maximum IO events queues setup +------------------------------ +When user sets maximum number of IO event queues for a SF or +a VF, such function driver is limited to consume only enforced +number of IO event queues. + +IO event queues deliver events related to IO queues, including network +device transmit and receive queues (txq and rxq) and RDMA Queue Pairs (QPs). +For example, the number of netdevice channels and RDMA device completion +vectors are derived from the function's IO event queues. Usually, the number +of interrupt vectors consumed by the driver is limited by the number of IO +event queues per device, as each of the IO event queues is connected to an +interrupt vector. + +- Get maximum IO event queues of the VF device:: + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 10 + +- Set maximum IO event queues of the VF device:: + + $ devlink port function set pci/0000:06:00.0/2 max_io_eqs 32 + + $ devlink port show pci/0000:06:00.0/2 + pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1 + function: + hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32 + Subfunction ============ diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst index 9232cd7da3..5d0b68f752 100644 --- a/Documentation/networking/devlink/devlink-region.rst +++ b/Documentation/networking/devlink/devlink-region.rst @@ -49,7 +49,7 @@ example usage $ devlink region show [ DEV/REGION ] $ devlink region del DEV/REGION snapshot SNAPSHOT_ID $ devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ] - $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length length + $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ] address ADDRESS length LENGTH # Show all of the exposed regions with region sizes: $ devlink region show diff --git a/Documentation/networking/devlink/hns3.rst b/Documentation/networking/devlink/hns3.rst index 4562a6e478..72bc1b9f37 100644 --- a/Documentation/networking/devlink/hns3.rst +++ b/Documentation/networking/devlink/hns3.rst @@ -23,3 +23,8 @@ The ``hns3`` driver reports the following versions * - ``fw`` - running - Used to represent the firmware version. + * - ``fw.scc`` + - running + - Used to represent the Soft Congestion Control (SSC) firmware version. + SCC is a firmware component which provides multiple RDMA congestion + control algorithms, including DCQCN. diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index 7f30ebd5de..830c043542 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -21,6 +21,53 @@ Parameters * - ``enable_iwarp`` - runtime - mutually exclusive with ``enable_roce`` + * - ``tx_scheduling_layers`` + - permanent + - The ice hardware uses hierarchical scheduling for Tx with a fixed + number of layers in the scheduling tree. Each of them are decision + points. Root node represents a port, while all the leaves represent + the queues. This way of configuring the Tx scheduler allows features + like DCB or devlink-rate (documented below) to configure how much + bandwidth is given to any given queue or group of queues, enabling + fine-grained control because scheduling parameters can be configured + at any given layer of the tree. + + The default 9-layer tree topology was deemed best for most workloads, + as it gives an optimal ratio of performance to configurability. However, + for some specific cases, this 9-layer topology might not be desired. + One example would be sending traffic to queues that are not a multiple + of 8. Because the maximum radix is limited to 8 in 9-layer topology, + the 9th queue has a different parent than the rest, and it's given + more bandwidth credits. This causes a problem when the system is + sending traffic to 9 queues: + + | tx_queue_0_packets: 24163396 + | tx_queue_1_packets: 24164623 + | tx_queue_2_packets: 24163188 + | tx_queue_3_packets: 24163701 + | tx_queue_4_packets: 24163683 + | tx_queue_5_packets: 24164668 + | tx_queue_6_packets: 23327200 + | tx_queue_7_packets: 24163853 + | tx_queue_8_packets: 91101417 < Too much traffic is sent from 9th + + To address this need, you can switch to a 5-layer topology, which + changes the maximum topology radix to 512. With this enhancement, + the performance characteristic is equal as all queues can be assigned + to the same parent in the tree. The obvious drawback of this solution + is a lower configuration depth of the tree. + + Use the ``tx_scheduling_layer`` parameter with the devlink command + to change the transmit scheduler topology. To use 5-layer topology, + use a value of 5. For example: + $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers + value 5 cmode permanent + Use a value of 9 to set it back to the default value. + + You must do PCI slot powercycle for the selected topology to take effect. + + To verify that value has been set: + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers Info versions ============= diff --git a/Documentation/networking/devlink/nfp.rst b/Documentation/networking/devlink/nfp.rst index a1717db0df..3093642bda 100644 --- a/Documentation/networking/devlink/nfp.rst +++ b/Documentation/networking/devlink/nfp.rst @@ -32,7 +32,7 @@ The ``nfp`` driver reports the following versions - Description * - ``board.id`` - fixed - - Part number identifying the board design + - Identifier of the board design * - ``board.rev`` - fixed - Revision of the board design @@ -42,6 +42,9 @@ The ``nfp`` driver reports the following versions * - ``board.model`` - fixed - Model name of the board design + * - ``board.part_number`` + - fixed + - Part number of the board and its components * - ``fw.bundle_id`` - stored, running - Firmware bundle id diff --git a/Documentation/networking/dns_resolver.rst b/Documentation/networking/dns_resolver.rst index add4d59a99..c0364f7070 100644 --- a/Documentation/networking/dns_resolver.rst +++ b/Documentation/networking/dns_resolver.rst @@ -118,7 +118,7 @@ Keys of dns_resolver type can be read from userspace using keyctl_read() or Mechanism ========= -The dnsresolver module registers a key type called "dns_resolver". Keys of +The dns_resolver module registers a key type called "dns_resolver". Keys of this type are used to transport and cache DNS lookup results from userspace. When dns_query() is invoked, it calls request_key() to search the local @@ -152,4 +152,4 @@ Debugging Debugging messages can be turned on dynamically by writing a 1 into the following file:: - /sys/module/dnsresolver/parameters/debug + /sys/module/dns_resolver/parameters/debug diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index d583d9abf2..160bfb0ae8 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -1237,12 +1237,21 @@ Kernel response contents: ``ETHTOOL_A_TSINFO_TX_TYPES`` bitset supported Tx types ``ETHTOOL_A_TSINFO_RX_FILTERS`` bitset supported Rx filters ``ETHTOOL_A_TSINFO_PHC_INDEX`` u32 PTP hw clock index + ``ETHTOOL_A_TSINFO_STATS`` nested HW timestamping statistics ===================================== ====== ========================== ``ETHTOOL_A_TSINFO_PHC_INDEX`` is absent if there is no associated PHC (there is no special value for this case). The bitset attributes are omitted if they would be empty (no bit set). +Additional hardware timestamping statistics response contents: + + ===================================== ====== =================================== + ``ETHTOOL_A_TS_STAT_TX_PKTS`` uint Packets with Tx HW timestamps + ``ETHTOOL_A_TS_STAT_TX_LOST`` uint Tx HW timestamp not arrived count + ``ETHTOOL_A_TS_STAT_TX_ERR`` uint HW error request Tx timestamp count + ===================================== ====== =================================== + CABLE_TEST ========== @@ -1717,6 +1726,10 @@ Kernel response contents: PSE functions ``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` u32 power detection status of the PoDL PSE. + ``ETHTOOL_A_C33_PSE_ADMIN_STATE`` u32 Operational state of the PoE + PSE functions. + ``ETHTOOL_A_C33_PSE_PW_D_STATUS`` u32 power detection status of the + PoE PSE. ====================================== ====== ============================= When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` attribute identifies @@ -1728,6 +1741,12 @@ aPoDLPSEAdminState. Possible values are: .. kernel-doc:: include/uapi/linux/ethtool.h :identifiers: ethtool_podl_pse_admin_state +The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_STATE`` implementing +``IEEE 802.3-2022`` 30.9.1.1.2 aPSEAdminState. + +.. kernel-doc:: include/uapi/linux/ethtool.h + :identifiers: ethtool_c33_pse_admin_state + When set, the optional ``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` attribute identifies the power detection status of the PoDL PSE. The status depend on internal PSE state machine and automatic PD classification support. This option is @@ -1737,6 +1756,12 @@ Possible values are: .. kernel-doc:: include/uapi/linux/ethtool.h :identifiers: ethtool_podl_pse_pw_d_status +The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_PW_D_STATUS`` implementing +``IEEE 802.3-2022`` 30.9.1.1.5 aPSEPowerDetectionStatus. + +.. kernel-doc:: include/uapi/linux/ethtool.h + :identifiers: ethtool_c33_pse_pw_d_status + PSE_SET ======= @@ -1747,6 +1772,7 @@ Request contents: ====================================== ====== ============================= ``ETHTOOL_A_PSE_HEADER`` nested request header ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` u32 Control PoDL PSE Admin state + ``ETHTOOL_A_C33_PSE_ADMIN_CONTROL`` u32 Control PSE Admin state ====================================== ====== ============================= When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` attribute is used @@ -1754,6 +1780,9 @@ to control PoDL PSE Admin functions. This option is implementing ``IEEE 802.3-2018`` 30.15.1.2.1 acPoDLPSEAdminControl. See ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` for supported values. +The same goes for ``ETHTOOL_A_C33_PSE_ADMIN_CONTROL`` implementing +``IEEE 802.3-2022`` 30.9.1.2.1 acPSEAdminControl. + RSS_GET ======= diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst index 7d8c538049..8eb9a5d40f 100644 --- a/Documentation/networking/filter.rst +++ b/Documentation/networking/filter.rst @@ -513,7 +513,7 @@ JIT compiler ------------ The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC, -PowerPC, ARM, ARM64, MIPS, RISC-V and s390 and can be enabled through +PowerPC, ARM, ARM64, MIPS, RISC-V, s390, and ARC and can be enabled through CONFIG_BPF_JIT. The JIT compiler is transparently invoked for each attached filter from user space or for internal kernel users if it has been previously enabled by root:: @@ -650,7 +650,7 @@ before a conversion to the new layout is being done behind the scenes! Currently, the classic BPF format is being used for JITing on most 32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64, -sparc64, arm32, riscv64, riscv32, loongarch64 perform JIT compilation +sparc64, arm32, riscv64, riscv32, loongarch64, arc perform JIT compilation from eBPF instruction set. Testing diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index 473d72c36d..7664c0bfe4 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -93,6 +93,7 @@ Contents: plip ppp_generic proc_net_tcp + pse-pd/index radiotap-headers rds regulatory diff --git a/Documentation/networking/nf_conntrack-sysctl.rst b/Documentation/networking/nf_conntrack-sysctl.rst index c383a394c6..238b66d0e0 100644 --- a/Documentation/networking/nf_conntrack-sysctl.rst +++ b/Documentation/networking/nf_conntrack-sysctl.rst @@ -222,11 +222,11 @@ nf_flowtable_tcp_timeout - INTEGER (seconds) Control offload timeout for tcp connections. TCP connections may be offloaded from nf conntrack to nf flow table. - Once aged, the connection is returned to nf conntrack with tcp pickup timeout. + Once aged, the connection is returned to nf conntrack. nf_flowtable_udp_timeout - INTEGER (seconds) default 30 Control offload timeout for udp connections. UDP connections may be offloaded from nf conntrack to nf flow table. - Once aged, the connection is returned to nf conntrack with udp pickup timeout. + Once aged, the connection is returned to nf conntrack. diff --git a/Documentation/networking/pse-pd/index.rst b/Documentation/networking/pse-pd/index.rst new file mode 100644 index 0000000000..de28a5aee3 --- /dev/null +++ b/Documentation/networking/pse-pd/index.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Power Sourcing Equipment (PSE) Documentation +============================================ + +.. toctree:: + :maxdepth: 2 + + introduction + pse-pi diff --git a/Documentation/networking/pse-pd/introduction.rst b/Documentation/networking/pse-pd/introduction.rst new file mode 100644 index 0000000000..e3d3faaef7 --- /dev/null +++ b/Documentation/networking/pse-pd/introduction.rst @@ -0,0 +1,73 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Power Sourcing Equipment (PSE) in IEEE 802.3 Standard +===================================================== + +Overview +-------- + +Power Sourcing Equipment (PSE) is essential in networks for delivering power +along with data over Ethernet cables. It usually refers to devices like +switches and hubs that supply power to Powered Devices (PDs) such as IP +cameras, VoIP phones, and wireless access points. + +PSE vs. PoDL PSE +---------------- + +PSE in the IEEE 802.3 standard generally refers to equipment that provides +power alongside data over Ethernet cables, typically associated with Power over +Ethernet (PoE). + +PoDL PSE, or Power over Data Lines PSE, specifically denotes PSEs operating +with single balanced twisted-pair PHYs, as per Clause 104 of IEEE 802.3. PoDL +is significant in contexts like automotive and industrial controls where power +and data delivery over a single pair is advantageous. + +IEEE 802.3-2018 Addendums and Related Clauses +--------------------------------------------- + +Key addenda to the IEEE 802.3-2018 standard relevant to power delivery over +Ethernet are as follows: + +- **802.3af (Approved in 2003-06-12)**: Known as PoE in the market, detailed in + Clause 33, delivering up to 15.4W of power. +- **802.3at (Approved in 2009-09-11)**: Marketed as PoE+, enhancing PoE as + covered in Clause 33, increasing power delivery to up to 30W. +- **802.3bt (Approved in 2018-09-27)**: Known as 4PPoE in the market, outlined + in Clause 33. Type 3 delivers up to 60W, and Type 4 up to 100W. +- **802.3bu (Approved in 2016-12-07)**: Formerly referred to as PoDL, detailed + in Clause 104. Introduces Classes 0 - 9. Class 9 PoDL PSE delivers up to ~65W + +Kernel Naming Convention Recommendations +---------------------------------------- + +For clarity and consistency within the Linux kernel's networking subsystem, the +following naming conventions are recommended: + +- For general PSE (PoE) code, use "c33_pse" key words. For example: + ``enum ethtool_c33_pse_admin_state c33_admin_control;``. + This aligns with Clause 33, encompassing various PoE forms. + +- For PoDL PSE - specific code, use "podl_pse". For example: + ``enum ethtool_podl_pse_admin_state podl_admin_control;`` to differentiate + PoDL PSE settings according to Clause 104. + +Summary of Clause 33: Data Terminal Equipment (DTE) Power via Media Dependent Interface (MDI) +--------------------------------------------------------------------------------------------- + +Clause 33 of the IEEE 802.3 standard defines the functional and electrical +characteristics of Powered Device (PD) and Power Sourcing Equipment (PSE). +These entities enable power delivery using the same generic cabling as for data +transmission, integrating power with data communication for devices such as +10BASE-T, 100BASE-TX, or 1000BASE-T. + +Summary of Clause 104: Power over Data Lines (PoDL) of Single Balanced Twisted-Pair Ethernet +-------------------------------------------------------------------------------------------- + +Clause 104 of the IEEE 802.3 standard delineates the functional and electrical +characteristics of PoDL Powered Devices (PDs) and PoDL Power Sourcing Equipment +(PSEs). These are designed for use with single balanced twisted-pair Ethernet +Physical Layers. In this clause, 'PSE' refers specifically to PoDL PSE, and +'PD' to PoDL PD. The key intent is to provide devices with a unified interface +for both data and the power required to process this data over a single +balanced twisted-pair Ethernet connection. diff --git a/Documentation/networking/pse-pd/pse-pi.rst b/Documentation/networking/pse-pd/pse-pi.rst new file mode 100644 index 0000000000..5cad14fedc --- /dev/null +++ b/Documentation/networking/pse-pd/pse-pi.rst @@ -0,0 +1,301 @@ +.. SPDX-License-Identifier: GPL-2.0 + +PSE Power Interface (PSE PI) Documentation +========================================== + +The Power Sourcing Equipment Power Interface (PSE PI) plays a pivotal role in +the architecture of Power over Ethernet (PoE) systems. It is essentially a +blueprint that outlines how one or multiple power sources are connected to the +eight-pin modular jack, commonly known as the Ethernet RJ45 port. This +connection scheme is crucial for enabling the delivery of power alongside data +over Ethernet cables. + +Documentation and Standards +--------------------------- + +The IEEE 802.3 standard provides detailed documentation on the PSE PI. +Specifically: + +- Section "33.2.3 PI pin assignments" covers the pin assignments for PoE + systems that utilize two pairs for power delivery. +- Section "145.2.4 PSE PI" addresses the configuration for PoE systems that + deliver power over all four pairs of an Ethernet cable. + +PSE PI and Single Pair Ethernet +------------------------------- + +Single Pair Ethernet (SPE) represents a different approach to Ethernet +connectivity, utilizing just one pair of conductors for both data and power +transmission. Unlike the configurations detailed in the PSE PI for standard +Ethernet, which can involve multiple power sourcing arrangements across four or +two pairs of wires, SPE operates on a simpler model due to its single-pair +design. As a result, the complexities of choosing between alternative pin +assignments for power delivery, as described in the PSE PI for multi-pair +Ethernet, are not applicable to SPE. + +Understanding PSE PI +-------------------- + +The Power Sourcing Equipment Power Interface (PSE PI) is a framework defining +how Power Sourcing Equipment (PSE) delivers power to Powered Devices (PDs) over +Ethernet cables. It details two main configurations for power delivery, known +as Alternative A and Alternative B, which are distinguished not only by their +method of power transmission but also by the implications for polarity and data +transmission direction. + +Alternative A and B Overview +---------------------------- + +- **Alternative A:** Utilizes RJ45 conductors 1, 2, 3 and 6. In either case of + networks 10/100BaseT or 1G/2G/5G/10GBaseT, the pairs used are carrying data. + The power delivery's polarity in this alternative can vary based on the MDI + (Medium Dependent Interface) or MDI-X (Medium Dependent Interface Crossover) + configuration. + +- **Alternative B:** Utilizes RJ45 conductors 4, 5, 7 and 8. In case of + 10/100BaseT network the pairs used are spare pairs without data and are less + influenced by data transmission direction. This is not the case for + 1G/2G/5G/10GBaseT network. Alternative B includes two configurations with + different polarities, known as variant X and variant S, to accommodate + different network requirements and device specifications. + +Table 145-3 PSE Pinout Alternatives +----------------------------------- + +The following table outlines the pin configurations for both Alternative A and +Alternative B. + ++------------+-------------------+-----------------+-----------------+-----------------+ +| Conductor | Alternative A | Alternative A | Alternative B | Alternative B | +| | (MDI-X) | (MDI) | (X) | (S) | ++============+===================+=================+=================+=================+ +| 1 | Negative V | Positive V | - | - | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 2 | Negative V | Positive V | - | - | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 3 | Positive V | Negative V | - | - | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 4 | - | - | Negative V | Positive V | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 5 | - | - | Negative V | Positive V | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 6 | Positive V | Negative V | - | - | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 7 | - | - | Positive V | Negative V | ++------------+-------------------+-----------------+-----------------+-----------------+ +| 8 | - | - | Positive V | Negative V | ++------------+-------------------+-----------------+-----------------+-----------------+ + +.. note:: + - "Positive V" and "Negative V" indicate the voltage polarity for each pin. + - "-" indicates that the pin is not used for power delivery in that + specific configuration. + +PSE PI compatibilities +---------------------- + +The following table outlines the compatibility between the pinout alternative +and the 1000/2.5G/5G/10GBaseT in the PSE 2 pairs connection. + ++---------+---------------+---------------------+-----------------------+ +| Variant | Alternative | Power Feeding Type | Compatibility with | +| | (A/B) | (Direct/Phantom) | 1000/2.5G/5G/10GBaseT | ++=========+===============+=====================+=======================+ +| 1 | A | Phantom | Yes | ++---------+---------------+---------------------+-----------------------+ +| 2 | B | Phantom | Yes | ++---------+---------------+---------------------+-----------------------+ +| 3 | B | Direct | No | ++---------+---------------+---------------------+-----------------------+ + +.. note:: + - "Direct" indicate a variant where the power is injected directly to pairs + without using magnetics in case of spare pairs. + - "Phantom" indicate power path over coils/magnetics as it is done for + Alternative A variant. + +In case of PSE 4 pairs, a PSE supporting only 10/100BaseT (which mean Direct +Power on pinout Alternative B) is not compatible with a 4 pairs +1000/2.5G/5G/10GBaseT. + +PSE Power Interface (PSE PI) Connection Diagram +----------------------------------------------- + +The diagram below illustrates the connection architecture between the RJ45 +port, the Ethernet PHY (Physical Layer), and the PSE PI (Power Sourcing +Equipment Power Interface), demonstrating how power and data are delivered +simultaneously through an Ethernet cable. The RJ45 port serves as the physical +interface for these connections, with each of its eight pins connected to both +the Ethernet PHY for data transmission and the PSE PI for power delivery. + +.. code-block:: + + +--------------------------+ + | | + | RJ45 Port | + | | + +--+--+--+--+--+--+--+--+--+ +-------------+ + 1| 2| 3| 4| 5| 6| 7| 8| | | + | | | | | | | o-------------------+ | + | | | | | | o--|-------------------+ +<--- PSE 1 + | | | | | o--|--|-------------------+ | + | | | | o--|--|--|-------------------+ | + | | | o--|--|--|--|-------------------+ PSE PI | + | | o--|--|--|--|--|-------------------+ | + | o--|--|--|--|--|--|-------------------+ +<--- PSE 2 (optional) + o--|--|--|--|--|--|--|-------------------+ | + | | | | | | | | | | + +--+--+--+--+--+--+--+--+--+ +-------------+ + | | + | Ethernet PHY | + | | + +--------------------------+ + +Simple PSE PI Configuration for Alternative A +--------------------------------------------- + +The diagram below illustrates a straightforward PSE PI (Power Sourcing +Equipment Power Interface) configuration designed to support the Alternative A +setup for Power over Ethernet (PoE). This implementation is tailored to provide +power delivery through the data-carrying pairs of an Ethernet cable, suitable +for either MDI or MDI-X configurations, albeit supporting one variation at a +time. + +.. code-block:: + + +-------------+ + | PSE PI | + 8 -----+ +-------------+ + 7 -----+ Rail 1 | + 6 -----+------+----------------------+ + 5 -----+ | | + 4 -----+ | Rail 2 | PSE 1 + 3 -----+------/ +------------+ + 2 -----+--+-------------/ | + 1 -----+--/ +-------------+ + | + +-------------+ + +In this configuration: + +- Pins 1 and 2, as well as pins 3 and 6, are utilized for power delivery in + addition to data transmission. This aligns with the standard wiring for + 10/100BaseT Ethernet networks where these pairs are used for data. +- Rail 1 and Rail 2 represent the positive and negative voltage rails, with + Rail 1 connected to pins 1 and 2, and Rail 2 connected to pins 3 and 6. + More advanced PSE PI configurations may include integrated or external + switches to change the polarity of the voltage rails, allowing for + compatibility with both MDI and MDI-X configurations. + +More complex PSE PI configurations may include additional components, to support +Alternative B, or to provide additional features such as power management, or +additional power delivery capabilities such as 2-pair or 4-pair power delivery. + +.. code-block:: + + +-------------+ + | PSE PI | + | +---+ + 8 -----+--------+ | +-------------+ + 7 -----+--------+ | Rail 1 | + 6 -----+--------+ +-----------------+ + 5 -----+--------+ | | + 4 -----+--------+ | Rail 2 | PSE 1 + 3 -----+--------+ +----------------+ + 2 -----+--------+ | | + 1 -----+--------+ | +-------------+ + | +---+ + +-------------+ + +Device Tree Configuration: Describing PSE PI Configurations +----------------------------------------------------------- + +The necessity for a separate PSE PI node in the device tree is influenced by +the intricacy of the Power over Ethernet (PoE) system's setup. Here are +descriptions of both simple and complex PSE PI configurations to illustrate +this decision-making process: + +**Simple PSE PI Configuration:** +In a straightforward scenario, the PSE PI setup involves a direct, one-to-one +connection between a single PSE controller and an Ethernet port. This setup +typically supports basic PoE functionality without the need for dynamic +configuration or management of multiple power delivery modes. For such simple +configurations, detailing the PSE PI within the existing PSE controller's node +may suffice, as the system does not encompass additional complexity that +warrants a separate node. The primary focus here is on the clear and direct +association of power delivery to a specific Ethernet port. + +**Complex PSE PI Configuration:** +Contrastingly, a complex PSE PI setup may encompass multiple PSE controllers or +auxiliary circuits that collectively manage power delivery to one Ethernet +port. Such configurations might support a range of PoE standards and require +the capability to dynamically configure power delivery based on the operational +mode (e.g., PoE2 versus PoE4) or specific requirements of connected devices. In +these instances, a dedicated PSE PI node becomes essential for accurately +documenting the system architecture. This node would serve to detail the +interactions between different PSE controllers, the support for various PoE +modes, and any additional logic required to coordinate power delivery across +the network infrastructure. + +**Guidance:** + +For simple PSE setups, including PSE PI information in the PSE controller node +might suffice due to the straightforward nature of these systems. However, +complex configurations, involving multiple components or advanced PoE features, +benefit from a dedicated PSE PI node. This method adheres to IEEE 802.3 +specifications, improving documentation clarity and ensuring accurate +representation of the PoE system's complexity. + +PSE PI Node: Essential Information +---------------------------------- + +The PSE PI (Power Sourcing Equipment Power Interface) node in a device tree can +include several key pieces of information critical for defining the power +delivery capabilities and configurations of a PoE (Power over Ethernet) system. +Below is a list of such information, along with explanations for their +necessity and reasons why they might not be found within a PSE controller node: + +1. **Powered Pairs Configuration** + + - *Description:* Identifies the pairs used for power delivery in the + Ethernet cable. + - *Necessity:* Essential to ensure the correct pairs are powered according + to the board's design. + - *PSE Controller Node:* Typically lacks details on physical pair usage, + focusing on power regulation. + +2. **Polarity of Powered Pairs** + + - *Description:* Specifies the polarity (positive or negative) for each + powered pair. + - *Necessity:* Critical for safe and effective power transmission to PDs. + - *PSE Controller Node:* Polarity management may exceed the standard + functionalities of PSE controllers. + +3. **PSE Cells Association** + + - *Description:* Details the association of PSE cells with Ethernet ports or + pairs in multi-cell configurations. + - *Necessity:* Allows for optimized power resource allocation in complex + systems. + - *PSE Controller Node:* Controllers may not manage cell associations + directly, focusing instead on power flow regulation. + +4. **Support for PoE Standards** + + - *Description:* Lists the PoE standards and configurations supported by the + system. + - *Necessity:* Ensures system compatibility with various PDs and adherence + to industry standards. + - *PSE Controller Node:* Specific capabilities may depend on the overall PSE + PI design rather than the controller alone. Multiple PSE cells per PI + do not necessarily imply support for multiple PoE standards. + +5. **Protection Mechanisms** + + - *Description:* Outlines additional protection mechanisms, such as + overcurrent protection and thermal management. + - *Necessity:* Provides extra safety and stability, complementing PSE + controller protections. + - *PSE Controller Node:* Some protections may be implemented via + board-specific hardware or algorithms external to the controller. diff --git a/Documentation/networking/xfrm_proc.rst b/Documentation/networking/xfrm_proc.rst index 0a771c5a73..973d1571ac 100644 --- a/Documentation/networking/xfrm_proc.rst +++ b/Documentation/networking/xfrm_proc.rst @@ -73,6 +73,9 @@ XfrmAcquireError: XfrmFwdHdrError: Forward routing of a packet is not allowed +XfrmInStateDirError: + State direction mismatch (lookup found an output state on the input path, expected input or no direction) + Outbound errors ~~~~~~~~~~~~~~~ XfrmOutError: @@ -111,3 +114,6 @@ XfrmOutPolError: XfrmOutStateInvalid: State is invalid, perhaps expired + +XfrmOutStateDirError: + State direction mismatch (lookup found an input state on the output path, expected output or no direction) diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index bd033fe95c..e76b0cfc32 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -11,12 +11,16 @@ metadata on the receive side. General Design ============== -The headroom for the metadata is reserved via ``tx_metadata_len`` in -``struct xdp_umem_reg``. The metadata length is therefore the same for -every socket that shares the same umem. The metadata layout is a fixed UAPI, -refer to ``union xsk_tx_metadata`` in ``include/uapi/linux/if_xdp.h``. -Thus, generally, the ``tx_metadata_len`` field above should contain -``sizeof(union xsk_tx_metadata)``. +The headroom for the metadata is reserved via ``tx_metadata_len`` and +``XDP_UMEM_TX_METADATA_LEN`` flag in ``struct xdp_umem_reg``. The metadata +length is therefore the same for every socket that shares the same umem. +The metadata layout is a fixed UAPI, refer to ``union xsk_tx_metadata`` in +``include/uapi/linux/if_xdp.h``. Thus, generally, the ``tx_metadata_len`` +field above should contain ``sizeof(union xsk_tx_metadata)``. + +Note that in the original implementation the ``XDP_UMEM_TX_METADATA_LEN`` +flag was not required. Applications might attempt to create a umem +with a flag first and if it fails, do another attempt without a flag. The headroom and the metadata itself should be located right before ``xdp_desc->addr`` in the umem frame. Within a frame, the metadata |