diff options
Diffstat (limited to 'Documentation/networking/device_drivers/ethernet/intel')
12 files changed, 4688 insertions, 0 deletions
diff --git a/Documentation/networking/device_drivers/ethernet/intel/e100.rst b/Documentation/networking/device_drivers/ethernet/intel/e100.rst new file mode 100644 index 000000000..3d4a9ba21 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/e100.rst @@ -0,0 +1,188 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +============================================================= +Linux Base Driver for the Intel(R) PRO/100 Family of Adapters +============================================================= + +June 1, 2018 + +Contents +======== + +- In This Release +- Identifying Your Adapter +- Building and Installation +- Driver Configuration Parameters +- Additional Configurations +- Known Issues +- Support + + +In This Release +=============== + +This file describes the Linux Base Driver for the Intel(R) PRO/100 Family of +Adapters. This driver includes support for Itanium(R)2-based systems. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel PRO/100 adapter. + +The following features are now available in supported kernels: + - Native VLANs + - Channel Bonding (teaming) + - SNMP + +Channel Bonding documentation can be found in the Linux kernel source: +/Documentation/networking/bonding.rst + + +Identifying Your Adapter +======================== + +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + +Driver Configuration Parameters +=============================== + +The default value for each parameter is generally the recommended setting, +unless otherwise noted. + +Rx Descriptors: + Number of receive descriptors. A receive descriptor is a data + structure that describes a receive buffer and its attributes to the network + controller. The data in the descriptor is used by the controller to write + data from the controller to host memory. In the 3.x.x driver the valid range + for this parameter is 64-256. The default value is 256. This parameter can be + changed using the command:: + + ethtool -G eth? rx n + + Where n is the number of desired Rx descriptors. + +Tx Descriptors: + Number of transmit descriptors. A transmit descriptor is a data + structure that describes a transmit buffer and its attributes to the network + controller. The data in the descriptor is used by the controller to read + data from the host memory to the controller. In the 3.x.x driver the valid + range for this parameter is 64-256. The default value is 128. This parameter + can be changed using the command:: + + ethtool -G eth? tx n + + Where n is the number of desired Tx descriptors. + +Speed/Duplex: + The driver auto-negotiates the link speed and duplex settings by + default. The ethtool utility can be used as follows to force speed/duplex.:: + + ethtool -s eth? autoneg off speed {10|100} duplex {full|half} + + NOTE: setting the speed/duplex to incorrect values will cause the link to + fail. + +Event Log Message Level: + The driver uses the message level flag to log events + to syslog. The message level can be set at driver load time. It can also be + set using the command:: + + ethtool -s eth? msglvl n + + +Additional Configurations +========================= + +Configuring the Driver on Different Distributions +------------------------------------------------- + +Configuring a network driver to load properly when the system is started +is distribution dependent. Typically, the configuration process involves +adding an alias line to `/etc/modprobe.d/*.conf` as well as editing other +system startup scripts and/or configuration files. Many popular Linux +distributions ship with tools to make these changes for you. To learn +the proper way to configure a network device for your system, refer to +your distribution documentation. If during this process you are asked +for the driver or module name, the name for the Linux Base Driver for +the Intel PRO/100 Family of Adapters is e100. + +As an example, if you install the e100 driver for two PRO/100 adapters +(eth0 and eth1), add the following to a configuration file in +/etc/modprobe.d/:: + + alias eth0 e100 + alias eth1 e100 + +Viewing Link Messages +--------------------- + +In order to see link messages and other Intel driver information on your +console, you must set the dmesg level up to six. This can be done by +entering the following on the command line before loading the e100 +driver:: + + dmesg -n 6 + +If you wish to see all messages issued by the driver, including debug +messages, set the dmesg level to eight. + +NOTE: This setting is not saved across reboots. + +ethtool +------- + +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The ethtool +version 1.6 or later is required for this functionality. + +The latest release of ethtool can be found from +https://www.kernel.org/pub/software/network/ethtool/ + +Enabling Wake on LAN (WoL) +-------------------------- +WoL is provided through the ethtool utility. For instructions on +enabling WoL with ethtool, refer to the ethtool man page. WoL will be +enabled on the system during the next shut down or reboot. For this +driver version, in order to enable WoL, the e100 driver must be loaded +when shutting down or rebooting the system. + +NAPI +---- + +NAPI (Rx polling mode) is supported in the e100 driver. + +See https://wiki.linuxfoundation.org/networking/napi for more +information on NAPI. + +Multiple Interfaces on Same Ethernet Broadcast Network +------------------------------------------------------ + +Due to the default ARP behavior on Linux, it is not possible to have one +system on two IP networks in the same Ethernet broadcast domain +(non-partitioned switch) behave as expected. All Ethernet interfaces +will respond to IP traffic for any IP address assigned to the system. +This results in unbalanced receive traffic. + +If you have multiple interfaces in a server, either turn on ARP +filtering by + +(1) entering:: + + echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter + + (this only works if your kernel's version is higher than 2.4.5), or + +(2) installing the interfaces in separate broadcast domains (either + in different switches or in a switch partitioned to VLANs). + + +Support +======= +For general information, go to the Intel support website at: +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: +http://sourceforge.net/projects/e1000 +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/e1000.rst b/Documentation/networking/device_drivers/ethernet/intel/e1000.rst new file mode 100644 index 000000000..4aaae0f7d --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/e1000.rst @@ -0,0 +1,463 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +========================================================== +Linux Base Driver for Intel(R) Ethernet Network Connection +========================================================== + +Intel Gigabit Linux driver. +Copyright(c) 1999 - 2013 Intel Corporation. + +Contents +======== + +- Identifying Your Adapter +- Command Line Parameters +- Speed and Duplex Configuration +- Additional Configurations +- Support + +Identifying Your Adapter +======================== + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + http://support.intel.com/support/go/network/adapter/idguide.htm + +For the latest Intel network drivers for Linux, refer to the following +website. In the search field, enter your adapter name or type, or use the +networking link on the left to search for your adapter: + + http://support.intel.com/support/go/network/adapter/home.htm + +Command Line Parameters +======================= + +The default value for each parameter is generally the recommended setting, +unless otherwise noted. + +NOTES: + For more information about the AutoNeg, Duplex, and Speed + parameters, see the "Speed and Duplex Configuration" section in + this document. + + For more information about the InterruptThrottleRate, + RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay + parameters, see the application note at: + http://www.intel.com/design/network/applnots/ap450.htm + +AutoNeg +------- + +(Supported only on adapters with copper connections) + +:Valid Range: 0x01-0x0F, 0x20-0x2F +:Default Value: 0x2F + +This parameter is a bit-mask that specifies the speed and duplex settings +advertised by the adapter. When this parameter is used, the Speed and +Duplex parameters must not be specified. + +NOTE: + Refer to the Speed and Duplex section of this readme for more + information on the AutoNeg parameter. + +Duplex +------ + +(Supported only on adapters with copper connections) + +:Valid Range: 0-2 (0=auto-negotiate, 1=half, 2=full) +:Default Value: 0 + +This defines the direction in which data is allowed to flow. Can be +either one or two-directional. If both Duplex and the link partner are +set to auto-negotiate, the board auto-detects the correct duplex. If the +link partner is forced (either full or half), Duplex defaults to half- +duplex. + +FlowControl +----------- + +:Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx) +:Default Value: Reads flow control settings from the EEPROM + +This parameter controls the automatic generation(Tx) and response(Rx) +to Ethernet PAUSE frames. + +InterruptThrottleRate +--------------------- + +(not supported on Intel(R) 82542, 82543 or 82544-based adapters) + +:Valid Range: + 0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative, + 4=simplified balancing) +:Default Value: 3 + +The driver can limit the amount of interrupts per second that the adapter +will generate for incoming packets. It does this by writing a value to the +adapter that is based on the maximum amount of interrupts that the adapter +will generate per second. + +Setting InterruptThrottleRate to a value greater or equal to 100 +will program the adapter to send out a maximum of that many interrupts +per second, even if more packets have come in. This reduces interrupt +load on the system and can lower CPU utilization under heavy load, +but will increase latency as packets are not processed as quickly. + +The default behaviour of the driver previously assumed a static +InterruptThrottleRate value of 8000, providing a good fallback value for +all traffic types,but lacking in small packet performance and latency. +The hardware can handle many more small packets per second however, and +for this reason an adaptive interrupt moderation algorithm was implemented. + +Since 7.3.x, the driver has two adaptive modes (setting 1 or 3) in which +it dynamically adjusts the InterruptThrottleRate value based on the traffic +that it receives. After determining the type of incoming traffic in the last +timeframe, it will adjust the InterruptThrottleRate to an appropriate value +for that traffic. + +The algorithm classifies the incoming traffic every interval into +classes. Once the class is determined, the InterruptThrottleRate value is +adjusted to suit that traffic type the best. There are three classes defined: +"Bulk traffic", for large amounts of packets of normal size; "Low latency", +for small amounts of traffic and/or a significant percentage of small +packets; and "Lowest latency", for almost completely small packets or +minimal traffic. + +In dynamic conservative mode, the InterruptThrottleRate value is set to 4000 +for traffic that falls in class "Bulk traffic". If traffic falls in the "Low +latency" or "Lowest latency" class, the InterruptThrottleRate is increased +stepwise to 20000. This default mode is suitable for most applications. + +For situations where low latency is vital such as cluster or +grid computing, the algorithm can reduce latency even more when +InterruptThrottleRate is set to mode 1. In this mode, which operates +the same as mode 3, the InterruptThrottleRate will be increased stepwise to +70000 for traffic in class "Lowest latency". + +In simplified mode the interrupt rate is based on the ratio of TX and +RX traffic. If the bytes per second rate is approximately equal, the +interrupt rate will drop as low as 2000 interrupts per second. If the +traffic is mostly transmit or mostly receive, the interrupt rate could +be as high as 8000. + +Setting InterruptThrottleRate to 0 turns off any interrupt moderation +and may improve small packet latency, but is generally not suitable +for bulk throughput traffic. + +NOTE: + InterruptThrottleRate takes precedence over the TxAbsIntDelay and + RxAbsIntDelay parameters. In other words, minimizing the receive + and/or transmit absolute delays does not force the controller to + generate more interrupts than what the Interrupt Throttle Rate + allows. + +CAUTION: + If you are using the Intel(R) PRO/1000 CT Network Connection + (controller 82547), setting InterruptThrottleRate to a value + greater than 75,000, may hang (stop transmitting) adapters + under certain network conditions. If this occurs a NETDEV + WATCHDOG message is logged in the system event log. In + addition, the controller is automatically reset, restoring + the network connection. To eliminate the potential for the + hang, ensure that InterruptThrottleRate is set no greater + than 75,000 and is not set to 0. + +NOTE: + When e1000 is loaded with default settings and multiple adapters + are in use simultaneously, the CPU utilization may increase non- + linearly. In order to limit the CPU utilization without impacting + the overall throughput, we recommend that you load the driver as + follows:: + + modprobe e1000 InterruptThrottleRate=3000,3000,3000 + + This sets the InterruptThrottleRate to 3000 interrupts/sec for + the first, second, and third instances of the driver. The range + of 2000 to 3000 interrupts per second works on a majority of + systems and is a good starting point, but the optimal value will + be platform-specific. If CPU utilization is not a concern, use + RX_POLLING (NAPI) and default driver settings. + +RxDescriptors +------------- + +:Valid Range: + - 48-256 for 82542 and 82543-based adapters + - 48-4096 for all other supported adapters +:Default Value: 256 + +This value specifies the number of receive buffer descriptors allocated +by the driver. Increasing this value allows the driver to buffer more +incoming packets, at the expense of increased system memory utilization. + +Each descriptor is 16 bytes. A receive buffer is also allocated for each +descriptor and can be either 2048, 4096, 8192, or 16384 bytes, depending +on the MTU setting. The maximum MTU size is 16110. + +NOTE: + MTU designates the frame size. It only needs to be set for Jumbo + Frames. Depending on the available system resources, the request + for a higher number of receive descriptors may be denied. In this + case, use a lower number. + +RxIntDelay +---------- + +:Valid Range: 0-65535 (0=off) +:Default Value: 0 + +This value delays the generation of receive interrupts in units of 1.024 +microseconds. Receive interrupt reduction can improve CPU efficiency if +properly tuned for specific network traffic. Increasing this value adds +extra latency to frame reception and can end up decreasing the throughput +of TCP traffic. If the system is reporting dropped receives, this value +may be set too high, causing the driver to run out of available receive +descriptors. + +CAUTION: + When setting RxIntDelay to a value other than 0, adapters may + hang (stop transmitting) under certain network conditions. If + this occurs a NETDEV WATCHDOG message is logged in the system + event log. In addition, the controller is automatically reset, + restoring the network connection. To eliminate the potential + for the hang ensure that RxIntDelay is set to 0. + +RxAbsIntDelay +------------- + +(This parameter is supported only on 82540, 82545 and later adapters.) + +:Valid Range: 0-65535 (0=off) +:Default Value: 128 + +This value, in units of 1.024 microseconds, limits the delay in which a +receive interrupt is generated. Useful only if RxIntDelay is non-zero, +this value ensures that an interrupt is generated after the initial +packet is received within the set amount of time. Proper tuning, +along with RxIntDelay, may improve traffic throughput in specific network +conditions. + +Speed +----- + +(This parameter is supported only on adapters with copper connections.) + +:Valid Settings: 0, 10, 100, 1000 +:Default Value: 0 (auto-negotiate at all supported speeds) + +Speed forces the line speed to the specified value in megabits per second +(Mbps). If this parameter is not specified or is set to 0 and the link +partner is set to auto-negotiate, the board will auto-detect the correct +speed. Duplex should also be set when Speed is set to either 10 or 100. + +TxDescriptors +------------- + +:Valid Range: + - 48-256 for 82542 and 82543-based adapters + - 48-4096 for all other supported adapters +:Default Value: 256 + +This value is the number of transmit descriptors allocated by the driver. +Increasing this value allows the driver to queue more transmits. Each +descriptor is 16 bytes. + +NOTE: + Depending on the available system resources, the request for a + higher number of transmit descriptors may be denied. In this case, + use a lower number. + +TxIntDelay +---------- + +:Valid Range: 0-65535 (0=off) +:Default Value: 8 + +This value delays the generation of transmit interrupts in units of +1.024 microseconds. Transmit interrupt reduction can improve CPU +efficiency if properly tuned for specific network traffic. If the +system is reporting dropped transmits, this value may be set too high +causing the driver to run out of available transmit descriptors. + +TxAbsIntDelay +------------- + +(This parameter is supported only on 82540, 82545 and later adapters.) + +:Valid Range: 0-65535 (0=off) +:Default Value: 32 + +This value, in units of 1.024 microseconds, limits the delay in which a +transmit interrupt is generated. Useful only if TxIntDelay is non-zero, +this value ensures that an interrupt is generated after the initial +packet is sent on the wire within the set amount of time. Proper tuning, +along with TxIntDelay, may improve traffic throughput in specific +network conditions. + +XsumRX +------ + +(This parameter is NOT supported on the 82542-based adapter.) + +:Valid Range: 0-1 +:Default Value: 1 + +A value of '1' indicates that the driver should enable IP checksum +offload for received packets (both UDP and TCP) to the adapter hardware. + +Copybreak +--------- + +:Valid Range: 0-xxxxxxx (0=off) +:Default Value: 256 +:Usage: modprobe e1000.ko copybreak=128 + +Driver copies all packets below or equaling this size to a fresh RX +buffer before handing it up the stack. + +This parameter is different than other parameters, in that it is a +single (not 1,1,1 etc.) parameter applied to all driver instances and +it is also available during runtime at +/sys/module/e1000/parameters/copybreak + +SmartPowerDownEnable +-------------------- + +:Valid Range: 0-1 +:Default Value: 0 (disabled) + +Allows PHY to turn off in lower power states. The user can turn off +this parameter in supported chipsets. + +Speed and Duplex Configuration +============================== + +Three keywords are used to control the speed and duplex configuration. +These keywords are Speed, Duplex, and AutoNeg. + +If the board uses a fiber interface, these keywords are ignored, and the +fiber interface board only links at 1000 Mbps full-duplex. + +For copper-based boards, the keywords interact as follows: + +- The default operation is auto-negotiate. The board advertises all + supported speed and duplex combinations, and it links at the highest + common speed and duplex mode IF the link partner is set to auto-negotiate. + +- If Speed = 1000, limited auto-negotiation is enabled and only 1000 Mbps + is advertised (The 1000BaseT spec requires auto-negotiation.) + +- If Speed = 10 or 100, then both Speed and Duplex should be set. Auto- + negotiation is disabled, and the AutoNeg parameter is ignored. Partner + SHOULD also be forced. + +The AutoNeg parameter is used when more control is required over the +auto-negotiation process. It should be used when you wish to control which +speed and duplex combinations are advertised during the auto-negotiation +process. + +The parameter may be specified as either a decimal or hexadecimal value as +determined by the bitmap below. + +============== ====== ====== ======= ======= ====== ====== ======= ====== +Bit position 7 6 5 4 3 2 1 0 +Decimal Value 128 64 32 16 8 4 2 1 +Hex value 80 40 20 10 8 4 2 1 +Speed (Mbps) N/A N/A 1000 N/A 100 100 10 10 +Duplex Full Full Half Full Half +============== ====== ====== ======= ======= ====== ====== ======= ====== + +Some examples of using AutoNeg:: + + modprobe e1000 AutoNeg=0x01 (Restricts autonegotiation to 10 Half) + modprobe e1000 AutoNeg=1 (Same as above) + modprobe e1000 AutoNeg=0x02 (Restricts autonegotiation to 10 Full) + modprobe e1000 AutoNeg=0x03 (Restricts autonegotiation to 10 Half or 10 Full) + modprobe e1000 AutoNeg=0x04 (Restricts autonegotiation to 100 Half) + modprobe e1000 AutoNeg=0x05 (Restricts autonegotiation to 10 Half or 100 + Half) + modprobe e1000 AutoNeg=0x020 (Restricts autonegotiation to 1000 Full) + modprobe e1000 AutoNeg=32 (Same as above) + +Note that when this parameter is used, Speed and Duplex must not be specified. + +If the link partner is forced to a specific speed and duplex, then this +parameter should not be used. Instead, use the Speed and Duplex parameters +previously mentioned to force the adapter to the same speed and duplex. + +Additional Configurations +========================= + +Jumbo Frames +------------ + + Jumbo Frames support is enabled by changing the MTU to a value larger than + the default of 1500. Use the ifconfig command to increase the MTU size. + For example:: + + ifconfig eth<x> mtu 9000 up + + This setting is not saved across reboots. It can be made permanent if + you add:: + + MTU=9000 + + to the file /etc/sysconfig/network-scripts/ifcfg-eth<x>. This example + applies to the Red Hat distributions; other distributions may store this + setting in a different location. + +Notes: + Degradation in throughput performance may be observed in some Jumbo frames + environments. If this is observed, increasing the application's socket buffer + size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help. + See the specific application manual and /usr/src/linux*/Documentation/ + networking/ip-sysctl.txt for more details. + + - The maximum MTU setting for Jumbo Frames is 16110. This value coincides + with the maximum Jumbo Frames size of 16128. + + - Using Jumbo frames at 10 or 100 Mbps is not supported and may result in + poor performance or loss of link. + + - Adapters based on the Intel(R) 82542 and 82573V/E controller do not + support Jumbo Frames. These correspond to the following product names:: + + Intel(R) PRO/1000 Gigabit Server Adapter + Intel(R) PRO/1000 PM Network Connection + +ethtool +------- + + The driver utilizes the ethtool interface for driver configuration and + diagnostics, as well as displaying statistical information. The ethtool + version 1.6 or later is required for this functionality. + + The latest release of ethtool can be found from + https://www.kernel.org/pub/software/network/ethtool/ + +Enabling Wake on LAN (WoL) +-------------------------- + + WoL is configured through the ethtool utility. + + WoL will be enabled on the system during the next shut down or reboot. + For this driver version, in order to enable WoL, the e1000 driver must be + loaded when shutting down or rebooting the system. + +Support +======= + +For general information, go to the Intel support website at: + + http://support.intel.com + +or the Intel Wired Networking project hosted by Sourceforge at: + + http://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on the supported +kernel with a supported adapter, email the specific information related +to the issue to e1000-devel@lists.sf.net diff --git a/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst b/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst new file mode 100644 index 000000000..f49cd370e --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/e1000e.rst @@ -0,0 +1,383 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +===================================================== +Linux Driver for Intel(R) Ethernet Network Connection +===================================================== + +Intel Gigabit Linux driver. +Copyright(c) 2008-2018 Intel Corporation. + +Contents +======== + +- Identifying Your Adapter +- Command Line Parameters +- Additional Configurations +- Support + + +Identifying Your Adapter +======================== +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + + +Command Line Parameters +======================= +If the driver is built as a module, the following optional parameters are used +by entering them on the command line with the modprobe command using this +syntax:: + + modprobe e1000e [<option>=<VAL1>,<VAL2>,...] + +There needs to be a <VAL#> for each network port in the system supported by +this driver. The values will be applied to each instance, in function order. +For example:: + + modprobe e1000e InterruptThrottleRate=16000,16000 + +In this case, there are two network ports supported by e1000e in the system. +The default value for each parameter is generally the recommended setting, +unless otherwise noted. + +NOTE: A descriptor describes a data buffer and attributes related to the data +buffer. This information is accessed by the hardware. + +InterruptThrottleRate +--------------------- +:Valid Range: 0,1,3,4,100-100000 +:Default Value: 3 + +Interrupt Throttle Rate controls the number of interrupts each interrupt +vector can generate per second. Increasing ITR lowers latency at the cost of +increased CPU utilization, though it may help throughput in some circumstances. + +Setting InterruptThrottleRate to a value greater or equal to 100 +will program the adapter to send out a maximum of that many interrupts +per second, even if more packets have come in. This reduces interrupt +load on the system and can lower CPU utilization under heavy load, +but will increase latency as packets are not processed as quickly. + +The default behaviour of the driver previously assumed a static +InterruptThrottleRate value of 8000, providing a good fallback value for +all traffic types, but lacking in small packet performance and latency. +The hardware can handle many more small packets per second however, and +for this reason an adaptive interrupt moderation algorithm was implemented. + +The driver has two adaptive modes (setting 1 or 3) in which +it dynamically adjusts the InterruptThrottleRate value based on the traffic +that it receives. After determining the type of incoming traffic in the last +timeframe, it will adjust the InterruptThrottleRate to an appropriate value +for that traffic. + +The algorithm classifies the incoming traffic every interval into +classes. Once the class is determined, the InterruptThrottleRate value is +adjusted to suit that traffic type the best. There are three classes defined: +"Bulk traffic", for large amounts of packets of normal size; "Low latency", +for small amounts of traffic and/or a significant percentage of small +packets; and "Lowest latency", for almost completely small packets or +minimal traffic. + + - 0: Off + Turns off any interrupt moderation and may improve small packet latency. + However, this is generally not suitable for bulk throughput traffic due + to the increased CPU utilization of the higher interrupt rate. + - 1: Dynamic mode + This mode attempts to moderate interrupts per vector while maintaining + very low latency. This can sometimes cause extra CPU utilization. If + planning on deploying e1000e in a latency sensitive environment, this + parameter should be considered. + - 3: Dynamic Conservative mode (default) + In dynamic conservative mode, the InterruptThrottleRate value is set to + 4000 for traffic that falls in class "Bulk traffic". If traffic falls in + the "Low latency" or "Lowest latency" class, the InterruptThrottleRate is + increased stepwise to 20000. This default mode is suitable for most + applications. + - 4: Simplified Balancing mode + In simplified mode the interrupt rate is based on the ratio of TX and + RX traffic. If the bytes per second rate is approximately equal, the + interrupt rate will drop as low as 2000 interrupts per second. If the + traffic is mostly transmit or mostly receive, the interrupt rate could + be as high as 8000. + - 100-100000: + Setting InterruptThrottleRate to a value greater or equal to 100 + will program the adapter to send at most that many interrupts per second, + even if more packets have come in. This reduces interrupt load on the + system and can lower CPU utilization under heavy load, but will increase + latency as packets are not processed as quickly. + +NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and +RxAbsIntDelay parameters. In other words, minimizing the receive and/or +transmit absolute delays does not force the controller to generate more +interrupts than what the Interrupt Throttle Rate allows. + +RxIntDelay +---------- +:Valid Range: 0-65535 (0=off) +:Default Value: 0 + +This value delays the generation of receive interrupts in units of 1.024 +microseconds. Receive interrupt reduction can improve CPU efficiency if +properly tuned for specific network traffic. Increasing this value adds extra +latency to frame reception and can end up decreasing the throughput of TCP +traffic. If the system is reporting dropped receives, this value may be set +too high, causing the driver to run out of available receive descriptors. + +CAUTION: When setting RxIntDelay to a value other than 0, adapters may hang +(stop transmitting) under certain network conditions. If this occurs a NETDEV +WATCHDOG message is logged in the system event log. In addition, the +controller is automatically reset, restoring the network connection. To +eliminate the potential for the hang ensure that RxIntDelay is set to 0. + +RxAbsIntDelay +------------- +:Valid Range: 0-65535 (0=off) +:Default Value: 8 + +This value, in units of 1.024 microseconds, limits the delay in which a +receive interrupt is generated. This value ensures that an interrupt is +generated after the initial packet is received within the set amount of time, +which is useful only if RxIntDelay is non-zero. Proper tuning, along with +RxIntDelay, may improve traffic throughput in specific network conditions. + +TxIntDelay +---------- +:Valid Range: 0-65535 (0=off) +:Default Value: 8 + +This value delays the generation of transmit interrupts in units of 1.024 +microseconds. Transmit interrupt reduction can improve CPU efficiency if +properly tuned for specific network traffic. If the system is reporting +dropped transmits, this value may be set too high causing the driver to run +out of available transmit descriptors. + +TxAbsIntDelay +------------- +:Valid Range: 0-65535 (0=off) +:Default Value: 32 + +This value, in units of 1.024 microseconds, limits the delay in which a +transmit interrupt is generated. It is useful only if TxIntDelay is non-zero. +It ensures that an interrupt is generated after the initial Packet is sent on +the wire within the set amount of time. Proper tuning, along with TxIntDelay, +may improve traffic throughput in specific network conditions. + +copybreak +--------- +:Valid Range: 0-xxxxxxx (0=off) +:Default Value: 256 + +The driver copies all packets below or equaling this size to a fresh receive +buffer before handing it up the stack. +This parameter differs from other parameters because it is a single (not 1,1,1 +etc.) parameter applied to all driver instances and it is also available +during runtime at /sys/module/e1000e/parameters/copybreak. + +To use copybreak, type:: + + modprobe e1000e.ko copybreak=128 + +SmartPowerDownEnable +-------------------- +:Valid Range: 0,1 +:Default Value: 0 (disabled) + +Allows the PHY to turn off in lower power states. The user can turn off this +parameter in supported chipsets. + +KumeranLockLoss +--------------- +:Valid Range: 0,1 +:Default Value: 1 (enabled) + +This workaround skips resetting the PHY at shutdown for the initial silicon +releases of ICH8 systems. + +IntMode +------- +:Valid Range: 0-2 +:Default Value: 0 + + +-------+----------------+ + | Value | Interrupt Mode | + +=======+================+ + | 0 | Legacy | + +-------+----------------+ + | 1 | MSI | + +-------+----------------+ + | 2 | MSI-X | + +-------+----------------+ + +IntMode allows load time control over the type of interrupt registered for by +the driver. MSI-X is required for multiple queue support, and some kernels and +combinations of kernel .config options will force a lower level of interrupt +support. + +This command will show different values for each type of interrupt:: + + cat /proc/interrupts + +CrcStripping +------------ +:Valid Range: 0,1 +:Default Value: 1 (enabled) + +Strip the CRC from received packets before sending up the network stack. If +you have a machine with a BMC enabled but cannot receive IPMI traffic after +loading or enabling the driver, try disabling this feature. + +WriteProtectNVM +--------------- +:Valid Range: 0,1 +:Default Value: 1 (enabled) + +If set to 1, configure the hardware to ignore all write/erase cycles to the +GbE region in the ICHx NVM (in order to prevent accidental corruption of the +NVM). This feature can be disabled by setting the parameter to 0 during initial +driver load. + +NOTE: The machine must be power cycled (full off/on) when enabling NVM writes +via setting the parameter to zero. Once the NVM has been locked (via the +parameter at 1 when the driver loads) it cannot be unlocked except via power +cycle. + +Debug +----- +:Valid Range: 0-16 (0=none,...,16=all) +:Default Value: 0 + +This parameter adjusts the level of debug messages displayed in the system logs. + + +Additional Features and Configurations +====================================== + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ifconfig command to increase the MTU size. For example, enter the +following where <x> is the interface number:: + + ifconfig eth<x> mtu 9000 up + +Alternatively, you can use the ip command as follows:: + + ip link set mtu 9000 dev eth<x> + ip link set up dev eth<x> + +This setting is not saved across reboots. The setting change can be made +permanent by adding 'MTU=9000' to the file: + +- For RHEL: /etc/sysconfig/network-scripts/ifcfg-eth<x> +- For SLES: /etc/sysconfig/network/<config_file> + +NOTE: The maximum MTU setting for Jumbo Frames is 8996. This value coincides +with the maximum Jumbo Frames size of 9018 bytes. + +NOTE: Using Jumbo frames at 10 or 100 Mbps is not supported and may result in +poor performance or loss of link. + +NOTE: The following adapters limit Jumbo Frames sized packets to a maximum of +4088 bytes: + + - Intel(R) 82578DM Gigabit Network Connection + - Intel(R) 82577LM Gigabit Network Connection + +The following adapters do not support Jumbo Frames: + + - Intel(R) PRO/1000 Gigabit Server Adapter + - Intel(R) PRO/1000 PM Network Connection + - Intel(R) 82562G 10/100 Network Connection + - Intel(R) 82562G-2 10/100 Network Connection + - Intel(R) 82562GT 10/100 Network Connection + - Intel(R) 82562GT-2 10/100 Network Connection + - Intel(R) 82562V 10/100 Network Connection + - Intel(R) 82562V-2 10/100 Network Connection + - Intel(R) 82566DC Gigabit Network Connection + - Intel(R) 82566DC-2 Gigabit Network Connection + - Intel(R) 82566DM Gigabit Network Connection + - Intel(R) 82566MC Gigabit Network Connection + - Intel(R) 82566MM Gigabit Network Connection + - Intel(R) 82567V-3 Gigabit Network Connection + - Intel(R) 82577LC Gigabit Network Connection + - Intel(R) 82578DC Gigabit Network Connection + +NOTE: Jumbo Frames cannot be configured on an 82579-based Network device if +MACSec is enabled on the system. + + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: + +https://www.kernel.org/pub/software/network/ethtool/ + +NOTE: When validating enable/disable tests on some parts (for example, 82578), +it is necessary to add a few seconds between tests when working with ethtool. + + +Speed and Duplex Configuration +------------------------------ +In addressing speed and duplex configuration issues, you need to distinguish +between copper-based adapters and fiber-based adapters. + +In the default mode, an Intel(R) Ethernet Network Adapter using copper +connections will attempt to auto-negotiate with its link partner to determine +the best setting. If the adapter cannot establish link with the link partner +using auto-negotiation, you may need to manually configure the adapter and link +partner to identical settings to establish link and pass packets. This should +only be needed when attempting to link with an older switch that does not +support auto-negotiation or one that has been forced to a specific speed or +duplex mode. Your link partner must match the setting you choose. 1 Gbps speeds +and higher cannot be forced. Use the autonegotiation advertising setting to +manually set devices for 1 Gbps and higher. + +Speed, duplex, and autonegotiation advertising are configured through the +ethtool utility. + +Caution: Only experienced network administrators should force speed and duplex +or change autonegotiation advertising manually. The settings at the switch must +always match the adapter settings. Adapter performance may suffer or your +adapter may not operate if you configure the adapter differently from your +switch. + +An Intel(R) Ethernet Network Adapter using fiber-based connections, however, +will not attempt to auto-negotiate with its link partner since those adapters +operate only in full duplex and only at their native speed. + + +Enabling Wake on LAN (WoL) +-------------------------- +WoL is configured through the ethtool utility. + +WoL will be enabled on the system during the next shut down or reboot. For +this driver version, in order to enable WoL, the e1000e driver must be loaded +prior to shutting down or suspending the system. + +NOTE: Wake on LAN is only supported on port A for the following devices: +- Intel(R) PRO/1000 PT Dual Port Network Connection +- Intel(R) PRO/1000 PT Dual Port Server Connection +- Intel(R) PRO/1000 PT Dual Port Server Adapter +- Intel(R) PRO/1000 PF Dual Port Server Adapter +- Intel(R) PRO/1000 PT Quad Port Server Adapter +- Intel(R) Gigabit PT Quad Port Server ExpressModule + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst b/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst new file mode 100644 index 000000000..9258ef6f5 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/fm10k.rst @@ -0,0 +1,142 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +============================================================= +Linux Base Driver for Intel(R) Ethernet Multi-host Controller +============================================================= + +August 20, 2018 +Copyright(c) 2015-2018 Intel Corporation. + +Contents +======== +- Identifying Your Adapter +- Additional Configurations +- Performance Tuning +- Known Issues +- Support + +Identifying Your Adapter +======================== +The driver in this release is compatible with devices based on the Intel(R) +Ethernet Multi-host Controller. + +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + + +Flow Control +------------ +The Intel(R) Ethernet Switch Host Interface Driver does not support Flow +Control. It will not send pause frames. This may result in dropped frames. + + +Virtual Functions (VFs) +----------------------- +Use sysfs to enable VFs. +Valid Range: 0-64 + +For example:: + + echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs //enable VFs + echo 0 > /sys/class/net/$dev/device/sriov_numvfs //disable VFs + +NOTE: Neither the device nor the driver control how VFs are mapped into config +space. Bus layout will vary by operating system. On operating systems that +support it, you can check sysfs to find the mapping. + +NOTE: When SR-IOV mode is enabled, hardware VLAN filtering and VLAN tag +stripping/insertion will remain enabled. Please remove the old VLAN filter +before the new VLAN filter is added. For example:: + + ip link set eth0 vf 0 vlan 100 // set vlan 100 for VF 0 + ip link set eth0 vf 0 vlan 0 // Delete vlan 100 + ip link set eth0 vf 0 vlan 200 // set a new vlan 200 for VF 0 + + +Additional Features and Configurations +====================================== + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ifconfig command to increase the MTU size. For example, enter the +following where <x> is the interface number:: + + ifconfig eth<x> mtu 9000 up + +Alternatively, you can use the ip command as follows:: + + ip link set mtu 9000 dev eth<x> + ip link set up dev eth<x> + +This setting is not saved across reboots. The setting change can be made +permanent by adding 'MTU=9000' to the file: + +- For RHEL: /etc/sysconfig/network-scripts/ifcfg-eth<x> +- For SLES: /etc/sysconfig/network/<config_file> + +NOTE: The maximum MTU setting for Jumbo Frames is 15342. This value coincides +with the maximum Jumbo Frames size of 15364 bytes. + +NOTE: This driver will attempt to use multiple page sized buffers to receive +each jumbo packet. This should help to avoid buffer starvation issues when +allocating receive packets. + + +Generic Receive Offload, aka GRO +-------------------------------- +The driver supports the in-kernel software implementation of GRO. GRO has +shown that by coalescing Rx traffic into larger chunks of data, CPU +utilization can be significantly reduced when under large Rx load. GRO is an +evolution of the previously-used LRO interface. GRO is able to coalesce +other protocols besides TCP. It's also safe to use with configurations that +are problematic for LRO, namely bridging and iSCSI. + + + +Supported ethtool Commands and Options for Filtering +---------------------------------------------------- +-n --show-nfc + Retrieves the receive network flow classification configurations. + +rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|tcp6|udp6|ah6|esp6|sctp6 + Retrieves the hash options for the specified network traffic type. + +-N --config-nfc + Configures the receive network flow classification. + +rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|tcp6|udp6|ah6|esp6|sctp6 m|v|t|s|d|f|n|r + Configures the hash options for the specified network traffic type. + +- udp4: UDP over IPv4 +- udp6: UDP over IPv6 +- f Hash on bytes 0 and 1 of the Layer 4 header of the rx packet. +- n Hash on bytes 2 and 3 of the Layer 4 header of the rx packet. + + +Known Issues/Troubleshooting +============================ + +Enabling SR-IOV in a 64-bit Microsoft Windows Server 2012/R2 guest OS under Linux KVM +------------------------------------------------------------------------------------- +KVM Hypervisor/VMM supports direct assignment of a PCIe device to a VM. This +includes traditional PCIe devices, as well as SR-IOV-capable devices based on +the Intel Ethernet Controller XL710. + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/i40e.rst b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst new file mode 100644 index 000000000..ac35bd472 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/i40e.rst @@ -0,0 +1,771 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +================================================================= +Linux Base Driver for the Intel(R) Ethernet Controller 700 Series +================================================================= + +Intel 40 Gigabit Linux driver. +Copyright(c) 1999-2018 Intel Corporation. + +Contents +======== + +- Overview +- Identifying Your Adapter +- Intel(R) Ethernet Flow Director +- Additional Configurations +- Known Issues +- Support + + +Driver information can be obtained using ethtool, lspci, and ifconfig. +Instructions on updating ethtool can be found in the section Additional +Configurations later in this document. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel adapter. All hardware requirements listed apply to use +with Linux. + + +Identifying Your Adapter +======================== +The driver is compatible with devices based on the following: + + * Intel(R) Ethernet Controller X710 + * Intel(R) Ethernet Controller XL710 + * Intel(R) Ethernet Network Connection X722 + * Intel(R) Ethernet Controller XXV710 + +For the best performance, make sure the latest NVM/FW is installed on your +device. + +For information on how to identify your adapter, and for the latest NVM/FW +images and Intel network drivers, refer to the Intel Support website: +https://www.intel.com/support + +SFP+ and QSFP+ Devices +---------------------- +For information about supported media, refer to this document: +https://www.intel.com/content/dam/www/public/us/en/documents/release-notes/xl710-ethernet-controller-feature-matrix.pdf + +NOTE: Some adapters based on the Intel(R) Ethernet Controller 700 Series only +support Intel Ethernet Optics modules. On these adapters, other modules are not +supported and will not function. In all cases Intel recommends using Intel +Ethernet Optics; other modules may function but are not validated by Intel. +Contact Intel for supported media types. + +NOTE: For connections based on Intel(R) Ethernet Controller 700 Series, support +is dependent on your system board. Please see your vendor for details. + +NOTE: In systems that do not have adequate airflow to cool the adapter and +optical modules, you must use high temperature optical modules. + +Virtual Functions (VFs) +----------------------- +Use sysfs to enable VFs. For example:: + + #echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs #enable VFs + #echo 0 > /sys/class/net/$dev/device/sriov_numvfs #disable VFs + +For example, the following instructions will configure PF eth0 and the first VF +on VLAN 10:: + + $ ip link set dev eth0 vf 0 vlan 10 + +VLAN Tag Packet Steering +------------------------ +Allows you to send all packets with a specific VLAN tag to a particular SR-IOV +virtual function (VF). Further, this feature allows you to designate a +particular VF as trusted, and allows that trusted VF to request selective +promiscuous mode on the Physical Function (PF). + +To set a VF as trusted or untrusted, enter the following command in the +Hypervisor:: + + # ip link set dev eth0 vf 1 trust [on|off] + +Once the VF is designated as trusted, use the following commands in the VM to +set the VF to promiscuous mode. + +:: + + For promiscuous all: + #ip link set eth2 promisc on + Where eth2 is a VF interface in the VM + + For promiscuous Multicast: + #ip link set eth2 allmulticast on + Where eth2 is a VF interface in the VM + +NOTE: By default, the ethtool priv-flag vf-true-promisc-support is set to +"off",meaning that promiscuous mode for the VF will be limited. To set the +promiscuous mode for the VF to true promiscuous and allow the VF to see all +ingress traffic, use the following command:: + + #ethtool -set-priv-flags p261p1 vf-true-promisc-support on + +The vf-true-promisc-support priv-flag does not enable promiscuous mode; rather, +it designates which type of promiscuous mode (limited or true) you will get +when you enable promiscuous mode using the ip link commands above. Note that +this is a global setting that affects the entire device. However,the +vf-true-promisc-support priv-flag is only exposed to the first PF of the +device. The PF remains in limited promiscuous mode (unless it is in MFP mode) +regardless of the vf-true-promisc-support setting. + +Now add a VLAN interface on the VF interface:: + + #ip link add link eth2 name eth2.100 type vlan id 100 + +Note that the order in which you set the VF to promiscuous mode and add the +VLAN interface does not matter (you can do either first). The end result in +this example is that the VF will get all traffic that is tagged with VLAN 100. + +Intel(R) Ethernet Flow Director +------------------------------- +The Intel Ethernet Flow Director performs the following tasks: + +- Directs receive packets according to their flows to different queues. +- Enables tight control on routing a flow in the platform. +- Matches flows and CPU cores for flow affinity. +- Supports multiple parameters for flexible flow classification and load + balancing (in SFP mode only). + +NOTE: The Linux i40e driver supports the following flow types: IPv4, TCPv4, and +UDPv4. For a given flow type, it supports valid combinations of IP addresses +(source or destination) and UDP/TCP ports (source and destination). For +example, you can supply only a source IP address, a source IP address and a +destination port, or any combination of one or more of these four parameters. + +NOTE: The Linux i40e driver allows you to filter traffic based on a +user-defined flexible two-byte pattern and offset by using the ethtool user-def +and mask fields. Only L3 and L4 flow types are supported for user-defined +flexible filters. For a given flow type, you must clear all Intel Ethernet Flow +Director filters before changing the input set (for that flow type). + +To enable or disable the Intel Ethernet Flow Director:: + + # ethtool -K ethX ntuple <on|off> + +When disabling ntuple filters, all the user programmed filters are flushed from +the driver cache and hardware. All needed filters must be re-added when ntuple +is re-enabled. + +To add a filter that directs packet to queue 2, use -U or -N switch:: + + # ethtool -N ethX flow-type tcp4 src-ip 192.168.10.1 dst-ip \ + 192.168.10.2 src-port 2000 dst-port 2001 action 2 [loc 1] + +To set a filter using only the source and destination IP address:: + + # ethtool -N ethX flow-type tcp4 src-ip 192.168.10.1 dst-ip \ + 192.168.10.2 action 2 [loc 1] + +To see the list of filters currently present:: + + # ethtool <-u|-n> ethX + +Application Targeted Routing (ATR) Perfect Filters +-------------------------------------------------- +ATR is enabled by default when the kernel is in multiple transmit queue mode. +An ATR Intel Ethernet Flow Director filter rule is added when a TCP-IP flow +starts and is deleted when the flow ends. When a TCP-IP Intel Ethernet Flow +Director rule is added from ethtool (Sideband filter), ATR is turned off by the +driver. To re-enable ATR, the sideband can be disabled with the ethtool -K +option. For example:: + + ethtool -K [adapter] ntuple [off|on] + +If sideband is re-enabled after ATR is re-enabled, ATR remains enabled until a +TCP-IP flow is added. When all TCP-IP sideband rules are deleted, ATR is +automatically re-enabled. + +Packets that match the ATR rules are counted in fdir_atr_match stats in +ethtool, which also can be used to verify whether ATR rules still exist. + +Sideband Perfect Filters +------------------------ +Sideband Perfect Filters are used to direct traffic that matches specified +characteristics. They are enabled through ethtool's ntuple interface. To add a +new filter use the following command:: + + ethtool -U <device> flow-type <type> src-ip <ip> dst-ip <ip> src-port <port> \ + dst-port <port> action <queue> + +Where: + <device> - the ethernet device to program + <type> - can be ip4, tcp4, udp4, or sctp4 + <ip> - the ip address to match on + <port> - the port number to match on + <queue> - the queue to direct traffic towards (-1 discards matching traffic) + +Use the following command to display all of the active filters:: + + ethtool -u <device> + +Use the following command to delete a filter:: + + ethtool -U <device> delete <N> + +Where <N> is the filter id displayed when printing all the active filters, and +may also have been specified using "loc <N>" when adding the filter. + +The following example matches TCP traffic sent from 192.168.0.1, port 5300, +directed to 192.168.0.5, port 80, and sends it to queue 7:: + + ethtool -U enp130s0 flow-type tcp4 src-ip 192.168.0.1 dst-ip 192.168.0.5 \ + src-port 5300 dst-port 80 action 7 + +For each flow-type, the programmed filters must all have the same matching +input set. For example, issuing the following two commands is acceptable:: + + ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7 + ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.5 src-port 55 action 10 + +Issuing the next two commands, however, is not acceptable, since the first +specifies src-ip and the second specifies dst-ip:: + + ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7 + ethtool -U enp130s0 flow-type ip4 dst-ip 192.168.0.5 src-port 55 action 10 + +The second command will fail with an error. You may program multiple filters +with the same fields, using different values, but, on one device, you may not +program two tcp4 filters with different matching fields. + +Matching on a sub-portion of a field is not supported by the i40e driver, thus +partial mask fields are not supported. + +The driver also supports matching user-defined data within the packet payload. +This flexible data is specified using the "user-def" field of the ethtool +command in the following way: + ++----------------------------+--------------------------+ +| 31 28 24 20 16 | 15 12 8 4 0 | ++----------------------------+--------------------------+ +| offset into packet payload | 2 bytes of flexible data | ++----------------------------+--------------------------+ + +For example, + +:: + + ... user-def 0x4FFFF ... + +tells the filter to look 4 bytes into the payload and match that value against +0xFFFF. The offset is based on the beginning of the payload, and not the +beginning of the packet. Thus + +:: + + flow-type tcp4 ... user-def 0x8BEAF ... + +would match TCP/IPv4 packets which have the value 0xBEAF 8 bytes into the +TCP/IPv4 payload. + +Note that ICMP headers are parsed as 4 bytes of header and 4 bytes of payload. +Thus to match the first byte of the payload, you must actually add 4 bytes to +the offset. Also note that ip4 filters match both ICMP frames as well as raw +(unknown) ip4 frames, where the payload will be the L3 payload of the IP4 frame. + +The maximum offset is 64. The hardware will only read up to 64 bytes of data +from the payload. The offset must be even because the flexible data is 2 bytes +long and must be aligned to byte 0 of the packet payload. + +The user-defined flexible offset is also considered part of the input set and +cannot be programmed separately for multiple filters of the same type. However, +the flexible data is not part of the input set and multiple filters may use the +same offset but match against different data. + +To create filters that direct traffic to a specific Virtual Function, use the +"action" parameter. Specify the action as a 64 bit value, where the lower 32 +bits represents the queue number, while the next 8 bits represent which VF. +Note that 0 is the PF, so the VF identifier is offset by 1. For example:: + + ... action 0x800000002 ... + +specifies to direct traffic to Virtual Function 7 (8 minus 1) into queue 2 of +that VF. + +Note that these filters will not break internal routing rules, and will not +route traffic that otherwise would not have been sent to the specified Virtual +Function. + +Setting the link-down-on-close Private Flag +------------------------------------------- +When the link-down-on-close private flag is set to "on", the port's link will +go down when the interface is brought down using the ifconfig ethX down command. + +Use ethtool to view and set link-down-on-close, as follows:: + + ethtool --show-priv-flags ethX + ethtool --set-priv-flags ethX link-down-on-close [on|off] + +Viewing Link Messages +--------------------- +Link messages will not be displayed to the console if the distribution is +restricting system messages. In order to see network driver link messages on +your console, set dmesg to eight by entering the following:: + + dmesg -n 8 + +NOTE: This setting is not saved across reboots. + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ifconfig command to increase the MTU size. For example, enter the +following where <x> is the interface number:: + + ifconfig eth<x> mtu 9000 up + +Alternatively, you can use the ip command as follows:: + + ip link set mtu 9000 dev eth<x> + ip link set up dev eth<x> + +This setting is not saved across reboots. The setting change can be made +permanent by adding 'MTU=9000' to the file:: + + /etc/sysconfig/network-scripts/ifcfg-eth<x> // for RHEL + /etc/sysconfig/network/<config_file> // for SLES + +NOTE: The maximum MTU setting for Jumbo Frames is 9702. This value coincides +with the maximum Jumbo Frames size of 9728 bytes. + +NOTE: This driver will attempt to use multiple page sized buffers to receive +each jumbo packet. This should help to avoid buffer starvation issues when +allocating receive packets. + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: +https://www.kernel.org/pub/software/network/ethtool/ + +Supported ethtool Commands and Options for Filtering +---------------------------------------------------- +-n --show-nfc + Retrieves the receive network flow classification configurations. + +rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|tcp6|udp6|ah6|esp6|sctp6 + Retrieves the hash options for the specified network traffic type. + +-N --config-nfc + Configures the receive network flow classification. + +rx-flow-hash tcp4|udp4|ah4|esp4|sctp4|tcp6|udp6|ah6|esp6|sctp6 m|v|t|s|d|f|n|r... + Configures the hash options for the specified network traffic type. + +udp4 UDP over IPv4 +udp6 UDP over IPv6 + +f Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet. +n Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet. + +Speed and Duplex Configuration +------------------------------ +In addressing speed and duplex configuration issues, you need to distinguish +between copper-based adapters and fiber-based adapters. + +In the default mode, an Intel(R) Ethernet Network Adapter using copper +connections will attempt to auto-negotiate with its link partner to determine +the best setting. If the adapter cannot establish link with the link partner +using auto-negotiation, you may need to manually configure the adapter and link +partner to identical settings to establish link and pass packets. This should +only be needed when attempting to link with an older switch that does not +support auto-negotiation or one that has been forced to a specific speed or +duplex mode. Your link partner must match the setting you choose. 1 Gbps speeds +and higher cannot be forced. Use the autonegotiation advertising setting to +manually set devices for 1 Gbps and higher. + +NOTE: You cannot set the speed for devices based on the Intel(R) Ethernet +Network Adapter XXV710 based devices. + +Speed, duplex, and autonegotiation advertising are configured through the +ethtool utility. + +Caution: Only experienced network administrators should force speed and duplex +or change autonegotiation advertising manually. The settings at the switch must +always match the adapter settings. Adapter performance may suffer or your +adapter may not operate if you configure the adapter differently from your +switch. + +An Intel(R) Ethernet Network Adapter using fiber-based connections, however, +will not attempt to auto-negotiate with its link partner since those adapters +operate only in full duplex and only at their native speed. + +NAPI +---- +NAPI (Rx polling mode) is supported in the i40e driver. +For more information on NAPI, see +https://wiki.linuxfoundation.org/networking/napi + +Flow Control +------------ +Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable +receiving and transmitting pause frames for i40e. When transmit is enabled, +pause frames are generated when the receive packet buffer crosses a predefined +threshold. When receive is enabled, the transmit unit will halt for the time +delay specified when a pause frame is received. + +NOTE: You must have a flow control capable link partner. + +Flow Control is on by default. + +Use ethtool to change the flow control settings. + +To enable or disable Rx or Tx Flow Control:: + + ethtool -A eth? rx <on|off> tx <on|off> + +Note: This command only enables or disables Flow Control if auto-negotiation is +disabled. If auto-negotiation is enabled, this command changes the parameters +used for auto-negotiation with the link partner. + +To enable or disable auto-negotiation:: + + ethtool -s eth? autoneg <on|off> + +Note: Flow Control auto-negotiation is part of link auto-negotiation. Depending +on your device, you may not be able to change the auto-negotiation setting. + +RSS Hash Flow +------------- +Allows you to set the hash bytes per flow type and any combination of one or +more options for Receive Side Scaling (RSS) hash byte configuration. + +:: + + # ethtool -N <dev> rx-flow-hash <type> <option> + +Where <type> is: + tcp4 signifying TCP over IPv4 + udp4 signifying UDP over IPv4 + tcp6 signifying TCP over IPv6 + udp6 signifying UDP over IPv6 +And <option> is one or more of: + s Hash on the IP source address of the Rx packet. + d Hash on the IP destination address of the Rx packet. + f Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet. + n Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet. + +MAC and VLAN anti-spoofing feature +---------------------------------- +When a malicious driver attempts to send a spoofed packet, it is dropped by the +hardware and not transmitted. +NOTE: This feature can be disabled for a specific Virtual Function (VF):: + + ip link set <pf dev> vf <vf id> spoofchk {off|on} + +IEEE 1588 Precision Time Protocol (PTP) Hardware Clock (PHC) +------------------------------------------------------------ +Precision Time Protocol (PTP) is used to synchronize clocks in a computer +network. PTP support varies among Intel devices that support this driver. Use +"ethtool -T <netdev name>" to get a definitive list of PTP capabilities +supported by the device. + +IEEE 802.1ad (QinQ) Support +--------------------------- +The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN +IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as +"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks +allow L2 tunneling and the ability to segregate traffic within a particular +VLAN ID, among other uses. + +The following are examples of how to configure 802.1ad (QinQ):: + + ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24 + ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371 + +Where "24" and "371" are example VLAN IDs. + +NOTES: + Receive checksum offloads, cloud filters, and VLAN acceleration are not + supported for 802.1ad (QinQ) packets. + +VXLAN and GENEVE Overlay HW Offloading +-------------------------------------- +Virtual Extensible LAN (VXLAN) allows you to extend an L2 network over an L3 +network, which may be useful in a virtualized or cloud environment. Some +Intel(R) Ethernet Network devices perform VXLAN processing, offloading it from +the operating system. This reduces CPU utilization. + +VXLAN offloading is controlled by the Tx and Rx checksum offload options +provided by ethtool. That is, if Tx checksum offload is enabled, and the +adapter has the capability, VXLAN offloading is also enabled. + +Support for VXLAN and GENEVE HW offloading is dependent on kernel support of +the HW offloading features. + +Multiple Functions per Port +--------------------------- +Some adapters based on the Intel Ethernet Controller X710/XL710 support +multiple functions on a single physical port. Configure these functions through +the System Setup/BIOS. + +Minimum TX Bandwidth is the guaranteed minimum data transmission bandwidth, as +a percentage of the full physical port link speed, that the partition will +receive. The bandwidth the partition is awarded will never fall below the level +you specify. + +The range for the minimum bandwidth values is: +1 to ((100 minus # of partitions on the physical port) plus 1) +For example, if a physical port has 4 partitions, the range would be: +1 to ((100 - 4) + 1 = 97) + +The Maximum Bandwidth percentage represents the maximum transmit bandwidth +allocated to the partition as a percentage of the full physical port link +speed. The accepted range of values is 1-100. The value is used as a limiter, +should you chose that any one particular function not be able to consume 100% +of a port's bandwidth (should it be available). The sum of all the values for +Maximum Bandwidth is not restricted, because no more than 100% of a port's +bandwidth can ever be used. + +NOTE: X710/XXV710 devices fail to enable Max VFs (64) when Multiple Functions +per Port (MFP) and SR-IOV are enabled. An error from i40e is logged that says +"add vsi failed for VF N, aq_err 16". To workaround the issue, enable less than +64 virtual functions (VFs). + +Data Center Bridging (DCB) +-------------------------- +DCB is a configuration Quality of Service implementation in hardware. It uses +the VLAN priority tag (802.1p) to filter traffic. That means that there are 8 +different priorities that traffic can be filtered into. It also enables +priority flow control (802.1Qbb) which can limit or eliminate the number of +dropped packets during network stress. Bandwidth can be allocated to each of +these priorities, which is enforced at the hardware level (802.1Qaz). + +Adapter firmware implements LLDP and DCBX protocol agents as per 802.1AB and +802.1Qaz respectively. The firmware based DCBX agent runs in willing mode only +and can accept settings from a DCBX capable peer. Software configuration of +DCBX parameters via dcbtool/lldptool are not supported. + +NOTE: Firmware LLDP can be disabled by setting the private flag disable-fw-lldp. + +The i40e driver implements the DCB netlink interface layer to allow user-space +to communicate with the driver and query DCB configuration for the port. + +NOTE: +The kernel assumes that TC0 is available, and will disable Priority Flow +Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is +enabled when setting up DCB on your switch. + +Interrupt Rate Limiting +----------------------- +:Valid Range: 0-235 (0=no limit) + +The Intel(R) Ethernet Controller XL710 family supports an interrupt rate +limiting mechanism. The user can control, via ethtool, the number of +microseconds between interrupts. + +Syntax:: + + # ethtool -C ethX rx-usecs-high N + +The range of 0-235 microseconds provides an effective range of 4,310 to 250,000 +interrupts per second. The value of rx-usecs-high can be set independently of +rx-usecs and tx-usecs in the same ethtool command, and is also independent of +the adaptive interrupt moderation algorithm. The underlying hardware supports +granularity in 4-microsecond intervals, so adjacent values may result in the +same interrupt rate. + +One possible use case is the following:: + + # ethtool -C ethX adaptive-rx off adaptive-tx off rx-usecs-high 20 rx-usecs \ + 5 tx-usecs 5 + +The above command would disable adaptive interrupt moderation, and allow a +maximum of 5 microseconds before indicating a receive or transmit was complete. +However, instead of resulting in as many as 200,000 interrupts per second, it +limits total interrupts per second to 50,000 via the rx-usecs-high parameter. + +Performance Optimization +======================== +Driver defaults are meant to fit a wide variety of workloads, but if further +optimization is required we recommend experimenting with the following settings. + +NOTE: For better performance when processing small (64B) frame sizes, try +enabling Hyper threading in the BIOS in order to increase the number of logical +cores in the system and subsequently increase the number of queues available to +the adapter. + +Virtualized Environments +------------------------ +1. Disable XPS on both ends by using the included virt_perf_default script +or by running the following command as root:: + + for file in `ls /sys/class/net/<ethX>/queues/tx-*/xps_cpus`; + do echo 0 > $file; done + +2. Using the appropriate mechanism (vcpupin) in the vm, pin the cpu's to +individual lcpu's, making sure to use a set of cpu's included in the +device's local_cpulist: /sys/class/net/<ethX>/device/local_cpulist. + +3. Configure as many Rx/Tx queues in the VM as available. Do not rely on +the default setting of 1. + + +Non-virtualized Environments +---------------------------- +Pin the adapter's IRQs to specific cores by disabling the irqbalance service +and using the included set_irq_affinity script. Please see the script's help +text for further options. + +- The following settings will distribute the IRQs across all the cores evenly:: + + # scripts/set_irq_affinity -x all <interface1> , [ <interface2>, ... ] + +- The following settings will distribute the IRQs across all the cores that are + local to the adapter (same NUMA node):: + + # scripts/set_irq_affinity -x local <interface1> ,[ <interface2>, ... ] + +For very CPU intensive workloads, we recommend pinning the IRQs to all cores. + +For IP Forwarding: Disable Adaptive ITR and lower Rx and Tx interrupts per +queue using ethtool. + +- Setting rx-usecs and tx-usecs to 125 will limit interrupts to about 8000 + interrupts per second per queue. + +:: + + # ethtool -C <interface> adaptive-rx off adaptive-tx off rx-usecs 125 \ + tx-usecs 125 + +For lower CPU utilization: Disable Adaptive ITR and lower Rx and Tx interrupts +per queue using ethtool. + +- Setting rx-usecs and tx-usecs to 250 will limit interrupts to about 4000 + interrupts per second per queue. + +:: + + # ethtool -C <interface> adaptive-rx off adaptive-tx off rx-usecs 250 \ + tx-usecs 250 + +For lower latency: Disable Adaptive ITR and ITR by setting Rx and Tx to 0 using +ethtool. + +:: + + # ethtool -C <interface> adaptive-rx off adaptive-tx off rx-usecs 0 \ + tx-usecs 0 + +Application Device Queues (ADq) +------------------------------- +Application Device Queues (ADq) allows you to dedicate one or more queues to a +specific application. This can reduce latency for the specified application, +and allow Tx traffic to be rate limited per application. Follow the steps below +to set ADq. + +1. Create traffic classes (TCs). Maximum of 8 TCs can be created per interface. +The shaper bw_rlimit parameter is optional. + +Example: Sets up two tcs, tc0 and tc1, with 16 queues each and max tx rate set +to 1Gbit for tc0 and 3Gbit for tc1. + +:: + + # tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1 + queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit + max_rate 1Gbit 3Gbit + +map: priority mapping for up to 16 priorities to tcs (e.g. map 0 0 0 0 1 1 1 1 +sets priorities 0-3 to use tc0 and 4-7 to use tc1) + +queues: for each tc, <num queues>@<offset> (e.g. queues 16@0 16@16 assigns +16 queues to tc0 at offset 0 and 16 queues to tc1 at offset 16. Max total +number of queues for all tcs is 64 or number of cores, whichever is lower.) + +hw 1 mode channel: ‘channel’ with ‘hw’ set to 1 is a new new hardware +offload mode in mqprio that makes full use of the mqprio options, the +TCs, the queue configurations, and the QoS parameters. + +shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates. +Totals must be equal or less than port speed. + +For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network +monitoring tools such as `ifstat` or `sar -n DEV [interval] [number of samples]` + +2. Enable HW TC offload on interface:: + + # ethtool -K <interface> hw-tc-offload on + +3. Apply TCs to ingress (RX) flow of interface:: + + # tc qdisc add dev <interface> ingress + +NOTES: + - Run all tc commands from the iproute2 <pathtoiproute2>/tc/ directory. + - ADq is not compatible with cloud filters. + - Setting up channels via ethtool (ethtool -L) is not supported when the + TCs are configured using mqprio. + - You must have iproute2 latest version + - NVM version 6.01 or later is required. + - ADq cannot be enabled when any the following features are enabled: Data + Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband + Filters. + - If another driver (for example, DPDK) has set cloud filters, you cannot + enable ADq. + - Tunnel filters are not supported in ADq. If encapsulated packets do + arrive in non-tunnel mode, filtering will be done on the inner headers. + For example, for VXLAN traffic in non-tunnel mode, PCTYPE is identified + as a VXLAN encapsulated packet, outer headers are ignored. Therefore, + inner headers are matched. + - If a TC filter on a PF matches traffic over a VF (on the PF), that + traffic will be routed to the appropriate queue of the PF, and will + not be passed on the VF. Such traffic will end up getting dropped higher + up in the TCP/IP stack as it does not match PF address data. + - If traffic matches multiple TC filters that point to different TCs, + that traffic will be duplicated and sent to all matching TC queues. + The hardware switch mirrors the packet to a VSI list when multiple + filters are matched. + + +Known Issues/Troubleshooting +============================ + +NOTE: 1 Gb devices based on the Intel(R) Ethernet Network Connection X722 do +not support the following features: + + * Data Center Bridging (DCB) + * QOS + * VMQ + * SR-IOV + * Task Encapsulation offload (VXLAN, NVGRE) + * Energy Efficient Ethernet (EEE) + * Auto-media detect + +Unexpected Issues when the device driver and DPDK share a device +---------------------------------------------------------------- +Unexpected issues may result when an i40e device is in multi driver mode and +the kernel driver and DPDK driver are sharing the device. This is because +access to the global NIC resources is not synchronized between multiple +drivers. Any change to the global NIC configuration (writing to a global +register, setting global configuration by AQ, or changing switch modes) will +affect all ports and drivers on the device. Loading DPDK with the +"multi-driver" module parameter may mitigate some of the issues. + +TC0 must be enabled when setting up DCB on a switch +--------------------------------------------------- +The kernel assumes that TC0 is available, and will disable Priority Flow +Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is +enabled when setting up DCB on your switch. + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/iavf.rst b/Documentation/networking/device_drivers/ethernet/intel/iavf.rst new file mode 100644 index 000000000..151af0a8d --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/iavf.rst @@ -0,0 +1,331 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +================================================================= +Linux Base Driver for Intel(R) Ethernet Adaptive Virtual Function +================================================================= + +Intel Ethernet Adaptive Virtual Function Linux driver. +Copyright(c) 2013-2018 Intel Corporation. + +Contents +======== + +- Overview +- Identifying Your Adapter +- Additional Configurations +- Known Issues/Troubleshooting +- Support + +Overview +======== + +This file describes the iavf Linux Base Driver. This driver was formerly +called i40evf. + +The iavf driver supports the below mentioned virtual function devices and +can only be activated on kernels running the i40e or newer Physical Function +(PF) driver compiled with CONFIG_PCI_IOV. The iavf driver requires +CONFIG_PCI_MSI to be enabled. + +The guest OS loading the iavf driver must support MSI-X interrupts. + +Identifying Your Adapter +======================== + +The driver in this kernel is compatible with devices based on the following: + * Intel(R) XL710 X710 Virtual Function + * Intel(R) X722 Virtual Function + * Intel(R) XXV710 Virtual Function + * Intel(R) Ethernet Adaptive Virtual Function + +For the best performance, make sure the latest NVM/FW is installed on your +device. + +For information on how to identify your adapter, and for the latest NVM/FW +images and Intel network drivers, refer to the Intel Support website: +https://www.intel.com/support + + +Additional Features and Configurations +====================================== + +Viewing Link Messages +--------------------- +Link messages will not be displayed to the console if the distribution is +restricting system messages. In order to see network driver link messages on +your console, set dmesg to eight by entering the following:: + + # dmesg -n 8 + +NOTE: + This setting is not saved across reboots. + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: +https://www.kernel.org/pub/software/network/ethtool/ + +Setting VLAN Tag Stripping +-------------------------- +If you have applications that require Virtual Functions (VFs) to receive +packets with VLAN tags, you can disable VLAN tag stripping for the VF. The +Physical Function (PF) processes requests issued from the VF to enable or +disable VLAN tag stripping. Note that if the PF has assigned a VLAN to a VF, +then requests from that VF to set VLAN tag stripping will be ignored. + +To enable/disable VLAN tag stripping for a VF, issue the following command +from inside the VM in which you are running the VF:: + + # ethtool -K <if_name> rxvlan on/off + +or alternatively:: + + # ethtool --offload <if_name> rxvlan on/off + +Adaptive Virtual Function +------------------------- +Adaptive Virtual Function (AVF) allows the virtual function driver, or VF, to +adapt to changing feature sets of the physical function driver (PF) with which +it is associated. This allows system administrators to update a PF without +having to update all the VFs associated with it. All AVFs have a single common +device ID and branding string. + +AVFs have a minimum set of features known as "base mode," but may provide +additional features depending on what features are available in the PF with +which the AVF is associated. The following are base mode features: + +- 4 Queue Pairs (QP) and associated Configuration Status Registers (CSRs) + for Tx/Rx +- i40e descriptors and ring format +- Descriptor write-back completion +- 1 control queue, with i40e descriptors, CSRs and ring format +- 5 MSI-X interrupt vectors and corresponding i40e CSRs +- 1 Interrupt Throttle Rate (ITR) index +- 1 Virtual Station Interface (VSI) per VF +- 1 Traffic Class (TC), TC0 +- Receive Side Scaling (RSS) with 64 entry indirection table and key, + configured through the PF +- 1 unicast MAC address reserved per VF +- 16 MAC address filters for each VF +- Stateless offloads - non-tunneled checksums +- AVF device ID +- HW mailbox is used for VF to PF communications (including on Windows) + +IEEE 802.1ad (QinQ) Support +--------------------------- +The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN +IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as +"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks +allow L2 tunneling and the ability to segregate traffic within a particular +VLAN ID, among other uses. + +The following are examples of how to configure 802.1ad (QinQ):: + + # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24 + # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371 + +Where "24" and "371" are example VLAN IDs. + +NOTES: + Receive checksum offloads, cloud filters, and VLAN acceleration are not + supported for 802.1ad (QinQ) packets. + +Application Device Queues (ADq) +------------------------------- +Application Device Queues (ADq) allows you to dedicate one or more queues to a +specific application. This can reduce latency for the specified application, +and allow Tx traffic to be rate limited per application. Follow the steps below +to set ADq. + +Requirements: + +- The sch_mqprio, act_mirred and cls_flower modules must be loaded +- The latest version of iproute2 +- If another driver (for example, DPDK) has set cloud filters, you cannot + enable ADQ +- Depending on the underlying PF device, ADQ cannot be enabled when the + following features are enabled: + + + Data Center Bridging (DCB) + + Multiple Functions per Port (MFP) + + Sideband Filters + +1. Create traffic classes (TCs). Maximum of 8 TCs can be created per interface. +The shaper bw_rlimit parameter is optional. + +Example: Sets up two tcs, tc0 and tc1, with 16 queues each and max tx rate set +to 1Gbit for tc0 and 3Gbit for tc1. + +:: + + tc qdisc add dev <interface> root mqprio num_tc 2 map 0 0 0 0 1 1 1 1 + queues 16@0 16@16 hw 1 mode channel shaper bw_rlimit min_rate 1Gbit 2Gbit + max_rate 1Gbit 3Gbit + +map: priority mapping for up to 16 priorities to tcs (e.g. map 0 0 0 0 1 1 1 1 +sets priorities 0-3 to use tc0 and 4-7 to use tc1) + +queues: for each tc, <num queues>@<offset> (e.g. queues 16@0 16@16 assigns +16 queues to tc0 at offset 0 and 16 queues to tc1 at offset 16. Max total +number of queues for all tcs is 64 or number of cores, whichever is lower.) + +hw 1 mode channel: ‘channel’ with ‘hw’ set to 1 is a new new hardware +offload mode in mqprio that makes full use of the mqprio options, the +TCs, the queue configurations, and the QoS parameters. + +shaper bw_rlimit: for each tc, sets minimum and maximum bandwidth rates. +Totals must be equal or less than port speed. + +For example: min_rate 1Gbit 3Gbit: Verify bandwidth limit using network +monitoring tools such as ``ifstat`` or ``sar -n DEV [interval] [number of samples]`` + +NOTE: + Setting up channels via ethtool (ethtool -L) is not supported when the + TCs are configured using mqprio. + +2. Enable HW TC offload on interface:: + + # ethtool -K <interface> hw-tc-offload on + +3. Apply TCs to ingress (RX) flow of interface:: + + # tc qdisc add dev <interface> ingress + +NOTES: + - Run all tc commands from the iproute2 <pathtoiproute2>/tc/ directory + - ADq is not compatible with cloud filters + - Setting up channels via ethtool (ethtool -L) is not supported when the TCs + are configured using mqprio + - You must have iproute2 latest version + - NVM version 6.01 or later is required + - ADq cannot be enabled when any the following features are enabled: Data + Center Bridging (DCB), Multiple Functions per Port (MFP), or Sideband Filters + - If another driver (for example, DPDK) has set cloud filters, you cannot + enable ADq + - Tunnel filters are not supported in ADq. If encapsulated packets do arrive + in non-tunnel mode, filtering will be done on the inner headers. For example, + for VXLAN traffic in non-tunnel mode, PCTYPE is identified as a VXLAN + encapsulated packet, outer headers are ignored. Therefore, inner headers are + matched. + - If a TC filter on a PF matches traffic over a VF (on the PF), that traffic + will be routed to the appropriate queue of the PF, and will not be passed on + the VF. Such traffic will end up getting dropped higher up in the TCP/IP + stack as it does not match PF address data. + - If traffic matches multiple TC filters that point to different TCs, that + traffic will be duplicated and sent to all matching TC queues. The hardware + switch mirrors the packet to a VSI list when multiple filters are matched. + + +Known Issues/Troubleshooting +============================ + +Bonding fails with VFs bound to an Intel(R) Ethernet Controller 700 series device +--------------------------------------------------------------------------------- +If you bind Virtual Functions (VFs) to an Intel(R) Ethernet Controller 700 +series based device, the VF slaves may fail when they become the active slave. +If the MAC address of the VF is set by the PF (Physical Function) of the +device, when you add a slave, or change the active-backup slave, Linux bonding +tries to sync the backup slave's MAC address to the same MAC address as the +active slave. Linux bonding will fail at this point. This issue will not occur +if the VF's MAC address is not set by the PF. + +Traffic Is Not Being Passed Between VM and Client +------------------------------------------------- +You may not be able to pass traffic between a client system and a +Virtual Machine (VM) running on a separate host if the Virtual Function +(VF, or Virtual NIC) is not in trusted mode and spoof checking is enabled +on the VF. Note that this situation can occur in any combination of client, +host, and guest operating system. For information on how to set the VF to +trusted mode, refer to the section "VLAN Tag Packet Steering" in this +readme document. For information on setting spoof checking, refer to the +section "MAC and VLAN anti-spoofing feature" in this readme document. + +Do not unload port driver if VF with active VM is bound to it +------------------------------------------------------------- +Do not unload a port's driver if a Virtual Function (VF) with an active Virtual +Machine (VM) is bound to it. Doing so will cause the port to appear to hang. +Once the VM shuts down, or otherwise releases the VF, the command will complete. + +Using four traffic classes fails +-------------------------------- +Do not try to reserve more than three traffic classes in the iavf driver. Doing +so will fail to set any traffic classes and will cause the driver to write +errors to stdout. Use a maximum of three queues to avoid this issue. + +Multiple log error messages on iavf driver removal +-------------------------------------------------- +If you have several VFs and you remove the iavf driver, several instances of +the following log errors are written to the log:: + + Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err ok + Unable to send the message to VF 2 aq_err 12 + ARQ Overflow Error detected + +Virtual machine does not get link +--------------------------------- +If the virtual machine has more than one virtual port assigned to it, and those +virtual ports are bound to different physical ports, you may not get link on +all of the virtual ports. The following command may work around the issue:: + + # ethtool -r <PF> + +Where <PF> is the PF interface in the host, for example: p5p1. You may need to +run the command more than once to get link on all virtual ports. + +MAC address of Virtual Function changes unexpectedly +---------------------------------------------------- +If a Virtual Function's MAC address is not assigned in the host, then the VF +(virtual function) driver will use a random MAC address. This random MAC +address may change each time the VF driver is reloaded. You can assign a static +MAC address in the host machine. This static MAC address will survive +a VF driver reload. + +Driver Buffer Overflow Fix +-------------------------- +The fix to resolve CVE-2016-8105, referenced in Intel SA-00069 +https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00069.html +is included in this and future versions of the driver. + +Multiple Interfaces on Same Ethernet Broadcast Network +------------------------------------------------------ +Due to the default ARP behavior on Linux, it is not possible to have one system +on two IP networks in the same Ethernet broadcast domain (non-partitioned +switch) behave as expected. All Ethernet interfaces will respond to IP traffic +for any IP address assigned to the system. This results in unbalanced receive +traffic. + +If you have multiple interfaces in a server, either turn on ARP filtering by +entering:: + + # echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter + +NOTE: + This setting is not saved across reboots. The configuration change can be + made permanent by adding the following line to the file /etc/sysctl.conf:: + + net.ipv4.conf.all.arp_filter = 1 + +Another alternative is to install the interfaces in separate broadcast domains +(either in different switches or in a switch partitioned to VLANs). + +Rx Page Allocation Errors +------------------------- +'Page allocation failure. order:0' errors may occur under stress. +This is caused by the way the Linux kernel reports this stressed condition. + + +Support +======= +For general information, go to the Intel support website at: + +https://support.intel.com + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on the supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst new file mode 100644 index 000000000..dc2e60ced --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst @@ -0,0 +1,1040 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +================================================================= +Linux Base Driver for the Intel(R) Ethernet Controller 800 Series +================================================================= + +Intel ice Linux driver. +Copyright(c) 2018-2021 Intel Corporation. + +Contents +======== + +- Overview +- Identifying Your Adapter +- Important Notes +- Additional Features & Configurations +- Performance Optimization + + +The associated Virtual Function (VF) driver for this driver is iavf. + +Driver information can be obtained using ethtool and lspci. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel adapter. All hardware requirements listed apply to use +with Linux. + +This driver supports XDP (Express Data Path) and AF_XDP zero-copy. Note that +XDP is blocked for frame sizes larger than 3KB. + + +Identifying Your Adapter +======================== +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + + +Important Notes +=============== + +Packet drops may occur under receive stress +------------------------------------------- +Devices based on the Intel(R) Ethernet Controller 800 Series are designed to +tolerate a limited amount of system latency during PCIe and DMA transactions. +If these transactions take longer than the tolerated latency, it can impact the +length of time the packets are buffered in the device and associated memory, +which may result in dropped packets. These packets drops typically do not have +a noticeable impact on throughput and performance under standard workloads. + +If these packet drops appear to affect your workload, the following may improve +the situation: + +1) Make sure that your system's physical memory is in a high-performance + configuration, as recommended by the platform vendor. A common + recommendation is for all channels to be populated with a single DIMM + module. +2) In your system's BIOS/UEFI settings, select the "Performance" profile. +3) Your distribution may provide tools like "tuned," which can help tweak + kernel settings to achieve better standard settings for different workloads. + + +Configuring SR-IOV for improved network security +------------------------------------------------ +In a virtualized environment, on Intel(R) Ethernet Network Adapters that +support SR-IOV, the virtual function (VF) may be subject to malicious behavior. +Software-generated layer two frames, like IEEE 802.3x (link flow control), IEEE +802.1Qbb (priority based flow-control), and others of this type, are not +expected and can throttle traffic between the host and the virtual switch, +reducing performance. To resolve this issue, and to ensure isolation from +unintended traffic streams, configure all SR-IOV enabled ports for VLAN tagging +from the administrative interface on the PF. This configuration allows +unexpected, and potentially malicious, frames to be dropped. + +See "Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports" later in this +README for configuration instructions. + + +Do not unload port driver if VF with active VM is bound to it +------------------------------------------------------------- +Do not unload a port's driver if a Virtual Function (VF) with an active Virtual +Machine (VM) is bound to it. Doing so will cause the port to appear to hang. +Once the VM shuts down, or otherwise releases the VF, the command will +complete. + + +Important notes for SR-IOV and Link Aggregation +----------------------------------------------- +Link Aggregation is mutually exclusive with SR-IOV. + +- If Link Aggregation is active, SR-IOV VFs cannot be created on the PF. +- If SR-IOV is active, you cannot set up Link Aggregation on the interface. + +Bridging and MACVLAN are also affected by this. If you wish to use bridging or +MACVLAN with SR-IOV, you must set up bridging or MACVLAN before enabling +SR-IOV. If you are using bridging or MACVLAN in conjunction with SR-IOV, and +you want to remove the interface from the bridge or MACVLAN, you must follow +these steps: + +1. Destroy SR-IOV VFs if they exist +2. Remove the interface from the bridge or MACVLAN +3. Recreate SRIOV VFs as needed + + +Additional Features and Configurations +====================================== + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: +https://kernel.org/pub/software/network/ethtool/ + +NOTE: The rx_bytes value of ethtool does not match the rx_bytes value of +Netdev, due to the 4-byte CRC being stripped by the device. The difference +between the two rx_bytes values will be 4 x the number of Rx packets. For +example, if Rx packets are 10 and Netdev (software statistics) displays +rx_bytes as "X", then ethtool (hardware statistics) will display rx_bytes as +"X+40" (4 bytes CRC x 10 packets). + + +Viewing Link Messages +--------------------- +Link messages will not be displayed to the console if the distribution is +restricting system messages. In order to see network driver link messages on +your console, set dmesg to eight by entering the following:: + + # dmesg -n 8 + +NOTE: This setting is not saved across reboots. + + +Dynamic Device Personalization +------------------------------ +Dynamic Device Personalization (DDP) allows you to change the packet processing +pipeline of a device by applying a profile package to the device at runtime. +Profiles can be used to, for example, add support for new protocols, change +existing protocols, or change default settings. DDP profiles can also be rolled +back without rebooting the system. + +The DDP package loads during device initialization. The driver looks for +``intel/ice/ddp/ice.pkg`` in your firmware root (typically ``/lib/firmware/`` +or ``/lib/firmware/updates/``) and checks that it contains a valid DDP package +file. + +NOTE: Your distribution should likely have provided the latest DDP file, but if +ice.pkg is missing, you can find it in the linux-firmware repository or from +intel.com. + +If the driver is unable to load the DDP package, the device will enter Safe +Mode. Safe Mode disables advanced and performance features and supports only +basic traffic and minimal functionality, such as updating the NVM or +downloading a new driver or DDP package. Safe Mode only applies to the affected +physical function and does not impact any other PFs. See the "Intel(R) Ethernet +Adapters and Devices User Guide" for more details on DDP and Safe Mode. + +NOTES: + +- If you encounter issues with the DDP package file, you may need to download + an updated driver or DDP package file. See the log messages for more + information. + +- The ice.pkg file is a symbolic link to the default DDP package file. + +- You cannot update the DDP package if any PF drivers are already loaded. To + overwrite a package, unload all PFs and then reload the driver with the new + package. + +- Only the first loaded PF per device can download a package for that device. + +You can install specific DDP package files for different physical devices in +the same system. To install a specific DDP package file: + +1. Download the DDP package file you want for your device. + +2. Rename the file ice-xxxxxxxxxxxxxxxx.pkg, where 'xxxxxxxxxxxxxxxx' is the + unique 64-bit PCI Express device serial number (in hex) of the device you + want the package downloaded on. The filename must include the complete + serial number (including leading zeros) and be all lowercase. For example, + if the 64-bit serial number is b887a3ffffca0568, then the file name would be + ice-b887a3ffffca0568.pkg. + + To find the serial number from the PCI bus address, you can use the + following command:: + + # lspci -vv -s af:00.0 | grep -i Serial + Capabilities: [150 v1] Device Serial Number b8-87-a3-ff-ff-ca-05-68 + + You can use the following command to format the serial number without the + dashes:: + + # lspci -vv -s af:00.0 | grep -i Serial | awk '{print $7}' | sed s/-//g + b887a3ffffca0568 + +3. Copy the renamed DDP package file to + ``/lib/firmware/updates/intel/ice/ddp/``. If the directory does not yet + exist, create it before copying the file. + +4. Unload all of the PFs on the device. + +5. Reload the driver with the new package. + +NOTE: The presence of a device-specific DDP package file overrides the loading +of the default DDP package file (ice.pkg). + + +Intel(R) Ethernet Flow Director +------------------------------- +The Intel Ethernet Flow Director performs the following tasks: + +- Directs receive packets according to their flows to different queues +- Enables tight control on routing a flow in the platform +- Matches flows and CPU cores for flow affinity + +NOTE: This driver supports the following flow types: + +- IPv4 +- TCPv4 +- UDPv4 +- SCTPv4 +- IPv6 +- TCPv6 +- UDPv6 +- SCTPv6 + +Each flow type supports valid combinations of IP addresses (source or +destination) and UDP/TCP/SCTP ports (source and destination). You can supply +only a source IP address, a source IP address and a destination port, or any +combination of one or more of these four parameters. + +NOTE: This driver allows you to filter traffic based on a user-defined flexible +two-byte pattern and offset by using the ethtool user-def and mask fields. Only +L3 and L4 flow types are supported for user-defined flexible filters. For a +given flow type, you must clear all Intel Ethernet Flow Director filters before +changing the input set (for that flow type). + + +Flow Director Filters +--------------------- +Flow Director filters are used to direct traffic that matches specified +characteristics. They are enabled through ethtool's ntuple interface. To enable +or disable the Intel Ethernet Flow Director and these filters:: + + # ethtool -K <ethX> ntuple <off|on> + +NOTE: When you disable ntuple filters, all the user programmed filters are +flushed from the driver cache and hardware. All needed filters must be re-added +when ntuple is re-enabled. + +To display all of the active filters:: + + # ethtool -u <ethX> + +To add a new filter:: + + # ethtool -U <ethX> flow-type <type> src-ip <ip> [m <ip_mask>] dst-ip <ip> + [m <ip_mask>] src-port <port> [m <port_mask>] dst-port <port> [m <port_mask>] + action <queue> + + Where: + <ethX> - the Ethernet device to program + <type> - can be ip4, tcp4, udp4, sctp4, ip6, tcp6, udp6, sctp6 + <ip> - the IP address to match on + <ip_mask> - the IPv4 address to mask on + NOTE: These filters use inverted masks. + <port> - the port number to match on + <port_mask> - the 16-bit integer for masking + NOTE: These filters use inverted masks. + <queue> - the queue to direct traffic toward (-1 discards the + matched traffic) + +To delete a filter:: + + # ethtool -U <ethX> delete <N> + + Where <N> is the filter ID displayed when printing all the active filters, + and may also have been specified using "loc <N>" when adding the filter. + +EXAMPLES: + +To add a filter that directs packet to queue 2:: + + # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \ + 192.168.10.2 src-port 2000 dst-port 2001 action 2 [loc 1] + +To set a filter using only the source and destination IP address:: + + # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \ + 192.168.10.2 action 2 [loc 1] + +To set a filter based on a user-defined pattern and offset:: + + # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \ + 192.168.10.2 user-def 0x4FFFF action 2 [loc 1] + + where the value of the user-def field contains the offset (4 bytes) and + the pattern (0xffff). + +To match TCP traffic sent from 192.168.0.1, port 5300, directed to 192.168.0.5, +port 80, and then send it to queue 7:: + + # ethtool -U enp130s0 flow-type tcp4 src-ip 192.168.0.1 dst-ip 192.168.0.5 + src-port 5300 dst-port 80 action 7 + +To add a TCPv4 filter with a partial mask for a source IP subnet:: + + # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.0.0 m 0.255.255.255 dst-ip + 192.168.5.12 src-port 12600 dst-port 31 action 12 + +NOTES: + +For each flow-type, the programmed filters must all have the same matching +input set. For example, issuing the following two commands is acceptable:: + + # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7 + # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.5 src-port 55 action 10 + +Issuing the next two commands, however, is not acceptable, since the first +specifies src-ip and the second specifies dst-ip:: + + # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7 + # ethtool -U enp130s0 flow-type ip4 dst-ip 192.168.0.5 src-port 55 action 10 + +The second command will fail with an error. You may program multiple filters +with the same fields, using different values, but, on one device, you may not +program two tcp4 filters with different matching fields. + +The ice driver does not support matching on a subportion of a field, thus +partial mask fields are not supported. + + +Flex Byte Flow Director Filters +------------------------------- +The driver also supports matching user-defined data within the packet payload. +This flexible data is specified using the "user-def" field of the ethtool +command in the following way: + +.. table:: + + ============================== ============================ + ``31 28 24 20 16`` ``15 12 8 4 0`` + ``offset into packet payload`` ``2 bytes of flexible data`` + ============================== ============================ + +For example, + +:: + + ... user-def 0x4FFFF ... + +tells the filter to look 4 bytes into the payload and match that value against +0xFFFF. The offset is based on the beginning of the payload, and not the +beginning of the packet. Thus + +:: + + flow-type tcp4 ... user-def 0x8BEAF ... + +would match TCP/IPv4 packets which have the value 0xBEAF 8 bytes into the +TCP/IPv4 payload. + +Note that ICMP headers are parsed as 4 bytes of header and 4 bytes of payload. +Thus to match the first byte of the payload, you must actually add 4 bytes to +the offset. Also note that ip4 filters match both ICMP frames as well as raw +(unknown) ip4 frames, where the payload will be the L3 payload of the IP4 +frame. + +The maximum offset is 64. The hardware will only read up to 64 bytes of data +from the payload. The offset must be even because the flexible data is 2 bytes +long and must be aligned to byte 0 of the packet payload. + +The user-defined flexible offset is also considered part of the input set and +cannot be programmed separately for multiple filters of the same type. However, +the flexible data is not part of the input set and multiple filters may use the +same offset but match against different data. + + +RSS Hash Flow +------------- +Allows you to set the hash bytes per flow type and any combination of one or +more options for Receive Side Scaling (RSS) hash byte configuration. + +:: + + # ethtool -N <ethX> rx-flow-hash <type> <option> + + Where <type> is: + tcp4 signifying TCP over IPv4 + udp4 signifying UDP over IPv4 + tcp6 signifying TCP over IPv6 + udp6 signifying UDP over IPv6 + And <option> is one or more of: + s Hash on the IP source address of the Rx packet. + d Hash on the IP destination address of the Rx packet. + f Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet. + n Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet. + + +Accelerated Receive Flow Steering (aRFS) +---------------------------------------- +Devices based on the Intel(R) Ethernet Controller 800 Series support +Accelerated Receive Flow Steering (aRFS) on the PF. aRFS is a load-balancing +mechanism that allows you to direct packets to the same CPU where an +application is running or consuming the packets in that flow. + +NOTES: + +- aRFS requires that ntuple filtering is enabled via ethtool. +- aRFS support is limited to the following packet types: + + - TCP over IPv4 and IPv6 + - UDP over IPv4 and IPv6 + - Nonfragmented packets + +- aRFS only supports Flow Director filters, which consist of the + source/destination IP addresses and source/destination ports. +- aRFS and ethtool's ntuple interface both use the device's Flow Director. aRFS + and ntuple features can coexist, but you may encounter unexpected results if + there's a conflict between aRFS and ntuple requests. See "Intel(R) Ethernet + Flow Director" for additional information. + +To set up aRFS: + +1. Enable the Intel Ethernet Flow Director and ntuple filters using ethtool. + +:: + + # ethtool -K <ethX> ntuple on + +2. Set up the number of entries in the global flow table. For example: + +:: + + # NUM_RPS_ENTRIES=16384 + # echo $NUM_RPS_ENTRIES > /proc/sys/net/core/rps_sock_flow_entries + +3. Set up the number of entries in the per-queue flow table. For example: + +:: + + # NUM_RX_QUEUES=64 + # for file in /sys/class/net/$IFACE/queues/rx-*/rps_flow_cnt; do + # echo $(($NUM_RPS_ENTRIES/$NUM_RX_QUEUES)) > $file; + # done + +4. Disable the IRQ balance daemon (this is only a temporary stop of the service + until the next reboot). + +:: + + # systemctl stop irqbalance + +5. Configure the interrupt affinity. + + See ``/Documentation/core-api/irq/irq-affinity.rst`` + + +To disable aRFS using ethtool:: + + # ethtool -K <ethX> ntuple off + +NOTE: This command will disable ntuple filters and clear any aRFS filters in +software and hardware. + +Example Use Case: + +1. Set the server application on the desired CPU (e.g., CPU 4). + +:: + + # taskset -c 4 netserver + +2. Use netperf to route traffic from the client to CPU 4 on the server with + aRFS configured. This example uses TCP over IPv4. + +:: + + # netperf -H <Host IPv4 Address> -t TCP_STREAM + + +Enabling Virtual Functions (VFs) +-------------------------------- +Use sysfs to enable virtual functions (VF). + +For example, you can create 4 VFs as follows:: + + # echo 4 > /sys/class/net/<ethX>/device/sriov_numvfs + +To disable VFs, write 0 to the same file:: + + # echo 0 > /sys/class/net/<ethX>/device/sriov_numvfs + +The maximum number of VFs for the ice driver is 256 total (all ports). To check +how many VFs each PF supports, use the following command:: + + # cat /sys/class/net/<ethX>/device/sriov_totalvfs + +Note: You cannot use SR-IOV when link aggregation (LAG)/bonding is active, and +vice versa. To enforce this, the driver checks for this mutual exclusion. + + +Displaying VF Statistics on the PF +---------------------------------- +Use the following command to display the statistics for the PF and its VFs:: + + # ip -s link show dev <ethX> + +NOTE: The output of this command can be very large due to the maximum number of +possible VFs. + +The PF driver will display a subset of the statistics for the PF and for all +VFs that are configured. The PF will always print a statistics block for each +of the possible VFs, and it will show zero for all unconfigured VFs. + + +Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports +-------------------------------------------------------- +To configure VLAN tagging for the ports on an SR-IOV enabled adapter, use the +following command. The VLAN configuration should be done before the VF driver +is loaded or the VM is booted. The VF is not aware of the VLAN tag being +inserted on transmit and removed on received frames (sometimes called "port +VLAN" mode). + +:: + + # ip link set dev <ethX> vf <id> vlan <vlan id> + +For example, the following will configure PF eth0 and the first VF on VLAN 10:: + + # ip link set dev eth0 vf 0 vlan 10 + + +Enabling a VF link if the port is disconnected +---------------------------------------------- +If the physical function (PF) link is down, you can force link up (from the +host PF) on any virtual functions (VF) bound to the PF. + +For example, to force link up on VF 0 bound to PF eth0:: + + # ip link set eth0 vf 0 state enable + +Note: If the command does not work, it may not be supported by your system. + + +Setting the MAC Address for a VF +-------------------------------- +To change the MAC address for the specified VF:: + + # ip link set <ethX> vf 0 mac <address> + +For example:: + + # ip link set <ethX> vf 0 mac 00:01:02:03:04:05 + +This setting lasts until the PF is reloaded. + +NOTE: Assigning a MAC address for a VF from the host will disable any +subsequent requests to change the MAC address from within the VM. This is a +security feature. The VM is not aware of this restriction, so if this is +attempted in the VM, it will trigger MDD events. + + +Trusted VFs and VF Promiscuous Mode +----------------------------------- +This feature allows you to designate a particular VF as trusted and allows that +trusted VF to request selective promiscuous mode on the Physical Function (PF). + +To set a VF as trusted or untrusted, enter the following command in the +Hypervisor:: + + # ip link set dev <ethX> vf 1 trust [on|off] + +NOTE: It's important to set the VF to trusted before setting promiscuous mode. +If the VM is not trusted, the PF will ignore promiscuous mode requests from the +VF. If the VM becomes trusted after the VF driver is loaded, you must make a +new request to set the VF to promiscuous. + +Once the VF is designated as trusted, use the following commands in the VM to +set the VF to promiscuous mode. + +For promiscuous all:: + + # ip link set <ethX> promisc on + Where <ethX> is a VF interface in the VM + +For promiscuous Multicast:: + + # ip link set <ethX> allmulticast on + Where <ethX> is a VF interface in the VM + +NOTE: By default, the ethtool private flag vf-true-promisc-support is set to +"off," meaning that promiscuous mode for the VF will be limited. To set the +promiscuous mode for the VF to true promiscuous and allow the VF to see all +ingress traffic, use the following command:: + + # ethtool --set-priv-flags <ethX> vf-true-promisc-support on + +The vf-true-promisc-support private flag does not enable promiscuous mode; +rather, it designates which type of promiscuous mode (limited or true) you will +get when you enable promiscuous mode using the ip link commands above. Note +that this is a global setting that affects the entire device. However, the +vf-true-promisc-support private flag is only exposed to the first PF of the +device. The PF remains in limited promiscuous mode regardless of the +vf-true-promisc-support setting. + +Next, add a VLAN interface on the VF interface. For example:: + + # ip link add link eth2 name eth2.100 type vlan id 100 + +Note that the order in which you set the VF to promiscuous mode and add the +VLAN interface does not matter (you can do either first). The result in this +example is that the VF will get all traffic that is tagged with VLAN 100. + + +Malicious Driver Detection (MDD) for VFs +---------------------------------------- +Some Intel Ethernet devices use Malicious Driver Detection (MDD) to detect +malicious traffic from the VF and disable Tx/Rx queues or drop the offending +packet until a VF driver reset occurs. You can view MDD messages in the PF's +system log using the dmesg command. + +- If the PF driver logs MDD events from the VF, confirm that the correct VF + driver is installed. +- To restore functionality, you can manually reload the VF or VM or enable + automatic VF resets. +- When automatic VF resets are enabled, the PF driver will immediately reset + the VF and reenable queues when it detects MDD events on the receive path. +- If automatic VF resets are disabled, the PF will not automatically reset the + VF when it detects MDD events. + +To enable or disable automatic VF resets, use the following command:: + + # ethtool --set-priv-flags <ethX> mdd-auto-reset-vf on|off + + +MAC and VLAN Anti-Spoofing Feature for VFs +------------------------------------------ +When a malicious driver on a Virtual Function (VF) interface attempts to send a +spoofed packet, it is dropped by the hardware and not transmitted. + +NOTE: This feature can be disabled for a specific VF:: + + # ip link set <ethX> vf <vf id> spoofchk {off|on} + + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ifconfig command to increase the MTU size. For example, enter the +following where <ethX> is the interface number:: + + # ifconfig <ethX> mtu 9000 up + +Alternatively, you can use the ip command as follows:: + + # ip link set mtu 9000 dev <ethX> + # ip link set up dev <ethX> + +This setting is not saved across reboots. + + +NOTE: The maximum MTU setting for jumbo frames is 9702. This corresponds to the +maximum jumbo frame size of 9728 bytes. + +NOTE: This driver will attempt to use multiple page sized buffers to receive +each jumbo packet. This should help to avoid buffer starvation issues when +allocating receive packets. + +NOTE: Packet loss may have a greater impact on throughput when you use jumbo +frames. If you observe a drop in performance after enabling jumbo frames, +enabling flow control may mitigate the issue. + + +Speed and Duplex Configuration +------------------------------ +In addressing speed and duplex configuration issues, you need to distinguish +between copper-based adapters and fiber-based adapters. + +In the default mode, an Intel(R) Ethernet Network Adapter using copper +connections will attempt to auto-negotiate with its link partner to determine +the best setting. If the adapter cannot establish link with the link partner +using auto-negotiation, you may need to manually configure the adapter and link +partner to identical settings to establish link and pass packets. This should +only be needed when attempting to link with an older switch that does not +support auto-negotiation or one that has been forced to a specific speed or +duplex mode. Your link partner must match the setting you choose. 1 Gbps speeds +and higher cannot be forced. Use the autonegotiation advertising setting to +manually set devices for 1 Gbps and higher. + +Speed, duplex, and autonegotiation advertising are configured through the +ethtool utility. For the latest version, download and install ethtool from the +following website: + + https://kernel.org/pub/software/network/ethtool/ + +To see the speed configurations your device supports, run the following:: + + # ethtool <ethX> + +Caution: Only experienced network administrators should force speed and duplex +or change autonegotiation advertising manually. The settings at the switch must +always match the adapter settings. Adapter performance may suffer or your +adapter may not operate if you configure the adapter differently from your +switch. + + +Data Center Bridging (DCB) +-------------------------- +NOTE: The kernel assumes that TC0 is available, and will disable Priority Flow +Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is +enabled when setting up DCB on your switch. + +DCB is a configuration Quality of Service implementation in hardware. It uses +the VLAN priority tag (802.1p) to filter traffic. That means that there are 8 +different priorities that traffic can be filtered into. It also enables +priority flow control (802.1Qbb) which can limit or eliminate the number of +dropped packets during network stress. Bandwidth can be allocated to each of +these priorities, which is enforced at the hardware level (802.1Qaz). + +DCB is normally configured on the network using the DCBX protocol (802.1Qaz), a +specialization of LLDP (802.1AB). The ice driver supports the following +mutually exclusive variants of DCBX support: + +1) Firmware-based LLDP Agent +2) Software-based LLDP Agent + +In firmware-based mode, firmware intercepts all LLDP traffic and handles DCBX +negotiation transparently for the user. In this mode, the adapter operates in +"willing" DCBX mode, receiving DCB settings from the link partner (typically a +switch). The local user can only query the negotiated DCB configuration. For +information on configuring DCBX parameters on a switch, please consult the +switch manufacturer's documentation. + +In software-based mode, LLDP traffic is forwarded to the network stack and user +space, where a software agent can handle it. In this mode, the adapter can +operate in either "willing" or "nonwilling" DCBX mode and DCB configuration can +be both queried and set locally. This mode requires the FW-based LLDP Agent to +be disabled. + +NOTE: + +- You can enable and disable the firmware-based LLDP Agent using an ethtool + private flag. Refer to the "FW-LLDP (Firmware Link Layer Discovery Protocol)" + section in this README for more information. +- In software-based DCBX mode, you can configure DCB parameters using software + LLDP/DCBX agents that interface with the Linux kernel's DCB Netlink API. We + recommend using OpenLLDP as the DCBX agent when running in software mode. For + more information, see the OpenLLDP man pages and + https://github.com/intel/openlldp. +- The driver implements the DCB netlink interface layer to allow the user space + to communicate with the driver and query DCB configuration for the port. +- iSCSI with DCB is not supported. + + +FW-LLDP (Firmware Link Layer Discovery Protocol) +------------------------------------------------ +Use ethtool to change FW-LLDP settings. The FW-LLDP setting is per port and +persists across boots. + +To enable LLDP:: + + # ethtool --set-priv-flags <ethX> fw-lldp-agent on + +To disable LLDP:: + + # ethtool --set-priv-flags <ethX> fw-lldp-agent off + +To check the current LLDP setting:: + + # ethtool --show-priv-flags <ethX> + +NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to +take effect. If "LLDP AGENT" is set to disabled, you cannot enable it from the +OS. + + +Flow Control +------------ +Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable +receiving and transmitting pause frames for ice. When transmit is enabled, +pause frames are generated when the receive packet buffer crosses a predefined +threshold. When receive is enabled, the transmit unit will halt for the time +delay specified when a pause frame is received. + +NOTE: You must have a flow control capable link partner. + +Flow Control is disabled by default. + +Use ethtool to change the flow control settings. + +To enable or disable Rx or Tx Flow Control:: + + # ethtool -A <ethX> rx <on|off> tx <on|off> + +Note: This command only enables or disables Flow Control if auto-negotiation is +disabled. If auto-negotiation is enabled, this command changes the parameters +used for auto-negotiation with the link partner. + +Note: Flow Control auto-negotiation is part of link auto-negotiation. Depending +on your device, you may not be able to change the auto-negotiation setting. + +NOTE: + +- The ice driver requires flow control on both the port and link partner. If + flow control is disabled on one of the sides, the port may appear to hang on + heavy traffic. +- You may encounter issues with link-level flow control (LFC) after disabling + DCB. The LFC status may show as enabled but traffic is not paused. To resolve + this issue, disable and reenable LFC using ethtool:: + + # ethtool -A <ethX> rx off tx off + # ethtool -A <ethX> rx on tx on + + +NAPI +---- +This driver supports NAPI (Rx polling mode). +For more information on NAPI, see +https://www.linuxfoundation.org/collaborate/workgroups/networking/napi + + +MACVLAN +------- +This driver supports MACVLAN. Kernel support for MACVLAN can be tested by +checking if the MACVLAN driver is loaded. You can run 'lsmod | grep macvlan' to +see if the MACVLAN driver is loaded or run 'modprobe macvlan' to try to load +the MACVLAN driver. + +NOTE: + +- In passthru mode, you can only set up one MACVLAN device. It will inherit the + MAC address of the underlying PF (Physical Function) device. + + +IEEE 802.1ad (QinQ) Support +--------------------------- +The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN +IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as +"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks +allow L2 tunneling and the ability to segregate traffic within a particular +VLAN ID, among other uses. + +NOTES: + +- Receive checksum offloads and VLAN acceleration are not supported for 802.1ad + (QinQ) packets. + +- 0x88A8 traffic will not be received unless VLAN stripping is disabled with + the following command:: + + # ethtool -K <ethX> rxvlan off + +- 0x88A8/0x8100 double VLANs cannot be used with 0x8100 or 0x8100/0x8100 VLANS + configured on the same port. 0x88a8/0x8100 traffic will not be received if + 0x8100 VLANs are configured. + +- The VF can only transmit 0x88A8/0x8100 (i.e., 802.1ad/802.1Q) traffic if: + + 1) The VF is not assigned a port VLAN. + 2) spoofchk is disabled from the PF. If you enable spoofchk, the VF will + not transmit 0x88A8/0x8100 traffic. + +- The VF may not receive all network traffic based on the Inner VLAN header + when VF true promiscuous mode (vf-true-promisc-support) and double VLANs are + enabled in SR-IOV mode. + +The following are examples of how to configure 802.1ad (QinQ):: + + # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24 + # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371 + + Where "24" and "371" are example VLAN IDs. + + +Tunnel/Overlay Stateless Offloads +--------------------------------- +Supported tunnels and overlays include VXLAN, GENEVE, and others depending on +hardware and software configuration. Stateless offloads are enabled by default. + +To view the current state of all offloads:: + + # ethtool -k <ethX> + + +UDP Segmentation Offload +------------------------ +Allows the adapter to offload transmit segmentation of UDP packets with +payloads up to 64K into valid Ethernet frames. Because the adapter hardware is +able to complete data segmentation much faster than operating system software, +this feature may improve transmission performance. +In addition, the adapter may use fewer CPU resources. + +NOTE: + +- The application sending UDP packets must support UDP segmentation offload. + +To enable/disable UDP Segmentation Offload, issue the following command:: + + # ethtool -K <ethX> tx-udp-segmentation [off|on] + +GNSS module +----------- +Allows user to read messages from the GNSS module and write supported commands. +If the module is physically present, driver creates 2 TTYs for each supported +device in /dev, ttyGNSS_<device>:<function>_0 and _1. First one (_0) is RW and +the second one is RO. +The protocol of write commands is dependent on the GNSS module as the driver +writes raw bytes from the TTY to the GNSS i2c. Please refer to the module +documentation for details. + +Performance Optimization +======================== +Driver defaults are meant to fit a wide variety of workloads, but if further +optimization is required, we recommend experimenting with the following +settings. + + +Rx Descriptor Ring Size +----------------------- +To reduce the number of Rx packet discards, increase the number of Rx +descriptors for each Rx ring using ethtool. + + Check if the interface is dropping Rx packets due to buffers being full + (rx_dropped.nic can mean that there is no PCIe bandwidth):: + + # ethtool -S <ethX> | grep "rx_dropped" + + If the previous command shows drops on queues, it may help to increase + the number of descriptors using 'ethtool -G':: + + # ethtool -G <ethX> rx <N> + Where <N> is the desired number of ring entries/descriptors + + This can provide temporary buffering for issues that create latency while + the CPUs process descriptors. + + +Interrupt Rate Limiting +----------------------- +This driver supports an adaptive interrupt throttle rate (ITR) mechanism that +is tuned for general workloads. The user can customize the interrupt rate +control for specific workloads, via ethtool, adjusting the number of +microseconds between interrupts. + +To set the interrupt rate manually, you must disable adaptive mode:: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off + +For lower CPU utilization: + + Disable adaptive ITR and lower Rx and Tx interrupts. The examples below + affect every queue of the specified interface. + + Setting rx-usecs and tx-usecs to 80 will limit interrupts to about + 12,500 interrupts per second per queue:: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 80 tx-usecs 80 + +For reduced latency: + + Disable adaptive ITR and ITR by setting rx-usecs and tx-usecs to 0 + using ethtool:: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 0 tx-usecs 0 + +Per-queue interrupt rate settings: + + The following examples are for queues 1 and 3, but you can adjust other + queues. + + To disable Rx adaptive ITR and set static Rx ITR to 10 microseconds or + about 100,000 interrupts/second, for queues 1 and 3:: + + # ethtool --per-queue <ethX> queue_mask 0xa --coalesce adaptive-rx off + rx-usecs 10 + + To show the current coalesce settings for queues 1 and 3:: + + # ethtool --per-queue <ethX> queue_mask 0xa --show-coalesce + +Bounding interrupt rates using rx-usecs-high: + + :Valid Range: 0-236 (0=no limit) + + The range of 0-236 microseconds provides an effective range of 4,237 to + 250,000 interrupts per second. The value of rx-usecs-high can be set + independently of rx-usecs and tx-usecs in the same ethtool command, and is + also independent of the adaptive interrupt moderation algorithm. The + underlying hardware supports granularity in 4-microsecond intervals, so + adjacent values may result in the same interrupt rate. + + The following command would disable adaptive interrupt moderation, and allow + a maximum of 5 microseconds before indicating a receive or transmit was + complete. However, instead of resulting in as many as 200,000 interrupts per + second, it limits total interrupts per second to 50,000 via the rx-usecs-high + parameter. + + :: + + # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs-high 20 + rx-usecs 5 tx-usecs 5 + + +Virtualized Environments +------------------------ +In addition to the other suggestions in this section, the following may be +helpful to optimize performance in VMs. + + Using the appropriate mechanism (vcpupin) in the VM, pin the CPUs to + individual LCPUs, making sure to use a set of CPUs included in the + device's local_cpulist: ``/sys/class/net/<ethX>/device/local_cpulist``. + + Configure as many Rx/Tx queues in the VM as available. (See the iavf driver + documentation for the number of queues supported.) For example:: + + # ethtool -L <virt_interface> rx <max> tx <max> + + +Support +======= +For general information, go to the Intel support website at: +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. + + +Trademarks +========== +Intel is a trademark or registered trademark of Intel Corporation or its +subsidiaries in the United States and/or other countries. + +* Other names and brands may be claimed as the property of others. diff --git a/Documentation/networking/device_drivers/ethernet/intel/igb.rst b/Documentation/networking/device_drivers/ethernet/intel/igb.rst new file mode 100644 index 000000000..d46289e18 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/igb.rst @@ -0,0 +1,213 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +========================================================== +Linux Base Driver for Intel(R) Ethernet Network Connection +========================================================== + +Intel Gigabit Linux driver. +Copyright(c) 1999-2018 Intel Corporation. + +Contents +======== + +- Identifying Your Adapter +- Command Line Parameters +- Additional Configurations +- Support + + +Identifying Your Adapter +======================== +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + + +Command Line Parameters +======================== +If the driver is built as a module, the following optional parameters are used +by entering them on the command line with the modprobe command using this +syntax:: + + modprobe igb [<option>=<VAL1>,<VAL2>,...] + +There needs to be a <VAL#> for each network port in the system supported by +this driver. The values will be applied to each instance, in function order. +For example:: + + modprobe igb max_vfs=2,4 + +In this case, there are two network ports supported by igb in the system. + +NOTE: A descriptor describes a data buffer and attributes related to the data +buffer. This information is accessed by the hardware. + +max_vfs +------- +:Valid Range: 0-7 + +This parameter adds support for SR-IOV. It causes the driver to spawn up to +max_vfs worth of virtual functions. If the value is greater than 0 it will +also force the VMDq parameter to be 1 or more. + +The parameters for the driver are referenced by position. Thus, if you have a +dual port adapter, or more than one adapter in your system, and want N virtual +functions per port, you must specify a number for each port with each parameter +separated by a comma. For example:: + + modprobe igb max_vfs=4 + +This will spawn 4 VFs on the first port. + +:: + + modprobe igb max_vfs=2,4 + +This will spawn 2 VFs on the first port and 4 VFs on the second port. + +NOTE: Caution must be used in loading the driver with these parameters. +Depending on your system configuration, number of slots, etc., it is impossible +to predict in all cases where the positions would be on the command line. + +NOTE: Neither the device nor the driver control how VFs are mapped into config +space. Bus layout will vary by operating system. On operating systems that +support it, you can check sysfs to find the mapping. + +NOTE: When either SR-IOV mode or VMDq mode is enabled, hardware VLAN filtering +and VLAN tag stripping/insertion will remain enabled. Please remove the old +VLAN filter before the new VLAN filter is added. For example:: + + ip link set eth0 vf 0 vlan 100 // set vlan 100 for VF 0 + ip link set eth0 vf 0 vlan 0 // Delete vlan 100 + ip link set eth0 vf 0 vlan 200 // set a new vlan 200 for VF 0 + +Debug +----- +:Valid Range: 0-16 (0=none,...,16=all) +:Default Value: 0 + +This parameter adjusts the level debug messages displayed in the system logs. + + +Additional Features and Configurations +====================================== + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ifconfig command to increase the MTU size. For example, enter the +following where <x> is the interface number:: + + ifconfig eth<x> mtu 9000 up + +Alternatively, you can use the ip command as follows:: + + ip link set mtu 9000 dev eth<x> + ip link set up dev eth<x> + +This setting is not saved across reboots. The setting change can be made +permanent by adding 'MTU=9000' to the file: + +- For RHEL: /etc/sysconfig/network-scripts/ifcfg-eth<x> +- For SLES: /etc/sysconfig/network/<config_file> + +NOTE: The maximum MTU setting for Jumbo Frames is 9216. This value coincides +with the maximum Jumbo Frames size of 9234 bytes. + +NOTE: Using Jumbo frames at 10 or 100 Mbps is not supported and may result in +poor performance or loss of link. + + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: + +https://www.kernel.org/pub/software/network/ethtool/ + + +Enabling Wake on LAN (WoL) +-------------------------- +WoL is configured through the ethtool utility. + +WoL will be enabled on the system during the next shut down or reboot. For +this driver version, in order to enable WoL, the igb driver must be loaded +prior to shutting down or suspending the system. + +NOTE: Wake on LAN is only supported on port A of multi-port devices. Also +Wake On LAN is not supported for the following device: +- Intel(R) Gigabit VT Quad Port Server Adapter + + +Multiqueue +---------- +In this mode, a separate MSI-X vector is allocated for each queue and one for +"other" interrupts such as link status change and errors. All interrupts are +throttled via interrupt moderation. Interrupt moderation must be used to avoid +interrupt storms while the driver is processing one interrupt. The moderation +value should be at least as large as the expected time for the driver to +process an interrupt. Multiqueue is off by default. + +REQUIREMENTS: MSI-X support is required for Multiqueue. If MSI-X is not found, +the system will fallback to MSI or to Legacy interrupts. This driver supports +receive multiqueue on all kernels that support MSI-X. + +NOTE: On some kernels a reboot is required to switch between single queue mode +and multiqueue mode or vice-versa. + + +MAC and VLAN anti-spoofing feature +---------------------------------- +When a malicious driver attempts to send a spoofed packet, it is dropped by the +hardware and not transmitted. + +An interrupt is sent to the PF driver notifying it of the spoof attempt. When a +spoofed packet is detected, the PF driver will send the following message to +the system log (displayed by the "dmesg" command): +Spoof event(s) detected on VF(n), where n = the VF that attempted to do the +spoofing + + +Setting MAC Address, VLAN and Rate Limit Using IProute2 Tool +------------------------------------------------------------ +You can set a MAC address of a Virtual Function (VF), a default VLAN and the +rate limit using the IProute2 tool. Download the latest version of the +IProute2 tool from Sourceforge if your version does not have all the features +you require. + +Credit Based Shaper (Qav Mode) +------------------------------ +When enabling the CBS qdisc in the hardware offload mode, traffic shaping using +the CBS (described in the IEEE 802.1Q-2018 Section 8.6.8.2 and discussed in the +Annex L) algorithm will run in the i210 controller, so it's more accurate and +uses less CPU. + +When using offloaded CBS, and the traffic rate obeys the configured rate +(doesn't go above it), CBS should have little to no effect in the latency. + +The offloaded version of the algorithm has some limits, caused by how the idle +slope is expressed in the adapter's registers. It can only represent idle slopes +in 16.38431 kbps units, which means that if a idle slope of 2576kbps is +requested, the controller will be configured to use a idle slope of ~2589 kbps, +because the driver rounds the value up. For more details, see the comments on +:c:func:`igb_config_tx_modes()`. + +NOTE: This feature is exclusive to i210 models. + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst b/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst new file mode 100644 index 000000000..40fa210c5 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/igbvf.rst @@ -0,0 +1,65 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +=========================================================== +Linux Base Virtual Function Driver for Intel(R) 1G Ethernet +=========================================================== + +Intel Gigabit Virtual Function Linux driver. +Copyright(c) 1999-2018 Intel Corporation. + +Contents +======== +- Identifying Your Adapter +- Additional Configurations +- Support + +This driver supports Intel 82576-based virtual function devices-based virtual +function devices that can only be activated on kernels that support SR-IOV. + +SR-IOV requires the correct platform and OS support. + +The guest OS loading this driver must support MSI-X interrupts. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel adapter. All hardware requirements listed apply to use +with Linux. + +Driver information can be obtained using ethtool, lspci, and ifconfig. +Instructions on updating ethtool can be found in the section Additional +Configurations later in this document. + +NOTE: There is a limit of a total of 32 shared VLANs to 1 or more VFs. + + +Identifying Your Adapter +======================== +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + + +Additional Features and Configurations +====================================== + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: + +https://www.kernel.org/pub/software/network/ethtool/ + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst new file mode 100644 index 000000000..c6a233e68 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/ixgb.rst @@ -0,0 +1,468 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +===================================================================== +Linux Base Driver for 10 Gigabit Intel(R) Ethernet Network Connection +===================================================================== + +October 1, 2018 + + +Contents +======== + +- In This Release +- Identifying Your Adapter +- Command Line Parameters +- Improving Performance +- Additional Configurations +- Known Issues/Troubleshooting +- Support + + + +In This Release +=============== + +This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R) +Network Connection. This driver includes support for Itanium(R)2-based +systems. + +For questions related to hardware requirements, refer to the documentation +supplied with your 10 Gigabit adapter. All hardware requirements listed apply +to use with Linux. + +The following features are available in this kernel: + - Native VLANs + - Channel Bonding (teaming) + - SNMP + +Channel Bonding documentation can be found in the Linux kernel source: +/Documentation/networking/bonding.rst + +The driver information previously displayed in the /proc filesystem is not +supported in this release. Alternatively, you can use ethtool (version 1.6 +or later), lspci, and iproute2 to obtain the same information. + +Instructions on updating ethtool can be found in the section "Additional +Configurations" later in this document. + + +Identifying Your Adapter +======================== + +The following Intel network adapters are compatible with the drivers in this +release: + ++------------+------------------------------+----------------------------------+ +| Controller | Adapter Name | Physical Layer | ++============+==============================+==================================+ +| 82597EX | Intel(R) PRO/10GbE LR/SR/CX4 | - 10G Base-LR (fiber) | +| | Server Adapters | - 10G Base-SR (fiber) | +| | | - 10G Base-CX4 (copper) | ++------------+------------------------------+----------------------------------+ + +For more information on how to identify your adapter, go to the Adapter & +Driver ID Guide at: + + https://support.intel.com + + +Command Line Parameters +======================= + +If the driver is built as a module, the following optional parameters are +used by entering them on the command line with the modprobe command using +this syntax:: + + modprobe ixgb [<option>=<VAL1>,<VAL2>,...] + +For example, with two 10GbE PCI adapters, entering:: + + modprobe ixgb TxDescriptors=80,128 + +loads the ixgb driver with 80 TX resources for the first adapter and 128 TX +resources for the second adapter. + +The default value for each parameter is generally the recommended setting, +unless otherwise noted. + +Copybreak +--------- +:Valid Range: 0-XXXX +:Default Value: 256 + + This is the maximum size of packet that is copied to a new buffer on + receive. + +Debug +----- +:Valid Range: 0-16 (0=none,...,16=all) +:Default Value: 0 + + This parameter adjusts the level of debug messages displayed in the + system logs. + +FlowControl +----------- +:Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx) +:Default Value: 1 if no EEPROM, otherwise read from EEPROM + + This parameter controls the automatic generation(Tx) and response(Rx) to + Ethernet PAUSE frames. There are hardware bugs associated with enabling + Tx flow control so beware. + +RxDescriptors +------------- +:Valid Range: 64-4096 +:Default Value: 1024 + + This value is the number of receive descriptors allocated by the driver. + Increasing this value allows the driver to buffer more incoming packets. + Each descriptor is 16 bytes. A receive buffer is also allocated for + each descriptor and can be either 2048, 4056, 8192, or 16384 bytes, + depending on the MTU setting. When the MTU size is 1500 or less, the + receive buffer size is 2048 bytes. When the MTU is greater than 1500 the + receive buffer size will be either 4056, 8192, or 16384 bytes. The + maximum MTU size is 16114. + +TxDescriptors +------------- +:Valid Range: 64-4096 +:Default Value: 256 + + This value is the number of transmit descriptors allocated by the driver. + Increasing this value allows the driver to queue more transmits. Each + descriptor is 16 bytes. + +RxIntDelay +---------- +:Valid Range: 0-65535 (0=off) +:Default Value: 72 + + This value delays the generation of receive interrupts in units of + 0.8192 microseconds. Receive interrupt reduction can improve CPU + efficiency if properly tuned for specific network traffic. Increasing + this value adds extra latency to frame reception and can end up + decreasing the throughput of TCP traffic. If the system is reporting + dropped receives, this value may be set too high, causing the driver to + run out of available receive descriptors. + +TxIntDelay +---------- +:Valid Range: 0-65535 (0=off) +:Default Value: 32 + + This value delays the generation of transmit interrupts in units of + 0.8192 microseconds. Transmit interrupt reduction can improve CPU + efficiency if properly tuned for specific network traffic. Increasing + this value adds extra latency to frame transmission and can end up + decreasing the throughput of TCP traffic. If this value is set too high, + it will cause the driver to run out of available transmit descriptors. + +XsumRX +------ +:Valid Range: 0-1 +:Default Value: 1 + + A value of '1' indicates that the driver should enable IP checksum + offload for received packets (both UDP and TCP) to the adapter hardware. + +RxFCHighThresh +-------------- +:Valid Range: 1,536-262,136 (0x600 - 0x3FFF8, 8 byte granularity) +:Default Value: 196,608 (0x30000) + + Receive Flow control high threshold (when we send a pause frame) + +RxFCLowThresh +------------- +:Valid Range: 64-262,136 (0x40 - 0x3FFF8, 8 byte granularity) +:Default Value: 163,840 (0x28000) + + Receive Flow control low threshold (when we send a resume frame) + +FCReqTimeout +------------ +:Valid Range: 1-65535 +:Default Value: 65535 + + Flow control request timeout (how long to pause the link partner's tx) + +IntDelayEnable +-------------- +:Value Range: 0,1 +:Default Value: 1 + + Interrupt Delay, 0 disables transmit interrupt delay and 1 enables it. + + +Improving Performance +===================== + +With the 10 Gigabit server adapters, the default Linux configuration will +very likely limit the total available throughput artificially. There is a set +of configuration changes that, when applied together, will increase the ability +of Linux to transmit and receive data. The following enhancements were +originally acquired from settings published at https://www.spec.org/web99/ for +various submitted results using Linux. + +NOTE: + These changes are only suggestions, and serve as a starting point for + tuning your network performance. + +The changes are made in three major ways, listed in order of greatest effect: + +- Use ip link to modify the mtu (maximum transmission unit) and the txqueuelen + parameter. +- Use sysctl to modify /proc parameters (essentially kernel tuning) +- Use setpci to modify the MMRBC field in PCI-X configuration space to increase + transmit burst lengths on the bus. + +NOTE: + setpci modifies the adapter's configuration registers to allow it to read + up to 4k bytes at a time (for transmits). However, for some systems the + behavior after modifying this register may be undefined (possibly errors of + some kind). A power-cycle, hard reset or explicitly setting the e6 register + back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a + stable configuration. + +- COPY these lines and paste them into ixgb_perf.sh: + +:: + + #!/bin/bash + echo "configuring network performance , edit this file to change the interface + or device ID of 10GbE card" + # set mmrbc to 4k reads, modify only Intel 10GbE device IDs + # replace 1a48 with appropriate 10GbE device's ID installed on the system, + # if needed. + setpci -d 8086:1a48 e6.b=2e + # set the MTU (max transmission unit) - it requires your switch and clients + # to change as well. + # set the txqueuelen + # your ixgb adapter should be loaded as eth1 for this to work, change if needed + ip li set dev eth1 mtu 9000 txqueuelen 1000 up + # call the sysctl utility to modify /proc/sys entries + sysctl -p ./sysctl_ixgb.conf + +- COPY these lines and paste them into sysctl_ixgb.conf: + +:: + + # some of the defaults may be different for your kernel + # call this file with sysctl -p <this file> + # these are just suggested values that worked well to increase throughput in + # several network benchmark tests, your mileage may vary + + ### IPV4 specific settings + # turn TCP timestamp support off, default 1, reduces CPU use + net.ipv4.tcp_timestamps = 0 + # turn SACK support off, default on + # on systems with a VERY fast bus -> memory interface this is the big gainer + net.ipv4.tcp_sack = 0 + # set min/default/max TCP read buffer, default 4096 87380 174760 + net.ipv4.tcp_rmem = 10000000 10000000 10000000 + # set min/pressure/max TCP write buffer, default 4096 16384 131072 + net.ipv4.tcp_wmem = 10000000 10000000 10000000 + # set min/pressure/max TCP buffer space, default 31744 32256 32768 + net.ipv4.tcp_mem = 10000000 10000000 10000000 + + ### CORE settings (mostly for socket and UDP effect) + # set maximum receive socket buffer size, default 131071 + net.core.rmem_max = 524287 + # set maximum send socket buffer size, default 131071 + net.core.wmem_max = 524287 + # set default receive socket buffer size, default 65535 + net.core.rmem_default = 524287 + # set default send socket buffer size, default 65535 + net.core.wmem_default = 524287 + # set maximum amount of option memory buffers, default 10240 + net.core.optmem_max = 524287 + # set number of unprocessed input packets before kernel starts dropping them; default 300 + net.core.netdev_max_backlog = 300000 + +Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface +your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's +ID installed on the system. + +NOTE: + Unless these scripts are added to the boot process, these changes will + only last only until the next system reboot. + + +Resolving Slow UDP Traffic +-------------------------- +If your server does not seem to be able to receive UDP traffic as fast as it +can receive TCP traffic, it could be because Linux, by default, does not set +the network stack buffers as large as they need to be to support high UDP +transfer rates. One way to alleviate this problem is to allow more memory to +be used by the IP stack to store incoming data. + +For instance, use the commands:: + + sysctl -w net.core.rmem_max=262143 + +and:: + + sysctl -w net.core.rmem_default=262143 + +to increase the read buffer memory max and default to 262143 (256k - 1) from +defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables +will increase the amount of memory used by the network stack for receives, and +can be increased significantly more if necessary for your application. + + +Additional Configurations +========================= + +Configuring the Driver on Different Distributions +------------------------------------------------- +Configuring a network driver to load properly when the system is started is +distribution dependent. Typically, the configuration process involves adding +an alias line to /etc/modprobe.conf as well as editing other system startup +scripts and/or configuration files. Many popular Linux distributions ship +with tools to make these changes for you. To learn the proper way to +configure a network device for your system, refer to your distribution +documentation. If during this process you are asked for the driver or module +name, the name for the Linux Base Driver for the Intel 10GbE Family of +Adapters is ixgb. + +Viewing Link Messages +--------------------- +Link messages will not be displayed to the console if the distribution is +restricting system messages. In order to see network driver link messages on +your console, set dmesg to eight by entering the following:: + + dmesg -n 8 + +NOTE: This setting is not saved across reboots. + +Jumbo Frames +------------ +The driver supports Jumbo Frames for all adapters. Jumbo Frames support is +enabled by changing the MTU to a value larger than the default of 1500. +The maximum value for the MTU is 16114. Use the ip command to +increase the MTU size. For example:: + + ip li set dev ethx mtu 9000 + +The maximum MTU setting for Jumbo Frames is 16114. This value coincides +with the maximum Jumbo Frames size of 16128. + +Ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The ethtool +version 1.6 or later is required for this functionality. + +The latest release of ethtool can be found from +https://www.kernel.org/pub/software/network/ethtool/ + +NOTE: + The ethtool version 1.6 only supports a limited set of ethtool options. + Support for a more complete ethtool feature set can be enabled by + upgrading to the latest version. + +NAPI +---- +NAPI (Rx polling mode) is supported in the ixgb driver. + +See https://wiki.linuxfoundation.org/networking/napi for more information on +NAPI. + + +Known Issues/Troubleshooting +============================ + +NOTE: + After installing the driver, if your Intel Network Connection is not + working, verify in the "In This Release" section of the readme that you have + installed the correct driver. + +Cable Interoperability Issue with Fujitsu XENPAK Module in SmartBits Chassis +---------------------------------------------------------------------------- +Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 +Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits +chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni. +The CRC errors may be received either by the Intel(R) PRO/10GbE CX4 +Server adapter or the SmartBits. If this situation occurs using a different +cable assembly may resolve the issue. + +Cable Interoperability Issues with HP Procurve 3400cl Switch Port +----------------------------------------------------------------- +Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server +adapter is connected to an HP Procurve 3400cl switch port using short cables +(1 m or shorter). If this situation occurs, using a longer cable may resolve +the issue. + +Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that +Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC +errors may be received either by the CX4 Server adapter or at the switch. If +this situation occurs, using a different cable assembly may resolve the issue. + +Jumbo Frames System Requirement +------------------------------- +Memory allocation failures have been observed on Linux systems with 64 MB +of RAM or less that are running Jumbo Frames. If you are using Jumbo +Frames, your system may require more than the advertised minimum +requirement of 64 MB of system memory. + +Performance Degradation with Jumbo Frames +----------------------------------------- +Degradation in throughput performance may be observed in some Jumbo frames +environments. If this is observed, increasing the application's socket buffer +size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help. +See the specific application manual and /usr/src/linux*/Documentation/ +networking/ip-sysctl.txt for more details. + +Allocating Rx Buffers when Using Jumbo Frames +--------------------------------------------- +Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if +the available memory is heavily fragmented. This issue may be seen with PCI-X +adapters or with packet split disabled. This can be reduced or eliminated +by changing the amount of available memory for receive buffer allocation, by +increasing /proc/sys/vm/min_free_kbytes. + +Multiple Interfaces on Same Ethernet Broadcast Network +------------------------------------------------------ +Due to the default ARP behavior on Linux, it is not possible to have +one system on two IP networks in the same Ethernet broadcast domain +(non-partitioned switch) behave as expected. All Ethernet interfaces +will respond to IP traffic for any IP address assigned to the system. +This results in unbalanced receive traffic. + +If you have multiple interfaces in a server, do either of the following: + + - Turn on ARP filtering by entering:: + + echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter + + - Install the interfaces in separate broadcast domains - either in + different switches or in a switch partitioned to VLANs. + +UDP Stress Test Dropped Packet Issue +-------------------------------------- +Under small packets UDP stress test with 10GbE driver, the Linux system +may drop UDP packets due to the fullness of socket buffers. You may want +to change the driver's Flow Control variables to the minimum value for +controlling packet reception. + +Tx Hangs Possible Under Stress +------------------------------ +Under stress conditions, if TX hangs occur, turning off TSO +"ethtool -K eth0 tso off" may resolve the problem. + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst new file mode 100644 index 000000000..0a233b17c --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/ixgbe.rst @@ -0,0 +1,557 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +=========================================================================== +Linux Base Driver for the Intel(R) Ethernet 10 Gigabit PCI Express Adapters +=========================================================================== + +Intel 10 Gigabit Linux driver. +Copyright(c) 1999-2018 Intel Corporation. + +Contents +======== + +- Identifying Your Adapter +- Command Line Parameters +- Additional Configurations +- Known Issues +- Support + +Identifying Your Adapter +======================== +The driver is compatible with devices based on the following: + + * Intel(R) Ethernet Controller 82598 + * Intel(R) Ethernet Controller 82599 + * Intel(R) Ethernet Controller X520 + * Intel(R) Ethernet Controller X540 + * Intel(R) Ethernet Controller x550 + * Intel(R) Ethernet Controller X552 + * Intel(R) Ethernet Controller X553 + +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + +SFP+ Devices with Pluggable Optics +---------------------------------- + +82599-BASED ADAPTERS +~~~~~~~~~~~~~~~~~~~~ +NOTES: +- If your 82599-based Intel(R) Network Adapter came with Intel optics or is an +Intel(R) Ethernet Server Adapter X520-2, then it only supports Intel optics +and/or the direct attach cables listed below. +- When 82599-based SFP+ devices are connected back to back, they should be set +to the same Speed setting via ethtool. Results may vary if you mix speed +settings. + ++---------------+---------------------------------------+------------------+ +| Supplier | Type | Part Numbers | ++===============+=======================================+==================+ +| SR Modules | ++---------------+---------------------------------------+------------------+ +| Intel | DUAL RATE 1G/10G SFP+ SR (bailed) | FTLX8571D3BCV-IT | ++---------------+---------------------------------------+------------------+ +| Intel | DUAL RATE 1G/10G SFP+ SR (bailed) | AFBR-703SDZ-IN2 | ++---------------+---------------------------------------+------------------+ +| Intel | DUAL RATE 1G/10G SFP+ SR (bailed) | AFBR-703SDDZ-IN1 | ++---------------+---------------------------------------+------------------+ +| LR Modules | ++---------------+---------------------------------------+------------------+ +| Intel | DUAL RATE 1G/10G SFP+ LR (bailed) | FTLX1471D3BCV-IT | ++---------------+---------------------------------------+------------------+ +| Intel | DUAL RATE 1G/10G SFP+ LR (bailed) | AFCT-701SDZ-IN2 | ++---------------+---------------------------------------+------------------+ +| Intel | DUAL RATE 1G/10G SFP+ LR (bailed) | AFCT-701SDDZ-IN1 | ++---------------+---------------------------------------+------------------+ + +The following is a list of 3rd party SFP+ modules that have received some +testing. Not all modules are applicable to all devices. + ++---------------+---------------------------------------+------------------+ +| Supplier | Type | Part Numbers | ++===============+=======================================+==================+ +| Finisar | SFP+ SR bailed, 10g single rate | FTLX8571D3BCL | ++---------------+---------------------------------------+------------------+ +| Avago | SFP+ SR bailed, 10g single rate | AFBR-700SDZ | ++---------------+---------------------------------------+------------------+ +| Finisar | SFP+ LR bailed, 10g single rate | FTLX1471D3BCL | ++---------------+---------------------------------------+------------------+ +| Finisar | DUAL RATE 1G/10G SFP+ SR (No Bail) | FTLX8571D3QCV-IT | ++---------------+---------------------------------------+------------------+ +| Avago | DUAL RATE 1G/10G SFP+ SR (No Bail) | AFBR-703SDZ-IN1 | ++---------------+---------------------------------------+------------------+ +| Finisar | DUAL RATE 1G/10G SFP+ LR (No Bail) | FTLX1471D3QCV-IT | ++---------------+---------------------------------------+------------------+ +| Avago | DUAL RATE 1G/10G SFP+ LR (No Bail) | AFCT-701SDZ-IN1 | ++---------------+---------------------------------------+------------------+ +| Finisar | 1000BASE-T SFP | FCLF8522P2BTL | ++---------------+---------------------------------------+------------------+ +| Avago | 1000BASE-T | ABCU-5710RZ | ++---------------+---------------------------------------+------------------+ +| HP | 1000BASE-SX SFP | 453153-001 | ++---------------+---------------------------------------+------------------+ + +82599-based adapters support all passive and active limiting direct attach +cables that comply with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. + +Laser turns off for SFP+ when ifconfig ethX down +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +"ifconfig ethX down" turns off the laser for 82599-based SFP+ fiber adapters. +"ifconfig ethX up" turns on the laser. +Alternatively, you can use "ip link set [down/up] dev ethX" to turn the +laser off and on. + + +82599-based QSFP+ Adapters +~~~~~~~~~~~~~~~~~~~~~~~~~~ +NOTES: +- If your 82599-based Intel(R) Network Adapter came with Intel optics, it only +supports Intel optics. +- 82599-based QSFP+ adapters only support 4x10 Gbps connections. 1x40 Gbps +connections are not supported. QSFP+ link partners must be configured for +4x10 Gbps. +- 82599-based QSFP+ adapters do not support automatic link speed detection. +The link speed must be configured to either 10 Gbps or 1 Gbps to match the link +partners speed capabilities. Incorrect speed configurations will result in +failure to link. +- Intel(R) Ethernet Converged Network Adapter X520-Q1 only supports the optics +and direct attach cables listed below. + ++---------------+---------------------------------------+------------------+ +| Supplier | Type | Part Numbers | ++===============+=======================================+==================+ +| Intel | DUAL RATE 1G/10G QSFP+ SRL (bailed) | E10GQSFPSR | ++---------------+---------------------------------------+------------------+ + +82599-based QSFP+ adapters support all passive and active limiting QSFP+ +direct attach cables that comply with SFF-8436 v4.1 specifications. + +82598-BASED ADAPTERS +~~~~~~~~~~~~~~~~~~~~ +NOTES: +- Intel(r) Ethernet Network Adapters that support removable optical modules +only support their original module type (for example, the Intel(R) 10 Gigabit +SR Dual Port Express Module only supports SR optical modules). If you plug in +a different type of module, the driver will not load. +- Hot Swapping/hot plugging optical modules is not supported. +- Only single speed, 10 gigabit modules are supported. +- LAN on Motherboard (LOMs) may support DA, SR, or LR modules. Other module +types are not supported. Please see your system documentation for details. + +The following is a list of SFP+ modules and direct attach cables that have +received some testing. Not all modules are applicable to all devices. + ++---------------+---------------------------------------+------------------+ +| Supplier | Type | Part Numbers | ++===============+=======================================+==================+ +| Finisar | SFP+ SR bailed, 10g single rate | FTLX8571D3BCL | ++---------------+---------------------------------------+------------------+ +| Avago | SFP+ SR bailed, 10g single rate | AFBR-700SDZ | ++---------------+---------------------------------------+------------------+ +| Finisar | SFP+ LR bailed, 10g single rate | FTLX1471D3BCL | ++---------------+---------------------------------------+------------------+ + +82598-based adapters support all passive direct attach cables that comply with +SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach cables +are not supported. + +Third party optic modules and cables referred to above are listed only for the +purpose of highlighting third party specifications and potential +compatibility, and are not recommendations or endorsements or sponsorship of +any third party's product by Intel. Intel is not endorsing or promoting +products made by any third party and the third party reference is provided +only to share information regarding certain optic modules and cables with the +above specifications. There may be other manufacturers or suppliers, producing +or supplying optic modules and cables with similar or matching descriptions. +Customers must use their own discretion and diligence to purchase optic +modules and cables from any third party of their choice. Customers are solely +responsible for assessing the suitability of the product and/or devices and +for the selection of the vendor for purchasing any product. THE OPTIC MODULES +AND CABLES REFERRED TO ABOVE ARE NOT WARRANTED OR SUPPORTED BY INTEL. INTEL +ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED +WARRANTY, RELATING TO SALE AND/OR USE OF SUCH THIRD PARTY PRODUCTS OR +SELECTION OF VENDOR BY CUSTOMERS. + +Command Line Parameters +======================= + +max_vfs +------- +:Valid Range: 1-63 + +This parameter adds support for SR-IOV. It causes the driver to spawn up to +max_vfs worth of virtual functions. +If the value is greater than 0 it will also force the VMDq parameter to be 1 or +more. + +NOTE: This parameter is only used on kernel 3.7.x and below. On kernel 3.8.x +and above, use sysfs to enable VFs. Also, for Red Hat distributions, this +parameter is only used on version 6.6 and older. For version 6.7 and newer, use +sysfs. For example:: + + #echo $num_vf_enabled > /sys/class/net/$dev/device/sriov_numvfs // enable VFs + #echo 0 > /sys/class/net/$dev/device/sriov_numvfs //disable VFs + +The parameters for the driver are referenced by position. Thus, if you have a +dual port adapter, or more than one adapter in your system, and want N virtual +functions per port, you must specify a number for each port with each parameter +separated by a comma. For example:: + + modprobe ixgbe max_vfs=4 + +This will spawn 4 VFs on the first port. + +:: + + modprobe ixgbe max_vfs=2,4 + +This will spawn 2 VFs on the first port and 4 VFs on the second port. + +NOTE: Caution must be used in loading the driver with these parameters. +Depending on your system configuration, number of slots, etc., it is impossible +to predict in all cases where the positions would be on the command line. + +NOTE: Neither the device nor the driver control how VFs are mapped into config +space. Bus layout will vary by operating system. On operating systems that +support it, you can check sysfs to find the mapping. + +NOTE: When either SR-IOV mode or VMDq mode is enabled, hardware VLAN filtering +and VLAN tag stripping/insertion will remain enabled. Please remove the old +VLAN filter before the new VLAN filter is added. For example, + +:: + + ip link set eth0 vf 0 vlan 100 // set VLAN 100 for VF 0 + ip link set eth0 vf 0 vlan 0 // Delete VLAN 100 + ip link set eth0 vf 0 vlan 200 // set a new VLAN 200 for VF 0 + +With kernel 3.6, the driver supports the simultaneous usage of max_vfs and DCB +features, subject to the constraints described below. Prior to kernel 3.6, the +driver did not support the simultaneous operation of max_vfs greater than 0 and +the DCB features (multiple traffic classes utilizing Priority Flow Control and +Extended Transmission Selection). + +When DCB is enabled, network traffic is transmitted and received through +multiple traffic classes (packet buffers in the NIC). The traffic is associated +with a specific class based on priority, which has a value of 0 through 7 used +in the VLAN tag. When SR-IOV is not enabled, each traffic class is associated +with a set of receive/transmit descriptor queue pairs. The number of queue +pairs for a given traffic class depends on the hardware configuration. When +SR-IOV is enabled, the descriptor queue pairs are grouped into pools. The +Physical Function (PF) and each Virtual Function (VF) is allocated a pool of +receive/transmit descriptor queue pairs. When multiple traffic classes are +configured (for example, DCB is enabled), each pool contains a queue pair from +each traffic class. When a single traffic class is configured in the hardware, +the pools contain multiple queue pairs from the single traffic class. + +The number of VFs that can be allocated depends on the number of traffic +classes that can be enabled. The configurable number of traffic classes for +each enabled VF is as follows: +0 - 15 VFs = Up to 8 traffic classes, depending on device support +16 - 31 VFs = Up to 4 traffic classes +32 - 63 VFs = 1 traffic class + +When VFs are configured, the PF is allocated one pool as well. The PF supports +the DCB features with the constraint that each traffic class will only use a +single queue pair. When zero VFs are configured, the PF can support multiple +queue pairs per traffic class. + +allow_unsupported_sfp +--------------------- +:Valid Range: 0,1 +:Default Value: 0 (disabled) + +This parameter allows unsupported and untested SFP+ modules on 82599-based +adapters, as long as the type of module is known to the driver. + +debug +----- +:Valid Range: 0-16 (0=none,...,16=all) +:Default Value: 0 + +This parameter adjusts the level of debug messages displayed in the system +logs. + + +Additional Features and Configurations +====================================== + +Flow Control +------------ +Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable +receiving and transmitting pause frames for ixgbe. When transmit is enabled, +pause frames are generated when the receive packet buffer crosses a predefined +threshold. When receive is enabled, the transmit unit will halt for the time +delay specified when a pause frame is received. + +NOTE: You must have a flow control capable link partner. + +Flow Control is enabled by default. + +Use ethtool to change the flow control settings. To enable or disable Rx or +Tx Flow Control:: + + ethtool -A eth? rx <on|off> tx <on|off> + +Note: This command only enables or disables Flow Control if auto-negotiation is +disabled. If auto-negotiation is enabled, this command changes the parameters +used for auto-negotiation with the link partner. + +To enable or disable auto-negotiation:: + + ethtool -s eth? autoneg <on|off> + +Note: Flow Control auto-negotiation is part of link auto-negotiation. Depending +on your device, you may not be able to change the auto-negotiation setting. + +NOTE: For 82598 backplane cards entering 1 gigabit mode, flow control default +behavior is changed to off. Flow control in 1 gigabit mode on these devices can +lead to transmit hangs. + +Intel(R) Ethernet Flow Director +------------------------------- +The Intel Ethernet Flow Director performs the following tasks: + +- Directs receive packets according to their flows to different queues. +- Enables tight control on routing a flow in the platform. +- Matches flows and CPU cores for flow affinity. +- Supports multiple parameters for flexible flow classification and load + balancing (in SFP mode only). + +NOTE: Intel Ethernet Flow Director masking works in the opposite manner from +subnet masking. In the following command:: + + #ethtool -N eth11 flow-type ip4 src-ip 172.4.1.2 m 255.0.0.0 dst-ip \ + 172.21.1.1 m 255.128.0.0 action 31 + +The src-ip value that is written to the filter will be 0.4.1.2, not 172.0.0.0 +as might be expected. Similarly, the dst-ip value written to the filter will be +0.21.1.1, not 172.0.0.0. + +To enable or disable the Intel Ethernet Flow Director:: + + # ethtool -K ethX ntuple <on|off> + +When disabling ntuple filters, all the user programmed filters are flushed from +the driver cache and hardware. All needed filters must be re-added when ntuple +is re-enabled. + +To add a filter that directs packet to queue 2, use -U or -N switch:: + + # ethtool -N ethX flow-type tcp4 src-ip 192.168.10.1 dst-ip \ + 192.168.10.2 src-port 2000 dst-port 2001 action 2 [loc 1] + +To see the list of filters currently present:: + + # ethtool <-u|-n> ethX + +Sideband Perfect Filters +------------------------ +Sideband Perfect Filters are used to direct traffic that matches specified +characteristics. They are enabled through ethtool's ntuple interface. To add a +new filter use the following command:: + + ethtool -U <device> flow-type <type> src-ip <ip> dst-ip <ip> src-port <port> \ + dst-port <port> action <queue> + +Where: + <device> - the ethernet device to program + <type> - can be ip4, tcp4, udp4, or sctp4 + <ip> - the IP address to match on + <port> - the port number to match on + <queue> - the queue to direct traffic towards (-1 discards the matched traffic) + +Use the following command to delete a filter:: + + ethtool -U <device> delete <N> + +Where <N> is the filter id displayed when printing all the active filters, and +may also have been specified using "loc <N>" when adding the filter. + +The following example matches TCP traffic sent from 192.168.0.1, port 5300, +directed to 192.168.0.5, port 80, and sends it to queue 7:: + + ethtool -U enp130s0 flow-type tcp4 src-ip 192.168.0.1 dst-ip 192.168.0.5 \ + src-port 5300 dst-port 80 action 7 + +For each flow-type, the programmed filters must all have the same matching +input set. For example, issuing the following two commands is acceptable:: + + ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7 + ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.5 src-port 55 action 10 + +Issuing the next two commands, however, is not acceptable, since the first +specifies src-ip and the second specifies dst-ip:: + + ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7 + ethtool -U enp130s0 flow-type ip4 dst-ip 192.168.0.5 src-port 55 action 10 + +The second command will fail with an error. You may program multiple filters +with the same fields, using different values, but, on one device, you may not +program two TCP4 filters with different matching fields. + +Matching on a sub-portion of a field is not supported by the ixgbe driver, thus +partial mask fields are not supported. + +To create filters that direct traffic to a specific Virtual Function, use the +"user-def" parameter. Specify the user-def as a 64 bit value, where the lower 32 +bits represents the queue number, while the next 8 bits represent which VF. +Note that 0 is the PF, so the VF identifier is offset by 1. For example:: + + ... user-def 0x800000002 ... + +specifies to direct traffic to Virtual Function 7 (8 minus 1) into queue 2 of +that VF. + +Note that these filters will not break internal routing rules, and will not +route traffic that otherwise would not have been sent to the specified Virtual +Function. + +Jumbo Frames +------------ +Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU) +to a value larger than the default value of 1500. + +Use the ifconfig command to increase the MTU size. For example, enter the +following where <x> is the interface number:: + + ifconfig eth<x> mtu 9000 up + +Alternatively, you can use the ip command as follows:: + + ip link set mtu 9000 dev eth<x> + ip link set up dev eth<x> + +This setting is not saved across reboots. The setting change can be made +permanent by adding 'MTU=9000' to the file:: + + /etc/sysconfig/network-scripts/ifcfg-eth<x> // for RHEL + /etc/sysconfig/network/<config_file> // for SLES + +NOTE: The maximum MTU setting for Jumbo Frames is 9710. This value coincides +with the maximum Jumbo Frames size of 9728 bytes. + +NOTE: This driver will attempt to use multiple page sized buffers to receive +each jumbo packet. This should help to avoid buffer starvation issues when +allocating receive packets. + +NOTE: For 82599-based network connections, if you are enabling jumbo frames in +a virtual function (VF), jumbo frames must first be enabled in the physical +function (PF). The VF MTU setting cannot be larger than the PF MTU. + +NBASE-T Support +--------------- +The ixgbe driver supports NBASE-T on some devices. However, the advertisement +of NBASE-T speeds is suppressed by default, to accommodate broken network +switches which cannot cope with advertised NBASE-T speeds. Use the ethtool +command to enable advertising NBASE-T speeds on devices which support it:: + + ethtool -s eth? advertise 0x1800000001028 + +On Linux systems with INTERFACES(5), this can be specified as a pre-up command +in /etc/network/interfaces so that the interface is always brought up with +NBASE-T support, e.g.:: + + iface eth? inet dhcp + pre-up ethtool -s eth? advertise 0x1800000001028 || true + +Generic Receive Offload, aka GRO +-------------------------------- +The driver supports the in-kernel software implementation of GRO. GRO has +shown that by coalescing Rx traffic into larger chunks of data, CPU +utilization can be significantly reduced when under large Rx load. GRO is an +evolution of the previously-used LRO interface. GRO is able to coalesce +other protocols besides TCP. It's also safe to use with configurations that +are problematic for LRO, namely bridging and iSCSI. + +Data Center Bridging (DCB) +-------------------------- +NOTE: +The kernel assumes that TC0 is available, and will disable Priority Flow +Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is +enabled when setting up DCB on your switch. + +DCB is a configuration Quality of Service implementation in hardware. It uses +the VLAN priority tag (802.1p) to filter traffic. That means that there are 8 +different priorities that traffic can be filtered into. It also enables +priority flow control (802.1Qbb) which can limit or eliminate the number of +dropped packets during network stress. Bandwidth can be allocated to each of +these priorities, which is enforced at the hardware level (802.1Qaz). + +Adapter firmware implements LLDP and DCBX protocol agents as per 802.1AB and +802.1Qaz respectively. The firmware based DCBX agent runs in willing mode only +and can accept settings from a DCBX capable peer. Software configuration of +DCBX parameters via dcbtool/lldptool are not supported. + +The ixgbe driver implements the DCB netlink interface layer to allow user-space +to communicate with the driver and query DCB configuration for the port. + +ethtool +------- +The driver utilizes the ethtool interface for driver configuration and +diagnostics, as well as displaying statistical information. The latest ethtool +version is required for this functionality. Download it at: +https://www.kernel.org/pub/software/network/ethtool/ + +FCoE +---- +The ixgbe driver supports Fiber Channel over Ethernet (FCoE) and Data Center +Bridging (DCB). This code has no default effect on the regular driver +operation. Configuring DCB and FCoE is outside the scope of this README. Refer +to http://www.open-fcoe.org/ for FCoE project information and contact +ixgbe-eedc@lists.sourceforge.net for DCB information. + +MAC and VLAN anti-spoofing feature +---------------------------------- +When a malicious driver attempts to send a spoofed packet, it is dropped by the +hardware and not transmitted. + +An interrupt is sent to the PF driver notifying it of the spoof attempt. When a +spoofed packet is detected, the PF driver will send the following message to +the system log (displayed by the "dmesg" command):: + + ixgbe ethX: ixgbe_spoof_check: n spoofed packets detected + +where "x" is the PF interface number; and "n" is number of spoofed packets. +NOTE: This feature can be disabled for a specific Virtual Function (VF):: + + ip link set <pf dev> vf <vf id> spoofchk {off|on} + +IPsec Offload +------------- +The ixgbe driver supports IPsec Hardware Offload. When creating Security +Associations with "ip xfrm ..." the 'offload' tag option can be used to +register the IPsec SA with the driver in order to get higher throughput in +the secure communications. + +The offload is also supported for ixgbe's VFs, but the VF must be set as +'trusted' and the support must be enabled with:: + + ethtool --set-priv-flags eth<x> vf-ipsec on + ip link set eth<x> vf <y> trust on + + +Known Issues/Troubleshooting +============================ + +Enabling SR-IOV in a 64-bit Microsoft Windows Server 2012/R2 guest OS +--------------------------------------------------------------------- +Linux KVM Hypervisor/VMM supports direct assignment of a PCIe device to a VM. +This includes traditional PCIe devices, as well as SR-IOV-capable devices based +on the Intel Ethernet Controller XL710. + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. diff --git a/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst b/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst new file mode 100644 index 000000000..76bbde736 --- /dev/null +++ b/Documentation/networking/device_drivers/ethernet/intel/ixgbevf.rst @@ -0,0 +1,67 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +============================================================ +Linux Base Virtual Function Driver for Intel(R) 10G Ethernet +============================================================ + +Intel 10 Gigabit Virtual Function Linux driver. +Copyright(c) 1999-2018 Intel Corporation. + +Contents +======== + +- Identifying Your Adapter +- Known Issues +- Support + +This driver supports 82599, X540, X550, and X552-based virtual function devices +that can only be activated on kernels that support SR-IOV. + +For questions related to hardware requirements, refer to the documentation +supplied with your Intel adapter. All hardware requirements listed apply to use +with Linux. + + +Identifying Your Adapter +======================== +The driver is compatible with devices based on the following: + + * Intel(R) Ethernet Controller 82598 + * Intel(R) Ethernet Controller 82599 + * Intel(R) Ethernet Controller X520 + * Intel(R) Ethernet Controller X540 + * Intel(R) Ethernet Controller x550 + * Intel(R) Ethernet Controller X552 + * Intel(R) Ethernet Controller X553 + +For information on how to identify your adapter, and for the latest Intel +network drivers, refer to the Intel Support website: +https://www.intel.com/support + +Known Issues/Troubleshooting +============================ + +SR-IOV requires the correct platform and OS support. + +The guest OS loading this driver must support MSI-X interrupts. + +This driver is only supported as a loadable module at this time. Intel is not +supplying patches against the kernel source to allow for static linking of the +drivers. + +VLANs: There is a limit of a total of 64 shared VLANs to 1 or more VFs. + + +Support +======= +For general information, go to the Intel support website at: + +https://www.intel.com/support/ + +or the Intel Wired Networking project hosted by Sourceforge at: + +https://sourceforge.net/projects/e1000 + +If an issue is identified with the released source code on a supported kernel +with a supported adapter, email the specific information related to the issue +to e1000-devel@lists.sf.net. |