160 files changed, 41568 insertions, 0 deletions
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
new file mode 100644
index 000000000..02a323c43
--- /dev/null
+++ b/Documentation/networking/00-INDEX
@@ -0,0 +1,234 @@
+00-INDEX
+	- this file
+3c509.txt
+	- information on the 3Com Etherlink III Series Ethernet cards.
+6pack.txt
+	- info on the 6pack protocol, an alternative to KISS for AX.25
+LICENSE.qla3xxx
+	- GPLv2 for QLogic Linux Networking HBA Driver
+LICENSE.qlge
+	- GPLv2 for QLogic Linux qlge NIC Driver
+LICENSE.qlcnic
+	- GPLv2 for QLogic Linux qlcnic NIC Driver
+PLIP.txt
+	- PLIP: The Parallel Line Internet Protocol device driver
+README.ipw2100
+	- README for the Intel PRO/Wireless 2100 driver.
+README.ipw2200
+	- README for the Intel PRO/Wireless 2915ABG and 2200BG driver.
+README.sb1000
+	- info on General Instrument/NextLevel SURFboard1000 cable modem.
+altera_tse.txt
+	- Altera Triple-Speed Ethernet controller.
+arcnet-hardware.txt
+	- tons of info on ARCnet, hubs, jumper settings for ARCnet cards, etc.
+arcnet.txt
+	- info on the using the ARCnet driver itself.
+atm.txt
+	- info on where to get ATM programs and support for Linux.
+ax25.txt
+	- info on using AX.25 and NET/ROM code for Linux
+baycom.txt
+	- info on the driver for Baycom style amateur radio modems
+bonding.txt
+	- Linux Ethernet Bonding Driver HOWTO: link aggregation in Linux.
+bridge.txt
+	- where to get user space programs for ethernet bridging with Linux.
+cdc_mbim.txt
+	- 3G/LTE USB modem (Mobile Broadband Interface Model)
+checksum-offloads.txt
+	- Explanation of checksum offloads; LCO, RCO
+cops.txt
+	- info on the COPS LocalTalk Linux driver
+cs89x0.txt
+	- the Crystal LAN (CS8900/20-based) Ethernet ISA adapter driver
+cxacru.txt
+	- Conexant AccessRunner USB ADSL Modem
+cxacru-cf.py
+	- Conexant AccessRunner USB ADSL Modem configuration file parser
+cxgb.txt
+	- Release Notes for the Chelsio N210 Linux device driver.
+dccp.txt
+	- the Datagram Congestion Control Protocol (DCCP) (RFC 4340..42).
+dctcp.txt
+	- DataCenter TCP congestion control
+de4x5.txt
+	- the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver
+decnet.txt
+	- info on using the DECnet networking layer in Linux.
+dl2k.txt
+	- README for D-Link DL2000-based Gigabit Ethernet Adapters (dl2k.ko).
+dm9000.txt
+	- README for the Simtec DM9000 Network driver.
+dmfe.txt
+	- info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver.
+dns_resolver.txt
+	- The DNS resolver module allows kernel servies to make DNS queries.
+driver.txt
+	- Softnet driver issues.
+ena.txt
+	- info on Amazon's Elastic Network Adapter (ENA)
+e100.txt
+	- info on Intel's EtherExpress PRO/100 line of 10/100 boards
+e1000.txt
+	- info on Intel's E1000 line of gigabit ethernet boards
+e1000e.txt
+	- README for the Intel Gigabit Ethernet Driver (e1000e).
+eql.txt
+	- serial IP load balancing
+fib_trie.txt
+	- Level Compressed Trie (LC-trie) notes: a structure for routing.
+filter.txt
+	- Linux Socket Filtering
+fore200e.txt
+	- FORE Systems PCA-200E/SBA-200E ATM NIC driver info.
+framerelay.txt
+	- info on using Frame Relay/Data Link Connection Identifier (DLCI).
+gen_stats.txt
+	- Generic networking statistics for netlink users.
+generic-hdlc.txt
+	- The generic High Level Data Link Control (HDLC) layer.
+generic_netlink.txt
+	- info on Generic Netlink
+gianfar.txt
+	- Gianfar Ethernet Driver.
+i40e.txt
+	- README for the Intel Ethernet Controller XL710 Driver (i40e).
+i40evf.txt
+	- Short note on the Driver for the Intel(R) XL710 X710 Virtual Function
+ieee802154.txt
+	- Linux IEEE 802.15.4 implementation, API and drivers
+igb.txt
+	- README for the Intel Gigabit Ethernet Driver (igb).
+igbvf.txt
+	- README for the Intel Gigabit Ethernet Driver (igbvf).
+ip-sysctl.txt
+	- /proc/sys/net/ipv4/* variables
+ip_dynaddr.txt
+	- IP dynamic address hack e.g. for auto-dialup links
+ipddp.txt
+	- AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
+iphase.txt
+	- Interphase PCI ATM (i)Chip IA Linux driver info.
+ipsec.txt
+	- Note on not compressing IPSec payload and resulting failed policy check.
+ipv6.txt
+	- Options to the ipv6 kernel module.
+ipvs-sysctl.txt
+	- Per-inode explanation of the /proc/sys/net/ipv4/vs interface.
+irda.txt
+	- where to get IrDA (infrared) utilities and info for Linux.
+ixgb.txt
+	- README for the Intel 10 Gigabit Ethernet Driver (ixgb).
+ixgbe.txt
+	- README for the Intel 10 Gigabit Ethernet Driver (ixgbe).
+ixgbevf.txt
+	- README for the Intel Virtual Function (VF) Driver (ixgbevf).
+l2tp.txt
+	- User guide to the L2TP tunnel protocol.
+lapb-module.txt
+	- programming information of the LAPB module.
+ltpc.txt
+	- the Apple or Farallon LocalTalk PC card driver
+mac80211-auth-assoc-deauth.txt
+	- authentication and association / deauth-disassoc with max80211
+mac80211-injection.txt
+	- HOWTO use packet injection with mac80211
+multiqueue.txt
+	- HOWTO for multiqueue network device support.
+netconsole.txt
+	- The network console module netconsole.ko: configuration and notes.
+netdev-features.txt
+	- Network interface features API description.
+netdevices.txt
+	- info on network device driver functions exported to the kernel.
+netif-msg.txt
+	- Design of the network interface message level setting (NETIF_MSG_*).
+netlink_mmap.txt
+	- memory mapped I/O with netlink
+nf_conntrack-sysctl.txt
+	- list of netfilter-sysctl knobs.
+nfc.txt
+	- The Linux Near Field Communication (NFS) subsystem.
+openvswitch.txt
+	- Open vSwitch developer documentation.
+operstates.txt
+	- Overview of network interface operational states.
+packet_mmap.txt
+	- User guide to memory mapped packet socket rings (PACKET_[RT]X_RING).
+phonet.txt
+	- The Phonet packet protocol used in Nokia cellular modems.
+phy.txt
+	- The PHY abstraction layer.
+pktgen.txt
+	- User guide to the kernel packet generator (pktgen.ko).
+policy-routing.txt
+	- IP policy-based routing
+ppp_generic.txt
+	- Information about the generic PPP driver.
+proc_net_tcp.txt
+	- Per inode overview of the /proc/net/tcp and /proc/net/tcp6 interfaces.
+radiotap-headers.txt
+	- Background on radiotap headers.
+ray_cs.txt
+	- Raylink Wireless LAN card driver info.
+rds.txt
+	- Background on the reliable, ordered datagram delivery method RDS.
+regulatory.txt
+	- Overview of the Linux wireless regulatory infrastructure.
+rxrpc.txt
+	- Guide to the RxRPC protocol.
+s2io.txt
+	- Release notes for Neterion Xframe I/II 10GbE driver.
+scaling.txt
+	- Explanation of network scaling techniques: RSS, RPS, RFS, aRFS, XPS.
+sctp.txt
+	- Notes on the Linux kernel implementation of the SCTP protocol.
+secid.txt
+	- Explanation of the secid member in flow structures.
+skfp.txt
+	- SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info.
+smc9.txt
+	- the driver for SMC's 9000 series of Ethernet cards
+spider_net.txt
+	- README for the Spidernet Driver (as found in PS3 / Cell BE).
+stmmac.txt
+	- README for the STMicro Synopsys Ethernet driver.
+tc-actions-env-rules.txt
+	- rules for traffic control (tc) actions.
+timestamping.txt
+	- overview of network packet timestamping variants.
+tcp.txt
+	- short blurb on how TCP output takes place.
+tcp-thin.txt
+	- kernel tuning options for low rate 'thin' TCP streams.
+team.txt
+	- pointer to information for ethernet teaming devices.
+tlan.txt
+	- ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info.
+tproxy.txt
+	- Transparent proxy support user guide.
+tuntap.txt
+	- TUN/TAP device driver, allowing user space Rx/Tx of packets.
+udplite.txt
+	- UDP-Lite protocol (RFC 3828) introduction.
+vortex.txt
+	- info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards.
+vxge.txt
+	- README for the Neterion X3100 PCIe Server Adapter.
+vxlan.txt
+	- Virtual extensible LAN overview
+x25.txt
+	- general info on X.25 development.
+x25-iface.txt
+	- description of the X.25 Packet Layer to LAPB device interface.
+xfrm_device.txt
+	- description of XFRM offload API
+xfrm_proc.txt
+	- description of the statistics package for XFRM.
+xfrm_sync.txt
+	- sync patches for XFRM enable migration of an SA between hosts.
+xfrm_sysctl.txt
+	- description of the XFRM configuration options.
+z8530drv.txt
+	- info about Linux driver for Z8530 based HDLC cards for AX.25
diff --git a/Documentation/networking/3c509.txt b/Documentation/networking/3c509.txt
new file mode 100644
index 000000000..fbf722e15
--- /dev/null
+++ b/Documentation/networking/3c509.txt
@@ -0,0 +1,213 @@
+Linux and the 3Com EtherLink III Series Ethercards (driver v1.18c and higher)
+----------------------------------------------------------------------------
+
+This file contains the instructions and caveats for v1.18c and higher versions
+of the 3c509 driver. You should not use the driver without reading this file.
+
+release 1.0
+28 February 2002
+Current maintainer (corrections to):
+  David Ruggiero <jdr@farfalle.com>
+
+----------------------------------------------------------------------------
+
+(0) Introduction
+
+The following are notes and information on using the 3Com EtherLink III series
+ethercards in Linux. These cards are commonly known by the most widely-used
+card's 3Com model number, 3c509. They are all 10mb/s ISA-bus cards and shouldn't
+be (but sometimes are) confused with the similarly-numbered PCI-bus "3c905"
+(aka "Vortex" or "Boomerang") series.  Kernel support for the 3c509 family is
+provided by the module 3c509.c, which has code to support all of the following
+models:
+
+  3c509 (original ISA card)
+  3c509B (later revision of the ISA card; supports full-duplex)
+  3c589 (PCMCIA)
+  3c589B (later revision of the 3c589; supports full-duplex)
+  3c579 (EISA)
+
+Large portions of this documentation were heavily borrowed from the guide
+written the original author of the 3c509 driver, Donald Becker. The master
+copy of that document, which contains notes on older versions of the driver,
+currently resides on Scyld web server: http://www.scyld.com/.
+
+
+(1) Special Driver Features
+
+Overriding card settings
+
+The driver allows boot- or load-time overriding of the card's detected IOADDR,
+IRQ, and transceiver settings, although this capability shouldn't generally be
+needed except to enable full-duplex mode (see below). An example of the syntax
+for LILO parameters for doing this:
+
+    ether=10,0x310,3,0x3c509,eth0 
+
+This configures the first found 3c509 card for IRQ 10, base I/O 0x310, and
+transceiver type 3 (10base2). The flag "0x3c509" must be set to avoid conflicts
+with other card types when overriding the I/O address. When the driver is
+loaded as a module, only the IRQ may be overridden. For example,
+setting two cards to IRQ10 and IRQ11 is done by using the irq module
+option:
+
+   options 3c509 irq=10,11
+
+
+(2) Full-duplex mode
+
+The v1.18c driver added support for the 3c509B's full-duplex capabilities.
+In order to enable and successfully use full-duplex mode, three conditions
+must be met: 
+
+(a) You must have a Etherlink III card model whose hardware supports full-
+duplex operations. Currently, the only members of the 3c509 family that are
+positively known to support full-duplex are the 3c509B (ISA bus) and 3c589B
+(PCMCIA) cards. Cards without the "B" model designation do *not* support
+full-duplex mode; these include the original 3c509 (no "B"), the original
+3c589, the 3c529 (MCA bus), and the 3c579 (EISA bus).
+
+(b) You must be using your card's 10baseT transceiver (i.e., the RJ-45
+connector), not its AUI (thick-net) or 10base2 (thin-net/coax) interfaces.
+AUI and 10base2 network cabling is physically incapable of full-duplex
+operation.
+
+(c) Most importantly, your 3c509B must be connected to a link partner that is
+itself full-duplex capable. This is almost certainly one of two things: a full-
+duplex-capable  Ethernet switch (*not* a hub), or a full-duplex-capable NIC on
+another system that's connected directly to the 3c509B via a crossover cable.
+
+Full-duplex mode can be enabled using 'ethtool'.
+ 
+/////Extremely important caution concerning full-duplex mode/////
+Understand that the 3c509B's hardware's full-duplex support is much more
+limited than that provide by more modern network interface cards. Although
+at the physical layer of the network it fully supports full-duplex operation,
+the card was designed before the current Ethernet auto-negotiation (N-way)
+spec was written. This means that the 3c509B family ***cannot and will not
+auto-negotiate a full-duplex connection with its link partner under any
+circumstances, no matter how it is initialized***. If the full-duplex mode
+of the 3c509B is enabled, its link partner will very likely need to be
+independently _forced_ into full-duplex mode as well; otherwise various nasty
+failures will occur - at the very least, you'll see massive numbers of packet
+collisions. This is one of very rare circumstances where disabling auto-
+negotiation and forcing the duplex mode of a network interface card or switch
+would ever be necessary or desirable.
+
+
+(3) Available Transceiver Types
+
+For versions of the driver v1.18c and above, the available transceiver types are:
+ 
+0  transceiver type from EEPROM config (normally 10baseT); force half-duplex
+1  AUI (thick-net / DB15 connector)
+2  (undefined)
+3  10base2 (thin-net == coax / BNC connector)
+4  10baseT (RJ-45 connector); force half-duplex mode
+8  transceiver type and duplex mode taken from card's EEPROM config settings
+12 10baseT (RJ-45 connector); force full-duplex mode
+
+Prior to driver version 1.18c, only transceiver codes 0-4 were supported. Note
+that the new transceiver codes 8 and 12 are the *only* ones that will enable
+full-duplex mode, no matter what the card's detected EEPROM settings might be.
+This insured that merely upgrading the driver from an earlier version would
+never automatically enable full-duplex mode in an existing installation;
+it must always be explicitly enabled via one of these code in order to be
+activated.
+
+The transceiver type can be changed using 'ethtool'.
+  
+
+(4a) Interpretation of error messages and common problems
+
+Error Messages
+
+eth0: Infinite loop in interrupt, status 2011. 
+These are "mostly harmless" message indicating that the driver had too much
+work during that interrupt cycle. With a status of 0x2011 you are receiving
+packets faster than they can be removed from the card. This should be rare
+or impossible in normal operation. Possible causes of this error report are:
+ 
+   - a "green" mode enabled that slows the processor down when there is no
+     keyboard activity. 
+
+   - some other device or device driver hogging the bus or disabling interrupts.
+     Check /proc/interrupts for excessive interrupt counts. The timer tick
+     interrupt should always be incrementing faster than the others. 
+
+No received packets 
+If a 3c509, 3c562 or 3c589 can successfully transmit packets, but never
+receives packets (as reported by /proc/net/dev or 'ifconfig') you likely
+have an interrupt line problem. Check /proc/interrupts to verify that the
+card is actually generating interrupts. If the interrupt count is not
+increasing you likely have a physical conflict with two devices trying to
+use the same ISA IRQ line. The common conflict is with a sound card on IRQ10
+or IRQ5, and the easiest solution is to move the 3c509 to a different
+interrupt line. If the device is receiving packets but 'ping' doesn't work,
+you have a routing problem.
+
+Tx Carrier Errors Reported in /proc/net/dev 
+If an EtherLink III appears to transmit packets, but the "Tx carrier errors"
+field in /proc/net/dev increments as quickly as the Tx packet count, you
+likely have an unterminated network or the incorrect media transceiver selected. 
+
+3c509B card is not detected on machines with an ISA PnP BIOS. 
+While the updated driver works with most PnP BIOS programs, it does not work
+with all. This can be fixed by disabling PnP support using the 3Com-supplied
+setup program. 
+
+3c509 card is not detected on overclocked machines 
+Increase the delay time in id_read_eeprom() from the current value, 500,
+to an absurdly high value, such as 5000. 
+
+
+(4b) Decoding Status and Error Messages
+
+The bits in the main status register are: 
+
+value 	description
+0x01 	Interrupt latch
+0x02 	Tx overrun, or Rx underrun
+0x04 	Tx complete
+0x08 	Tx FIFO room available
+0x10 	A complete Rx packet has arrived
+0x20 	A Rx packet has started to arrive
+0x40 	The driver has requested an interrupt
+0x80 	Statistics counter nearly full
+
+The bits in the transmit (Tx) status word are: 
+
+value 	description
+0x02 	Out-of-window collision.
+0x04 	Status stack overflow (normally impossible).
+0x08 	16 collisions.
+0x10 	Tx underrun (not enough PCI bus bandwidth).
+0x20 	Tx jabber.
+0x40 	Tx interrupt requested.
+0x80 	Status is valid (this should always be set).
+
+
+When a transmit error occurs the driver produces a status message such as 
+
+   eth0: Transmit error, Tx status register 82
+
+The two values typically seen here are:
+
+0x82 
+Out of window collision. This typically occurs when some other Ethernet
+host is incorrectly set to full duplex on a half duplex network. 
+
+0x88 
+16 collisions. This typically occurs when the network is exceptionally busy
+or when another host doesn't correctly back off after a collision. If this
+error is mixed with 0x82 errors it is the result of a host incorrectly set
+to full duplex (see above).
+
+Both of these errors are the result of network problems that should be
+corrected. They do not represent driver malfunction.
+
+
+(5) Revision history (this file)
+
+28Feb02 v1.0  DR   New; major portions based on Becker original 3c509 docs
+
diff --git a/Documentation/networking/6lowpan.txt b/Documentation/networking/6lowpan.txt
new file mode 100644
index 000000000..2e5a939d7
--- /dev/null
+++ b/Documentation/networking/6lowpan.txt
@@ -0,0 +1,50 @@
+
+Netdev private dataroom for 6lowpan interfaces:
+
+All 6lowpan able net devices, means all interfaces with ARPHRD_6LOWPAN,
+must have "struct lowpan_priv" placed at beginning of netdev_priv.
+
+The priv_size of each interface should be calculate by:
+
+ dev->priv_size = LOWPAN_PRIV_SIZE(LL_6LOWPAN_PRIV_DATA);
+
+Where LL_PRIV_6LOWPAN_DATA is sizeof linklayer 6lowpan private data struct.
+To access the LL_PRIV_6LOWPAN_DATA structure you can cast:
+
+ lowpan_priv(dev)-priv;
+
+to your LL_6LOWPAN_PRIV_DATA structure.
+
+Before registering the lowpan netdev interface you must run:
+
+ lowpan_netdev_setup(dev, LOWPAN_LLTYPE_FOOBAR);
+
+wheres LOWPAN_LLTYPE_FOOBAR is a define for your 6LoWPAN linklayer type of
+enum lowpan_lltypes.
+
+Example to evaluate the private usually you can do:
+
+static inline struct lowpan_priv_foobar *
+lowpan_foobar_priv(struct net_device *dev)
+{
+	return (struct lowpan_priv_foobar *)lowpan_priv(dev)->priv;
+}
+
+switch (dev->type) {
+case ARPHRD_6LOWPAN:
+	lowpan_priv = lowpan_priv(dev);
+	/* do great stuff which is ARPHRD_6LOWPAN related */
+	switch (lowpan_priv->lltype) {
+	case LOWPAN_LLTYPE_FOOBAR:
+		/* do 802.15.4 6LoWPAN handling here */
+		lowpan_foobar_priv(dev)->bar = foo;
+		break;
+	...
+	}
+	break;
+...
+}
+
+In case of generic 6lowpan branch ("net/6lowpan") you can remove the check
+on ARPHRD_6LOWPAN, because you can be sure that these function are called
+by ARPHRD_6LOWPAN interfaces.
diff --git a/Documentation/networking/6pack.txt b/Documentation/networking/6pack.txt
new file mode 100644
index 000000000..8f339428f
--- /dev/null
+++ b/Documentation/networking/6pack.txt
@@ -0,0 +1,175 @@
+This is the 6pack-mini-HOWTO, written by
+
+Andreas Könsgen DG3KQ
+Internet: ajk@comnets.uni-bremen.de
+AMPR-net: dg3kq@db0pra.ampr.org
+AX.25:    dg3kq@db0ach.#nrw.deu.eu
+
+Last update: April 7, 1998
+
+1. What is 6pack, and what are the advantages to KISS?
+
+6pack is a transmission protocol for data exchange between the PC and
+the TNC over a serial line. It can be used as an alternative to KISS.
+
+6pack has two major advantages:
+- The PC is given full control over the radio
+  channel. Special control data is exchanged between the PC and the TNC so
+  that the PC knows at any time if the TNC is receiving data, if a TNC
+  buffer underrun or overrun has occurred, if the PTT is
+  set and so on. This control data is processed at a higher priority than
+  normal data, so a data stream can be interrupted at any time to issue an
+  important event. This helps to improve the channel access and timing 
+  algorithms as everything is computed in the PC. It would even be possible 
+  to experiment with something completely different from the known CSMA and 
+  DAMA channel access methods.
+  This kind of real-time control is especially important to supply several
+  TNCs that are connected between each other and the PC by a daisy chain
+  (however, this feature is not supported yet by the Linux 6pack driver).
+
+- Each packet transferred over the serial line is supplied with a checksum,
+  so it is easy to detect errors due to problems on the serial line.
+  Received packets that are corrupt are not passed on to the AX.25 layer.
+  Damaged packets that the TNC has received from the PC are not transmitted.
+
+More details about 6pack are described in the file 6pack.ps that is located
+in the doc directory of the AX.25 utilities package.
+
+2. Who has developed the 6pack protocol?
+
+The 6pack protocol has been developed by Ekki Plicht DF4OR, Henning Rech
+DF9IC and Gunter Jost DK7WJ. A driver for 6pack, written by Gunter Jost and
+Matthias Welwarsky DG2FEF, comes along with the PC version of FlexNet.
+They have also written a firmware for TNCs to perform the 6pack
+protocol (see section 4 below).
+
+3. Where can I get the latest version of 6pack for LinuX?
+
+At the moment, the 6pack stuff can obtained via anonymous ftp from
+db0bm.automation.fh-aachen.de. In the directory /incoming/dg3kq,
+there is a file named 6pack.tgz.
+
+4. Preparing the TNC for 6pack operation
+
+To be able to use 6pack, a special firmware for the TNC is needed. The EPROM
+of a newly bought TNC does not contain 6pack, so you will have to
+program an EPROM yourself. The image file for 6pack EPROMs should be
+available on any packet radio box where PC/FlexNet can be found. The name of
+the file is 6pack.bin. This file is copyrighted and maintained by the FlexNet
+team. It can be used under the terms of the license that comes along
+with PC/FlexNet. Please do not ask me about the internals of this file as I
+don't know anything about it. I used a textual description of the 6pack
+protocol to program the Linux driver.
+
+TNCs contain a 64kByte EPROM, the lower half of which is used for
+the firmware/KISS. The upper half is either empty or is sometimes
+programmed with software called TAPR. In the latter case, the TNC
+is supplied with a DIP switch so you can easily change between the
+two systems. When programming a new EPROM, one of the systems is replaced
+by 6pack. It is useful to replace TAPR, as this software is rarely used
+nowadays. If your TNC is not equipped with the switch mentioned above, you
+can build in one yourself that switches over the highest address pin
+of the EPROM between HIGH and LOW level. After having inserted the new EPROM
+and switched to 6pack, apply power to the TNC for a first test. The connect
+and the status LED are lit for about a second if the firmware initialises
+the TNC correctly.
+
+5. Building and installing the 6pack driver
+
+The driver has been tested with kernel version 2.1.90. Use with older
+kernels may lead to a compilation error because the interface to a kernel
+function has been changed in the 2.1.8x kernels.
+
+How to turn on 6pack support:
+
+- In the linux kernel configuration program, select the code maturity level
+  options menu and turn on the prompting for development drivers.
+
+- Select the amateur radio support menu and turn on the serial port 6pack
+  driver.
+
+- Compile and install the kernel and the modules.
+
+To use the driver, the kissattach program delivered with the AX.25 utilities
+has to be modified.
+
+- Do a cd to the directory that holds the kissattach sources. Edit the
+  kissattach.c file. At the top, insert the following lines:
+
+  #ifndef N_6PACK
+  #define N_6PACK (N_AX25+1)
+  #endif
+
+  Then find the line
+   
+  int disc = N_AX25;
+
+  and replace N_AX25 by N_6PACK.
+
+- Recompile kissattach. Rename it to spattach to avoid confusions.
+
+Installing the driver:
+
+- Do an insmod 6pack. Look at your /var/log/messages file to check if the 
+  module has printed its initialization message.
+
+- Do a spattach as you would launch kissattach when starting a KISS port.
+  Check if the kernel prints the message '6pack: TNC found'. 
+
+- From here, everything should work as if you were setting up a KISS port.
+  The only difference is that the network device that represents
+  the 6pack port is called sp instead of sl or ax. So, sp0 would be the
+  first 6pack port.
+
+Although the driver has been tested on various platforms, I still declare it
+ALPHA. BE CAREFUL! Sync your disks before insmoding the 6pack module
+and spattaching. Watch out if your computer behaves strangely. Read section
+6 of this file about known problems.
+
+Note that the connect and status LEDs of the TNC are controlled in a
+different way than they are when the TNC is used with PC/FlexNet. When using
+FlexNet, the connect LED is on if there is a connection; the status LED is
+on if there is data in the buffer of the PC's AX.25 engine that has to be
+transmitted. Under Linux, the 6pack layer is beyond the AX.25 layer,
+so the 6pack driver doesn't know anything about connects or data that
+has not yet been transmitted. Therefore the LEDs are controlled
+as they are in KISS mode: The connect LED is turned on if data is transferred
+from the PC to the TNC over the serial line, the status LED if data is
+sent to the PC.
+
+6. Known problems
+
+When testing the driver with 2.0.3x kernels and
+operating with data rates on the radio channel of 9600 Baud or higher,
+the driver may, on certain systems, sometimes print the message '6pack:
+bad checksum', which is due to data loss if the other station sends two
+or more subsequent packets. I have been told that this is due to a problem
+with the serial driver of 2.0.3x kernels. I don't know yet if the problem
+still exists with 2.1.x kernels, as I have heard that the serial driver
+code has been changed with 2.1.x.
+
+When shutting down the sp interface with ifconfig, the kernel crashes if
+there is still an AX.25 connection left over which an IP connection was
+running, even if that IP connection is already closed. The problem does not
+occur when there is a bare AX.25 connection still running. I don't know if
+this is a problem of the 6pack driver or something else in the kernel.
+
+The driver has been tested as a module, not yet as a kernel-builtin driver.
+
+The 6pack protocol supports daisy-chaining of TNCs in a token ring, which is
+connected to one serial port of the PC. This feature is not implemented
+and at least at the moment I won't be able to do it because I do not have
+the opportunity to build a TNC daisy-chain and test it.
+
+Some of the comments in the source code are inaccurate. They are left from
+the SLIP/KISS driver, from which the 6pack driver has been derived.
+I haven't modified or removed them yet -- sorry! The code itself needs
+some cleaning and optimizing. This will be done in a later release.
+
+If you encounter a bug or if you have a question or suggestion concerning the
+driver, feel free to mail me, using the addresses given at the beginning of
+this file.
+
+Have fun!
+
+Andreas
diff --git a/Documentation/networking/LICENSE.qla3xxx b/Documentation/networking/LICENSE.qla3xxx
new file mode 100644
index 000000000..2f2077e34
--- /dev/null
+++ b/Documentation/networking/LICENSE.qla3xxx
@@ -0,0 +1,46 @@
+Copyright (c)  2003-2006 QLogic Corporation
+QLogic Linux Networking HBA Driver
+
+This program includes a device driver for Linux 2.6 that may be
+distributed with QLogic hardware specific firmware binary file.
+You may modify and redistribute the device driver code under the
+GNU General Public License as published by the Free Software
+Foundation (version 2 or a later version).
+
+You may redistribute the hardware specific firmware binary file
+under the following terms:
+
+	1. Redistribution of source code (only if applicable),
+	   must retain the above copyright notice, this list of
+	   conditions and the following disclaimer.
+
+	2. Redistribution in binary form must reproduce the above
+	   copyright notice, this list of conditions and the
+	   following disclaimer in the documentation and/or other
+	   materials provided with the distribution.
+
+	3. The name of QLogic Corporation may not be used to
+	   endorse or promote products derived from this software
+	   without specific prior written permission
+
+REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE,
+THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY
+EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
+PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR
+BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
+TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
+ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGE.
+
+USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT
+CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR
+OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT,
+TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN
+ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN
+COMBINATION WITH THIS PROGRAM.
+
diff --git a/Documentation/networking/LICENSE.qlcnic b/Documentation/networking/LICENSE.qlcnic
new file mode 100644
index 000000000..2ae3b6498
--- /dev/null
+++ b/Documentation/networking/LICENSE.qlcnic
@@ -0,0 +1,288 @@
+Copyright (c) 2009-2013 QLogic Corporation
+QLogic Linux qlcnic NIC Driver
+
+You may modify and redistribute the device driver code under the
+GNU General Public License (a copy of which is attached hereto as
+Exhibit A) published by the Free Software Foundation (version 2).
+
+
+EXHIBIT A
+
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+ 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+			    Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+		    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+			    NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
diff --git a/Documentation/networking/LICENSE.qlge b/Documentation/networking/LICENSE.qlge
new file mode 100644
index 000000000..ce64e4d15
--- /dev/null
+++ b/Documentation/networking/LICENSE.qlge
@@ -0,0 +1,288 @@
+Copyright (c) 2003-2011 QLogic Corporation
+QLogic Linux qlge NIC Driver
+
+You may modify and redistribute the device driver code under the
+GNU General Public License (a copy of which is attached hereto as
+Exhibit A) published by the Free Software Foundation (version 2).
+
+
+EXHIBIT A
+
+		    GNU GENERAL PUBLIC LICENSE
+		       Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+ 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+			    Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+		    GNU GENERAL PUBLIC LICENSE
+   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+  0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term "modification".)  Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+  1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+  2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+    a) You must cause the modified files to carry prominent notices
+    stating that you changed the files and the date of any change.
+
+    b) You must cause any work that you distribute or publish, that in
+    whole or in part contains or is derived from the Program or any
+    part thereof, to be licensed as a whole at no charge to all third
+    parties under the terms of this License.
+
+    c) If the modified program normally reads commands interactively
+    when run, you must cause it, when started running for such
+    interactive use in the most ordinary way, to print or display an
+    announcement including an appropriate copyright notice and a
+    notice that there is no warranty (or else, saying that you provide
+    a warranty) and that users may redistribute the program under
+    these conditions, and telling the user how to view a copy of this
+    License.  (Exception: if the Program itself is interactive but
+    does not normally print such an announcement, your work based on
+    the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+  3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+    a) Accompany it with the complete corresponding machine-readable
+    source code, which must be distributed under the terms of Sections
+    1 and 2 above on a medium customarily used for software interchange; or,
+
+    b) Accompany it with a written offer, valid for at least three
+    years, to give any third party, for a charge no more than your
+    cost of physically performing source distribution, a complete
+    machine-readable copy of the corresponding source code, to be
+    distributed under the terms of Sections 1 and 2 above on a medium
+    customarily used for software interchange; or,
+
+    c) Accompany it with the information you received as to the offer
+    to distribute corresponding source code.  (This alternative is
+    allowed only for noncommercial distribution and only if you
+    received the program in object code or executable form with such
+    an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+  4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+  5. You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+  6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+  7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+  8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+  9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+  10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+			    NO WARRANTY
+
+  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
diff --git a/Documentation/networking/PLIP.txt b/Documentation/networking/PLIP.txt
new file mode 100644
index 000000000..ad7e3f7c3
--- /dev/null
+++ b/Documentation/networking/PLIP.txt
@@ -0,0 +1,215 @@
+PLIP: The Parallel Line Internet Protocol Device
+
+Donald Becker (becker@super.org)
+I.D.A. Supercomputing Research Center, Bowie MD 20715
+
+At some point T. Thorn will probably contribute text,
+Tommy Thorn (tthorn@daimi.aau.dk)
+
+PLIP Introduction
+-----------------
+
+This document describes the parallel port packet pusher for Net/LGX.
+This device interface allows a point-to-point connection between two
+parallel ports to appear as a IP network interface.
+
+What is PLIP?
+=============
+
+PLIP is Parallel Line IP, that is, the transportation of IP packages
+over a parallel port. In the case of a PC, the obvious choice is the
+printer port.  PLIP is a non-standard, but [can use] uses the standard
+LapLink null-printer cable [can also work in turbo mode, with a PLIP
+cable]. [The protocol used to pack IP packages, is a simple one
+initiated by Crynwr.]
+
+Advantages of PLIP
+==================
+
+It's cheap, it's available everywhere, and it's easy.
+
+The PLIP cable is all that's needed to connect two Linux boxes, and it
+can be built for very few bucks.
+
+Connecting two Linux boxes takes only a second's decision and a few
+minutes' work, no need to search for a [supported] netcard. This might
+even be especially important in the case of notebooks, where netcards
+are not easily available.
+
+Not requiring a netcard also means that apart from connecting the
+cables, everything else is software configuration [which in principle
+could be made very easy.]
+
+Disadvantages of PLIP
+=====================
+
+Doesn't work over a modem, like SLIP and PPP. Limited range, 15 m.
+Can only be used to connect three (?) Linux boxes. Doesn't connect to
+an existing Ethernet. Isn't standard (not even de facto standard, like
+SLIP).
+
+Performance
+===========
+
+PLIP easily outperforms Ethernet cards....(ups, I was dreaming, but
+it *is* getting late. EOB)
+
+PLIP driver details
+-------------------
+
+The Linux PLIP driver is an implementation of the original Crynwr protocol,
+that uses the parallel port subsystem of the kernel in order to properly
+share parallel ports between PLIP and other services.
+
+IRQs and trigger timeouts
+=========================
+
+When a parallel port used for a PLIP driver has an IRQ configured to it, the
+PLIP driver is signaled whenever data is sent to it via the cable, such that
+when no data is available, the driver isn't being used.
+
+However, on some machines it is hard, if not impossible, to configure an IRQ
+to a certain parallel port, mainly because it is used by some other device.
+On these machines, the PLIP driver can be used in IRQ-less mode, where
+the PLIP driver would constantly poll the parallel port for data waiting,
+and if such data is available, process it. This mode is less efficient than
+the IRQ mode, because the driver has to check the parallel port many times
+per second, even when no data at all is sent. Some rough measurements
+indicate that there isn't a noticeable performance drop when using IRQ-less
+mode as compared to IRQ mode as far as the data transfer speed is involved.
+There is a performance drop on the machine hosting the driver.
+
+When the PLIP driver is used in IRQ mode, the timeout used for triggering a
+data transfer (the maximal time the PLIP driver would allow the other side
+before announcing a timeout, when trying to handshake a transfer of some
+data) is, by default, 500usec. As IRQ delivery is more or less immediate,
+this timeout is quite sufficient. 
+
+When in IRQ-less mode, the PLIP driver polls the parallel port HZ times
+per second (where HZ is typically 100 on most platforms, and 1024 on an
+Alpha, as of this writing). Between two such polls, there are 10^6/HZ usecs.
+On an i386, for example, 10^6/100 = 10000usec. It is easy to see that it is
+quite possible for the trigger timeout to expire between two such polls, as
+the timeout is only 500usec long. As a result, it is required to change the
+trigger timeout on the *other* side of a PLIP connection, to about
+10^6/HZ usecs. If both sides of a PLIP connection are used in IRQ-less mode,
+this timeout is required on both sides.
+
+It appears that in practice, the trigger timeout can be shorter than in the
+above calculation. It isn't an important issue, unless the wire is faulty,
+in which case a long timeout would stall the machine when, for whatever
+reason, bits are dropped.
+
+A utility that can perform this change in Linux is plipconfig, which is part
+of the net-tools package (its location can be found in the
+Documentation/Changes file). An example command would be
+'plipconfig plipX trigger 10000', where plipX is the appropriate
+PLIP device.
+
+PLIP hardware interconnection
+-----------------------------
+
+PLIP uses several different data transfer methods.  The first (and the
+only one implemented in the early version of the code) uses a standard
+printer "null" cable to transfer data four bits at a time using
+data bit outputs connected to status bit inputs.
+
+The second data transfer method relies on both machines having
+bi-directional parallel ports, rather than output-only ``printer''
+ports.  This allows byte-wide transfers and avoids reconstructing
+nibbles into bytes, leading to much faster transfers.
+
+Parallel Transfer Mode 0 Cable
+==============================
+
+The cable for the first transfer mode is a standard
+printer "null" cable which transfers data four bits at a time using
+data bit outputs of the first port (machine T) connected to the
+status bit inputs of the second port (machine R).  There are five
+status inputs, and they are used as four data inputs and a clock (data
+strobe) input, arranged so that the data input bits appear as contiguous
+bits with standard status register implementation.
+
+A cable that implements this protocol is available commercially as a
+"Null Printer" or "Turbo Laplink" cable.  It can be constructed with
+two DB-25 male connectors symmetrically connected as follows:
+
+    STROBE output	1*
+    D0->ERROR	2 - 15		15 - 2
+    D1->SLCT	3 - 13		13 - 3
+    D2->PAPOUT	4 - 12		12 - 4
+    D3->ACK	5 - 10		10 - 5
+    D4->BUSY	6 - 11		11 - 6
+    D5,D6,D7 are   7*, 8*, 9*
+    AUTOFD output 14*
+    INIT   output 16*
+    SLCTIN	17 - 17
+    extra grounds are 18*,19*,20*,21*,22*,23*,24*
+    GROUND	25 - 25
+* Do not connect these pins on either end
+
+If the cable you are using has a metallic shield it should be
+connected to the metallic DB-25 shell at one end only.
+
+Parallel Transfer Mode 1
+========================
+
+The second data transfer method relies on both machines having
+bi-directional parallel ports, rather than output-only ``printer''
+ports.  This allows byte-wide transfers, and avoids reconstructing
+nibbles into bytes.  This cable should not be used on unidirectional
+``printer'' (as opposed to ``parallel'') ports or when the machine
+isn't configured for PLIP, as it will result in output driver
+conflicts and the (unlikely) possibility of damage.
+
+The cable for this transfer mode should be constructed as follows:
+
+    STROBE->BUSY 1 - 11
+    D0->D0	2 - 2
+    D1->D1	3 - 3
+    D2->D2	4 - 4
+    D3->D3	5 - 5
+    D4->D4	6 - 6
+    D5->D5	7 - 7
+    D6->D6	8 - 8
+    D7->D7	9 - 9
+    INIT -> ACK  16 - 10
+    AUTOFD->PAPOUT 14 - 12
+    SLCT->SLCTIN 13 - 17
+    GND->ERROR	18 - 15
+    extra grounds are 19*,20*,21*,22*,23*,24*
+    GROUND	25 - 25
+* Do not connect these pins on either end
+
+Once again, if the cable you are using has a metallic shield it should
+be connected to the metallic DB-25 shell at one end only.
+
+PLIP Mode 0 transfer protocol
+=============================
+
+The PLIP driver is compatible with the "Crynwr" parallel port transfer
+standard in Mode 0.  That standard specifies the following protocol:
+
+   send header nibble '0x8'
+   count-low octet
+   count-high octet
+   ... data octets
+   checksum octet
+
+Each octet is sent as
+	<wait for rx. '0x1?'>	<send 0x10+(octet&0x0F)>
+	<wait for rx. '0x0?'>	<send 0x00+((octet>>4)&0x0F)>
+
+To start a transfer the transmitting machine outputs a nibble 0x08.
+That raises the ACK line, triggering an interrupt in the receiving
+machine.  The receiving machine disables interrupts and raises its own ACK
+line. 
+
+Restated:
+
+(OUT is bit 0-4, OUT.j is bit j from OUT. IN likewise)
+Send_Byte:
+   OUT := low nibble, OUT.4 := 1
+   WAIT FOR IN.4 = 1
+   OUT := high nibble, OUT.4 := 0
+   WAIT FOR IN.4 = 0
diff --git a/Documentation/networking/README.ipw2100 b/Documentation/networking/README.ipw2100
new file mode 100644
index 000000000..6f85e1d06
--- /dev/null
+++ b/Documentation/networking/README.ipw2100
@@ -0,0 +1,293 @@
+
+Intel(R) PRO/Wireless 2100 Driver for Linux in support of:
+
+Intel(R) PRO/Wireless 2100 Network Connection
+
+Copyright (C) 2003-2006, Intel Corporation
+
+README.ipw2100
+
+Version: git-1.1.5
+Date   : January 25, 2006
+
+Index
+-----------------------------------------------
+0. IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+1. Introduction
+2. Release git-1.1.5 Current Features
+3. Command Line Parameters
+4. Sysfs Helper Files
+5. Radio Kill Switch
+6. Dynamic Firmware
+7. Power Management
+8. Support
+9. License
+
+
+0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+-----------------------------------------------
+
+Important Notice FOR ALL USERS OR DISTRIBUTORS!!!!
+
+Intel wireless LAN adapters are engineered, manufactured, tested, and
+quality checked to ensure that they meet all necessary local and
+governmental regulatory agency requirements for the regions that they
+are designated and/or marked to ship into. Since wireless LANs are
+generally unlicensed devices that share spectrum with radars,
+satellites, and other licensed and unlicensed devices, it is sometimes
+necessary to dynamically detect, avoid, and limit usage to avoid
+interference with these devices. In many instances Intel is required to
+provide test data to prove regional and local compliance to regional and
+governmental regulations before certification or approval to use the
+product is granted. Intel's wireless LAN's EEPROM, firmware, and
+software driver are designed to carefully control parameters that affect
+radio operation and to ensure electromagnetic compliance (EMC). These
+parameters include, without limitation, RF power, spectrum usage,
+channel scanning, and human exposure.
+
+For these reasons Intel cannot permit any manipulation by third parties
+of the software provided in binary format with the wireless WLAN
+adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
+patches, utilities, or code with the Intel wireless LAN adapters that
+have been manipulated by an unauthorized party (i.e., patches,
+utilities, or code (including open source code modifications) which have
+not been validated by Intel), (i) you will be solely responsible for
+ensuring the regulatory compliance of the products, (ii) Intel will bear
+no liability, under any theory of liability for any issues associated
+with the modified products, including without limitation, claims under
+the warranty and/or issues arising from regulatory non-compliance, and
+(iii) Intel will not provide or be required to assist in providing
+support to any third parties for such modified products.
+
+Note: Many regulatory agencies consider Wireless LAN adapters to be
+modules, and accordingly, condition system-level regulatory approval
+upon receipt and review of test data documenting that the antennas and
+system configuration do not cause the EMC and radio operation to be
+non-compliant.
+
+The drivers available for download from SourceForge are provided as a
+part of a development project.  Conformance to local regulatory
+requirements is the responsibility of the individual developer.  As
+such, if you are interested in deploying or shipping a driver as part of
+solution intended to be used for purposes other than development, please
+obtain a tested driver from Intel Customer Support at:
+
+http://www.intel.com/support/wireless/sb/CS-006408.htm
+
+1. Introduction
+-----------------------------------------------
+
+This document provides a brief overview of the features supported by the 
+IPW2100 driver project.  The main project website, where the latest 
+development version of the driver can be found, is:
+
+	http://ipw2100.sourceforge.net
+
+There you can find the not only the latest releases, but also information about
+potential fixes and patches, as well as links to the development mailing list
+for the driver project.
+
+
+2. Release git-1.1.5 Current Supported Features
+-----------------------------------------------
+- Managed (BSS) and Ad-Hoc (IBSS)
+- WEP (shared key and open)
+- Wireless Tools support 
+- 802.1x (tested with XSupplicant 1.0.1)
+
+Enabled (but not supported) features:
+- Monitor/RFMon mode
+- WPA/WPA2
+
+The distinction between officially supported and enabled is a reflection
+on the amount of validation and interoperability testing that has been
+performed on a given feature.
+
+
+3. Command Line Parameters
+-----------------------------------------------
+
+If the driver is built as a module, the following optional parameters are used
+by entering them on the command line with the modprobe command using this
+syntax:
+
+	modprobe ipw2100 [<option>=<VAL1><,VAL2>...]
+
+For example, to disable the radio on driver loading, enter:
+
+	modprobe ipw2100 disable=1
+
+The ipw2100 driver supports the following module parameters:
+
+Name		Value		Example:
+debug		0x0-0xffffffff	debug=1024
+mode		0,1,2		mode=1   /* AdHoc */
+channel		int		channel=3 /* Only valid in AdHoc or Monitor */
+associate	boolean		associate=0 /* Do NOT auto associate */
+disable		boolean		disable=1 /* Do not power the HW */
+
+
+4. Sysfs Helper Files
+---------------------------     
+-----------------------------------------------
+
+There are several ways to control the behavior of the driver.  Many of the 
+general capabilities are exposed through the Wireless Tools (iwconfig).  There
+are a few capabilities that are exposed through entries in the Linux Sysfs.
+
+
+----- Driver Level ------
+For the driver level files, look in /sys/bus/pci/drivers/ipw2100/
+
+  debug_level  
+	
+	This controls the same global as the 'debug' module parameter.  For 
+        information on the various debugging levels available, run the 'dvals'
+	script found in the driver source directory.
+
+	NOTE:  'debug_level' is only enabled if CONFIG_IPW2100_DEBUG is turn
+	       on.
+
+----- Device Level ------
+For the device level files look in
+	
+	/sys/bus/pci/drivers/ipw2100/{PCI-ID}/
+
+For example:
+	/sys/bus/pci/drivers/ipw2100/0000:02:01.0
+
+For the device level files, see /sys/bus/pci/drivers/ipw2100:
+
+  rf_kill
+	read - 
+	0 = RF kill not enabled (radio on)
+	1 = SW based RF kill active (radio off)
+	2 = HW based RF kill active (radio off)
+	3 = Both HW and SW RF kill active (radio off)
+	write -
+	0 = If SW based RF kill active, turn the radio back on
+	1 = If radio is on, activate SW based RF kill
+
+	NOTE: If you enable the SW based RF kill and then toggle the HW
+  	based RF kill from ON -> OFF -> ON, the radio will NOT come back on
+
+
+5. Radio Kill Switch
+-----------------------------------------------
+Most laptops provide the ability for the user to physically disable the radio.
+Some vendors have implemented this as a physical switch that requires no
+software to turn the radio off and on.  On other laptops, however, the switch
+is controlled through a button being pressed and a software driver then making
+calls to turn the radio off and on.  This is referred to as a "software based
+RF kill switch"
+
+See the Sysfs helper file 'rf_kill' for determining the state of the RF switch
+on your system.
+
+
+6. Dynamic Firmware
+-----------------------------------------------
+As the firmware is licensed under a restricted use license, it can not be 
+included within the kernel sources.  To enable the IPW2100 you will need a 
+firmware image to load into the wireless NIC's processors.
+
+You can obtain these images from <http://ipw2100.sf.net/firmware.php>.
+
+See INSTALL for instructions on installing the firmware.
+
+
+7. Power Management
+-----------------------------------------------
+The IPW2100 supports the configuration of the Power Save Protocol 
+through a private wireless extension interface.  The IPW2100 supports 
+the following different modes:
+
+	off	No power management.  Radio is always on.
+	on	Automatic power management
+	1-5	Different levels of power management.  The higher the 
+		number the greater the power savings, but with an impact to 
+		packet latencies. 
+
+Power management works by powering down the radio after a certain 
+interval of time has passed where no packets are passed through the 
+radio.  Once powered down, the radio remains in that state for a given 
+period of time.  For higher power savings, the interval between last 
+packet processed to sleep is shorter and the sleep period is longer.
+
+When the radio is asleep, the access point sending data to the station 
+must buffer packets at the AP until the station wakes up and requests 
+any buffered packets.  If you have an AP that does not correctly support 
+the PSP protocol you may experience packet loss or very poor performance 
+while power management is enabled.  If this is the case, you will need 
+to try and find a firmware update for your AP, or disable power 
+management (via `iwconfig eth1 power off`)
+
+To configure the power level on the IPW2100 you use a combination of 
+iwconfig and iwpriv.  iwconfig is used to turn power management on, off, 
+and set it to auto.
+
+	iwconfig eth1 power off    Disables radio power down
+	iwconfig eth1 power on     Enables radio power management to 
+				   last set level (defaults to AUTO)
+	iwpriv eth1 set_power 0    Sets power level to AUTO and enables 
+				   power management if not previously 
+				   enabled.
+	iwpriv eth1 set_power 1-5  Set the power level as specified, 
+				   enabling power management if not 
+				   previously enabled.
+
+You can view the current power level setting via:
+	
+	iwpriv eth1 get_power
+
+It will return the current period or timeout that is configured as a string
+in the form of xxxx/yyyy (z) where xxxx is the timeout interval (amount of
+time after packet processing), yyyy is the period to sleep (amount of time to 
+wait before powering the radio and querying the access point for buffered
+packets), and z is the 'power level'.  If power management is turned off the
+xxxx/yyyy will be replaced with 'off' -- the level reported will be the active
+level if `iwconfig eth1 power on` is invoked.
+
+
+8. Support
+-----------------------------------------------
+
+For general development information and support,
+go to:
+	
+    http://ipw2100.sf.net/
+
+The ipw2100 1.1.0 driver and firmware can be downloaded from:  
+
+    http://support.intel.com
+
+For installation support on the ipw2100 1.1.0 driver on Linux kernels 
+2.6.8 or greater, email support is available from:  
+
+    http://supportmail.intel.com
+
+9. License
+-----------------------------------------------
+
+  Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify it 
+  under the terms of the GNU General Public License (version 2) as 
+  published by the Free Software Foundation.
+  
+  This program is distributed in the hope that it will be useful, but WITHOUT 
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
+  more details.
+  
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc., 59 
+  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+  
+  The full GNU General Public License is included in this distribution in the
+  file called LICENSE.
+  
+  License Contact Information:
+  James P. Ketrenos <ipw2100-admin@linux.intel.com>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
diff --git a/Documentation/networking/README.ipw2200 b/Documentation/networking/README.ipw2200
new file mode 100644
index 000000000..b7658bed4
--- /dev/null
+++ b/Documentation/networking/README.ipw2200
@@ -0,0 +1,472 @@
+
+Intel(R) PRO/Wireless 2915ABG Driver for Linux in support of:
+
+Intel(R) PRO/Wireless 2200BG Network Connection
+Intel(R) PRO/Wireless 2915ABG Network Connection
+
+Note: The Intel(R) PRO/Wireless 2915ABG Driver for Linux and Intel(R)
+PRO/Wireless 2200BG Driver for Linux is a unified driver that works on
+both hardware adapters listed above. In this document the Intel(R)
+PRO/Wireless 2915ABG Driver for Linux will be used to reference the
+unified driver.
+
+Copyright (C) 2004-2006, Intel Corporation
+
+README.ipw2200
+
+Version: 1.1.2
+Date   : March 30, 2006
+
+
+Index
+-----------------------------------------------
+0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+1.   Introduction
+1.1. Overview of features
+1.2. Module parameters
+1.3. Wireless Extension Private Methods
+1.4. Sysfs Helper Files
+1.5. Supported channels
+2.   Ad-Hoc Networking
+3.   Interacting with Wireless Tools
+3.1. iwconfig mode
+3.2. iwconfig sens
+4.   About the Version Numbers
+5.   Firmware installation
+6.   Support
+7.   License
+
+
+0.   IMPORTANT INFORMATION BEFORE USING THIS DRIVER
+-----------------------------------------------
+
+Important Notice FOR ALL USERS OR DISTRIBUTORS!!!! 
+
+Intel wireless LAN adapters are engineered, manufactured, tested, and
+quality checked to ensure that they meet all necessary local and
+governmental regulatory agency requirements for the regions that they
+are designated and/or marked to ship into. Since wireless LANs are
+generally unlicensed devices that share spectrum with radars,
+satellites, and other licensed and unlicensed devices, it is sometimes
+necessary to dynamically detect, avoid, and limit usage to avoid
+interference with these devices. In many instances Intel is required to
+provide test data to prove regional and local compliance to regional and
+governmental regulations before certification or approval to use the
+product is granted. Intel's wireless LAN's EEPROM, firmware, and
+software driver are designed to carefully control parameters that affect
+radio operation and to ensure electromagnetic compliance (EMC). These
+parameters include, without limitation, RF power, spectrum usage,
+channel scanning, and human exposure. 
+
+For these reasons Intel cannot permit any manipulation by third parties
+of the software provided in binary format with the wireless WLAN
+adapters (e.g., the EEPROM and firmware). Furthermore, if you use any
+patches, utilities, or code with the Intel wireless LAN adapters that
+have been manipulated by an unauthorized party (i.e., patches,
+utilities, or code (including open source code modifications) which have
+not been validated by Intel), (i) you will be solely responsible for
+ensuring the regulatory compliance of the products, (ii) Intel will bear
+no liability, under any theory of liability for any issues associated
+with the modified products, including without limitation, claims under
+the warranty and/or issues arising from regulatory non-compliance, and
+(iii) Intel will not provide or be required to assist in providing
+support to any third parties for such modified products.  
+
+Note: Many regulatory agencies consider Wireless LAN adapters to be
+modules, and accordingly, condition system-level regulatory approval
+upon receipt and review of test data documenting that the antennas and
+system configuration do not cause the EMC and radio operation to be
+non-compliant.
+
+The drivers available for download from SourceForge are provided as a 
+part of a development project.  Conformance to local regulatory 
+requirements is the responsibility of the individual developer.  As 
+such, if you are interested in deploying or shipping a driver as part of 
+solution intended to be used for purposes other than development, please 
+obtain a tested driver from Intel Customer Support at:
+
+http://support.intel.com
+
+
+1.   Introduction
+-----------------------------------------------
+The following sections attempt to provide a brief introduction to using 
+the Intel(R) PRO/Wireless 2915ABG Driver for Linux.
+
+This document is not meant to be a comprehensive manual on 
+understanding or using wireless technologies, but should be sufficient 
+to get you moving without wires on Linux.
+
+For information on building and installing the driver, see the INSTALL
+file.
+
+
+1.1. Overview of Features
+-----------------------------------------------
+The current release (1.1.2) supports the following features:
+
++ BSS mode (Infrastructure, Managed)
++ IBSS mode (Ad-Hoc)
++ WEP (OPEN and SHARED KEY mode)
++ 802.1x EAP via wpa_supplicant and xsupplicant
++ Wireless Extension support 
++ Full B and G rate support (2200 and 2915)
++ Full A rate support (2915 only)
++ Transmit power control
++ S state support (ACPI suspend/resume)
+
+The following features are currently enabled, but not officially
+supported:
+
++ WPA
++ long/short preamble support
++ Monitor mode (aka RFMon)
+
+The distinction between officially supported and enabled is a reflection 
+on the amount of validation and interoperability testing that has been
+performed on a given feature. 
+
+
+
+1.2. Command Line Parameters
+-----------------------------------------------
+
+Like many modules used in the Linux kernel, the Intel(R) PRO/Wireless
+2915ABG Driver for Linux allows configuration options to be provided 
+as module parameters.  The most common way to specify a module parameter 
+is via the command line.  
+
+The general form is:
+
+% modprobe ipw2200 parameter=value
+
+Where the supported parameter are:
+
+  associate
+	Set to 0 to disable the auto scan-and-associate functionality of the
+	driver.  If disabled, the driver will not attempt to scan 
+	for and associate to a network until it has been configured with 
+	one or more properties for the target network, for example configuring 
+	the network SSID.  Default is 0 (do not auto-associate)
+	
+	Example: % modprobe ipw2200 associate=0
+
+  auto_create
+	Set to 0 to disable the auto creation of an Ad-Hoc network 
+	matching the channel and network name parameters provided.  
+	Default is 1.
+
+  channel
+	channel number for association.  The normal method for setting
+        the channel would be to use the standard wireless tools
+        (i.e. `iwconfig eth1 channel 10`), but it is useful sometimes
+	to set this while debugging.  Channel 0 means 'ANY'
+
+  debug
+	If using a debug build, this is used to control the amount of debug
+	info is logged.  See the 'dvals' and 'load' script for more info on
+	how to use this (the dvals and load scripts are provided as part 
+	of the ipw2200 development snapshot releases available from the 
+	SourceForge project at http://ipw2200.sf.net)
+  
+  led
+	Can be used to turn on experimental LED code.
+	0 = Off, 1 = On.  Default is 1.
+
+  mode
+	Can be used to set the default mode of the adapter.  
+	0 = Managed, 1 = Ad-Hoc, 2 = Monitor
+
+
+1.3. Wireless Extension Private Methods
+-----------------------------------------------
+
+As an interface designed to handle generic hardware, there are certain 
+capabilities not exposed through the normal Wireless Tool interface.  As 
+such, a provision is provided for a driver to declare custom, or 
+private, methods.  The Intel(R) PRO/Wireless 2915ABG Driver for Linux 
+defines several of these to configure various settings.
+
+The general form of using the private wireless methods is:
+
+	% iwpriv $IFNAME method parameters
+
+Where $IFNAME is the interface name the device is registered with 
+(typically eth1, customized via one of the various network interface
+name managers, such as ifrename)
+
+The supported private methods are:
+
+  get_mode
+	Can be used to report out which IEEE mode the driver is 
+	configured to support.  Example:
+	
+	% iwpriv eth1 get_mode
+	eth1	get_mode:802.11bg (6)
+
+  set_mode
+	Can be used to configure which IEEE mode the driver will 
+	support.  
+
+	Usage:
+	% iwpriv eth1 set_mode {mode}
+	Where {mode} is a number in the range 1-7:
+	1	802.11a (2915 only)
+	2	802.11b
+	3	802.11ab (2915 only)
+	4	802.11g 
+	5	802.11ag (2915 only)
+	6	802.11bg
+	7	802.11abg (2915 only)
+
+  get_preamble
+	Can be used to report configuration of preamble length.
+
+  set_preamble
+	Can be used to set the configuration of preamble length:
+
+	Usage:
+	% iwpriv eth1 set_preamble {mode}
+	Where {mode} is one of:
+	1	Long preamble only
+	0	Auto (long or short based on connection)
+	
+
+1.4. Sysfs Helper Files:
+-----------------------------------------------
+
+The Linux kernel provides a pseudo file system that can be used to 
+access various components of the operating system.  The Intel(R)
+PRO/Wireless 2915ABG Driver for Linux exposes several configuration
+parameters through this mechanism.
+
+An entry in the sysfs can support reading and/or writing.  You can 
+typically query the contents of a sysfs entry through the use of cat, 
+and can set the contents via echo.  For example:
+
+% cat /sys/bus/pci/drivers/ipw2200/debug_level
+
+Will report the current debug level of the driver's logging subsystem 
+(only available if CONFIG_IPW2200_DEBUG was configured when the driver
+was built).
+
+You can set the debug level via:
+
+% echo $VALUE > /sys/bus/pci/drivers/ipw2200/debug_level
+
+Where $VALUE would be a number in the case of this sysfs entry.  The 
+input to sysfs files does not have to be a number.  For example, the 
+firmware loader used by hotplug utilizes sysfs entries for transferring 
+the firmware image from user space into the driver.
+
+The Intel(R) PRO/Wireless 2915ABG Driver for Linux exposes sysfs entries 
+at two levels -- driver level, which apply to all instances of the driver 
+(in the event that there are more than one device installed) and device 
+level, which applies only to the single specific instance.
+
+
+1.4.1 Driver Level Sysfs Helper Files
+-----------------------------------------------
+
+For the driver level files, look in /sys/bus/pci/drivers/ipw2200/
+
+  debug_level  
+	
+	This controls the same global as the 'debug' module parameter
+
+
+
+1.4.2 Device Level Sysfs Helper Files
+-----------------------------------------------
+
+For the device level files, look in
+	
+	/sys/bus/pci/drivers/ipw2200/{PCI-ID}/
+
+For example:
+	/sys/bus/pci/drivers/ipw2200/0000:02:01.0
+
+For the device level files, see /sys/bus/pci/drivers/ipw2200:
+
+  rf_kill
+	read - 
+	0 = RF kill not enabled (radio on)
+	1 = SW based RF kill active (radio off)
+	2 = HW based RF kill active (radio off)
+	3 = Both HW and SW RF kill active (radio off)
+	write -
+	0 = If SW based RF kill active, turn the radio back on
+	1 = If radio is on, activate SW based RF kill
+
+	NOTE: If you enable the SW based RF kill and then toggle the HW
+  	based RF kill from ON -> OFF -> ON, the radio will NOT come back on
+	
+  ucode 
+	read-only access to the ucode version number
+
+  led
+	read -
+	0 = LED code disabled
+	1 = LED code enabled
+	write -
+	0 = Disable LED code
+	1 = Enable LED code
+
+	NOTE: The LED code has been reported to hang some systems when 
+	running ifconfig and is therefore disabled by default.
+
+
+1.5. Supported channels
+-----------------------------------------------
+
+Upon loading the Intel(R) PRO/Wireless 2915ABG Driver for Linux, a
+message stating the detected geography code and the number of 802.11
+channels supported by the card will be displayed in the log.
+
+The geography code corresponds to a regulatory domain as shown in the
+table below.
+
+					  Supported channels
+Code	Geography			802.11bg	802.11a
+
+---	Restricted			11 	 	 0
+ZZF	Custom US/Canada		11	 	 8
+ZZD	Rest of World			13	 	 0
+ZZA	Custom USA & Europe & High	11		13
+ZZB	Custom NA & Europe    		11		13
+ZZC	Custom Japan			11	 	 4
+ZZM	Custom 				11	 	 0
+ZZE	Europe				13		19
+ZZJ	Custom Japan			14	 	 4
+ZZR	Rest of World			14	 	 0
+ZZH	High Band			13	 	 4
+ZZG	Custom Europe			13	 	 4
+ZZK	Europe 				13		24
+ZZL	Europe				11		13
+
+
+2.   Ad-Hoc Networking
+-----------------------------------------------
+
+When using a device in an Ad-Hoc network, it is useful to understand the 
+sequence and requirements for the driver to be able to create, join, or 
+merge networks.
+
+The following attempts to provide enough information so that you can 
+have a consistent experience while using the driver as a member of an 
+Ad-Hoc network.
+
+2.1. Joining an Ad-Hoc Network
+-----------------------------------------------
+
+The easiest way to get onto an Ad-Hoc network is to join one that 
+already exists.
+
+2.2. Creating an Ad-Hoc Network
+-----------------------------------------------
+
+An Ad-Hoc networks is created using the syntax of the Wireless tool.
+
+For Example:
+iwconfig eth1 mode ad-hoc essid testing channel 2
+
+2.3. Merging Ad-Hoc Networks
+-----------------------------------------------
+
+
+3.  Interaction with Wireless Tools
+-----------------------------------------------
+
+3.1 iwconfig mode
+-----------------------------------------------
+
+When configuring the mode of the adapter, all run-time configured parameters
+are reset to the value used when the module was loaded.  This includes
+channels, rates, ESSID, etc.
+
+3.2 iwconfig sens
+-----------------------------------------------
+
+The 'iwconfig ethX sens XX' command will not set the signal sensitivity
+threshold, as described in iwconfig documentation, but rather the number
+of consecutive missed beacons that will trigger handover, i.e. roaming
+to another access point. At the same time, it will set the disassociation
+threshold to 3 times the given value.
+
+
+4.   About the Version Numbers
+-----------------------------------------------
+
+Due to the nature of open source development projects, there are 
+frequently changes being incorporated that have not gone through 
+a complete validation process.  These changes are incorporated into 
+development snapshot releases.
+
+Releases are numbered with a three level scheme: 
+
+	major.minor.development
+
+Any version where the 'development' portion is 0 (for example
+1.0.0, 1.1.0, etc.) indicates a stable version that will be made 
+available for kernel inclusion.
+
+Any version where the 'development' portion is not a 0 (for
+example 1.0.1, 1.1.5, etc.) indicates a development version that is
+being made available for testing and cutting edge users.  The stability 
+and functionality of the development releases are not know.  We make
+efforts to try and keep all snapshots reasonably stable, but due to the
+frequency of their release, and the desire to get those releases 
+available as quickly as possible, unknown anomalies should be expected.
+
+The major version number will be incremented when significant changes
+are made to the driver.  Currently, there are no major changes planned.
+
+5.  Firmware installation
+----------------------------------------------
+
+The driver requires a firmware image, download it and extract the
+files under /lib/firmware (or wherever your hotplug's firmware.agent
+will look for firmware files)
+
+The firmware can be downloaded from the following URL:
+
+    http://ipw2200.sf.net/
+
+
+6.  Support
+-----------------------------------------------
+
+For direct support of the 1.0.0 version, you can contact 
+http://supportmail.intel.com, or you can use the open source project
+support.
+
+For general information and support, go to:
+	
+    http://ipw2200.sf.net/
+
+
+7.  License
+-----------------------------------------------
+
+  Copyright(c) 2003 - 2006 Intel Corporation. All rights reserved.
+
+  This program is free software; you can redistribute it and/or modify it 
+  under the terms of the GNU General Public License version 2 as 
+  published by the Free Software Foundation.
+  
+  This program is distributed in the hope that it will be useful, but WITHOUT 
+  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
+  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
+  more details.
+  
+  You should have received a copy of the GNU General Public License along with
+  this program; if not, write to the Free Software Foundation, Inc., 59 
+  Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+  
+  The full GNU General Public License is included in this distribution in the
+  file called LICENSE.
+  
+  Contact Information:
+  James P. Ketrenos <ipw2100-admin@linux.intel.com>
+  Intel Corporation, 5200 N.E. Elam Young Parkway, Hillsboro, OR 97124-6497
+
diff --git a/Documentation/networking/README.sb1000 b/Documentation/networking/README.sb1000
new file mode 100644
index 000000000..f92c2aac5
--- /dev/null
+++ b/Documentation/networking/README.sb1000
@@ -0,0 +1,207 @@
+sb1000 is a module network device driver for the General Instrument (also known
+as NextLevel) SURFboard1000 internal cable modem board.  This is an ISA card
+which is used by a number of cable TV companies to provide cable modem access.
+It's a one-way downstream-only cable modem, meaning that your upstream net link
+is provided by your regular phone modem.
+
+This driver was written by Franco Venturi <fventuri@mediaone.net>.  He deserves
+a great deal of thanks for this wonderful piece of code!
+
+-----------------------------------------------------------------------------
+
+Support for this device is now a part of the standard Linux kernel.  The
+driver source code file is drivers/net/sb1000.c.  In addition to this
+you will need:
+
+1.) The "cmconfig" program.  This is a utility which supplements "ifconfig"
+to configure the cable modem and network interface (usually called "cm0");
+and
+
+2.) Several PPP scripts which live in /etc/ppp to make connecting via your
+cable modem easy.
+
+   These utilities can be obtained from:
+
+      http://www.jacksonville.net/~fventuri/
+
+   in Franco's original source code distribution .tar.gz file.  Support for
+   the sb1000 driver can be found at:
+
+      http://web.archive.org/web/*/http://home.adelphia.net/~siglercm/sb1000.html
+      http://web.archive.org/web/*/http://linuxpower.cx/~cable/
+
+   along with these utilities.
+
+3.) The standard isapnp tools.  These are necessary to configure your SB1000
+card at boot time (or afterwards by hand) since it's a PnP card.
+
+   If you don't have these installed as a standard part of your Linux
+   distribution, you can find them at:
+
+      http://www.roestock.demon.co.uk/isapnptools/
+
+   or check your Linux distribution binary CD or their web site.  For help with
+   isapnp, pnpdump, or /etc/isapnp.conf, go to:
+
+      http://www.roestock.demon.co.uk/isapnptools/isapnpfaq.html
+
+-----------------------------------------------------------------------------
+
+To make the SB1000 card work, follow these steps:
+
+1.) Run `make config', or `make menuconfig', or `make xconfig', whichever
+you prefer, in the top kernel tree directory to set up your kernel
+configuration.  Make sure to say "Y" to "Prompt for development drivers"
+and to say "M" to the sb1000 driver.  Also say "Y" or "M" to all the standard
+networking questions to get TCP/IP and PPP networking support.
+
+2.) *BEFORE* you build the kernel, edit drivers/net/sb1000.c.  Make sure
+to redefine the value of READ_DATA_PORT to match the I/O address used
+by isapnp to access your PnP cards.  This is the value of READPORT in
+/etc/isapnp.conf or given by the output of pnpdump.
+
+3.) Build and install the kernel and modules as usual.
+
+4.) Boot your new kernel following the usual procedures.
+
+5.) Set up to configure the new SB1000 PnP card by capturing the output
+of "pnpdump" to a file and editing this file to set the correct I/O ports,
+IRQ, and DMA settings for all your PnP cards.  Make sure none of the settings
+conflict with one another.  Then test this configuration by running the
+"isapnp" command with your new config file as the input.  Check for
+errors and fix as necessary.  (As an aside, I use I/O ports 0x110 and
+0x310 and IRQ 11 for my SB1000 card and these work well for me.  YMMV.)
+Then save the finished config file as /etc/isapnp.conf for proper configuration
+on subsequent reboots.
+
+6.) Download the original file sb1000-1.1.2.tar.gz from Franco's site or one of
+the others referenced above.  As root, unpack it into a temporary directory and
+do a `make cmconfig' and then `install -c cmconfig /usr/local/sbin'.  Don't do
+`make install' because it expects to find all the utilities built and ready for
+installation, not just cmconfig.
+
+7.) As root, copy all the files under the ppp/ subdirectory in Franco's
+tar file into /etc/ppp, being careful not to overwrite any files that are
+already in there.  Then modify ppp@gi-on to set the correct login name,
+phone number, and frequency for the cable modem.  Also edit pap-secrets
+to specify your login name and password and any site-specific information
+you need.
+
+8.) Be sure to modify /etc/ppp/firewall to use ipchains instead of
+the older ipfwadm commands from the 2.0.x kernels.  There's a neat utility to
+convert ipfwadm commands to ipchains commands:
+
+   http://users.dhp.com/~whisper/ipfwadm2ipchains/
+
+You may also wish to modify the firewall script to implement a different
+firewalling scheme.
+
+9.) Start the PPP connection via the script /etc/ppp/ppp@gi-on.  You must be
+root to do this.  It's better to use a utility like sudo to execute
+frequently used commands like this with root permissions if possible.  If you
+connect successfully the cable modem interface will come up and you'll see a
+driver message like this at the console:
+
+         cm0: sb1000 at (0x110,0x310), csn 1, S/N 0x2a0d16d8, IRQ 11.
+         sb1000.c:v1.1.2 6/01/98 (fventuri@mediaone.net)
+
+The "ifconfig" command should show two new interfaces, ppp0 and cm0.
+The command "cmconfig cm0" will give you information about the cable modem
+interface.
+
+10.) Try pinging a site via `ping -c 5 www.yahoo.com', for example.  You should
+see packets received.
+
+11.) If you can't get site names (like www.yahoo.com) to resolve into
+IP addresses (like 204.71.200.67), be sure your /etc/resolv.conf file
+has no syntax errors and has the right nameserver IP addresses in it.
+If this doesn't help, try something like `ping -c 5 204.71.200.67' to
+see if the networking is running but the DNS resolution is where the
+problem lies.
+
+12.) If you still have problems, go to the support web sites mentioned above
+and read the information and documentation there.
+
+-----------------------------------------------------------------------------
+
+Common problems:
+
+1.) Packets go out on the ppp0 interface but don't come back on the cm0
+interface.  It looks like I'm connected but I can't even ping any
+numerical IP addresses.  (This happens predominantly on Debian systems due
+to a default boot-time configuration script.)
+
+Solution -- As root `echo 0 > /proc/sys/net/ipv4/conf/cm0/rp_filter' so it
+can share the same IP address as the ppp0 interface.  Note that this
+command should probably be added to the /etc/ppp/cablemodem script
+*right*between* the "/sbin/ifconfig" and "/sbin/cmconfig" commands.
+You may need to do this to /proc/sys/net/ipv4/conf/ppp0/rp_filter as well.
+If you do this to /proc/sys/net/ipv4/conf/default/rp_filter on each reboot
+(in rc.local or some such) then any interfaces can share the same IP
+addresses.
+
+2.) I get "unresolved symbol" error messages on executing `insmod sb1000.o'.
+
+Solution -- You probably have a non-matching kernel source tree and
+/usr/include/linux and /usr/include/asm header files.  Make sure you
+install the correct versions of the header files in these two directories.
+Then rebuild and reinstall the kernel.
+
+3.) When isapnp runs it reports an error, and my SB1000 card isn't working.
+
+Solution -- There's a problem with later versions of isapnp using the "(CHECK)"
+option in the lines that allocate the two I/O addresses for the SB1000 card.
+This first popped up on RH 6.0.  Delete "(CHECK)" for the SB1000 I/O addresses.
+Make sure they don't conflict with any other pieces of hardware first!  Then
+rerun isapnp and go from there.
+
+4.) I can't execute the /etc/ppp/ppp@gi-on file.
+
+Solution -- As root do `chmod ug+x /etc/ppp/ppp@gi-on'.
+
+5.) The firewall script isn't working (with 2.2.x and higher kernels).
+
+Solution -- Use the ipfwadm2ipchains script referenced above to convert the
+/etc/ppp/firewall script from the deprecated ipfwadm commands to ipchains.
+
+6.) I'm getting *tons* of firewall deny messages in the /var/kern.log,
+/var/messages, and/or /var/syslog files, and they're filling up my /var
+partition!!!
+
+Solution -- First, tell your ISP that you're receiving DoS (Denial of Service)
+and/or portscanning (UDP connection attempts) attacks!  Look over the deny
+messages to figure out what the attack is and where it's coming from.  Next,
+edit /etc/ppp/cablemodem and make sure the ",nobroadcast" option is turned on
+to the "cmconfig" command (uncomment that line).  If you're not receiving these
+denied packets on your broadcast interface (IP address xxx.yyy.zzz.255
+typically), then someone is attacking your machine in particular.  Be careful
+out there....
+
+7.) Everything seems to work fine but my computer locks up after a while
+(and typically during a lengthy download through the cable modem)!
+
+Solution -- You may need to add a short delay in the driver to 'slow down' the
+SURFboard because your PC might not be able to keep up with the transfer rate
+of the SB1000. To do this, it's probably best to download Franco's
+sb1000-1.1.2.tar.gz archive and build and install sb1000.o manually.  You'll
+want to edit the 'Makefile' and look for the 'SB1000_DELAY'
+define.  Uncomment those 'CFLAGS' lines (and comment out the default ones)
+and try setting the delay to something like 60 microseconds with:
+'-DSB1000_DELAY=60'.  Then do `make' and as root `make install' and try
+it out.  If it still doesn't work or you like playing with the driver, you may
+try other numbers.  Remember though that the higher the delay, the slower the
+driver (which slows down the rest of the PC too when it is actively
+used). Thanks to Ed Daiga for this tip!
+
+-----------------------------------------------------------------------------
+
+Credits:  This README came from Franco Venturi's original README file which is
+still supplied with his driver .tar.gz archive.  I and all other sb1000 users
+owe Franco a tremendous "Thank you!"  Additional thanks goes to Carl Patten
+and Ralph Bonnell who are now managing the Linux SB1000 web site, and to
+the SB1000 users who reported and helped debug the common problems listed
+above.
+
+
+					Clemmitt Sigler
+					csigler@vt.edu
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
new file mode 100644
index 000000000..ff929cfab
--- /dev/null
+++ b/Documentation/networking/af_xdp.rst
@@ -0,0 +1,312 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======
+AF_XDP
+======
+
+Overview
+========
+
+AF_XDP is an address family that is optimized for high performance
+packet processing.
+
+This document assumes that the reader is familiar with BPF and XDP. If
+not, the Cilium project has an excellent reference guide at
+http://cilium.readthedocs.io/en/latest/bpf/.
+
+Using the XDP_REDIRECT action from an XDP program, the program can
+redirect ingress frames to other XDP enabled netdevs, using the
+bpf_redirect_map() function. AF_XDP sockets enable the possibility for
+XDP programs to redirect frames to a memory buffer in a user-space
+application.
+
+An AF_XDP socket (XSK) is created with the normal socket()
+syscall. Associated with each XSK are two rings: the RX ring and the
+TX ring. A socket can receive packets on the RX ring and it can send
+packets on the TX ring. These rings are registered and sized with the
+setsockopts XDP_RX_RING and XDP_TX_RING, respectively. It is mandatory
+to have at least one of these rings for each socket. An RX or TX
+descriptor ring points to a data buffer in a memory area called a
+UMEM. RX and TX can share the same UMEM so that a packet does not have
+to be copied between RX and TX. Moreover, if a packet needs to be kept
+for a while due to a possible retransmit, the descriptor that points
+to that packet can be changed to point to another and reused right
+away. This again avoids copying data.
+
+The UMEM consists of a number of equally sized chunks. A descriptor in
+one of the rings references a frame by referencing its addr. The addr
+is simply an offset within the entire UMEM region. The user space
+allocates memory for this UMEM using whatever means it feels is most
+appropriate (malloc, mmap, huge pages, etc). This memory area is then
+registered with the kernel using the new setsockopt XDP_UMEM_REG. The
+UMEM also has two rings: the FILL ring and the COMPLETION ring. The
+fill ring is used by the application to send down addr for the kernel
+to fill in with RX packet data. References to these frames will then
+appear in the RX ring once each packet has been received. The
+completion ring, on the other hand, contains frame addr that the
+kernel has transmitted completely and can now be used again by user
+space, for either TX or RX. Thus, the frame addrs appearing in the
+completion ring are addrs that were previously transmitted using the
+TX ring. In summary, the RX and FILL rings are used for the RX path
+and the TX and COMPLETION rings are used for the TX path.
+
+The socket is then finally bound with a bind() call to a device and a
+specific queue id on that device, and it is not until bind is
+completed that traffic starts to flow.
+
+The UMEM can be shared between processes, if desired. If a process
+wants to do this, it simply skips the registration of the UMEM and its
+corresponding two rings, sets the XDP_SHARED_UMEM flag in the bind
+call and submits the XSK of the process it would like to share UMEM
+with as well as its own newly created XSK socket. The new process will
+then receive frame addr references in its own RX ring that point to
+this shared UMEM. Note that since the ring structures are
+single-consumer / single-producer (for performance reasons), the new
+process has to create its own socket with associated RX and TX rings,
+since it cannot share this with the other process. This is also the
+reason that there is only one set of FILL and COMPLETION rings per
+UMEM. It is the responsibility of a single process to handle the UMEM.
+
+How is then packets distributed from an XDP program to the XSKs? There
+is a BPF map called XSKMAP (or BPF_MAP_TYPE_XSKMAP in full). The
+user-space application can place an XSK at an arbitrary place in this
+map. The XDP program can then redirect a packet to a specific index in
+this map and at this point XDP validates that the XSK in that map was
+indeed bound to that device and ring number. If not, the packet is
+dropped. If the map is empty at that index, the packet is also
+dropped. This also means that it is currently mandatory to have an XDP
+program loaded (and one XSK in the XSKMAP) to be able to get any
+traffic to user space through the XSK.
+
+AF_XDP can operate in two different modes: XDP_SKB and XDP_DRV. If the
+driver does not have support for XDP, or XDP_SKB is explicitly chosen
+when loading the XDP program, XDP_SKB mode is employed that uses SKBs
+together with the generic XDP support and copies out the data to user
+space. A fallback mode that works for any network device. On the other
+hand, if the driver has support for XDP, it will be used by the AF_XDP
+code to provide better performance, but there is still a copy of the
+data into user space.
+
+Concepts
+========
+
+In order to use an AF_XDP socket, a number of associated objects need
+to be setup.
+
+Jonathan Corbet has also written an excellent article on LWN,
+"Accelerating networking with AF_XDP". It can be found at
+https://lwn.net/Articles/750845/.
+
+UMEM
+----
+
+UMEM is a region of virtual contiguous memory, divided into
+equal-sized frames. An UMEM is associated to a netdev and a specific
+queue id of that netdev. It is created and configured (chunk size,
+headroom, start address and size) by using the XDP_UMEM_REG setsockopt
+system call. A UMEM is bound to a netdev and queue id, via the bind()
+system call.
+
+An AF_XDP is socket linked to a single UMEM, but one UMEM can have
+multiple AF_XDP sockets. To share an UMEM created via one socket A,
+the next socket B can do this by setting the XDP_SHARED_UMEM flag in
+struct sockaddr_xdp member sxdp_flags, and passing the file descriptor
+of A to struct sockaddr_xdp member sxdp_shared_umem_fd.
+
+The UMEM has two single-producer/single-consumer rings, that are used
+to transfer ownership of UMEM frames between the kernel and the
+user-space application.
+
+Rings
+-----
+
+There are a four different kind of rings: Fill, Completion, RX and
+TX. All rings are single-producer/single-consumer, so the user-space
+application need explicit synchronization of multiple
+processes/threads are reading/writing to them.
+
+The UMEM uses two rings: Fill and Completion. Each socket associated
+with the UMEM must have an RX queue, TX queue or both. Say, that there
+is a setup with four sockets (all doing TX and RX). Then there will be
+one Fill ring, one Completion ring, four TX rings and four RX rings.
+
+The rings are head(producer)/tail(consumer) based rings. A producer
+writes the data ring at the index pointed out by struct xdp_ring
+producer member, and increasing the producer index. A consumer reads
+the data ring at the index pointed out by struct xdp_ring consumer
+member, and increasing the consumer index.
+
+The rings are configured and created via the _RING setsockopt system
+calls and mmapped to user-space using the appropriate offset to mmap()
+(XDP_PGOFF_RX_RING, XDP_PGOFF_TX_RING, XDP_UMEM_PGOFF_FILL_RING and
+XDP_UMEM_PGOFF_COMPLETION_RING).
+
+The size of the rings need to be of size power of two.
+
+UMEM Fill Ring
+~~~~~~~~~~~~~~
+
+The Fill ring is used to transfer ownership of UMEM frames from
+user-space to kernel-space. The UMEM addrs are passed in the ring. As
+an example, if the UMEM is 64k and each chunk is 4k, then the UMEM has
+16 chunks and can pass addrs between 0 and 64k.
+
+Frames passed to the kernel are used for the ingress path (RX rings).
+
+The user application produces UMEM addrs to this ring. Note that the
+kernel will mask the incoming addr. E.g. for a chunk size of 2k, the
+log2(2048) LSB of the addr will be masked off, meaning that 2048, 2050
+and 3000 refers to the same chunk.
+
+
+UMEM Completetion Ring
+~~~~~~~~~~~~~~~~~~~~~~
+
+The Completion Ring is used transfer ownership of UMEM frames from
+kernel-space to user-space. Just like the Fill ring, UMEM indicies are
+used.
+
+Frames passed from the kernel to user-space are frames that has been
+sent (TX ring) and can be used by user-space again.
+
+The user application consumes UMEM addrs from this ring.
+
+
+RX Ring
+~~~~~~~
+
+The RX ring is the receiving side of a socket. Each entry in the ring
+is a struct xdp_desc descriptor. The descriptor contains UMEM offset
+(addr) and the length of the data (len).
+
+If no frames have been passed to kernel via the Fill ring, no
+descriptors will (or can) appear on the RX ring.
+
+The user application consumes struct xdp_desc descriptors from this
+ring.
+
+TX Ring
+~~~~~~~
+
+The TX ring is used to send frames. The struct xdp_desc descriptor is
+filled (index, length and offset) and passed into the ring.
+
+To start the transfer a sendmsg() system call is required. This might
+be relaxed in the future.
+
+The user application produces struct xdp_desc descriptors to this
+ring.
+
+XSKMAP / BPF_MAP_TYPE_XSKMAP
+----------------------------
+
+On XDP side there is a BPF map type BPF_MAP_TYPE_XSKMAP (XSKMAP) that
+is used in conjunction with bpf_redirect_map() to pass the ingress
+frame to a socket.
+
+The user application inserts the socket into the map, via the bpf()
+system call.
+
+Note that if an XDP program tries to redirect to a socket that does
+not match the queue configuration and netdev, the frame will be
+dropped. E.g. an AF_XDP socket is bound to netdev eth0 and
+queue 17. Only the XDP program executing for eth0 and queue 17 will
+successfully pass data to the socket. Please refer to the sample
+application (samples/bpf/) in for an example.
+
+Usage
+=====
+
+In order to use AF_XDP sockets there are two parts needed. The
+user-space application and the XDP program. For a complete setup and
+usage example, please refer to the sample application. The user-space
+side is xdpsock_user.c and the XDP side xdpsock_kern.c.
+
+Naive ring dequeue and enqueue could look like this::
+
+    // struct xdp_rxtx_ring {
+    // 	__u32 *producer;
+    // 	__u32 *consumer;
+    // 	struct xdp_desc *desc;
+    // };
+
+    // struct xdp_umem_ring {
+    // 	__u32 *producer;
+    // 	__u32 *consumer;
+    // 	__u64 *desc;
+    // };
+
+    // typedef struct xdp_rxtx_ring RING;
+    // typedef struct xdp_umem_ring RING;
+
+    // typedef struct xdp_desc RING_TYPE;
+    // typedef __u64 RING_TYPE;
+
+    int dequeue_one(RING *ring, RING_TYPE *item)
+    {
+        __u32 entries = *ring->producer - *ring->consumer;
+
+        if (entries == 0)
+            return -1;
+
+        // read-barrier!
+
+        *item = ring->desc[*ring->consumer & (RING_SIZE - 1)];
+        (*ring->consumer)++;
+        return 0;
+    }
+
+    int enqueue_one(RING *ring, const RING_TYPE *item)
+    {
+        u32 free_entries = RING_SIZE - (*ring->producer - *ring->consumer);
+
+        if (free_entries == 0)
+            return -1;
+
+        ring->desc[*ring->producer & (RING_SIZE - 1)] = *item;
+
+        // write-barrier!
+
+        (*ring->producer)++;
+        return 0;
+    }
+
+
+For a more optimized version, please refer to the sample application.
+
+Sample application
+==================
+
+There is a xdpsock benchmarking/test application included that
+demonstrates how to use AF_XDP sockets with both private and shared
+UMEMs. Say that you would like your UDP traffic from port 4242 to end
+up in queue 16, that we will enable AF_XDP on. Here, we use ethtool
+for this::
+
+      ethtool -N p3p2 rx-flow-hash udp4 fn
+      ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 \
+          action 16
+
+Running the rxdrop benchmark in XDP_DRV mode can then be done
+using::
+
+      samples/bpf/xdpsock -i p3p2 -q 16 -r -N
+
+For XDP_SKB mode, use the switch "-S" instead of "-N" and all options
+can be displayed with "-h", as usual.
+
+Credits
+=======
+
+- Björn Töpel (AF_XDP core)
+- Magnus Karlsson (AF_XDP core)
+- Alexander Duyck
+- Alexei Starovoitov
+- Daniel Borkmann
+- Jesper Dangaard Brouer
+- John Fastabend
+- Jonathan Corbet (LWN coverage)
+- Michael S. Tsirkin
+- Qi Z Zhang
+- Willem de Bruijn
+
diff --git a/Documentation/networking/alias.rst b/Documentation/networking/alias.rst
new file mode 100644
index 000000000..af7c5ee92
--- /dev/null
+++ b/Documentation/networking/alias.rst
@@ -0,0 +1,49 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========
+IP-Aliasing
+===========
+
+IP-aliases are an obsolete way to manage multiple IP-addresses/masks
+per interface. Newer tools such as iproute2 support multiple
+address/prefixes per interface, but aliases are still supported
+for backwards compatibility.
+
+An alias is formed by adding a colon and a string when running ifconfig.
+This string is usually numeric, but this is not a must.
+
+
+Alias creation
+==============
+
+Alias creation is done by 'magic' interface naming: eg. to create a
+200.1.1.1 alias for eth0 ...
+::
+
+  # ifconfig eth0:0 200.1.1.1  etc,etc....
+	~~ -> request alias #0 creation (if not yet exists) for eth0
+
+The corresponding route is also set up by this command.  Please note:
+The route always points to the base interface.
+
+
+Alias deletion
+==============
+
+The alias is removed by shutting the alias down::
+
+  # ifconfig eth0:0 down
+	~~~~~~~~~~ -> will delete alias
+
+
+Alias (re-)configuring
+======================
+
+Aliases are not real devices, but programs should be able to configure
+and refer to them as usual (ifconfig, route, etc).
+
+
+Relationship with main device
+=============================
+
+If the base device is shut down the added aliases will be deleted too.
diff --git a/Documentation/networking/altera_tse.txt b/Documentation/networking/altera_tse.txt
new file mode 100644
index 000000000..50b8589d1
--- /dev/null
+++ b/Documentation/networking/altera_tse.txt
@@ -0,0 +1,263 @@
+       Altera Triple-Speed Ethernet MAC driver
+
+Copyright (C) 2008-2014 Altera Corporation
+
+This is the driver for the Altera Triple-Speed Ethernet (TSE) controllers
+using the SGDMA and MSGDMA soft DMA IP components. The driver uses the
+platform bus to obtain component resources. The designs used to test this
+driver were built for a Cyclone(R) V SOC FPGA board, a Cyclone(R) V FPGA board,
+and tested with ARM and NIOS processor hosts separately. The anticipated use
+cases are simple communications between an embedded system and an external peer
+for status and simple configuration of the embedded system.
+
+For more information visit www.altera.com and www.rocketboards.org. Support
+forums for the driver may be found on www.rocketboards.org, and a design used
+to test this driver may be found there as well. Support is also available from
+the maintainer of this driver, found in MAINTAINERS.
+
+The Triple-Speed Ethernet, SGDMA, and MSGDMA components are all soft IP
+components that can be assembled and built into an FPGA using the Altera
+Quartus toolchain. Quartus 13.1 and 14.0 were used to build the design that
+this driver was tested against. The sopc2dts tool is used to create the
+device tree for the driver, and may be found at rocketboards.org.
+
+The driver probe function examines the device tree and determines if the
+Triple-Speed Ethernet instance is using an SGDMA or MSGDMA component. The
+probe function then installs the appropriate set of DMA routines to
+initialize, setup transmits, receives, and interrupt handling primitives for
+the respective configurations.
+
+The SGDMA component is to be deprecated in the near future (over the next 1-2
+years as of this writing in early 2014) in favor of the MSGDMA component.
+SGDMA support is included for existing designs and reference in case a
+developer wishes to support their own soft DMA logic and driver support. Any
+new designs should not use the SGDMA.
+
+The SGDMA supports only a single transmit or receive operation at a time, and
+therefore will not perform as well compared to the MSGDMA soft IP. Please
+visit www.altera.com for known, documented SGDMA errata.
+
+Scatter-gather DMA is not supported by the SGDMA or MSGDMA at this time.
+Scatter-gather DMA will be added to a future maintenance update to this
+driver.
+
+Jumbo frames are not supported at this time.
+
+The driver limits PHY operations to 10/100Mbps, and has not yet been fully
+tested for 1Gbps. This support will be added in a future maintenance update.
+
+1) Kernel Configuration
+The kernel configuration option is ALTERA_TSE:
+ Device Drivers ---> Network device support ---> Ethernet driver support --->
+ Altera Triple-Speed Ethernet MAC support (ALTERA_TSE)
+
+2) Driver parameters list:
+	debug: message level (0: no output, 16: all);
+	dma_rx_num: Number of descriptors in the RX list (default is 64);
+	dma_tx_num: Number of descriptors in the TX list (default is 64).
+
+3) Command line options
+Driver parameters can be also passed in command line by using:
+	altera_tse=dma_rx_num:128,dma_tx_num:512
+
+4) Driver information and notes
+
+4.1) Transmit process
+When the driver's transmit routine is called by the kernel, it sets up a
+transmit descriptor by calling the underlying DMA transmit routine (SGDMA or
+MSGDMA), and initiates a transmit operation. Once the transmit is complete, an
+interrupt is driven by the transmit DMA logic. The driver handles the transmit
+completion in the context of the interrupt handling chain by recycling
+resource required to send and track the requested transmit operation.
+
+4.2) Receive process
+The driver will post receive buffers to the receive DMA logic during driver
+initialization. Receive buffers may or may not be queued depending upon the
+underlying DMA logic (MSGDMA is able queue receive buffers, SGDMA is not able
+to queue receive buffers to the SGDMA receive logic). When a packet is
+received, the DMA logic generates an interrupt. The driver handles a receive
+interrupt by obtaining the DMA receive logic status, reaping receive
+completions until no more receive completions are available.
+
+4.3) Interrupt Mitigation
+The driver is able to mitigate the number of its DMA interrupts
+using NAPI for receive operations. Interrupt mitigation is not yet supported
+for transmit operations, but will be added in a future maintenance release.
+
+4.4) Ethtool support
+Ethtool is supported. Driver statistics and internal errors can be taken using:
+ethtool -S ethX command. It is possible to dump registers etc.
+
+4.5) PHY Support
+The driver is compatible with PAL to work with PHY and GPHY devices.
+
+4.7) List of source files:
+ o Kconfig
+ o Makefile
+ o altera_tse_main.c: main network device driver
+ o altera_tse_ethtool.c: ethtool support
+ o altera_tse.h: private driver structure and common definitions
+ o altera_msgdma.h: MSGDMA implementation function definitions
+ o altera_sgdma.h: SGDMA implementation function definitions
+ o altera_msgdma.c: MSGDMA implementation
+ o altera_sgdma.c: SGDMA implementation
+ o altera_sgdmahw.h: SGDMA register and descriptor definitions
+ o altera_msgdmahw.h: MSGDMA register and descriptor definitions
+ o altera_utils.c: Driver utility functions
+ o altera_utils.h: Driver utility function definitions
+
+5) Debug Information
+
+The driver exports debug information such as internal statistics,
+debug information, MAC and DMA registers etc.
+
+A user may use the ethtool support to get statistics:
+e.g. using: ethtool -S ethX (that shows the statistics counters)
+or sees the MAC registers: e.g. using: ethtool -d ethX
+
+The developer can also use the "debug" module parameter to get
+further debug information.
+
+6) Statistics Support
+
+The controller and driver support a mix of IEEE standard defined statistics,
+RFC defined statistics, and driver or Altera defined statistics. The four
+specifications containing the standard definitions for these statistics are
+as follows:
+
+ o IEEE 802.3-2012 - IEEE Standard for Ethernet.
+ o RFC 2863 found at http://www.rfc-editor.org/rfc/rfc2863.txt.
+ o RFC 2819 found at http://www.rfc-editor.org/rfc/rfc2819.txt.
+ o Altera Triple Speed Ethernet User Guide, found at http://www.altera.com
+
+The statistics supported by the TSE and the device driver are as follows:
+
+"tx_packets" is equivalent to aFramesTransmittedOK defined in IEEE 802.3-2012,
+Section 5.2.2.1.2. This statistics is the count of frames that are successfully
+transmitted.
+
+"rx_packets" is equivalent to aFramesReceivedOK defined in IEEE 802.3-2012,
+Section 5.2.2.1.5. This statistic is the count of frames that are successfully
+received. This count does not include any error packets such as CRC errors,
+length errors, or alignment errors.
+
+"rx_crc_errors" is equivalent to aFrameCheckSequenceErrors defined in IEEE
+802.3-2012, Section 5.2.2.1.6. This statistic is the count of frames that are
+an integral number of bytes in length and do not pass the CRC test as the frame
+is received.
+
+"rx_align_errors" is equivalent to aAlignmentErrors defined in IEEE 802.3-2012,
+Section 5.2.2.1.7. This statistic is the count of frames that are not an
+integral number of bytes in length and do not pass the CRC test as the frame is
+received.
+
+"tx_bytes" is equivalent to aOctetsTransmittedOK defined in IEEE 802.3-2012,
+Section 5.2.2.1.8. This statistic is the count of data and pad bytes
+successfully transmitted from the interface.
+
+"rx_bytes" is equivalent to aOctetsReceivedOK defined in IEEE 802.3-2012,
+Section 5.2.2.1.14. This statistic is the count of data and pad bytes
+successfully received by the controller.
+
+"tx_pause" is equivalent to aPAUSEMACCtrlFramesTransmitted defined in IEEE
+802.3-2012, Section 30.3.4.2. This statistic is a count of PAUSE frames
+transmitted from the network controller.
+
+"rx_pause" is equivalent to aPAUSEMACCtrlFramesReceived defined in IEEE
+802.3-2012, Section 30.3.4.3. This statistic is a count of PAUSE frames
+received by the network controller.
+
+"rx_errors" is equivalent to ifInErrors defined in RFC 2863. This statistic is
+a count of the number of packets received containing errors that prevented the
+packet from being delivered to a higher level protocol.
+
+"tx_errors" is equivalent to ifOutErrors defined in RFC 2863. This statistic
+is a count of the number of packets that could not be transmitted due to errors.
+
+"rx_unicast" is equivalent to ifInUcastPkts defined in RFC 2863. This
+statistic is a count of the number of packets received that were not addressed
+to the broadcast address or a multicast group.
+
+"rx_multicast" is equivalent to ifInMulticastPkts defined in RFC 2863. This
+statistic is a count of the number of packets received that were addressed to
+a multicast address group.
+
+"rx_broadcast" is equivalent to ifInBroadcastPkts defined in RFC 2863. This
+statistic is a count of the number of packets received that were addressed to
+the broadcast address.
+
+"tx_discards" is equivalent to ifOutDiscards defined in RFC 2863. This
+statistic is the number of outbound packets not transmitted even though an
+error was not detected. An example of a reason this might occur is to free up
+internal buffer space.
+
+"tx_unicast" is equivalent to ifOutUcastPkts defined in RFC 2863. This
+statistic counts the number of packets transmitted that were not addressed to
+a multicast group or broadcast address.
+
+"tx_multicast" is equivalent to ifOutMulticastPkts defined in RFC 2863. This
+statistic counts the number of packets transmitted that were addressed to a
+multicast group.
+
+"tx_broadcast" is equivalent to ifOutBroadcastPkts defined in RFC 2863. This
+statistic counts the number of packets transmitted that were addressed to a
+broadcast address.
+
+"ether_drops" is equivalent to etherStatsDropEvents defined in RFC 2819.
+This statistic counts the number of packets dropped due to lack of internal
+controller resources.
+
+"rx_total_bytes" is equivalent to etherStatsOctets defined in RFC 2819.
+This statistic counts the total number of bytes received by the controller,
+including error and discarded packets.
+
+"rx_total_packets" is equivalent to etherStatsPkts defined in RFC 2819.
+This statistic counts the total number of packets received by the controller,
+including error, discarded, unicast, multicast, and broadcast packets.
+
+"rx_undersize" is equivalent to etherStatsUndersizePkts defined in RFC 2819.
+This statistic counts the number of correctly formed packets received less
+than 64 bytes long.
+
+"rx_oversize" is equivalent to etherStatsOversizePkts defined in RFC 2819.
+This statistic counts the number of correctly formed packets greater than 1518
+bytes long.
+
+"rx_64_bytes" is equivalent to etherStatsPkts64Octets defined in RFC 2819.
+This statistic counts the total number of packets received that were 64 octets
+in length.
+
+"rx_65_127_bytes" is equivalent to etherStatsPkts65to127Octets defined in RFC
+2819. This statistic counts the total number of packets received that were
+between 65 and 127 octets in length inclusive.
+
+"rx_128_255_bytes" is equivalent to etherStatsPkts128to255Octets defined in
+RFC 2819. This statistic is the total number of packets received that were
+between 128 and 255 octets in length inclusive.
+
+"rx_256_511_bytes" is equivalent to etherStatsPkts256to511Octets defined in
+RFC 2819. This statistic is the total number of packets received that were
+between 256 and 511 octets in length inclusive.
+
+"rx_512_1023_bytes" is equivalent to etherStatsPkts512to1023Octets defined in
+RFC 2819. This statistic is the total number of packets received that were
+between 512 and 1023 octets in length inclusive.
+
+"rx_1024_1518_bytes" is equivalent to etherStatsPkts1024to1518Octets define
+in RFC 2819. This statistic is the total number of packets received that were
+between 1024 and 1518 octets in length inclusive.
+
+"rx_gte_1519_bytes" is a statistic defined specific to the behavior of the
+Altera TSE. This statistics counts the number of received good and errored
+frames between the length of 1519 and the maximum frame length configured
+in the frm_length register. See the Altera TSE User Guide for More details.
+
+"rx_jabbers" is equivalent to etherStatsJabbers defined in RFC 2819. This
+statistic is the total number of packets received that were longer than 1518
+octets, and had either a bad CRC with an integral number of octets (CRC Error)
+or a bad CRC with a non-integral number of octets (Alignment Error).
+
+"rx_runts" is equivalent to etherStatsFragments defined in RFC 2819. This
+statistic is the total number of packets received that were less than 64 octets
+in length and had either a bad CRC with an integral number of octets (CRC
+error) or a bad CRC with a non-integral number of octets (Alignment Error).
diff --git a/Documentation/networking/arcnet-hardware.txt b/Documentation/networking/arcnet-hardware.txt
new file mode 100644
index 000000000..731de4115
--- /dev/null
+++ b/Documentation/networking/arcnet-hardware.txt
@@ -0,0 +1,3133 @@
+ 
+-----------------------------------------------------------------------------
+1) This file is a supplement to arcnet.txt.  Please read that for general
+   driver configuration help.
+-----------------------------------------------------------------------------
+2) This file is no longer Linux-specific.  It should probably be moved out of
+   the kernel sources.  Ideas?
+-----------------------------------------------------------------------------
+
+Because so many people (myself included) seem to have obtained ARCnet cards
+without manuals, this file contains a quick introduction to ARCnet hardware,
+some cabling tips, and a listing of all jumper settings I can find. Please
+e-mail apenwarr@worldvisions.ca with any settings for your particular card,
+or any other information you have!
+
+
+INTRODUCTION TO ARCNET
+----------------------
+
+ARCnet is a network type which works in a way similar to popular Ethernet
+networks but which is also different in some very important ways.
+
+First of all, you can get ARCnet cards in at least two speeds: 2.5 Mbps
+(slower than Ethernet) and 100 Mbps (faster than normal Ethernet).  In fact,
+there are others as well, but these are less common.  The different hardware
+types, as far as I'm aware, are not compatible and so you cannot wire a
+100 Mbps card to a 2.5 Mbps card, and so on.  From what I hear, my driver does
+work with 100 Mbps cards, but I haven't been able to verify this myself,
+since I only have the 2.5 Mbps variety.  It is probably not going to saturate
+your 100 Mbps card.  Stop complaining. :)
+
+You also cannot connect an ARCnet card to any kind of Ethernet card and
+expect it to work.  
+
+There are two "types" of ARCnet - STAR topology and BUS topology.  This
+refers to how the cards are meant to be wired together.  According to most
+available documentation, you can only connect STAR cards to STAR cards and
+BUS cards to BUS cards.  That makes sense, right?  Well, it's not quite
+true; see below under "Cabling."
+
+Once you get past these little stumbling blocks, ARCnet is actually quite a
+well-designed standard.  It uses something called "modified token passing"
+which makes it completely incompatible with so-called "Token Ring" cards,
+but which makes transfers much more reliable than Ethernet does.  In fact,
+ARCnet will guarantee that a packet arrives safely at the destination, and
+even if it can't possibly be delivered properly (ie. because of a cable
+break, or because the destination computer does not exist) it will at least
+tell the sender about it.
+
+Because of the carefully defined action of the "token", it will always make
+a pass around the "ring" within a maximum length of time.  This makes it
+useful for realtime networks.
+
+In addition, all known ARCnet cards have an (almost) identical programming
+interface.  This means that with one ARCnet driver you can support any
+card, whereas with Ethernet each manufacturer uses what is sometimes a
+completely different programming interface, leading to a lot of different,
+sometimes very similar, Ethernet drivers.  Of course, always using the same
+programming interface also means that when high-performance hardware
+facilities like PCI bus mastering DMA appear, it's hard to take advantage of
+them.  Let's not go into that.
+
+One thing that makes ARCnet cards difficult to program for, however, is the
+limit on their packet sizes; standard ARCnet can only send packets that are
+up to 508 bytes in length.  This is smaller than the Internet "bare minimum"
+of 576 bytes, let alone the Ethernet MTU of 1500.  To compensate, an extra
+level of encapsulation is defined by RFC1201, which I call "packet
+splitting," that allows "virtual packets" to grow as large as 64K each,
+although they are generally kept down to the Ethernet-style 1500 bytes.
+
+For more information on the advantages and disadvantages (mostly the
+advantages) of ARCnet networks, you might try the "ARCnet Trade Association"
+WWW page:
+	http://www.arcnet.com
+
+
+CABLING ARCNET NETWORKS
+-----------------------
+
+This section was rewritten by 
+        Vojtech Pavlik     <vojtech@suse.cz>
+using information from several people, including:
+        Avery Pennraun     <apenwarr@worldvisions.ca>
+ 	Stephen A. Wood    <saw@hallc1.cebaf.gov>
+ 	John Paul Morrison <jmorriso@bogomips.ee.ubc.ca>
+ 	Joachim Koenig     <jojo@repas.de>
+and Avery touched it up a bit, at Vojtech's request.
+
+ARCnet (the classic 2.5 Mbps version) can be connected by two different
+types of cabling: coax and twisted pair.  The other ARCnet-type networks
+(100 Mbps TCNS and 320 kbps - 32 Mbps ARCnet Plus) use different types of
+cabling (Type1, Fiber, C1, C4, C5).
+
+For a coax network, you "should" use 93 Ohm RG-62 cable.  But other cables
+also work fine, because ARCnet is a very stable network. I personally use 75
+Ohm TV antenna cable.
+
+Cards for coax cabling are shipped in two different variants: for BUS and
+STAR network topologies.  They are mostly the same.  The only difference
+lies in the hybrid chip installed.  BUS cards use high impedance output,
+while STAR use low impedance.  Low impedance card (STAR) is electrically
+equal to a high impedance one with a terminator installed.
+
+Usually, the ARCnet networks are built up from STAR cards and hubs.  There
+are two types of hubs - active and passive.  Passive hubs are small boxes
+with four BNC connectors containing four 47 Ohm resistors:
+
+   |         | wires
+   R         + junction
+-R-+-R-      R 47 Ohm resistors
+   R
+   |
+
+The shielding is connected together.  Active hubs are much more complicated;
+they are powered and contain electronics to amplify the signal and send it
+to other segments of the net.  They usually have eight connectors.  Active
+hubs come in two variants - dumb and smart.  The dumb variant just
+amplifies, but the smart one decodes to digital and encodes back all packets
+coming through.  This is much better if you have several hubs in the net,
+since many dumb active hubs may worsen the signal quality.
+
+And now to the cabling.  What you can connect together:
+
+1. A card to a card.  This is the simplest way of creating a 2-computer
+   network.
+
+2. A card to a passive hub.  Remember that all unused connectors on the hub
+   must be properly terminated with 93 Ohm (or something else if you don't
+   have the right ones) terminators.
+   	(Avery's note: oops, I didn't know that.  Mine (TV cable) works
+	anyway, though.)
+
+3. A card to an active hub.  Here is no need to terminate the unused
+   connectors except some kind of aesthetic feeling.  But, there may not be
+   more than eleven active hubs between any two computers.  That of course
+   doesn't limit the number of active hubs on the network.
+   
+4. An active hub to another.
+
+5. An active hub to passive hub.
+
+Remember that you cannot connect two passive hubs together.  The power loss
+implied by such a connection is too high for the net to operate reliably.
+
+An example of a typical ARCnet network:
+
+           R                     S - STAR type card              
+    S------H--------A-------S    R - Terminator
+           |        |            H - Hub                         
+           |        |            A - Active hub                  
+           |   S----H----S                                       
+           S        |                                            
+                    |                                            
+                    S                                            
+                                                                          
+The BUS topology is very similar to the one used by Ethernet.  The only
+difference is in cable and terminators: they should be 93 Ohm.  Ethernet
+uses 50 Ohm impedance. You use T connectors to put the computers on a single
+line of cable, the bus. You have to put terminators at both ends of the
+cable. A typical BUS ARCnet network looks like:
+
+    RT----T------T------T------T------TR
+     B    B      B      B      B      B
+
+  B - BUS type card
+  R - Terminator
+  T - T connector
+
+But that is not all! The two types can be connected together.  According to
+the official documentation the only way of connecting them is using an active
+hub:
+
+         A------T------T------TR
+         |      B      B      B
+     S---H---S
+         |
+         S
+
+The official docs also state that you can use STAR cards at the ends of
+BUS network in place of a BUS card and a terminator:
+
+     S------T------T------S
+            B      B
+
+But, according to my own experiments, you can simply hang a BUS type card
+anywhere in middle of a cable in a STAR topology network.  And more - you
+can use the bus card in place of any star card if you use a terminator. Then
+you can build very complicated networks fulfilling all your needs!  An
+example:
+
+                                  S
+                                  |
+           RT------T-------T------H------S
+            B      B       B      |
+                                  |       R
+    S------A------T-------T-------A-------H------TR                    
+           |      B       B       |       |      B                         
+           |   S                 BT       |                                 
+           |   |                  |  S----A-----S
+    S------H---A----S             |       | 
+           |   |      S------T----H---S   |
+           S   S             B    R       S  
+                                                               
+A basically different cabling scheme is used with Twisted Pair cabling. Each
+of the TP cards has two RJ (phone-cord style) connectors.  The cards are
+then daisy-chained together using a cable connecting every two neighboring
+cards.  The ends are terminated with RJ 93 Ohm terminators which plug into
+the empty connectors of cards on the ends of the chain.  An example:
+
+          ___________   ___________
+      _R_|_         _|_|_         _|_R_  
+     |     |       |     |       |     |      
+     |Card |       |Card |       |Card |     
+     |_____|       |_____|       |_____|          
+
+
+There are also hubs for the TP topology.  There is nothing difficult
+involved in using them; you just connect a TP chain to a hub on any end or
+even at both.  This way you can create almost any network configuration. 
+The maximum of 11 hubs between any two computers on the net applies here as
+well.  An example:
+
+    RP-------P--------P--------H-----P------P-----PR
+                               |
+      RP-----H--------P--------H-----P------PR
+             |                 |
+             PR                PR
+
+    R - RJ Terminator
+    P - TP Card
+    H - TP Hub
+
+Like any network, ARCnet has a limited cable length.  These are the maximum
+cable lengths between two active ends (an active end being an active hub or
+a STAR card).
+
+		RG-62       93 Ohm up to 650 m
+		RG-59/U     75 Ohm up to 457 m
+		RG-11/U     75 Ohm up to 533 m
+		IBM Type 1 150 Ohm up to 200 m
+		IBM Type 3 100 Ohm up to 100 m
+
+The maximum length of all cables connected to a passive hub is limited to 65
+meters for RG-62 cabling; less for others.  You can see that using passive
+hubs in a large network is a bad idea. The maximum length of a single "BUS
+Trunk" is about 300 meters for RG-62. The maximum distance between the two
+most distant points of the net is limited to 3000 meters. The maximum length
+of a TP cable between two cards/hubs is 650 meters.
+
+
+SETTING THE JUMPERS
+-------------------
+
+All ARCnet cards should have a total of four or five different settings:
+
+  - the I/O address:  this is the "port" your ARCnet card is on.  Probed
+    values in the Linux ARCnet driver are only from 0x200 through 0x3F0. (If
+    your card has additional ones, which is possible, please tell me.) This
+    should not be the same as any other device on your system.  According to
+    a doc I got from Novell, MS Windows prefers values of 0x300 or more,
+    eating net connections on my system (at least) otherwise.  My guess is
+    this may be because, if your card is at 0x2E0, probing for a serial port
+    at 0x2E8 will reset the card and probably mess things up royally.
+	- Avery's favourite: 0x300.
+
+  - the IRQ: on  8-bit cards, it might be 2 (9), 3, 4, 5, or 7.
+             on 16-bit cards, it might be 2 (9), 3, 4, 5, 7, or 10-15.
+             
+    Make sure this is different from any other card on your system.  Note
+    that IRQ2 is the same as IRQ9, as far as Linux is concerned.  You can
+    "cat /proc/interrupts" for a somewhat complete list of which ones are in
+    use at any given time.  Here is a list of common usages from Vojtech
+    Pavlik <vojtech@suse.cz>:
+    	("Not on bus" means there is no way for a card to generate this
+	interrupt)
+	IRQ  0 - Timer 0 (Not on bus)
+	IRQ  1 - Keyboard (Not on bus)
+	IRQ  2 - IRQ Controller 2 (Not on bus, nor does interrupt the CPU)
+	IRQ  3 - COM2
+	IRQ  4 - COM1
+	IRQ  5 - FREE (LPT2 if you have it; sometimes COM3; maybe PLIP)
+	IRQ  6 - Floppy disk controller
+	IRQ  7 - FREE (LPT1 if you don't use the polling driver; PLIP) 
+	IRQ  8 - Realtime Clock Interrupt (Not on bus)
+	IRQ  9 - FREE (VGA vertical sync interrupt if enabled)
+	IRQ 10 - FREE
+	IRQ 11 - FREE
+	IRQ 12 - FREE
+	IRQ 13 - Numeric Coprocessor (Not on bus)
+	IRQ 14 - Fixed Disk Controller
+	IRQ 15 - FREE (Fixed Disk Controller 2 if you have it) 
+	
+	Note: IRQ 9 is used on some video cards for the "vertical retrace"
+	interrupt.  This interrupt would have been handy for things like
+	video games, as it occurs exactly once per screen refresh, but
+	unfortunately IBM cancelled this feature starting with the original
+	VGA and thus many VGA/SVGA cards do not support it.  For this
+	reason, no modern software uses this interrupt and it can almost
+	always be safely disabled, if your video card supports it at all.
+	
+	If your card for some reason CANNOT disable this IRQ (usually there
+	is a jumper), one solution would be to clip the printed circuit
+	contact on the board: it's the fourth contact from the left on the
+	back side.  I take no responsibility if you try this.
+
+	- Avery's favourite: IRQ2 (actually IRQ9).  Watch that VGA, though.
+
+  - the memory address:  Unlike most cards, ARCnets use "shared memory" for
+    copying buffers around.  Make SURE it doesn't conflict with any other
+    used memory in your system!
+	A0000		- VGA graphics memory (ok if you don't have VGA)
+        B0000		- Monochrome text mode
+        C0000		\  One of these is your VGA BIOS - usually C0000.
+        E0000		/
+        F0000		- System BIOS
+
+    Anything less than 0xA0000 is, well, a BAD idea since it isn't above
+    640k.
+	- Avery's favourite: 0xD0000
+
+  - the station address:  Every ARCnet card has its own "unique" network
+    address from 0 to 255.  Unlike Ethernet, you can set this address
+    yourself with a jumper or switch (or on some cards, with special
+    software).  Since it's only 8 bits, you can only have 254 ARCnet cards
+    on a network.  DON'T use 0 or 255, since these are reserved (although
+    neat stuff will probably happen if you DO use them).  By the way, if you
+    haven't already guessed, don't set this the same as any other ARCnet on
+    your network!
+	- Avery's favourite:  3 and 4.  Not that it matters.
+
+  - There may be ETS1 and ETS2 settings.  These may or may not make a
+    difference on your card (many manuals call them "reserved"), but are
+    used to change the delays used when powering up a computer on the
+    network.  This is only necessary when wiring VERY long range ARCnet
+    networks, on the order of 4km or so; in any case, the only real
+    requirement here is that all cards on the network with ETS1 and ETS2
+    jumpers have them in the same position.  Chris Hindy <chrish@io.org>
+    sent in a chart with actual values for this:
+	ET1	ET2	Response Time	Reconfiguration Time
+	---	---	-------------	--------------------
+	open	open	74.7us		840us
+	open	closed	283.4us		1680us
+	closed	open	561.8us		1680us
+	closed	closed	1118.6us	1680us
+    
+    Make sure you set ETS1 and ETS2 to the SAME VALUE for all cards on your
+    network.
+    
+Also, on many cards (not mine, though) there are red and green LED's. 
+Vojtech Pavlik <vojtech@suse.cz> tells me this is what they mean:
+	GREEN           RED             Status
+	-----		---		------
+	OFF             OFF             Power off
+	OFF             Short flashes   Cabling problems (broken cable or not
+					  terminated)
+	OFF (short)     ON              Card init
+	ON              ON              Normal state - everything OK, nothing
+					  happens
+	ON              Long flashes    Data transfer
+	ON              OFF             Never happens (maybe when wrong ID)
+
+
+The following is all the specific information people have sent me about
+their own particular ARCnet cards.  It is officially a mess, and contains
+huge amounts of duplicated information.  I have no time to fix it.  If you
+want to, PLEASE DO!  Just send me a 'diff -u' of all your changes.
+
+The model # is listed right above specifics for that card, so you should be
+able to use your text viewer's "search" function to find the entry you want. 
+If you don't KNOW what kind of card you have, try looking through the
+various diagrams to see if you can tell.
+
+If your model isn't listed and/or has different settings, PLEASE PLEASE
+tell me.  I had to figure mine out without the manual, and it WASN'T FUN!
+
+Even if your ARCnet model isn't listed, but has the same jumpers as another
+model that is, please e-mail me to say so.
+
+Cards Listed in this file (in this order, mostly):
+
+	Manufacturer	Model #			Bits
+	------------	-------			----
+	SMC		PC100			8
+	SMC		PC110			8
+	SMC		PC120			8
+	SMC		PC130			8
+	SMC		PC270E			8
+	SMC		PC500			16
+	SMC		PC500Longboard		16
+	SMC		PC550Longboard		16
+	SMC		PC600			16
+	SMC		PC710			8
+	SMC?		LCS-8830(-T)		8/16
+	Puredata	PDI507			8
+	CNet Tech	CN120-Series		8
+	CNet Tech	CN160-Series		16
+	Lantech?	UM9065L chipset		8
+	Acer		5210-003		8
+	Datapoint?	LAN-ARC-8		8
+	Topware		TA-ARC/10		8
+	Thomas-Conrad	500-6242-0097 REV A	8
+	Waterloo?	(C)1985 Waterloo Micro. 8
+	No Name		--			8/16
+	No Name		Taiwan R.O.C?		8
+	No Name		Model 9058		8
+	Tiara		Tiara Lancard?		8
+	
+
+** SMC = Standard Microsystems Corp.
+** CNet Tech = CNet Technology, Inc.
+
+
+Unclassified Stuff
+------------------
+  - Please send any other information you can find.
+  
+  - And some other stuff (more info is welcome!):
+     From: root@ultraworld.xs4all.nl (Timo Hilbrink)
+     To: apenwarr@foxnet.net (Avery Pennarun)
+     Date: Wed, 26 Oct 1994 02:10:32 +0000 (GMT)
+     Reply-To: timoh@xs4all.nl
+
+     [...parts deleted...]
+
+     About the jumpers: On my PC130 there is one more jumper, located near the
+     cable-connector and it's for changing to star or bus topology; 
+     closed: star - open: bus
+     On the PC500 are some more jumper-pins, one block labeled with RX,PDN,TXI
+     and another with ALE,LA17,LA18,LA19 these are undocumented..
+
+     [...more parts deleted...]
+
+     --- CUT ---
+
+
+** Standard Microsystems Corp (SMC) **
+PC100, PC110, PC120, PC130 (8-bit cards)
+PC500, PC600 (16-bit cards)
+---------------------------------
+  - mainly from Avery Pennarun <apenwarr@worldvisions.ca>.  Values depicted
+    are from Avery's setup.
+  - special thanks to Timo Hilbrink <timoh@xs4all.nl> for noting that PC120,
+    130, 500, and 600 all have the same switches as Avery's PC100. 
+    PC500/600 have several extra, undocumented pins though. (?)
+  - PC110 settings were verified by Stephen A. Wood <saw@cebaf.gov>
+  - Also, the JP- and S-numbers probably don't match your card exactly.  Try
+    to find jumpers/switches with the same number of settings - it's
+    probably more reliable.
+  
+
+     JP5		       [|]    :    :    :    :
+(IRQ Setting)		      IRQ2  IRQ3 IRQ4 IRQ5 IRQ7
+		Put exactly one jumper on exactly one set of pins.
+
+
+                          1  2   3  4  5  6   7  8  9 10
+     S1                /----------------------------------\
+(I/O and Memory        |  1  1 * 0  0  0  0 * 1  1  0  1  |
+ addresses)            \----------------------------------/
+                          |--|   |--------|   |--------|
+                          (a)       (b)           (m)
+                          
+                WARNING.  It's very important when setting these which way
+                you're holding the card, and which way you think is '1'!
+                
+                If you suspect that your settings are not being made
+		correctly, try reversing the direction or inverting the
+		switch positions.
+
+		a: The first digit of the I/O address.
+			Setting		Value
+			-------		-----
+			00		0
+			01		1
+			10		2
+			11		3
+
+		b: The second digit of the I/O address.
+			Setting		Value
+			-------		-----
+			0000		0
+			0001		1
+			0010		2
+			...		...
+			1110		E
+			1111		F
+
+		The I/O address is in the form ab0.  For example, if
+		a is 0x2 and b is 0xE, the address will be 0x2E0.
+
+		DO NOT SET THIS LESS THAN 0x200!!!!!
+
+
+		m: The first digit of the memory address.
+			Setting		Value
+			-------		-----
+			0000		0
+			0001		1
+			0010		2
+			...		...
+			1110		E
+			1111		F
+
+		The memory address is in the form m0000.  For example, if
+		m is D, the address will be 0xD0000.
+
+		DO NOT SET THIS TO C0000, F0000, OR LESS THAN A0000!
+
+                          1  2  3  4  5  6  7  8
+     S2                /--------------------------\
+(Station Address)      |  1  1  0  0  0  0  0  0  |
+                       \--------------------------/
+
+			Setting		Value
+			-------		-----
+			00000000	00
+			10000000	01
+			01000000	02
+			...
+			01111111	FE
+			11111111	FF
+
+		Note that this is binary with the digits reversed!
+
+		DO NOT SET THIS TO 0 OR 255 (0xFF)!
+
+
+*****************************************************************************
+
+** Standard Microsystems Corp (SMC) **
+PC130E/PC270E (8-bit cards)
+---------------------------
+  - from Juergen Seifert <seifert@htwm.de>
+
+
+STANDARD MICROSYSTEMS CORPORATION (SMC) ARCNET(R)-PC130E/PC270E
+===============================================================
+
+This description has been written by Juergen Seifert <seifert@htwm.de>
+using information from the following Original SMC Manual 
+
+             "Configuration Guide for
+             ARCNET(R)-PC130E/PC270
+            Network Controller Boards
+                Pub. # 900.044A
+                   June, 1989"
+
+ARCNET is a registered trademark of the Datapoint Corporation
+SMC is a registered trademark of the Standard Microsystems Corporation  
+
+The PC130E is an enhanced version of the PC130 board, is equipped with a 
+standard BNC female connector for connection to RG-62/U coax cable.
+Since this board is designed both for point-to-point connection in star
+networks and for connection to bus networks, it is downwardly compatible 
+with all the other standard boards designed for coax networks (that is,
+the PC120, PC110 and PC100 star topology boards and the PC220, PC210 and 
+PC200 bus topology boards).
+
+The PC270E is an enhanced version of the PC260 board, is equipped with two 
+modular RJ11-type jacks for connection to twisted pair wiring.
+It can be used in a star or a daisy-chained network.
+
+
+         8 7 6 5 4 3 2 1
+    ________________________________________________________________
+   |   |       S1        |                                          |
+   |   |_________________|                                          |
+   |    Offs|Base |I/O Addr                                         |
+   |     RAM Addr |                                              ___|
+   |         ___  ___                                       CR3 |___|
+   |        |   \/   |                                      CR4 |___|
+   |        |  PROM  |                                           ___|
+   |        |        |                                        N |   | 8
+   |        | SOCKET |                                        o |   | 7
+   |        |________|                                        d |   | 6
+   |                   ___________________                    e |   | 5
+   |                  |                   |                   A | S | 4
+   |       |oo| EXT2  |                   |                   d | 2 | 3
+   |       |oo| EXT1  |       SMC         |                   d |   | 2
+   |       |oo| ROM   |      90C63        |                   r |___| 1
+   |       |oo| IRQ7  |                   |               |o|  _____|
+   |       |oo| IRQ5  |                   |               |o| | J1  |
+   |       |oo| IRQ4  |                   |              STAR |_____|
+   |       |oo| IRQ3  |                   |                   | J2  |
+   |       |oo| IRQ2  |___________________|                   |_____|
+   |___                                               ______________|
+       |                                             |
+       |_____________________________________________|
+
+Legend:
+
+SMC 90C63	ARCNET Controller / Transceiver /Logic
+S1	1-3:	I/O Base Address Select
+	4-6:	Memory Base Address Select
+	7-8:	RAM Offset Select
+S2	1-8:	Node ID Select
+EXT		Extended Timeout Select
+ROM		ROM Enable Select
+STAR		Selected - Star Topology	(PC130E only)
+		Deselected - Bus Topology	(PC130E only)
+CR3/CR4		Diagnostic LEDs
+J1		BNC RG62/U Connector		(PC130E only)
+J1		6-position Telephone Jack	(PC270E only)
+J2		6-position Telephone Jack	(PC270E only)
+
+Setting one of the switches to Off/Open means "1", On/Closed means "0".
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in group S2 are used to set the node ID.
+These switches work in a way similar to the PC100-series cards; see that
+entry for more information.
+
+
+Setting the I/O Base Address
+----------------------------
+
+The first three switches in switch group S1 are used to select one
+of eight possible I/O Base addresses using the following table
+
+
+   Switch | Hex I/O
+   1 2 3  | Address
+   -------|--------
+   0 0 0  |  260
+   0 0 1  |  290
+   0 1 0  |  2E0  (Manufacturer's default)
+   0 1 1  |  2F0
+   1 0 0  |  300
+   1 0 1  |  350
+   1 1 0  |  380
+   1 1 1  |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer requires 2K of a 16K block of RAM. The base of this
+16K block can be located in any of eight positions.
+Switches 4-6 of switch group S1 select the Base of the 16K block.
+Within that 16K address space, the buffer may be assigned any one of four 
+positions, determined by the offset, switches 7 and 8 of group S1.
+
+   Switch     | Hex RAM | Hex ROM
+   4 5 6  7 8 | Address | Address *)
+   -----------|---------|-----------
+   0 0 0  0 0 |  C0000  |  C2000
+   0 0 0  0 1 |  C0800  |  C2000
+   0 0 0  1 0 |  C1000  |  C2000
+   0 0 0  1 1 |  C1800  |  C2000
+              |         |
+   0 0 1  0 0 |  C4000  |  C6000
+   0 0 1  0 1 |  C4800  |  C6000
+   0 0 1  1 0 |  C5000  |  C6000
+   0 0 1  1 1 |  C5800  |  C6000
+              |         |
+   0 1 0  0 0 |  CC000  |  CE000
+   0 1 0  0 1 |  CC800  |  CE000
+   0 1 0  1 0 |  CD000  |  CE000
+   0 1 0  1 1 |  CD800  |  CE000
+              |         |
+   0 1 1  0 0 |  D0000  |  D2000  (Manufacturer's default)
+   0 1 1  0 1 |  D0800  |  D2000
+   0 1 1  1 0 |  D1000  |  D2000
+   0 1 1  1 1 |  D1800  |  D2000
+              |         |
+   1 0 0  0 0 |  D4000  |  D6000
+   1 0 0  0 1 |  D4800  |  D6000
+   1 0 0  1 0 |  D5000  |  D6000
+   1 0 0  1 1 |  D5800  |  D6000
+              |         |
+   1 0 1  0 0 |  D8000  |  DA000
+   1 0 1  0 1 |  D8800  |  DA000
+   1 0 1  1 0 |  D9000  |  DA000
+   1 0 1  1 1 |  D9800  |  DA000
+              |         |
+   1 1 0  0 0 |  DC000  |  DE000
+   1 1 0  0 1 |  DC800  |  DE000
+   1 1 0  1 0 |  DD000  |  DE000
+   1 1 0  1 1 |  DD800  |  DE000
+              |         |
+   1 1 1  0 0 |  E0000  |  E2000
+   1 1 1  0 1 |  E0800  |  E2000
+   1 1 1  1 0 |  E1000  |  E2000
+   1 1 1  1 1 |  E1800  |  E2000
+  
+*) To enable the 8K Boot PROM install the jumper ROM.
+   The default is jumper ROM not installed.
+
+
+Setting the Timeouts and Interrupt
+----------------------------------
+
+The jumpers labeled EXT1 and EXT2 are used to determine the timeout 
+parameters. These two jumpers are normally left open.
+
+To select a hardware interrupt level set one (only one!) of the jumpers
+IRQ2, IRQ3, IRQ4, IRQ5, IRQ7. The Manufacturer's default is IRQ2.
+ 
+
+Configuring the PC130E for Star or Bus Topology
+-----------------------------------------------
+
+The single jumper labeled STAR is used to configure the PC130E board for 
+star or bus topology.
+When the jumper is installed, the board may be used in a star network, when 
+it is removed, the board can be used in a bus topology.
+
+
+Diagnostic LEDs
+---------------
+
+Two diagnostic LEDs are visible on the rear bracket of the board.
+The green LED monitors the network activity: the red one shows the
+board activity:
+
+ Green  | Status               Red      | Status
+ -------|-------------------   ---------|-------------------
+  on    | normal activity      flash/on | data transfer
+  blink | reconfiguration      off      | no data transfer;
+  off   | defective board or            | incorrect memory or
+        | node ID is zero               | I/O address
+
+
+*****************************************************************************
+
+** Standard Microsystems Corp (SMC) **
+PC500/PC550 Longboard (16-bit cards)
+-------------------------------------
+  - from Juergen Seifert <seifert@htwm.de>
+
+
+STANDARD MICROSYSTEMS CORPORATION (SMC) ARCNET-PC500/PC550 Long Board
+=====================================================================
+
+Note: There is another Version of the PC500 called Short Version, which 
+      is different in hard- and software! The most important differences
+      are:
+      - The long board has no Shared memory.
+      - On the long board the selection of the interrupt is done by binary
+        coded switch, on the short board directly by jumper.
+        
+[Avery's note: pay special attention to that: the long board HAS NO SHARED
+MEMORY.  This means the current Linux-ARCnet driver can't use these cards. 
+I have obtained a PC500Longboard and will be doing some experiments on it in
+the future, but don't hold your breath.  Thanks again to Juergen Seifert for
+his advice about this!]
+
+This description has been written by Juergen Seifert <seifert@htwm.de>
+using information from the following Original SMC Manual 
+
+             "Configuration Guide for
+             SMC ARCNET-PC500/PC550
+         Series Network Controller Boards
+             Pub. # 900.033 Rev. A
+                November, 1989"
+
+ARCNET is a registered trademark of the Datapoint Corporation
+SMC is a registered trademark of the Standard Microsystems Corporation  
+
+The PC500 is equipped with a standard BNC female connector for connection
+to RG-62/U coax cable.
+The board is designed both for point-to-point connection in star networks
+and for connection to bus networks.
+
+The PC550 is equipped with two modular RJ11-type jacks for connection
+to twisted pair wiring.
+It can be used in a star or a daisy-chained (BUS) network.
+
+       1 
+       0 9 8 7 6 5 4 3 2 1     6 5 4 3 2 1
+    ____________________________________________________________________
+   < |         SW1         | |     SW2     |                            |
+   > |_____________________| |_____________|                            |
+   <   IRQ    |I/O Addr                                                 |
+   >                                                                 ___|
+   <                                                            CR4 |___|
+   >                                                            CR3 |___|
+   <                                                                 ___|
+   >                                                              N |   | 8
+   <                                                              o |   | 7
+   >                                                              d | S | 6
+   <                                                              e | W | 5
+   >                                                              A | 3 | 4
+   <                                                              d |   | 3
+   >                                                              d |   | 2
+   <                                                              r |___| 1
+   >                                                        |o|    _____|
+   <                                                        |o|   | J1  |
+   >  3 1                                                   JP6   |_____|
+   < |o|o| JP2                                                    | J2  |
+   > |o|o|                                                        |_____|
+   <  4 2__                                               ______________|
+   >    |  |                                             |
+   <____|  |_____________________________________________|
+
+Legend:
+
+SW1	1-6:	I/O Base Address Select
+	7-10:	Interrupt Select
+SW2	1-6:	Reserved for Future Use
+SW3	1-8:	Node ID Select
+JP2	1-4:	Extended Timeout Select
+JP6		Selected - Star Topology	(PC500 only)
+		Deselected - Bus Topology	(PC500 only)
+CR3	Green	Monitors Network Activity
+CR4	Red	Monitors Board Activity
+J1		BNC RG62/U Connector		(PC500 only)
+J1		6-position Telephone Jack	(PC550 only)
+J2		6-position Telephone Jack	(PC550 only)
+
+Setting one of the switches to Off/Open means "1", On/Closed means "0".
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in group SW3 are used to set the node ID. Each node
+attached to the network must have an unique node ID which must be 
+different from 0.
+Switch 1 serves as the least significant bit (LSB).
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+    Switch | Value
+    -------|-------
+      1    |   1
+      2    |   2
+      3    |   4
+      4    |   8
+      5    |  16
+      6    |  32
+      7    |  64
+      8    | 128
+
+Some Examples:
+
+    Switch         | Hex     | Decimal 
+   8 7 6 5 4 3 2 1 | Node ID | Node ID
+   ----------------|---------|---------
+   0 0 0 0 0 0 0 0 |    not allowed
+   0 0 0 0 0 0 0 1 |    1    |    1 
+   0 0 0 0 0 0 1 0 |    2    |    2
+   0 0 0 0 0 0 1 1 |    3    |    3
+       . . .       |         |
+   0 1 0 1 0 1 0 1 |   55    |   85
+       . . .       |         |
+   1 0 1 0 1 0 1 0 |   AA    |  170
+       . . .       |         |  
+   1 1 1 1 1 1 0 1 |   FD    |  253
+   1 1 1 1 1 1 1 0 |   FE    |  254
+   1 1 1 1 1 1 1 1 |   FF    |  255 
+
+
+Setting the I/O Base Address
+----------------------------
+
+The first six switches in switch group SW1 are used to select one
+of 32 possible I/O Base addresses using the following table
+
+   Switch       | Hex I/O
+   6 5  4 3 2 1 | Address
+   -------------|--------
+   0 1  0 0 0 0 |  200
+   0 1  0 0 0 1 |  210
+   0 1  0 0 1 0 |  220
+   0 1  0 0 1 1 |  230
+   0 1  0 1 0 0 |  240
+   0 1  0 1 0 1 |  250
+   0 1  0 1 1 0 |  260
+   0 1  0 1 1 1 |  270
+   0 1  1 0 0 0 |  280
+   0 1  1 0 0 1 |  290
+   0 1  1 0 1 0 |  2A0
+   0 1  1 0 1 1 |  2B0
+   0 1  1 1 0 0 |  2C0
+   0 1  1 1 0 1 |  2D0
+   0 1  1 1 1 0 |  2E0 (Manufacturer's default)
+   0 1  1 1 1 1 |  2F0
+   1 1  0 0 0 0 |  300
+   1 1  0 0 0 1 |  310
+   1 1  0 0 1 0 |  320
+   1 1  0 0 1 1 |  330
+   1 1  0 1 0 0 |  340
+   1 1  0 1 0 1 |  350
+   1 1  0 1 1 0 |  360
+   1 1  0 1 1 1 |  370
+   1 1  1 0 0 0 |  380
+   1 1  1 0 0 1 |  390
+   1 1  1 0 1 0 |  3A0
+   1 1  1 0 1 1 |  3B0
+   1 1  1 1 0 0 |  3C0
+   1 1  1 1 0 1 |  3D0
+   1 1  1 1 1 0 |  3E0
+   1 1  1 1 1 1 |  3F0
+
+
+Setting the Interrupt
+---------------------
+
+Switches seven through ten of switch group SW1 are used to select the 
+interrupt level. The interrupt level is binary coded, so selections 
+from 0 to 15 would be possible, but only the following eight values will
+be supported: 3, 4, 5, 7, 9, 10, 11, 12.
+
+   Switch   | IRQ
+   10 9 8 7 | 
+   ---------|-------- 
+    0 0 1 1 |  3
+    0 1 0 0 |  4
+    0 1 0 1 |  5
+    0 1 1 1 |  7
+    1 0 0 1 |  9 (=2) (default)
+    1 0 1 0 | 10
+    1 0 1 1 | 11
+    1 1 0 0 | 12
+
+
+Setting the Timeouts 
+--------------------
+
+The two jumpers JP2 (1-4) are used to determine the timeout parameters. 
+These two jumpers are normally left open.
+Refer to the COM9026 Data Sheet for alternate configurations.
+
+
+Configuring the PC500 for Star or Bus Topology
+----------------------------------------------
+
+The single jumper labeled JP6 is used to configure the PC500 board for 
+star or bus topology.
+When the jumper is installed, the board may be used in a star network, when 
+it is removed, the board can be used in a bus topology.
+
+
+Diagnostic LEDs
+---------------
+
+Two diagnostic LEDs are visible on the rear bracket of the board.
+The green LED monitors the network activity: the red one shows the
+board activity:
+
+ Green  | Status               Red      | Status
+ -------|-------------------   ---------|-------------------
+  on    | normal activity      flash/on | data transfer
+  blink | reconfiguration      off      | no data transfer;
+  off   | defective board or            | incorrect memory or
+        | node ID is zero               | I/O address
+
+
+*****************************************************************************
+
+** SMC **
+PC710 (8-bit card)
+------------------
+  - from J.S. van Oosten <jvoosten@compiler.tdcnet.nl>
+  
+Note: this data is gathered by experimenting and looking at info of other
+cards. However, I'm sure I got 99% of the settings right.
+
+The SMC710 card resembles the PC270 card, but is much more basic (i.e. no
+LEDs, RJ11 jacks, etc.) and 8 bit. Here's a little drawing:
+
+    _______________________________________   
+   | +---------+  +---------+              |____
+   | |   S2    |  |   S1    |              |
+   | +---------+  +---------+              |
+   |                                       |
+   |  +===+    __                          |
+   |  | R |   |  | X-tal                 ###___
+   |  | O |   |__|                      ####__'|
+   |  | M |    ||                        ###
+   |  +===+                                |
+   |                                       |
+   |   .. JP1   +----------+               |
+   |   ..       | big chip |               |   
+   |   ..       |  90C63   |               |
+   |   ..       |          |               |
+   |   ..       +----------+               |
+    -------                     -----------
+           |||||||||||||||||||||
+
+The row of jumpers at JP1 actually consists of 8 jumpers, (sometimes
+labelled) the same as on the PC270, from top to bottom: EXT2, EXT1, ROM,
+IRQ7, IRQ5, IRQ4, IRQ3, IRQ2 (gee, wonder what they would do? :-) )
+
+S1 and S2 perform the same function as on the PC270, only their numbers
+are swapped (S1 is the nodeaddress, S2 sets IO- and RAM-address).
+
+I know it works when connected to a PC110 type ARCnet board.
+
+	
+*****************************************************************************
+
+** Possibly SMC **
+LCS-8830(-T) (8 and 16-bit cards)
+---------------------------------
+  - from Mathias Katzer <mkatzer@HRZ.Uni-Bielefeld.DE>
+  - Marek Michalkiewicz <marekm@i17linuxb.ists.pwr.wroc.pl> says the
+    LCS-8830 is slightly different from LCS-8830-T.  These are 8 bit, BUS
+    only (the JP0 jumper is hardwired), and BNC only.
+	
+This is a LCS-8830-T made by SMC, I think ('SMC' only appears on one PLCC,
+nowhere else, not even on the few Xeroxed sheets from the manual).
+
+SMC ARCnet Board Type LCS-8830-T
+
+   ------------------------------------
+  |                                    |
+  |              JP3 88  8 JP2         |
+  |       #####      | \               |
+  |       #####    ET1 ET2          ###|
+  |                              8  ###|
+  |  U3   SW 1                  JP0 ###|  Phone Jacks
+  |  --                             ###|
+  | |  |                               |
+  | |  |   SW2                         |
+  | |  |                               |
+  | |  |  #####                        |
+  |  --   #####                       ####  BNC Connector 
+  |                                   ####
+  |   888888 JP1                       |
+  |   234567                           |
+   --                           -------
+     |||||||||||||||||||||||||||
+      --------------------------
+
+
+SW1: DIP-Switches for Station Address
+SW2: DIP-Switches for Memory Base and I/O Base addresses
+
+JP0: If closed, internal termination on (default open)
+JP1: IRQ Jumpers
+JP2: Boot-ROM enabled if closed
+JP3: Jumpers for response timeout
+ 
+U3: Boot-ROM Socket          
+
+
+ET1 ET2     Response Time     Idle Time    Reconfiguration Time
+
+               78                86               840
+ X            285               316              1680
+     X        563               624              1680
+ X   X       1130              1237              1680
+
+(X means closed jumper)
+
+(DIP-Switch downwards means "0")
+
+The station address is binary-coded with SW1.
+
+The I/O base address is coded with DIP-Switches 6,7 and 8 of SW2:
+
+Switches        Base
+678             Address
+000		260-26f
+100		290-29f
+010		2e0-2ef
+110		2f0-2ff
+001		300-30f
+101		350-35f
+011		380-38f
+111 		3e0-3ef
+
+
+DIP Switches 1-5 of SW2 encode the RAM and ROM Address Range:
+
+Switches        RAM           ROM
+12345           Address Range  Address Range
+00000		C:0000-C:07ff	C:2000-C:3fff
+10000		C:0800-C:0fff
+01000		C:1000-C:17ff
+11000		C:1800-C:1fff
+00100		C:4000-C:47ff	C:6000-C:7fff
+10100		C:4800-C:4fff
+01100		C:5000-C:57ff 
+11100		C:5800-C:5fff
+00010		C:C000-C:C7ff	C:E000-C:ffff
+10010		C:C800-C:Cfff
+01010		C:D000-C:D7ff
+11010		C:D800-C:Dfff
+00110		D:0000-D:07ff	D:2000-D:3fff
+10110		D:0800-D:0fff
+01110		D:1000-D:17ff
+11110		D:1800-D:1fff
+00001		D:4000-D:47ff	D:6000-D:7fff
+10001		D:4800-D:4fff
+01001		D:5000-D:57ff
+11001		D:5800-D:5fff
+00101		D:8000-D:87ff	D:A000-D:bfff
+10101		D:8800-D:8fff
+01101		D:9000-D:97ff
+11101		D:9800-D:9fff 
+00011		D:C000-D:c7ff	D:E000-D:ffff
+10011		D:C800-D:cfff
+01011		D:D000-D:d7ff
+11011		D:D800-D:dfff
+00111		E:0000-E:07ff	E:2000-E:3fff
+10111		E:0800-E:0fff
+01111		E:1000-E:17ff
+11111		E:1800-E:1fff
+
+
+*****************************************************************************
+
+** PureData Corp **
+PDI507 (8-bit card)
+--------------------
+  - from Mark Rejhon <mdrejhon@magi.com> (slight modifications by Avery)
+  - Avery's note: I think PDI508 cards (but definitely NOT PDI508Plus cards)
+    are mostly the same as this.  PDI508Plus cards appear to be mainly
+    software-configured.
+
+Jumpers:
+	There is a jumper array at the bottom of the card, near the edge
+        connector.  This array is labelled J1.  They control the IRQs and
+        something else.  Put only one jumper on the IRQ pins.
+
+	ETS1, ETS2 are for timing on very long distance networks.  See the
+	more general information near the top of this file.
+
+	There is a J2 jumper on two pins.  A jumper should be put on them,
+        since it was already there when I got the card.  I don't know what
+        this jumper is for though.
+
+	There is a two-jumper array for J3.  I don't know what it is for,
+        but there were already two jumpers on it when I got the card.  It's
+        a six pin grid in a two-by-three fashion.  The jumpers were
+        configured as follows:
+
+	   .-------.
+	 o | o   o |
+	   :-------:    ------> Accessible end of card with connectors
+	 o | o   o |             in this direction ------->
+	   `-------'
+
+Carl de Billy <CARL@carainfo.com> explains J3 and J4:
+
+	J3 Diagram:
+
+           .-------.
+         o | o   o |
+           :-------:    TWIST Technology
+         o | o   o |
+           `-------'
+           .-------.
+           | o   o | o
+           :-------:    COAX Technology
+           | o   o | o
+           `-------'
+
+  - If using coax cable in a bus topology the J4 jumper must be removed;
+    place it on one pin.
+
+  - If using bus topology with twisted pair wiring move the J3 
+    jumpers so they connect the middle pin and the pins closest to the RJ11
+    Connectors.  Also the J4 jumper must be removed; place it on one pin of
+    J4 jumper for storage.
+
+  - If using  star topology with twisted pair wiring move the J3 
+    jumpers so they connect the middle pin and the pins closest to the RJ11
+    connectors.
+
+
+DIP Switches:
+
+	The DIP switches accessible on the accessible end of the card while
+        it is installed, is used to set the ARCnet address.  There are 8
+        switches.  Use an address from 1 to 254.
+
+	Switch No.
+	12345678	ARCnet address
+	-----------------------------------------
+	00000000	FF  	(Don't use this!)
+	00000001	FE
+	00000010	FD
+	....
+	11111101	2	
+	11111110	1
+	11111111	0	(Don't use this!)
+
+	There is another array of eight DIP switches at the top of the
+        card.  There are five labelled MS0-MS4 which seem to control the
+        memory address, and another three labelled IO0-IO2 which seem to
+        control the base I/O address of the card.
+
+	This was difficult to test by trial and error, and the I/O addresses
+        are in a weird order.  This was tested by setting the DIP switches,
+        rebooting the computer, and attempting to load ARCETHER at various
+        addresses (mostly between 0x200 and 0x400).  The address that caused
+        the red transmit LED to blink, is the one that I thought works.
+
+	Also, the address 0x3D0 seem to have a special meaning, since the
+        ARCETHER packet driver loaded fine, but without the red LED
+        blinking.  I don't know what 0x3D0 is for though.  I recommend using
+        an address of 0x300 since Windows may not like addresses below
+        0x300.
+
+	IO Switch No.
+	210             I/O address
+	-------------------------------
+	111             0x260
+	110             0x290
+	101             0x2E0
+	100             0x2F0
+	011             0x300
+	010             0x350
+	001             0x380
+	000             0x3E0
+
+	The memory switches set a reserved address space of 0x1000 bytes
+        (0x100 segment units, or 4k).  For example if I set an address of
+        0xD000, it will use up addresses 0xD000 to 0xD100.
+
+	The memory switches were tested by booting using QEMM386 stealth,
+        and using LOADHI to see what address automatically became excluded
+        from the upper memory regions, and then attempting to load ARCETHER
+        using these addresses.
+
+	I recommend using an ARCnet memory address of 0xD000, and putting
+        the EMS page frame at 0xC000 while using QEMM stealth mode.  That
+        way, you get contiguous high memory from 0xD100 almost all the way
+        the end of the megabyte.
+
+	Memory Switch 0 (MS0) didn't seem to work properly when set to OFF
+        on my card.  It could be malfunctioning on my card.  Experiment with
+        it ON first, and if it doesn't work, set it to OFF.  (It may be a
+        modifier for the 0x200 bit?)
+
+	MS Switch No.
+	43210           Memory address
+	--------------------------------
+	00001           0xE100  (guessed - was not detected by QEMM)
+	00011           0xE000  (guessed - was not detected by QEMM)
+	00101           0xDD00
+	00111           0xDC00
+	01001           0xD900
+	01011           0xD800
+	01101           0xD500
+	01111           0xD400
+	10001           0xD100
+	10011           0xD000
+	10101           0xCD00
+	10111           0xCC00
+	11001           0xC900 (guessed - crashes tested system)
+	11011           0xC800 (guessed - crashes tested system)
+	11101           0xC500 (guessed - crashes tested system)
+	11111           0xC400 (guessed - crashes tested system)
+	
+	
+*****************************************************************************
+
+** CNet Technology Inc. **
+120 Series (8-bit cards)
+------------------------
+  - from Juergen Seifert <seifert@htwm.de>
+
+
+CNET TECHNOLOGY INC. (CNet) ARCNET 120A SERIES
+==============================================
+
+This description has been written by Juergen Seifert <seifert@htwm.de>
+using information from the following Original CNet Manual 
+
+              "ARCNET
+            USER'S MANUAL 
+                for
+               CN120A
+               CN120AB
+               CN120TP
+               CN120ST
+               CN120SBT
+             P/N:12-01-0007
+             Revision 3.00"
+
+ARCNET is a registered trademark of the Datapoint Corporation
+
+P/N 120A   ARCNET 8 bit XT/AT Star
+P/N 120AB  ARCNET 8 bit XT/AT Bus
+P/N 120TP  ARCNET 8 bit XT/AT Twisted Pair
+P/N 120ST  ARCNET 8 bit XT/AT Star, Twisted Pair
+P/N 120SBT ARCNET 8 bit XT/AT Star, Bus, Twisted Pair
+
+    __________________________________________________________________
+   |                                                                  |
+   |                                                               ___|
+   |                                                          LED |___|
+   |                                                               ___|
+   |                                                            N |   | ID7
+   |                                                            o |   | ID6
+   |                                                            d | S | ID5
+   |                                                            e | W | ID4
+   |                     ___________________                    A | 2 | ID3
+   |                    |                   |                   d |   | ID2
+   |                    |                   |  1 2 3 4 5 6 7 8  d |   | ID1
+   |                    |                   | _________________ r |___| ID0
+   |                    |      90C65        ||       SW1       |  ____|
+   |  JP 8 7            |                   ||_________________| |    |
+   |    |o|o|  JP1      |                   |                    | J2 |
+   |    |o|o|  |oo|     |                   |         JP 1 1 1   |    |
+   |   ______________   |                   |            0 1 2   |____|
+   |  |  PROM        |  |___________________|           |o|o|o|  _____|
+   |  >  SOCKET      |  JP 6 5 4 3 2                    |o|o|o| | J1  |
+   |  |______________|    |o|o|o|o|o|                   |o|o|o| |_____|
+   |_____                 |o|o|o|o|o|                   ______________|
+         |                                             |
+         |_____________________________________________|
+
+Legend:
+
+90C65       ARCNET Probe
+S1  1-5:    Base Memory Address Select
+    6-8:    Base I/O Address Select
+S2  1-8:    Node ID Select (ID0-ID7)
+JP1     ROM Enable Select
+JP2     IRQ2
+JP3     IRQ3
+JP4     IRQ4
+JP5     IRQ5
+JP6     IRQ7
+JP7/JP8     ET1, ET2 Timeout Parameters
+JP10/JP11   Coax / Twisted Pair Select  (CN120ST/SBT only)
+JP12        Terminator Select       (CN120AB/ST/SBT only)
+J1      BNC RG62/U Connector        (all except CN120TP)
+J2      Two 6-position Telephone Jack   (CN120TP/ST/SBT only)
+
+Setting one of the switches to Off means "1", On means "0".
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in SW2 are used to set the node ID. Each node attached
+to the network must have an unique node ID which must be different from 0.
+Switch 1 (ID0) serves as the least significant bit (LSB).
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+   Switch | Label | Value
+   -------|-------|-------
+     1    | ID0   |   1
+     2    | ID1   |   2
+     3    | ID2   |   4
+     4    | ID3   |   8
+     5    | ID4   |  16
+     6    | ID5   |  32
+     7    | ID6   |  64
+     8    | ID7   | 128
+
+Some Examples:
+
+    Switch         | Hex     | Decimal 
+   8 7 6 5 4 3 2 1 | Node ID | Node ID
+   ----------------|---------|---------
+   0 0 0 0 0 0 0 0 |    not allowed
+   0 0 0 0 0 0 0 1 |    1    |    1 
+   0 0 0 0 0 0 1 0 |    2    |    2
+   0 0 0 0 0 0 1 1 |    3    |    3
+       . . .       |         |
+   0 1 0 1 0 1 0 1 |   55    |   85
+       . . .       |         |
+   1 0 1 0 1 0 1 0 |   AA    |  170
+       . . .       |         |  
+   1 1 1 1 1 1 0 1 |   FD    |  253
+   1 1 1 1 1 1 1 0 |   FE    |  254
+   1 1 1 1 1 1 1 1 |   FF    |  255
+
+
+Setting the I/O Base Address
+----------------------------
+
+The last three switches in switch block SW1 are used to select one
+of eight possible I/O Base addresses using the following table
+
+
+   Switch      | Hex I/O
+    6   7   8  | Address
+   ------------|--------
+   ON  ON  ON  |  260
+   OFF ON  ON  |  290
+   ON  OFF ON  |  2E0  (Manufacturer's default)
+   OFF OFF ON  |  2F0
+   ON  ON  OFF |  300
+   OFF ON  OFF |  350
+   ON  OFF OFF |  380
+   OFF OFF OFF |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer (RAM) requires 2K. The base of this buffer can be 
+located in any of eight positions. The address of the Boot Prom is
+memory base + 8K or memory base + 0x2000.
+Switches 1-5 of switch block SW1 select the Memory Base address.
+
+   Switch              | Hex RAM | Hex ROM
+    1   2   3   4   5  | Address | Address *)
+   --------------------|---------|-----------
+   ON  ON  ON  ON  ON  |  C0000  |  C2000
+   ON  ON  OFF ON  ON  |  C4000  |  C6000
+   ON  ON  ON  OFF ON  |  CC000  |  CE000
+   ON  ON  OFF OFF ON  |  D0000  |  D2000  (Manufacturer's default)
+   ON  ON  ON  ON  OFF |  D4000  |  D6000
+   ON  ON  OFF ON  OFF |  D8000  |  DA000
+   ON  ON  ON  OFF OFF |  DC000  |  DE000
+   ON  ON  OFF OFF OFF |  E0000  |  E2000
+  
+*) To enable the Boot ROM install the jumper JP1
+
+Note: Since the switches 1 and 2 are always set to ON it may be possible
+      that they can be used to add an offset of 2K, 4K or 6K to the base
+      address, but this feature is not documented in the manual and I
+      haven't tested it yet.
+
+
+Setting the Interrupt Line
+--------------------------
+
+To select a hardware interrupt level install one (only one!) of the jumpers
+JP2, JP3, JP4, JP5, JP6. JP2 is the default.
+
+   Jumper | IRQ     
+   -------|-----
+     2    |  2
+     3    |  3
+     4    |  4
+     5    |  5
+     6    |  7
+
+
+Setting the Internal Terminator on CN120AB/TP/SBT
+--------------------------------------------------
+
+The jumper JP12 is used to enable the internal terminator. 
+
+                         -----
+       0                |  0  |     
+     -----   ON         |     |  ON
+    |  0  |             |  0  |
+    |     |  OFF         -----   OFF
+    |  0  |                0
+     -----
+   Terminator          Terminator 
+    disabled            enabled
+  
+
+Selecting the Connector Type on CN120ST/SBT
+-------------------------------------------
+
+     JP10    JP11        JP10    JP11
+                         -----   -----
+       0       0        |  0  | |  0  |       
+     -----   -----      |     | |     |
+    |  0  | |  0  |     |  0  | |  0  |
+    |     | |     |      -----   -----
+    |  0  | |  0  |        0       0 
+     -----   -----
+     Coaxial Cable       Twisted Pair Cable 
+       (Default)
+
+
+Setting the Timeout Parameters
+------------------------------
+
+The jumpers labeled EXT1 and EXT2 are used to determine the timeout 
+parameters. These two jumpers are normally left open.
+
+
+
+*****************************************************************************
+
+** CNet Technology Inc. **
+160 Series (16-bit cards)
+-------------------------
+  - from Juergen Seifert <seifert@htwm.de>
+
+CNET TECHNOLOGY INC. (CNet) ARCNET 160A SERIES
+==============================================
+
+This description has been written by Juergen Seifert <seifert@htwm.de>
+using information from the following Original CNet Manual 
+
+              "ARCNET
+            USER'S MANUAL 
+                for
+               CN160A
+               CN160AB
+               CN160TP
+             P/N:12-01-0006
+             Revision 3.00"
+
+ARCNET is a registered trademark of the Datapoint Corporation
+
+P/N 160A   ARCNET 16 bit XT/AT Star
+P/N 160AB  ARCNET 16 bit XT/AT Bus
+P/N 160TP  ARCNET 16 bit XT/AT Twisted Pair
+
+   ___________________________________________________________________
+  <                             _________________________          ___|
+  >               |oo| JP2     |                         |    LED |___|
+  <               |oo| JP1     |        9026             |    LED |___|
+  >                            |_________________________|         ___|
+  <                                                             N |   | ID7
+  >                                                      1      o |   | ID6
+  <                                    1 2 3 4 5 6 7 8 9 0      d | S | ID5
+  >         _______________           _____________________     e | W | ID4
+  <        |     PROM      |         |         SW1         |    A | 2 | ID3
+  >        >    SOCKET     |         |_____________________|    d |   | ID2
+  <        |_______________|          | IO-Base   | MEM   |     d |   | ID1
+  >                                                             r |___| ID0
+  <                                                               ____|
+  >                                                              |    |
+  <                                                              | J1 |
+  >                                                              |    |
+  <                                                              |____|
+  >                            1 1 1 1                                |
+  <  3 4 5 6 7      JP     8 9 0 1 2 3                                |
+  > |o|o|o|o|o|           |o|o|o|o|o|o|                               |
+  < |o|o|o|o|o| __        |o|o|o|o|o|o|                    ___________|
+  >            |  |                                       |
+  <____________|  |_______________________________________|
+
+Legend:
+
+9026            ARCNET Probe
+SW1 1-6:    Base I/O Address Select
+    7-10:   Base Memory Address Select
+SW2 1-8:    Node ID Select (ID0-ID7)
+JP1/JP2     ET1, ET2 Timeout Parameters
+JP3-JP13    Interrupt Select
+J1      BNC RG62/U Connector        (CN160A/AB only)
+J1      Two 6-position Telephone Jack   (CN160TP only)
+LED
+
+Setting one of the switches to Off means "1", On means "0".
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in SW2 are used to set the node ID. Each node attached
+to the network must have an unique node ID which must be different from 0.
+Switch 1 (ID0) serves as the least significant bit (LSB).
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+   Switch | Label | Value
+   -------|-------|-------
+     1    | ID0   |   1
+     2    | ID1   |   2
+     3    | ID2   |   4
+     4    | ID3   |   8
+     5    | ID4   |  16
+     6    | ID5   |  32
+     7    | ID6   |  64
+     8    | ID7   | 128
+
+Some Examples:
+
+    Switch         | Hex     | Decimal 
+   8 7 6 5 4 3 2 1 | Node ID | Node ID
+   ----------------|---------|---------
+   0 0 0 0 0 0 0 0 |    not allowed
+   0 0 0 0 0 0 0 1 |    1    |    1 
+   0 0 0 0 0 0 1 0 |    2    |    2
+   0 0 0 0 0 0 1 1 |    3    |    3
+       . . .       |         |
+   0 1 0 1 0 1 0 1 |   55    |   85
+       . . .       |         |
+   1 0 1 0 1 0 1 0 |   AA    |  170
+       . . .       |         |  
+   1 1 1 1 1 1 0 1 |   FD    |  253
+   1 1 1 1 1 1 1 0 |   FE    |  254
+   1 1 1 1 1 1 1 1 |   FF    |  255
+
+
+Setting the I/O Base Address
+----------------------------
+
+The first six switches in switch block SW1 are used to select the I/O Base
+address using the following table:
+
+             Switch        | Hex I/O
+    1   2   3   4   5   6  | Address
+   ------------------------|--------
+   OFF ON  ON  OFF OFF ON  |  260
+   OFF ON  OFF ON  ON  OFF |  290
+   OFF ON  OFF OFF OFF ON  |  2E0  (Manufacturer's default)
+   OFF ON  OFF OFF OFF OFF |  2F0
+   OFF OFF ON  ON  ON  ON  |  300
+   OFF OFF ON  OFF ON  OFF |  350
+   OFF OFF OFF ON  ON  ON  |  380
+   OFF OFF OFF OFF OFF ON  |  3E0
+
+Note: Other IO-Base addresses seem to be selectable, but only the above
+      combinations are documented.
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The switches 7-10 of switch block SW1 are used to select the Memory
+Base address of the RAM (2K) and the PROM.
+
+   Switch          | Hex RAM | Hex ROM
+    7   8   9  10  | Address | Address
+   ----------------|---------|-----------
+   OFF OFF ON  ON  |  C0000  |  C8000
+   OFF OFF ON  OFF |  D0000  |  D8000 (Default)
+   OFF OFF OFF ON  |  E0000  |  E8000
+
+Note: Other MEM-Base addresses seem to be selectable, but only the above
+      combinations are documented.
+
+
+Setting the Interrupt Line
+--------------------------
+
+To select a hardware interrupt level install one (only one!) of the jumpers
+JP3 through JP13 using the following table:
+
+   Jumper | IRQ     
+   -------|-----------------
+     3    |  14
+     4    |  15
+     5    |  12
+     6    |  11
+     7    |  10
+     8    |   3
+     9    |   4
+    10    |   5
+    11    |   6
+    12    |   7
+    13    |   2 (=9) Default!
+
+Note:  - Do not use JP11=IRQ6, it may conflict with your Floppy Disk
+         Controller
+       - Use JP3=IRQ14 only, if you don't have an IDE-, MFM-, or RLL-
+         Hard Disk, it may conflict with their controllers
+
+
+Setting the Timeout Parameters
+------------------------------
+
+The jumpers labeled JP1 and JP2 are used to determine the timeout
+parameters. These two jumpers are normally left open.
+
+
+*****************************************************************************
+
+** Lantech **
+8-bit card, unknown model
+-------------------------
+  - from Vlad Lungu <vlungu@ugal.ro> - his e-mail address seemed broken at
+    the time I tried to reach him.  Sorry Vlad, if you didn't get my reply.
+
+   ________________________________________________________________
+   |   1         8                                                 |
+   |   ___________                                               __|
+   |   |   SW1    |                                         LED |__|
+   |   |__________|                                                |
+   |                                                            ___|
+   |                _____________________                       |S | 8
+   |                |                   |                       |W |
+   |                |                   |                       |2 |
+   |                |                   |                       |__| 1
+   |                |      UM9065L      |     |o|  JP4         ____|____
+   |                |                   |     |o|              |  CN    |
+   |                |                   |                      |________|
+   |                |                   |                          |
+   |                |___________________|                          |
+   |                                                               |
+   |                                                               |
+   |      _____________                                            |
+   |      |            |                                           |
+   |      |    PROM    |        |ooooo|  JP6                       |
+   |      |____________|        |ooooo|                            |
+   |_____________                                             _   _|
+                |____________________________________________| |__|
+
+
+UM9065L : ARCnet Controller
+
+SW 1    : Shared Memory Address and I/O Base
+
+        ON=0
+
+        12345|Memory Address
+        -----|--------------
+        00001|  D4000
+        00010|  CC000
+        00110|  D0000
+        01110|  D1000
+        01101|  D9000
+        10010|  CC800
+        10011|  DC800
+        11110|  D1800
+
+It seems that the bits are considered in reverse order.  Also, you must
+observe that some of those addresses are unusual and I didn't probe them; I
+used a memory dump in DOS to identify them.  For the 00000 configuration and
+some others that I didn't write here the card seems to conflict with the
+video card (an S3 GENDAC). I leave the full decoding of those addresses to
+you.
+
+        678| I/O Address
+        ---|------------
+        000|    260
+        001|    failed probe
+        010|    2E0
+        011|    380
+        100|    290
+        101|    350
+        110|    failed probe
+        111|    3E0
+
+SW 2  : Node ID (binary coded)
+
+JP 4  : Boot PROM enable   CLOSE - enabled
+                           OPEN  - disabled
+
+JP 6  : IRQ set (ONLY ONE jumper on 1-5 for IRQ 2-6)
+
+
+*****************************************************************************
+
+** Acer **
+8-bit card, Model 5210-003
+--------------------------
+  - from Vojtech Pavlik <vojtech@suse.cz> using portions of the existing
+    arcnet-hardware file.
+
+This is a 90C26 based card.  Its configuration seems similar to the SMC
+PC100, but has some additional jumpers I don't know the meaning of.
+
+               __
+              |  |
+   ___________|__|_________________________
+  |         |      |                       |
+  |         | BNC  |                       |
+  |         |______|                    ___|
+  |  _____________________             |___  
+  | |                     |                |
+  | | Hybrid IC           |                |
+  | |                     |       o|o J1   |
+  | |_____________________|       8|8      |
+  |                               8|8 J5   |
+  |                               o|o      |
+  |                               8|8      |
+  |__                             8|8      |
+ (|__| LED                        o|o      |
+  |                               8|8      |
+  |                               8|8 J15  |
+  |                                        |
+  |                    _____               |
+  |                   |     |   _____      |
+  |                   |     |  |     |  ___|
+  |                   |     |  |     | |    
+  |  _____            | ROM |  | UFS | |    
+  | |     |           |     |  |     | |   
+  | |     |     ___   |     |  |     | |   
+  | |     |    |   |  |__.__|  |__.__| |   
+  | | NCR |    |XTL|   _____    _____  |   
+  | |     |    |___|  |     |  |     | |   
+  | |90C26|           |     |  |     | |   
+  | |     |           | RAM |  | UFS | |   
+  | |     | J17 o|o   |     |  |     | |   
+  | |     | J16 o|o   |     |  |     | |   
+  | |__.__|           |__.__|  |__.__| |   
+  |  ___                               |   
+  | |   |8                             |   
+  | |SW2|                              |   
+  | |   |                              |   
+  | |___|1                             |   
+  |  ___                               |   
+  | |   |10           J18 o|o          |   
+  | |   |                 o|o          |   
+  | |SW1|                 o|o          |   
+  | |   |             J21 o|o          |   
+  | |___|1                             |   
+  |                                    |   
+  |____________________________________|   
+
+
+Legend:
+
+90C26       ARCNET Chip
+XTL         20 MHz Crystal
+SW1 1-6     Base I/O Address Select
+    7-10    Memory Address Select
+SW2 1-8     Node ID Select (ID0-ID7)
+J1-J5       IRQ Select
+J6-J21      Unknown (Probably extra timeouts & ROM enable ...)
+LED1        Activity LED 
+BNC         Coax connector (STAR ARCnet)
+RAM         2k of SRAM
+ROM         Boot ROM socket
+UFS         Unidentified Flying Sockets
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in SW2 are used to set the node ID. Each node attached
+to the network must have an unique node ID which must not be 0.
+Switch 1 (ID0) serves as the least significant bit (LSB).
+
+Setting one of the switches to OFF means "1", ON means "0".
+
+The node ID is the sum of the values of all switches set to "1"
+These values are:
+
+   Switch | Value
+   -------|-------
+     1    |   1
+     2    |   2
+     3    |   4
+     4    |   8
+     5    |  16
+     6    |  32
+     7    |  64
+     8    | 128
+
+Don't set this to 0 or 255; these values are reserved.
+
+
+Setting the I/O Base Address
+----------------------------
+
+The switches 1 to 6 of switch block SW1 are used to select one
+of 32 possible I/O Base addresses using the following tables
+   
+          | Hex
+   Switch | Value
+   -------|-------
+     1    | 200  
+     2    | 100  
+     3    |  80  
+     4    |  40  
+     5    |  20  
+     6    |  10 
+
+The I/O address is sum of all switches set to "1". Remember that
+the I/O address space bellow 0x200 is RESERVED for mainboard, so
+switch 1 should be ALWAYS SET TO OFF. 
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer (RAM) requires 2K. The base of this buffer can be
+located in any of sixteen positions. However, the addresses below
+A0000 are likely to cause system hang because there's main RAM.
+
+Jumpers 7-10 of switch block SW1 select the Memory Base address.
+
+   Switch          | Hex RAM
+    7   8   9  10  | Address
+   ----------------|---------
+   OFF OFF OFF OFF |  F0000 (conflicts with main BIOS)
+   OFF OFF OFF ON  |  E0000 
+   OFF OFF ON  OFF |  D0000
+   OFF OFF ON  ON  |  C0000 (conflicts with video BIOS)
+   OFF ON  OFF OFF |  B0000 (conflicts with mono video)
+   OFF ON  OFF ON  |  A0000 (conflicts with graphics)
+
+
+Setting the Interrupt Line
+--------------------------
+
+Jumpers 1-5 of the jumper block J1 control the IRQ level. ON means 
+shorted, OFF means open.
+
+    Jumper              |  IRQ
+    1   2   3   4   5   |
+   ----------------------------
+    ON  OFF OFF OFF OFF |  7
+    OFF ON  OFF OFF OFF |  5
+    OFF OFF ON  OFF OFF |  4
+    OFF OFF OFF ON  OFF |  3
+    OFF OFF OFF OFF ON  |  2
+
+
+Unknown jumpers & sockets
+-------------------------
+
+I know nothing about these. I just guess that J16&J17 are timeout
+jumpers and maybe one of J18-J21 selects ROM. Also J6-J10 and
+J11-J15 are connecting IRQ2-7 to some pins on the UFSs. I can't
+guess the purpose.
+
+
+*****************************************************************************
+
+** Datapoint? **
+LAN-ARC-8, an 8-bit card
+------------------------
+  - from Vojtech Pavlik <vojtech@suse.cz>
+
+This is another SMC 90C65-based ARCnet card. I couldn't identify the
+manufacturer, but it might be DataPoint, because the card has the
+original arcNet logo in its upper right corner.
+
+          _______________________________________________________
+         |                         _________                     |
+         |                        |   SW2   | ON      arcNet     |
+         |                        |_________| OFF             ___|
+         |  _____________         1 ______  8                |   | 8  
+         | |             | SW1     | XTAL | ____________     | S |    
+         | > RAM (2k)    |         |______||            |    | W |    
+         | |_____________|                 |      H     |    | 3 |    
+         |                        _________|_____ y     |    |___| 1  
+         |  _________            |         |     |b     |        |    
+         | |_________|           |         |     |r     |        |    
+         |                       |     SMC |     |i     |        |    
+         |                       |    90C65|     |d     |        |      
+         |  _________            |         |     |      |        |
+         | |   SW1   | ON        |         |     |I     |        |
+         | |_________| OFF       |_________|_____/C     |   _____|
+         |  1       8                      |            |  |     |___
+         |  ______________                 |            |  | BNC |___|
+         | |              |                |____________|  |_____|
+         | > EPROM SOCKET |              _____________           |
+         | |______________|             |_____________|          |
+         |                                         ______________|
+         |                                        | 
+         |________________________________________|
+
+Legend:
+
+90C65       ARCNET Chip 
+SW1 1-5:    Base Memory Address Select
+    6-8:    Base I/O Address Select
+SW2 1-8:    Node ID Select
+SW3 1-5:    IRQ Select   
+    6-7:    Extra Timeout
+    8  :    ROM Enable   
+BNC         Coax connector
+XTAL        20 MHz Crystal
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in SW3 are used to set the node ID. Each node attached
+to the network must have an unique node ID which must not be 0.
+Switch 1 serves as the least significant bit (LSB).
+
+Setting one of the switches to Off means "1", On means "0".
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+   Switch | Value
+   -------|-------
+     1    |   1
+     2    |   2
+     3    |   4
+     4    |   8
+     5    |  16
+     6    |  32
+     7    |  64
+     8    | 128
+
+
+Setting the I/O Base Address
+----------------------------
+
+The last three switches in switch block SW1 are used to select one
+of eight possible I/O Base addresses using the following table
+
+
+   Switch      | Hex I/O
+    6   7   8  | Address
+   ------------|--------
+   ON  ON  ON  |  260
+   OFF ON  ON  |  290
+   ON  OFF ON  |  2E0  (Manufacturer's default)
+   OFF OFF ON  |  2F0
+   ON  ON  OFF |  300
+   OFF ON  OFF |  350
+   ON  OFF OFF |  380
+   OFF OFF OFF |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer (RAM) requires 2K. The base of this buffer can be 
+located in any of eight positions. The address of the Boot Prom is
+memory base + 0x2000.
+Jumpers 3-5 of switch block SW1 select the Memory Base address.
+
+   Switch              | Hex RAM | Hex ROM
+    1   2   3   4   5  | Address | Address *)
+   --------------------|---------|-----------
+   ON  ON  ON  ON  ON  |  C0000  |  C2000
+   ON  ON  OFF ON  ON  |  C4000  |  C6000
+   ON  ON  ON  OFF ON  |  CC000  |  CE000
+   ON  ON  OFF OFF ON  |  D0000  |  D2000  (Manufacturer's default)
+   ON  ON  ON  ON  OFF |  D4000  |  D6000
+   ON  ON  OFF ON  OFF |  D8000  |  DA000
+   ON  ON  ON  OFF OFF |  DC000  |  DE000
+   ON  ON  OFF OFF OFF |  E0000  |  E2000
+  
+*) To enable the Boot ROM set the switch 8 of switch block SW3 to position ON.
+
+The switches 1 and 2 probably add 0x0800 and 0x1000 to RAM base address.
+
+
+Setting the Interrupt Line
+--------------------------
+
+Switches 1-5 of the switch block SW3 control the IRQ level.
+
+    Jumper              |  IRQ
+    1   2   3   4   5   |
+   ----------------------------
+    ON  OFF OFF OFF OFF |  3
+    OFF ON  OFF OFF OFF |  4
+    OFF OFF ON  OFF OFF |  5
+    OFF OFF OFF ON  OFF |  7
+    OFF OFF OFF OFF ON  |  2
+
+
+Setting the Timeout Parameters
+------------------------------
+
+The switches 6-7 of the switch block SW3 are used to determine the timeout
+parameters.  These two switches are normally left in the OFF position.
+
+
+*****************************************************************************
+
+** Topware **
+8-bit card, TA-ARC/10
+-------------------------
+  - from Vojtech Pavlik <vojtech@suse.cz>
+
+This is another very similar 90C65 card. Most of the switches and jumpers
+are the same as on other clones.
+
+ _____________________________________________________________________
+|  ___________   |                         |            ______        |
+| |SW2 NODE ID|  |                         |           | XTAL |       |
+| |___________|  |  Hybrid IC              |           |______|       |
+|  ___________   |                         |                        __|    
+| |SW1 MEM+I/O|  |_________________________|                   LED1|__|)   
+| |___________|           1 2                                         |     
+|                     J3 |o|o| TIMEOUT                          ______|    
+|     ______________     |o|o|                                 |      |    
+|    |              |  ___________________                     | RJ   |    
+|    > EPROM SOCKET | |                   \                    |------|     
+|J2  |______________| |                    |                   |      |    
+||o|                  |                    |                   |______|
+||o| ROM ENABLE       |        SMC         |    _________             |
+|     _____________   |       90C65        |   |_________|       _____|    
+|    |             |  |                    |                    |     |___ 
+|    > RAM (2k)    |  |                    |                    | BNC |___|
+|    |_____________|  |                    |                    |_____|    
+|                     |____________________|                          |    
+| ________ IRQ 2 3 4 5 7                  ___________                 |
+||________|   |o|o|o|o|o|                |___________|                |
+|________   J1|o|o|o|o|o|                               ______________|
+         |                                             |
+         |_____________________________________________|
+
+Legend:
+
+90C65       ARCNET Chip
+XTAL        20 MHz Crystal
+SW1 1-5     Base Memory Address Select
+    6-8     Base I/O Address Select
+SW2 1-8     Node ID Select (ID0-ID7)
+J1          IRQ Select
+J2          ROM Enable
+J3          Extra Timeout
+LED1        Activity LED 
+BNC         Coax connector (BUS ARCnet)
+RJ          Twisted Pair Connector (daisy chain)
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in SW2 are used to set the node ID. Each node attached to
+the network must have an unique node ID which must not be 0.  Switch 1 (ID0)
+serves as the least significant bit (LSB).
+
+Setting one of the switches to Off means "1", On means "0".
+
+The node ID is the sum of the values of all switches set to "1"
+These values are:
+
+   Switch | Label | Value
+   -------|-------|-------
+     1    | ID0   |   1
+     2    | ID1   |   2
+     3    | ID2   |   4
+     4    | ID3   |   8
+     5    | ID4   |  16
+     6    | ID5   |  32
+     7    | ID6   |  64
+     8    | ID7   | 128
+
+Setting the I/O Base Address
+----------------------------
+
+The last three switches in switch block SW1 are used to select one
+of eight possible I/O Base addresses using the following table:
+
+
+   Switch      | Hex I/O
+    6   7   8  | Address
+   ------------|--------
+   ON  ON  ON  |  260  (Manufacturer's default)
+   OFF ON  ON  |  290
+   ON  OFF ON  |  2E0                         
+   OFF OFF ON  |  2F0
+   ON  ON  OFF |  300
+   OFF ON  OFF |  350
+   ON  OFF OFF |  380
+   OFF OFF OFF |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer (RAM) requires 2K. The base of this buffer can be
+located in any of eight positions. The address of the Boot Prom is
+memory base + 0x2000.
+Jumpers 3-5 of switch block SW1 select the Memory Base address.
+
+   Switch              | Hex RAM | Hex ROM
+    1   2   3   4   5  | Address | Address *)
+   --------------------|---------|-----------
+   ON  ON  ON  ON  ON  |  C0000  |  C2000
+   ON  ON  OFF ON  ON  |  C4000  |  C6000  (Manufacturer's default) 
+   ON  ON  ON  OFF ON  |  CC000  |  CE000
+   ON  ON  OFF OFF ON  |  D0000  |  D2000  
+   ON  ON  ON  ON  OFF |  D4000  |  D6000
+   ON  ON  OFF ON  OFF |  D8000  |  DA000
+   ON  ON  ON  OFF OFF |  DC000  |  DE000
+   ON  ON  OFF OFF OFF |  E0000  |  E2000
+
+*) To enable the Boot ROM short the jumper J2.
+
+The jumpers 1 and 2 probably add 0x0800 and 0x1000 to RAM address.
+
+
+Setting the Interrupt Line
+--------------------------
+
+Jumpers 1-5 of the jumper block J1 control the IRQ level.  ON means
+shorted, OFF means open.
+
+    Jumper              |  IRQ
+    1   2   3   4   5   |
+   ----------------------------
+    ON  OFF OFF OFF OFF |  2
+    OFF ON  OFF OFF OFF |  3
+    OFF OFF ON  OFF OFF |  4
+    OFF OFF OFF ON  OFF |  5
+    OFF OFF OFF OFF ON  |  7
+
+
+Setting the Timeout Parameters
+------------------------------
+
+The jumpers J3 are used to set the timeout parameters. These two 
+jumpers are normally left open.
+
+  
+*****************************************************************************
+
+** Thomas-Conrad **
+Model #500-6242-0097 REV A (8-bit card)
+---------------------------------------
+  - from Lars Karlsson <100617.3473@compuserve.com>
+
+     ________________________________________________________
+   |          ________   ________                           |_____
+   |         |........| |........|                            |
+   |         |________| |________|                         ___|
+   |            SW 3       SW 1                           |   |
+   |         Base I/O   Base Addr.                Station |   |
+   |                                              address |   |
+   |    ______                                    switch  |   |
+   |   |      |                                           |   |
+   |   |      |                                           |___|    
+   |   |      |                                 ______        |___._
+   |   |______|                                |______|         ____| BNC
+   |                                            Jumper-        _____| Connector
+   |   Main chip                                block  _    __|   '  
+   |                                                  | |  |    RJ Connector
+   |                                                  |_|  |    with 110 Ohm
+   |                                                       |__  Terminator
+   |    ___________                                         __|
+   |   |...........|                                       |    RJ-jack
+   |   |...........|    _____                              |    (unused)
+   |   |___________|   |_____|                             |__
+   |  Boot PROM socket IRQ-jumpers                            |_  Diagnostic
+   |________                                       __          _| LED (red)
+            | | | | | | | | | | | | | | | | | | | |  |        |
+            | | | | | | | | | | | | | | | | | | | |  |________|
+                                                              |
+                                                              |
+
+And here are the settings for some of the switches and jumpers on the cards.
+
+
+          I/O
+
+         1 2 3 4 5 6 7 8
+
+2E0----- 0 0 0 1 0 0 0 1
+2F0----- 0 0 0 1 0 0 0 0
+300----- 0 0 0 0 1 1 1 1
+350----- 0 0 0 0 1 1 1 0
+
+"0" in the above example means switch is off "1" means that it is on.
+
+
+    ShMem address.
+
+      1 2 3 4 5 6 7 8
+
+CX00--0 0 1 1 | |   |
+DX00--0 0 1 0       |
+X000--------- 1 1   |
+X400--------- 1 0   |
+X800--------- 0 1   |
+XC00--------- 0 0   
+ENHANCED----------- 1
+COMPATIBLE--------- 0
+
+
+       IRQ
+
+
+   3 4 5 7 2
+   . . . . .
+   . . . . .
+
+
+There is a DIP-switch with 8 switches, used to set the shared memory address
+to be used. The first 6 switches set the address, the 7th doesn't have any
+function, and the 8th switch is used to select "compatible" or "enhanced".
+When I got my two cards, one of them had this switch set to "enhanced". That
+card didn't work at all, it wasn't even recognized by the driver. The other
+card had this switch set to "compatible" and it behaved absolutely normally. I
+guess that the switch on one of the cards, must have been changed accidentally
+when the card was taken out of its former host. The question remains
+unanswered, what is the purpose of the "enhanced" position?
+
+[Avery's note: "enhanced" probably either disables shared memory (use IO
+ports instead) or disables IO ports (use memory addresses instead).  This
+varies by the type of card involved.  I fail to see how either of these
+enhance anything.  Send me more detailed information about this mode, or
+just use "compatible" mode instead.]
+
+
+*****************************************************************************
+
+** Waterloo Microsystems Inc. ?? **
+8-bit card (C) 1985
+-------------------
+  - from Robert Michael Best <rmb117@cs.usask.ca>
+
+[Avery's note: these don't work with my driver for some reason.  These cards
+SEEM to have settings similar to the PDI508Plus, which is
+software-configured and doesn't work with my driver either.  The "Waterloo
+chip" is a boot PROM, probably designed specifically for the University of
+Waterloo.  If you have any further information about this card, please
+e-mail me.]
+
+The probe has not been able to detect the card on any of the J2 settings,
+and I tried them again with the "Waterloo" chip removed.
+ 
+ _____________________________________________________________________
+| \/  \/              ___  __ __                                      |
+| C4  C4     |^|     | M ||  ^  ||^|                                  |
+| --  --     |_|     | 5 ||     || | C3                               |
+| \/  \/      C10    |___||     ||_|                                  | 
+| C4  C4             _  _ |     |                 ??                  | 
+| --  --            | \/ ||     |                                     | 
+|                   |    ||     |                                     | 
+|                   |    ||  C1 |                                     | 
+|                   |    ||     |  \/                            _____|    
+|                   | C6 ||     |  C9                           |     |___ 
+|                   |    ||     |  --                           | BNC |___| 
+|                   |    ||     |          >C7|                 |_____|
+|                   |    ||     |                                     |
+| __ __             |____||_____|       1 2 3     6                   |
+||  ^  |     >C4|                      |o|o|o|o|o|o| J2    >C4|       |
+||     |                               |o|o|o|o|o|o|                  |
+|| C2  |     >C4|                                          >C4|       |
+||     |                                   >C8|                       |
+||     |       2 3 4 5 6 7  IRQ                            >C4|       |
+||_____|      |o|o|o|o|o|o| J3                                        |
+|_______      |o|o|o|o|o|o|                            _______________|
+        |                                             |
+        |_____________________________________________|
+
+C1 -- "COM9026
+       SMC 8638"
+      In a chip socket.
+
+C2 -- "@Copyright
+       Waterloo Microsystems Inc.
+       1985"
+      In a chip Socket with info printed on a label covering a round window
+      showing the circuit inside. (The window indicates it is an EPROM chip.)
+
+C3 -- "COM9032
+       SMC 8643"
+      In a chip socket.
+
+C4 -- "74LS"
+      9 total no sockets.
+
+M5 -- "50006-136
+       20.000000 MHZ
+       MTQ-T1-S3
+       0 M-TRON 86-40"
+      Metallic case with 4 pins, no socket.
+
+C6 -- "MOSTEK@TC8643
+       MK6116N-20
+       MALAYSIA"
+      No socket.
+
+C7 -- No stamp or label but in a 20 pin chip socket.
+
+C8 -- "PAL10L8CN
+       8623"
+      In a 20 pin socket.
+
+C9 -- "PAl16R4A-2CN
+       8641"
+      In a 20 pin socket.
+
+C10 -- "M8640
+          NMC
+        9306N"
+       In an 8 pin socket.
+
+?? -- Some components on a smaller board and attached with 20 pins all 
+      along the side closest to the BNC connector.  The are coated in a dark 
+      resin.
+
+On the board there are two jumper banks labeled J2 and J3. The 
+manufacturer didn't put a J1 on the board. The two boards I have both 
+came with a jumper box for each bank.
+
+J2 -- Numbered 1 2 3 4 5 6. 
+      4 and 5 are not stamped due to solder points.
+       
+J3 -- IRQ 2 3 4 5 6 7
+
+The board itself has a maple leaf stamped just above the irq jumpers 
+and "-2 46-86" beside C2. Between C1 and C6 "ASS 'Y 300163" and "@1986 
+CORMAN CUSTOM ELECTRONICS CORP." stamped just below the BNC connector.
+Below that "MADE IN CANADA"
+
+  
+*****************************************************************************
+
+** No Name **
+8-bit cards, 16-bit cards
+-------------------------
+  - from Juergen Seifert <seifert@htwm.de>
+  
+NONAME 8-BIT ARCNET
+===================
+
+I have named this ARCnet card "NONAME", since there is no name of any
+manufacturer on the Installation manual nor on the shipping box. The only
+hint to the existence of a manufacturer at all is written in copper,
+it is "Made in Taiwan"
+
+This description has been written by Juergen Seifert <seifert@htwm.de>
+using information from the Original
+                    "ARCnet Installation Manual"
+
+
+    ________________________________________________________________
+   | |STAR| BUS| T/P|                                               |
+   | |____|____|____|                                               |
+   |                            _____________________               |
+   |                           |                     |              |
+   |                           |                     |              |
+   |                           |                     |              |
+   |                           |        SMC          |              |
+   |                           |                     |              |
+   |                           |       COM90C65      |              |
+   |                           |                     |              |
+   |                           |                     |              |
+   |                           |__________-__________|              |
+   |                                                           _____|
+   |      _______________                                     |  CN |
+   |     | PROM          |                                    |_____|
+   |     > SOCKET        |                                          |
+   |     |_______________|         1 2 3 4 5 6 7 8  1 2 3 4 5 6 7 8 |
+   |                               _______________  _______________ |
+   |           |o|o|o|o|o|o|o|o|  |      SW1      ||      SW2      ||
+   |           |o|o|o|o|o|o|o|o|  |_______________||_______________||
+   |___         2 3 4 5 7 E E R        Node ID       IOB__|__MEM____|
+       |        \ IRQ   / T T O                      |
+       |__________________1_2_M______________________|
+
+Legend:
+
+COM90C65:       ARCnet Probe
+S1  1-8:    Node ID Select
+S2  1-3:    I/O Base Address Select
+    4-6:    Memory Base Address Select
+    7-8:    RAM Offset Select
+ET1, ET2    Extended Timeout Select
+ROM     ROM Enable Select
+CN              RG62 Coax Connector
+STAR| BUS | T/P Three fields for placing a sign (colored circle)
+                indicating the topology of the card
+
+Setting one of the switches to Off means "1", On means "0".
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in group SW1 are used to set the node ID.
+Each node attached to the network must have an unique node ID which
+must be different from 0.
+Switch 8 serves as the least significant bit (LSB).
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+    Switch | Value
+    -------|-------
+      8    |   1
+      7    |   2
+      6    |   4
+      5    |   8
+      4    |  16
+      3    |  32
+      2    |  64
+      1    | 128
+
+Some Examples:
+
+    Switch         | Hex     | Decimal 
+   1 2 3 4 5 6 7 8 | Node ID | Node ID
+   ----------------|---------|---------
+   0 0 0 0 0 0 0 0 |    not allowed
+   0 0 0 0 0 0 0 1 |    1    |    1 
+   0 0 0 0 0 0 1 0 |    2    |    2
+   0 0 0 0 0 0 1 1 |    3    |    3
+       . . .       |         |
+   0 1 0 1 0 1 0 1 |   55    |   85
+       . . .       |         |
+   1 0 1 0 1 0 1 0 |   AA    |  170
+       . . .       |         |  
+   1 1 1 1 1 1 0 1 |   FD    |  253
+   1 1 1 1 1 1 1 0 |   FE    |  254
+   1 1 1 1 1 1 1 1 |   FF    |  255
+
+
+Setting the I/O Base Address
+----------------------------
+
+The first three switches in switch group SW2 are used to select one
+of eight possible I/O Base addresses using the following table
+
+   Switch      | Hex I/O
+    1   2   3  | Address
+   ------------|--------
+   ON  ON  ON  |  260
+   ON  ON  OFF |  290
+   ON  OFF ON  |  2E0  (Manufacturer's default)
+   ON  OFF OFF |  2F0
+   OFF ON  ON  |  300
+   OFF ON  OFF |  350
+   OFF OFF ON  |  380
+   OFF OFF OFF |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer requires 2K of a 16K block of RAM. The base of this
+16K block can be located in any of eight positions.
+Switches 4-6 of switch group SW2 select the Base of the 16K block.
+Within that 16K address space, the buffer may be assigned any one of four
+positions, determined by the offset, switches 7 and 8 of group SW2.
+
+   Switch     | Hex RAM | Hex ROM
+   4 5 6  7 8 | Address | Address *)
+   -----------|---------|-----------
+   0 0 0  0 0 |  C0000  |  C2000
+   0 0 0  0 1 |  C0800  |  C2000
+   0 0 0  1 0 |  C1000  |  C2000
+   0 0 0  1 1 |  C1800  |  C2000
+              |         |
+   0 0 1  0 0 |  C4000  |  C6000
+   0 0 1  0 1 |  C4800  |  C6000
+   0 0 1  1 0 |  C5000  |  C6000
+   0 0 1  1 1 |  C5800  |  C6000
+              |         |
+   0 1 0  0 0 |  CC000  |  CE000
+   0 1 0  0 1 |  CC800  |  CE000
+   0 1 0  1 0 |  CD000  |  CE000
+   0 1 0  1 1 |  CD800  |  CE000
+              |         |
+   0 1 1  0 0 |  D0000  |  D2000  (Manufacturer's default)
+   0 1 1  0 1 |  D0800  |  D2000
+   0 1 1  1 0 |  D1000  |  D2000
+   0 1 1  1 1 |  D1800  |  D2000
+              |         |
+   1 0 0  0 0 |  D4000  |  D6000
+   1 0 0  0 1 |  D4800  |  D6000
+   1 0 0  1 0 |  D5000  |  D6000
+   1 0 0  1 1 |  D5800  |  D6000
+              |         |
+   1 0 1  0 0 |  D8000  |  DA000
+   1 0 1  0 1 |  D8800  |  DA000
+   1 0 1  1 0 |  D9000  |  DA000
+   1 0 1  1 1 |  D9800  |  DA000
+              |         |
+   1 1 0  0 0 |  DC000  |  DE000
+   1 1 0  0 1 |  DC800  |  DE000
+   1 1 0  1 0 |  DD000  |  DE000
+   1 1 0  1 1 |  DD800  |  DE000
+              |         |
+   1 1 1  0 0 |  E0000  |  E2000
+   1 1 1  0 1 |  E0800  |  E2000
+   1 1 1  1 0 |  E1000  |  E2000
+   1 1 1  1 1 |  E1800  |  E2000
+  
+*) To enable the 8K Boot PROM install the jumper ROM.
+   The default is jumper ROM not installed.
+
+
+Setting Interrupt Request Lines (IRQ)
+-------------------------------------
+
+To select a hardware interrupt level set one (only one!) of the jumpers
+IRQ2, IRQ3, IRQ4, IRQ5 or IRQ7. The manufacturer's default is IRQ2.
+ 
+
+Setting the Timeouts
+--------------------
+
+The two jumpers labeled ET1 and ET2 are used to determine the timeout
+parameters (response and reconfiguration time). Every node in a network
+must be set to the same timeout values.
+
+   ET1 ET2 | Response Time (us) | Reconfiguration Time (ms)
+   --------|--------------------|--------------------------
+   Off Off |        78          |          840   (Default)
+   Off On  |       285          |         1680
+   On  Off |       563          |         1680
+   On  On  |      1130          |         1680
+
+On means jumper installed, Off means jumper not installed
+
+
+NONAME 16-BIT ARCNET
+====================
+
+The manual of my 8-Bit NONAME ARCnet Card contains another description
+of a 16-Bit Coax / Twisted Pair Card. This description is incomplete,
+because there are missing two pages in the manual booklet. (The table
+of contents reports pages ... 2-9, 2-11, 2-12, 3-1, ... but inside
+the booklet there is a different way of counting ... 2-9, 2-10, A-1,
+(empty page), 3-1, ..., 3-18, A-1 (again), A-2)
+Also the picture of the board layout is not as good as the picture of
+8-Bit card, because there isn't any letter like "SW1" written to the
+picture.
+Should somebody have such a board, please feel free to complete this
+description or to send a mail to me!
+
+This description has been written by Juergen Seifert <seifert@htwm.de>
+using information from the Original
+                    "ARCnet Installation Manual"
+
+
+   ___________________________________________________________________
+  <                    _________________  _________________           |
+  >                   |       SW?       ||      SW?        |          |
+  <                   |_________________||_________________|          |
+  >                       ____________________                        |
+  <                      |                    |                       |
+  >                      |                    |                       |
+  <                      |                    |                       |
+  >                      |                    |                       |
+  <                      |                    |                       |
+  >                      |                    |                       |
+  <                      |                    |                       |
+  >                      |____________________|                       |
+  <                                                               ____|
+  >                       ____________________                   |    |
+  <                      |                    |                  | J1 |
+  >                      |                    <                  |    |
+  <                      |____________________|  ? ? ? ? ? ?     |____|
+  >                                             |o|o|o|o|o|o|         |
+  <                                             |o|o|o|o|o|o|         |
+  >                                                                   |
+  <             __                                         ___________|
+  >            |  |                                       |
+  <____________|  |_______________________________________|
+
+
+Setting one of the switches to Off means "1", On means "0".
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in group SW2 are used to set the node ID.
+Each node attached to the network must have an unique node ID which
+must be different from 0.
+Switch 8 serves as the least significant bit (LSB).
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+    Switch | Value
+    -------|-------
+      8    |   1
+      7    |   2
+      6    |   4
+      5    |   8
+      4    |  16
+      3    |  32
+      2    |  64
+      1    | 128
+
+Some Examples:
+
+    Switch         | Hex     | Decimal 
+   1 2 3 4 5 6 7 8 | Node ID | Node ID
+   ----------------|---------|---------
+   0 0 0 0 0 0 0 0 |    not allowed
+   0 0 0 0 0 0 0 1 |    1    |    1 
+   0 0 0 0 0 0 1 0 |    2    |    2
+   0 0 0 0 0 0 1 1 |    3    |    3
+       . . .       |         |
+   0 1 0 1 0 1 0 1 |   55    |   85
+       . . .       |         |
+   1 0 1 0 1 0 1 0 |   AA    |  170
+       . . .       |         |  
+   1 1 1 1 1 1 0 1 |   FD    |  253
+   1 1 1 1 1 1 1 0 |   FE    |  254
+   1 1 1 1 1 1 1 1 |   FF    |  255
+
+
+Setting the I/O Base Address
+----------------------------
+
+The first three switches in switch group SW1 are used to select one
+of eight possible I/O Base addresses using the following table
+
+   Switch      | Hex I/O
+    3   2   1  | Address
+   ------------|--------
+   ON  ON  ON  |  260
+   ON  ON  OFF |  290
+   ON  OFF ON  |  2E0  (Manufacturer's default)
+   ON  OFF OFF |  2F0
+   OFF ON  ON  |  300
+   OFF ON  OFF |  350
+   OFF OFF ON  |  380
+   OFF OFF OFF |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer requires 2K of a 16K block of RAM. The base of this
+16K block can be located in any of eight positions.
+Switches 6-8 of switch group SW1 select the Base of the 16K block.
+Within that 16K address space, the buffer may be assigned any one of four
+positions, determined by the offset, switches 4 and 5 of group SW1.
+
+   Switch     | Hex RAM | Hex ROM
+   8 7 6  5 4 | Address | Address
+   -----------|---------|-----------
+   0 0 0  0 0 |  C0000  |  C2000
+   0 0 0  0 1 |  C0800  |  C2000
+   0 0 0  1 0 |  C1000  |  C2000
+   0 0 0  1 1 |  C1800  |  C2000
+              |         |
+   0 0 1  0 0 |  C4000  |  C6000
+   0 0 1  0 1 |  C4800  |  C6000
+   0 0 1  1 0 |  C5000  |  C6000
+   0 0 1  1 1 |  C5800  |  C6000
+              |         |
+   0 1 0  0 0 |  CC000  |  CE000
+   0 1 0  0 1 |  CC800  |  CE000
+   0 1 0  1 0 |  CD000  |  CE000
+   0 1 0  1 1 |  CD800  |  CE000
+              |         |
+   0 1 1  0 0 |  D0000  |  D2000  (Manufacturer's default)
+   0 1 1  0 1 |  D0800  |  D2000
+   0 1 1  1 0 |  D1000  |  D2000
+   0 1 1  1 1 |  D1800  |  D2000
+              |         |
+   1 0 0  0 0 |  D4000  |  D6000
+   1 0 0  0 1 |  D4800  |  D6000
+   1 0 0  1 0 |  D5000  |  D6000
+   1 0 0  1 1 |  D5800  |  D6000
+              |         |
+   1 0 1  0 0 |  D8000  |  DA000
+   1 0 1  0 1 |  D8800  |  DA000
+   1 0 1  1 0 |  D9000  |  DA000
+   1 0 1  1 1 |  D9800  |  DA000
+              |         |
+   1 1 0  0 0 |  DC000  |  DE000
+   1 1 0  0 1 |  DC800  |  DE000
+   1 1 0  1 0 |  DD000  |  DE000
+   1 1 0  1 1 |  DD800  |  DE000
+              |         |
+   1 1 1  0 0 |  E0000  |  E2000
+   1 1 1  0 1 |  E0800  |  E2000
+   1 1 1  1 0 |  E1000  |  E2000
+   1 1 1  1 1 |  E1800  |  E2000
+  
+
+Setting Interrupt Request Lines (IRQ)
+-------------------------------------
+
+??????????????????????????????????????
+
+
+Setting the Timeouts
+--------------------
+
+??????????????????????????????????????
+
+
+*****************************************************************************
+
+** No Name **
+8-bit cards ("Made in Taiwan R.O.C.")
+-----------
+  - from Vojtech Pavlik <vojtech@suse.cz>
+
+I have named this ARCnet card "NONAME", since I got only the card with
+no manual at all and the only text identifying the manufacturer is 
+"MADE IN TAIWAN R.O.C" printed on the card.
+
+          ____________________________________________________________
+         |                 1 2 3 4 5 6 7 8                            |
+         | |o|o| JP1       o|o|o|o|o|o|o|o| ON                        |
+         |  +              o|o|o|o|o|o|o|o|                        ___|
+         |  _____________  o|o|o|o|o|o|o|o| OFF         _____     |   | ID7
+         | |             | SW1                         |     |    |   | ID6
+         | > RAM (2k)    |        ____________________ |  H  |    | S | ID5
+         | |_____________|       |                    ||  y  |    | W | ID4
+         |                       |                    ||  b  |    | 2 | ID3
+         |                       |                    ||  r  |    |   | ID2
+         |                       |                    ||  i  |    |   | ID1
+         |                       |       90C65        ||  d  |    |___| ID0
+         |      SW3              |                    ||     |        |      
+         | |o|o|o|o|o|o|o|o| ON  |                    ||  I  |        |
+         | |o|o|o|o|o|o|o|o|     |                    ||  C  |        |
+         | |o|o|o|o|o|o|o|o| OFF |____________________||     |   _____|
+         |  1 2 3 4 5 6 7 8                            |     |  |     |___
+         |  ______________                             |     |  | BNC |___|
+         | |              |                            |_____|  |_____|
+         | > EPROM SOCKET |                                           |
+         | |______________|                                           |
+         |                                              ______________|
+         |                                             |
+         |_____________________________________________|
+
+Legend:
+
+90C65       ARCNET Chip 
+SW1 1-5:    Base Memory Address Select
+    6-8:    Base I/O Address Select
+SW2 1-8:    Node ID Select (ID0-ID7)
+SW3 1-5:    IRQ Select   
+    6-7:    Extra Timeout
+    8  :    ROM Enable   
+JP1         Led connector
+BNC         Coax connector
+
+Although the jumpers SW1 and SW3 are marked SW, not JP, they are jumpers, not 
+switches.
+
+Setting the jumpers to ON means connecting the upper two pins, off the bottom 
+two - or - in case of IRQ setting, connecting none of them at all.
+
+Setting the Node ID
+-------------------
+
+The eight switches in SW2 are used to set the node ID. Each node attached
+to the network must have an unique node ID which must not be 0.
+Switch 1 (ID0) serves as the least significant bit (LSB).
+
+Setting one of the switches to Off means "1", On means "0".
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+
+   Switch | Label | Value
+   -------|-------|-------
+     1    | ID0   |   1
+     2    | ID1   |   2
+     3    | ID2   |   4
+     4    | ID3   |   8
+     5    | ID4   |  16
+     6    | ID5   |  32
+     7    | ID6   |  64
+     8    | ID7   | 128
+
+Some Examples:
+
+    Switch         | Hex     | Decimal 
+   8 7 6 5 4 3 2 1 | Node ID | Node ID
+   ----------------|---------|---------
+   0 0 0 0 0 0 0 0 |    not allowed
+   0 0 0 0 0 0 0 1 |    1    |    1 
+   0 0 0 0 0 0 1 0 |    2    |    2
+   0 0 0 0 0 0 1 1 |    3    |    3
+       . . .       |         |
+   0 1 0 1 0 1 0 1 |   55    |   85
+       . . .       |         |
+   1 0 1 0 1 0 1 0 |   AA    |  170
+       . . .       |         |  
+   1 1 1 1 1 1 0 1 |   FD    |  253
+   1 1 1 1 1 1 1 0 |   FE    |  254
+   1 1 1 1 1 1 1 1 |   FF    |  255
+
+
+Setting the I/O Base Address
+----------------------------
+
+The last three switches in switch block SW1 are used to select one
+of eight possible I/O Base addresses using the following table
+
+
+   Switch      | Hex I/O
+    6   7   8  | Address
+   ------------|--------
+   ON  ON  ON  |  260
+   OFF ON  ON  |  290
+   ON  OFF ON  |  2E0  (Manufacturer's default)
+   OFF OFF ON  |  2F0
+   ON  ON  OFF |  300
+   OFF ON  OFF |  350
+   ON  OFF OFF |  380
+   OFF OFF OFF |  3E0
+
+
+Setting the Base Memory (RAM) buffer Address
+--------------------------------------------
+
+The memory buffer (RAM) requires 2K. The base of this buffer can be 
+located in any of eight positions. The address of the Boot Prom is
+memory base + 0x2000.
+Jumpers 3-5 of jumper block SW1 select the Memory Base address.
+
+   Switch              | Hex RAM | Hex ROM
+    1   2   3   4   5  | Address | Address *)
+   --------------------|---------|-----------
+   ON  ON  ON  ON  ON  |  C0000  |  C2000
+   ON  ON  OFF ON  ON  |  C4000  |  C6000
+   ON  ON  ON  OFF ON  |  CC000  |  CE000
+   ON  ON  OFF OFF ON  |  D0000  |  D2000  (Manufacturer's default)
+   ON  ON  ON  ON  OFF |  D4000  |  D6000
+   ON  ON  OFF ON  OFF |  D8000  |  DA000
+   ON  ON  ON  OFF OFF |  DC000  |  DE000
+   ON  ON  OFF OFF OFF |  E0000  |  E2000
+  
+*) To enable the Boot ROM set the jumper 8 of jumper block SW3 to position ON.
+
+The jumpers 1 and 2 probably add 0x0800, 0x1000 and 0x1800 to RAM adders.
+
+Setting the Interrupt Line
+--------------------------
+
+Jumpers 1-5 of the jumper block SW3 control the IRQ level.
+
+    Jumper              |  IRQ
+    1   2   3   4   5   |
+   ----------------------------
+    ON  OFF OFF OFF OFF |  2
+    OFF ON  OFF OFF OFF |  3
+    OFF OFF ON  OFF OFF |  4
+    OFF OFF OFF ON  OFF |  5
+    OFF OFF OFF OFF ON  |  7
+
+
+Setting the Timeout Parameters
+------------------------------
+
+The jumpers 6-7 of the jumper block SW3 are used to determine the timeout 
+parameters. These two jumpers are normally left in the OFF position.
+
+
+*****************************************************************************
+
+** No Name **
+(Generic Model 9058)
+--------------------
+  - from Andrew J. Kroll <ag784@freenet.buffalo.edu>
+  - Sorry this sat in my to-do box for so long, Andrew! (yikes - over a
+    year!)
+                                                                      _____
+                                                                     |    <
+                                                                     | .---'
+    ________________________________________________________________ | |
+   |                           |     SW2     |                      |  |
+   |   ___________             |_____________|                      |  |
+   |  |           |              1 2 3 4 5 6                     ___|  |
+   |  >  6116 RAM |         _________                         8 |   |  |
+   |  |___________|        |20MHzXtal|                        7 |   |  |
+   |                       |_________|       __________       6 | S |  |
+   |    74LS373                             |          |-     5 | W |  |
+   |   _________                            |      E   |-     4 |   |  |
+   |   >_______|              ______________|..... P   |-     3 | 3 |  |
+   |                         |              |    : O   |-     2 |   |  |
+   |                         |              |    : X   |-     1 |___|  |
+   |   ________________      |              |    : Y   |-           |  |
+   |  |      SW1       |     |      SL90C65 |    :     |-           |  |
+   |  |________________|     |              |    : B   |-           |  |
+   |    1 2 3 4 5 6 7 8      |              |    : O   |-           |  |
+   |                         |_________o____|..../ A   |-    _______|  |
+   |    ____________________                |      R   |-   |       |------,   
+   |   |                    |               |      D   |-   |  BNC  |   #  |
+   |   > 2764 PROM SOCKET   |               |__________|-   |_______|------'
+   |   |____________________|              _________                |  |
+   |                                       >________| <- 74LS245    |  |
+   |                                                                |  |
+   |___                                               ______________|  |
+       |H H H H H H H H H H H H H H H H H H H H H H H|               | |
+       |U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U_U|               | |
+                                                                      \|
+Legend:
+
+SL90C65 	ARCNET Controller / Transceiver /Logic
+SW1	1-5:	IRQ Select
+	  6:	ET1
+	  7:	ET2
+	  8:	ROM ENABLE 
+SW2	1-3:    Memory Buffer/PROM Address
+	3-6:	I/O Address Map
+SW3	1-8:	Node ID Select
+BNC		BNC RG62/U Connection 
+		*I* have had success using RG59B/U with *NO* terminators!
+		What gives?!
+
+SW1: Timeouts, Interrupt and ROM
+---------------------------------
+
+To select a hardware interrupt level set one (only one!) of the dip switches
+up (on) SW1...(switches 1-5)
+IRQ3, IRQ4, IRQ5, IRQ7, IRQ2. The Manufacturer's default is IRQ2.
+
+The switches on SW1 labeled EXT1 (switch 6) and EXT2 (switch 7)
+are used to determine the timeout parameters. These two dip switches
+are normally left off (down).
+
+   To enable the 8K Boot PROM position SW1 switch 8 on (UP) labeled ROM.
+   The default is jumper ROM not installed.
+
+
+Setting the I/O Base Address
+----------------------------
+
+The last three switches in switch group SW2 are used to select one
+of eight possible I/O Base addresses using the following table
+
+
+   Switch | Hex I/O
+   4 5 6  | Address
+   -------|--------
+   0 0 0  |  260
+   0 0 1  |  290
+   0 1 0  |  2E0  (Manufacturer's default)
+   0 1 1  |  2F0
+   1 0 0  |  300
+   1 0 1  |  350
+   1 1 0  |  380
+   1 1 1  |  3E0
+
+
+Setting the Base Memory Address (RAM & ROM)
+-------------------------------------------
+
+The memory buffer requires 2K of a 16K block of RAM. The base of this
+16K block can be located in any of eight positions.
+Switches 1-3 of switch group SW2 select the Base of the 16K block.
+(0 = DOWN, 1 = UP)
+I could, however, only verify two settings...
+
+   Switch| Hex RAM | Hex ROM
+   1 2 3 | Address | Address
+   ------|---------|-----------
+   0 0 0 |  E0000  |  E2000
+   0 0 1 |  D0000  |  D2000  (Manufacturer's default)
+   0 1 0 |  ?????  |  ?????
+   0 1 1 |  ?????  |  ?????  
+   1 0 0 |  ?????  |  ?????
+   1 0 1 |  ?????  |  ?????
+   1 1 0 |  ?????  |  ?????
+   1 1 1 |  ?????  |  ?????
+
+
+Setting the Node ID
+-------------------
+
+The eight switches in group SW3 are used to set the node ID.
+Each node attached to the network must have an unique node ID which
+must be different from 0.
+Switch 1 serves as the least significant bit (LSB).
+switches in the DOWN position are OFF (0) and in the UP position are ON (1)
+
+The node ID is the sum of the values of all switches set to "1"  
+These values are:
+    Switch | Value
+    -------|-------
+      1    |   1
+      2    |   2
+      3    |   4
+      4    |   8
+      5    |  16
+      6    |  32
+      7    |  64
+      8    | 128
+
+Some Examples:
+
+    Switch#     |   Hex   | Decimal 
+8 7 6 5 4 3 2 1 | Node ID | Node ID
+----------------|---------|---------
+0 0 0 0 0 0 0 0 |    not allowed  <-.
+0 0 0 0 0 0 0 1 |    1    |    1    | 
+0 0 0 0 0 0 1 0 |    2    |    2    |
+0 0 0 0 0 0 1 1 |    3    |    3    |
+    . . .       |         |         |
+0 1 0 1 0 1 0 1 |   55    |   85    |
+    . . .       |         |         + Don't use 0 or 255!
+1 0 1 0 1 0 1 0 |   AA    |  170    |
+    . . .       |         |         |
+1 1 1 1 1 1 0 1 |   FD    |  253    |
+1 1 1 1 1 1 1 0 |   FE    |  254    |
+1 1 1 1 1 1 1 1 |   FF    |  255  <-'
+  
+
+*****************************************************************************
+
+** Tiara **
+(model unknown)
+-------------------------
+  - from Christoph Lameter <christoph@lameter.com>
+  
+
+Here is information about my card as far as I could figure it out:
+----------------------------------------------- tiara
+Tiara LanCard of Tiara Computer Systems.
+
++----------------------------------------------+
+!           ! Transmitter Unit !               !
+!           +------------------+             -------
+!          MEM                              Coax Connector
+!  ROM    7654321 <- I/O                     -------
+!  :  :   +--------+                           !
+!  :  :   ! 90C66LJ!                         +++
+!  :  :   !        !                         !D  Switch to set
+!  :  :   !        !                         !I  the Nodenumber
+!  :  :   +--------+                         !P
+!                                            !++
+!         234567 <- IRQ                      !
++------------!!!!!!!!!!!!!!!!!!!!!!!!--------+
+             !!!!!!!!!!!!!!!!!!!!!!!!
+
+0 = Jumper Installed
+1 = Open
+
+Top Jumper line Bit 7 = ROM Enable 654=Memory location 321=I/O
+
+Settings for Memory Location (Top Jumper Line)
+456     Address selected
+000	C0000
+001     C4000
+010     CC000
+011     D0000
+100     D4000
+101     D8000
+110     DC000     
+111     E0000
+
+Settings for I/O Address (Top Jumper Line)
+123     Port
+000	260
+001	290
+010	2E0
+011	2F0
+100	300
+101	350
+110	380
+111	3E0
+
+Settings for IRQ Selection (Lower Jumper Line)
+234567
+011111 IRQ 2
+101111 IRQ 3
+110111 IRQ 4
+111011 IRQ 5
+111110 IRQ 7
+
+*****************************************************************************
+
+
+Other Cards
+-----------
+
+I have no information on other models of ARCnet cards at the moment.  Please
+send any and all info to:
+	apenwarr@worldvisions.ca
+
+Thanks.
diff --git a/Documentation/networking/arcnet.txt b/Documentation/networking/arcnet.txt
new file mode 100644
index 000000000..aff97f47c
--- /dev/null
+++ b/Documentation/networking/arcnet.txt
@@ -0,0 +1,556 @@
+----------------------------------------------------------------------------
+NOTE:  See also arcnet-hardware.txt in this directory for jumper-setting
+and cabling information if you're like many of us and didn't happen to get a
+manual with your ARCnet card.
+----------------------------------------------------------------------------
+
+Since no one seems to listen to me otherwise, perhaps a poem will get your
+attention:
+		This driver's getting fat and beefy,
+		But my cat is still named Fifi.
+
+Hmm, I think I'm allowed to call that a poem, even though it's only two
+lines.  Hey, I'm in Computer Science, not English.  Give me a break.
+
+The point is:  I REALLY REALLY REALLY REALLY REALLY want to hear from you if
+you test this and get it working.  Or if you don't.  Or anything.
+
+ARCnet 0.32 ALPHA first made it into the Linux kernel 1.1.80 - this was
+nice, but after that even FEWER people started writing to me because they
+didn't even have to install the patch.  <sigh>
+
+Come on, be a sport!  Send me a success report!
+
+(hey, that was even better than my original poem... this is getting bad!)
+
+
+--------
+WARNING:
+--------
+
+If you don't e-mail me about your success/failure soon, I may be forced to
+start SINGING.  And we don't want that, do we?
+
+(You know, it might be argued that I'm pushing this point a little too much. 
+If you think so, why not flame me in a quick little e-mail?  Please also
+include the type of card(s) you're using, software, size of network, and
+whether it's working or not.)
+
+My e-mail address is: apenwarr@worldvisions.ca
+
+
+---------------------------------------------------------------------------
+
+			
+These are the ARCnet drivers for Linux.
+
+
+This new release (2.91) has been put together by David Woodhouse 
+<dwmw2@infradead.org>, in an attempt to tidy up the driver after adding support
+for yet another chipset. Now the generic support has been separated from the
+individual chipset drivers, and the source files aren't quite so packed with
+#ifdefs! I've changed this file a bit, but kept it in the first person from
+Avery, because I didn't want to completely rewrite it.
+
+The previous release resulted from many months of on-and-off effort from me
+(Avery Pennarun), many bug reports/fixes and suggestions from others, and in
+particular a lot of input and coding from Tomasz Motylewski.  Starting with
+ARCnet 2.10 ALPHA, Tomasz's all-new-and-improved RFC1051 support has been
+included and seems to be working fine!
+
+
+Where do I discuss these drivers?
+---------------------------------
+
+Tomasz has been so kind as to set up a new and improved mailing list. 
+Subscribe by sending a message with the BODY "subscribe linux-arcnet YOUR
+REAL NAME" to listserv@tichy.ch.uj.edu.pl.  Then, to submit messages to the
+list, mail to linux-arcnet@tichy.ch.uj.edu.pl.
+
+There are archives of the mailing list at:
+	http://epistolary.org/mailman/listinfo.cgi/arcnet
+
+The people on linux-net@vger.kernel.org (now defunct, replaced by
+netdev@vger.kernel.org) have also been known to be very helpful, especially
+when we're talking about ALPHA Linux kernels that may or may not work right
+in the first place.
+
+
+Other Drivers and Info
+----------------------
+
+You can try my ARCNET page on the World Wide Web at:
+	http://www.qis.net/~jschmitz/arcnet/	
+
+Also, SMC (one of the companies that makes ARCnet cards) has a WWW site you
+might be interested in, which includes several drivers for various cards
+including ARCnet.  Try:
+	http://www.smc.com/
+	
+Performance Technologies makes various network software that supports
+ARCnet:
+	http://www.perftech.com/ or ftp to ftp.perftech.com.
+	
+Novell makes a networking stack for DOS which includes ARCnet drivers.  Try
+FTPing to ftp.novell.com.
+
+You can get the Crynwr packet driver collection (including arcether.com, the
+one you'll want to use with ARCnet cards) from
+oak.oakland.edu:/simtel/msdos/pktdrvr. It won't work perfectly on a 386+
+without patches, though, and also doesn't like several cards.  Fixed
+versions are available on my WWW page, or via e-mail if you don't have WWW
+access. 
+
+
+Installing the Driver
+---------------------
+
+All you will need to do in order to install the driver is:
+	make config
+		(be sure to choose ARCnet in the network devices 
+		and at least one chipset driver.)
+	make clean
+	make zImage
+	
+If you obtained this ARCnet package as an upgrade to the ARCnet driver in
+your current kernel, you will need to first copy arcnet.c over the one in
+the linux/drivers/net directory.
+
+You will know the driver is installed properly if you get some ARCnet
+messages when you reboot into the new Linux kernel.
+
+There are four chipset options:
+
+ 1. Standard ARCnet COM90xx chipset.
+
+This is the normal ARCnet card, which you've probably got. This is the only
+chipset driver which will autoprobe if not told where the card is.
+It following options on the command line:
+ com90xx=[<io>[,<irq>[,<shmem>]]][,<name>] | <name>
+
+If you load the chipset support as a module, the options are:
+ io=<io> irq=<irq> shmem=<shmem> device=<name>
+
+To disable the autoprobe, just specify "com90xx=" on the kernel command line.
+To specify the name alone, but allow autoprobe, just put "com90xx=<name>"
+
+ 2. ARCnet COM20020 chipset.
+
+This is the new chipset from SMC with support for promiscuous mode (packet 
+sniffing), extra diagnostic information, etc. Unfortunately, there is no
+sensible method of autoprobing for these cards. You must specify the I/O
+address on the kernel command line.
+The command line options are:
+ com20020=<io>[,<irq>[,<node_ID>[,backplane[,CKP[,timeout]]]]][,name]
+
+If you load the chipset support as a module, the options are:
+ io=<io> irq=<irq> node=<node_ID> backplane=<backplane> clock=<CKP>
+ timeout=<timeout> device=<name>
+
+The COM20020 chipset allows you to set the node ID in software, overriding the
+default which is still set in DIP switches on the card. If you don't have the
+COM20020 data sheets, and you don't know what the other three options refer
+to, then they won't interest you - forget them.
+
+ 3. ARCnet COM90xx chipset in IO-mapped mode.
+
+This will also work with the normal ARCnet cards, but doesn't use the shared
+memory. It performs less well than the above driver, but is provided in case
+you have a card which doesn't support shared memory, or (strangely) in case
+you have so many ARCnet cards in your machine that you run out of shmem slots.
+If you don't give the IO address on the kernel command line, then the driver
+will not find the card.
+The command line options are:
+ com90io=<io>[,<irq>][,<name>] 
+
+If you load the chipset support as a module, the options are:
+ io=<io> irq=<irq> device=<name>
+
+ 4. ARCnet RIM I cards.
+
+These are COM90xx chips which are _completely_ memory mapped. The support for
+these is not tested. If you have one, please mail the author with a success 
+report. All options must be specified, except the device name.
+Command line options:
+ arcrimi=<shmem>,<irq>,<node_ID>[,<name>]
+
+If you load the chipset support as a module, the options are:
+ shmem=<shmem> irq=<irq> node=<node_ID> device=<name>
+
+
+Loadable Module Support
+-----------------------
+
+Configure and rebuild Linux.  When asked, answer 'm' to "Generic ARCnet 
+support" and to support for your ARCnet chipset if you want to use the
+loadable module. You can also say 'y' to "Generic ARCnet support" and 'm' 
+to the chipset support if you wish.
+
+	make config
+	make clean	
+	make zImage
+	make modules
+	
+If you're using a loadable module, you need to use insmod to load it, and
+you can specify various characteristics of your card on the command
+line.  (In recent versions of the driver, autoprobing is much more reliable
+and works as a module, so most of this is now unnecessary.)
+
+For example:
+	cd /usr/src/linux/modules
+	insmod arcnet.o
+	insmod com90xx.o
+	insmod com20020.o io=0x2e0 device=eth1
+	
+
+Using the Driver
+----------------
+
+If you build your kernel with ARCnet COM90xx support included, it should 
+probe for your card automatically when you boot. If you use a different
+chipset driver complied into the kernel, you must give the necessary options
+on the kernel command line, as detailed above.
+
+Go read the NET-2-HOWTO and ETHERNET-HOWTO for Linux; they should be
+available where you picked up this driver.  Think of your ARCnet as a
+souped-up (or down, as the case may be) Ethernet card.
+
+By the way, be sure to change all references from "eth0" to "arc0" in the
+HOWTOs.  Remember that ARCnet isn't a "true" Ethernet, and the device name
+is DIFFERENT.
+
+
+Multiple Cards in One Computer
+------------------------------
+
+Linux has pretty good support for this now, but since I've been busy, the
+ARCnet driver has somewhat suffered in this respect. COM90xx support, if 
+compiled into the kernel, will (try to) autodetect all the installed cards. 
+
+If you have other cards, with support compiled into the kernel, then you can 
+just repeat the options on the kernel command line, e.g.:
+LILO: linux com20020=0x2e0 com20020=0x380 com90io=0x260
+
+If you have the chipset support built as a loadable module, then you need to 
+do something like this:
+	insmod -o arc0 com90xx
+	insmod -o arc1 com20020 io=0x2e0
+	insmod -o arc2 com90xx
+The ARCnet drivers will now sort out their names automatically.
+
+
+How do I get it to work with...?
+--------------------------------
+
+NFS: Should be fine linux->linux, just pretend you're using Ethernet cards. 
+        oak.oakland.edu:/simtel/msdos/nfs has some nice DOS clients.  There
+        is also a DOS-based NFS server called SOSS.  It doesn't multitask
+        quite the way Linux does (actually, it doesn't multitask AT ALL) but
+        you never know what you might need.
+        
+        With AmiTCP (and possibly others), you may need to set the following
+        options in your Amiga nfstab:  MD 1024 MR 1024 MW 1024
+        (Thanks to Christian Gottschling <ferksy@indigo.tng.oche.de>
+	for this.)
+	
+	Probably these refer to maximum NFS data/read/write block sizes.  I
+	don't know why the defaults on the Amiga didn't work; write to me if
+	you know more.
+
+DOS: If you're using the freeware arcether.com, you might want to install
+        the driver patch from my web page.  It helps with PC/TCP, and also
+        can get arcether to load if it timed out too quickly during
+        initialization.  In fact, if you use it on a 386+ you REALLY need
+        the patch, really.
+	
+Windows:  See DOS :)  Trumpet Winsock works fine with either the Novell or
+	Arcether client, assuming you remember to load winpkt of course.
+
+LAN Manager and Windows for Workgroups: These programs use protocols that
+        are incompatible with the Internet standard.  They try to pretend
+        the cards are Ethernet, and confuse everyone else on the network. 
+        
+        However, v2.00 and higher of the Linux ARCnet driver supports this
+        protocol via the 'arc0e' device.  See the section on "Multiprotocol
+        Support" for more information.
+
+	Using the freeware Samba server and clients for Linux, you can now
+	interface quite nicely with TCP/IP-based WfWg or Lan Manager
+	networks.
+	
+Windows 95: Tools are included with Win95 that let you use either the LANMAN
+	style network drivers (NDIS) or Novell drivers (ODI) to handle your
+	ARCnet packets.  If you use ODI, you'll need to use the 'arc0'
+	device with Linux.  If you use NDIS, then try the 'arc0e' device. 
+	See the "Multiprotocol Support" section below if you need arc0e,
+	you're completely insane, and/or you need to build some kind of
+	hybrid network that uses both encapsulation types.
+
+OS/2: I've been told it works under Warp Connect with an ARCnet driver from
+	SMC.  You need to use the 'arc0e' interface for this.  If you get
+	the SMC driver to work with the TCP/IP stuff included in the
+	"normal" Warp Bonus Pack, let me know.
+
+	ftp.microsoft.com also has a freeware "Lan Manager for OS/2" client
+	which should use the same protocol as WfWg does.  I had no luck
+	installing it under Warp, however.  Please mail me with any results.
+
+NetBSD/AmiTCP: These use an old version of the Internet standard ARCnet
+	protocol (RFC1051) which is compatible with the Linux driver v2.10
+	ALPHA and above using the arc0s device. (See "Multiprotocol ARCnet"
+	below.)  ** Newer versions of NetBSD apparently support RFC1201.
+
+
+Using Multiprotocol ARCnet
+--------------------------
+
+The ARCnet driver v2.10 ALPHA supports three protocols, each on its own
+"virtual network device":
+
+	arc0  - RFC1201 protocol, the official Internet standard which just
+		happens to be 100% compatible with Novell's TRXNET driver. 
+		Version 1.00 of the ARCnet driver supported _only_ this
+		protocol.  arc0 is the fastest of the three protocols (for
+		whatever reason), and allows larger packets to be used
+		because it supports RFC1201 "packet splitting" operations. 
+		Unless you have a specific need to use a different protocol,
+		I strongly suggest that you stick with this one.
+		
+	arc0e - "Ethernet-Encapsulation" which sends packets over ARCnet
+		that are actually a lot like Ethernet packets, including the
+		6-byte hardware addresses.  This protocol is compatible with
+		Microsoft's NDIS ARCnet driver, like the one in WfWg and
+		LANMAN.  Because the MTU of 493 is actually smaller than the
+		one "required" by TCP/IP (576), there is a chance that some
+		network operations will not function properly.  The Linux
+		TCP/IP layer can compensate in most cases, however, by
+		automatically fragmenting the TCP/IP packets to make them
+		fit.  arc0e also works slightly more slowly than arc0, for
+		reasons yet to be determined.  (Probably it's the smaller
+		MTU that does it.)
+		
+	arc0s - The "[s]imple" RFC1051 protocol is the "previous" Internet
+		standard that is completely incompatible with the new
+		standard.  Some software today, however, continues to
+		support the old standard (and only the old standard)
+		including NetBSD and AmiTCP.  RFC1051 also does not support
+		RFC1201's packet splitting, and the MTU of 507 is still
+		smaller than the Internet "requirement," so it's quite
+		possible that you may run into problems.  It's also slower
+		than RFC1201 by about 25%, for the same reason as arc0e.
+		
+		The arc0s support was contributed by Tomasz Motylewski
+		and modified somewhat by me.  Bugs are probably my fault.
+
+You can choose not to compile arc0e and arc0s into the driver if you want -
+this will save you a bit of memory and avoid confusion when eg. trying to
+use the "NFS-root" stuff in recent Linux kernels.
+
+The arc0e and arc0s devices are created automatically when you first
+ifconfig the arc0 device.  To actually use them, though, you need to also
+ifconfig the other virtual devices you need.  There are a number of ways you
+can set up your network then:
+
+
+1. Single Protocol.
+
+   This is the simplest way to configure your network: use just one of the
+   two available protocols.  As mentioned above, it's a good idea to use
+   only arc0 unless you have a good reason (like some other software, ie.
+   WfWg, that only works with arc0e).
+   
+   If you need only arc0, then the following commands should get you going:
+   	ifconfig arc0 MY.IP.ADD.RESS
+   	route add MY.IP.ADD.RESS arc0
+   	route add -net SUB.NET.ADD.RESS arc0
+   	[add other local routes here]
+   	
+   If you need arc0e (and only arc0e), it's a little different:
+   	ifconfig arc0 MY.IP.ADD.RESS
+   	ifconfig arc0e MY.IP.ADD.RESS
+   	route add MY.IP.ADD.RESS arc0e
+   	route add -net SUB.NET.ADD.RESS arc0e
+   
+   arc0s works much the same way as arc0e.
+
+
+2. More than one protocol on the same wire.
+
+   Now things start getting confusing.  To even try it, you may need to be
+   partly crazy.  Here's what *I* did. :) Note that I don't include arc0s in
+   my home network; I don't have any NetBSD or AmiTCP computers, so I only
+   use arc0s during limited testing.
+
+   I have three computers on my home network; two Linux boxes (which prefer
+   RFC1201 protocol, for reasons listed above), and one XT that can't run
+   Linux but runs the free Microsoft LANMAN Client instead.
+
+   Worse, one of the Linux computers (freedom) also has a modem and acts as
+   a router to my Internet provider.  The other Linux box (insight) also has
+   its own IP address and needs to use freedom as its default gateway.  The
+   XT (patience), however, does not have its own Internet IP address and so
+   I assigned it one on a "private subnet" (as defined by RFC1597).
+
+   To start with, take a simple network with just insight and freedom. 
+   Insight needs to:
+   	- talk to freedom via RFC1201 (arc0) protocol, because I like it
+	  more and it's faster.
+	- use freedom as its Internet gateway.
+	
+   That's pretty easy to do.  Set up insight like this:
+   	ifconfig arc0 insight
+   	route add insight arc0
+   	route add freedom arc0	/* I would use the subnet here (like I said
+					to to in "single protocol" above),
+   					but the rest of the subnet
+   					unfortunately lies across the PPP
+   					link on freedom, which confuses
+   					things. */
+   	route add default gw freedom
+   	
+   And freedom gets configured like so:
+   	ifconfig arc0 freedom
+   	route add freedom arc0
+   	route add insight arc0
+   	/* and default gateway is configured by pppd */
+   	
+   Great, now insight talks to freedom directly on arc0, and sends packets
+   to the Internet through freedom.  If you didn't know how to do the above,
+   you should probably stop reading this section now because it only gets
+   worse.
+
+   Now, how do I add patience into the network?  It will be using LANMAN
+   Client, which means I need the arc0e device.  It needs to be able to talk
+   to both insight and freedom, and also use freedom as a gateway to the
+   Internet.  (Recall that patience has a "private IP address" which won't
+   work on the Internet; that's okay, I configured Linux IP masquerading on
+   freedom for this subnet).
+   
+   So patience (necessarily; I don't have another IP number from my
+   provider) has an IP address on a different subnet than freedom and
+   insight, but needs to use freedom as an Internet gateway.  Worse, most
+   DOS networking programs, including LANMAN, have braindead networking
+   schemes that rely completely on the netmask and a 'default gateway' to
+   determine how to route packets.  This means that to get to freedom or
+   insight, patience WILL send through its default gateway, regardless of
+   the fact that both freedom and insight (courtesy of the arc0e device)
+   could understand a direct transmission.
+   
+   I compensate by giving freedom an extra IP address - aliased 'gatekeeper'
+   - that is on my private subnet, the same subnet that patience is on.  I
+   then define gatekeeper to be the default gateway for patience.
+   
+   To configure freedom (in addition to the commands above):
+   	ifconfig arc0e gatekeeper
+   	route add gatekeeper arc0e
+   	route add patience arc0e
+   
+   This way, freedom will send all packets for patience through arc0e,
+   giving its IP address as gatekeeper (on the private subnet).  When it
+   talks to insight or the Internet, it will use its "freedom" Internet IP
+   address.
+   
+   You will notice that we haven't configured the arc0e device on insight. 
+   This would work, but is not really necessary, and would require me to
+   assign insight another special IP number from my private subnet.  Since
+   both insight and patience are using freedom as their default gateway, the
+   two can already talk to each other.
+   
+   It's quite fortunate that I set things up like this the first time (cough
+   cough) because it's really handy when I boot insight into DOS.  There, it
+   runs the Novell ODI protocol stack, which only works with RFC1201 ARCnet. 
+   In this mode it would be impossible for insight to communicate directly
+   with patience, since the Novell stack is incompatible with Microsoft's
+   Ethernet-Encap.  Without changing any settings on freedom or patience, I
+   simply set freedom as the default gateway for insight (now in DOS,
+   remember) and all the forwarding happens "automagically" between the two
+   hosts that would normally not be able to communicate at all.
+   
+   For those who like diagrams, I have created two "virtual subnets" on the
+   same physical ARCnet wire.  You can picture it like this:
+   
+                                                    
+          [RFC1201 NETWORK]                   [ETHER-ENCAP NETWORK]
+      (registered Internet subnet)           (RFC1597 private subnet)
+  
+                             (IP Masquerade)
+          /---------------\         *            /---------------\
+          |               |         *            |               |
+          |               +-Freedom-*-Gatekeeper-+               |
+          |               |    |    *            |               |
+          \-------+-------/    |    *            \-------+-------/
+                  |            |                         |
+               Insight         |                      Patience
+                           (Internet)
+
+
+
+It works: what now?
+-------------------
+
+Send mail describing your setup, preferably including driver version, kernel
+version, ARCnet card model, CPU type, number of systems on your network, and
+list of software in use to me at the following address:
+	apenwarr@worldvisions.ca
+
+I do send (sometimes automated) replies to all messages I receive.  My email
+can be weird (and also usually gets forwarded all over the place along the
+way to me), so if you don't get a reply within a reasonable time, please
+resend.
+
+
+It doesn't work: what now?
+--------------------------
+
+Do the same as above, but also include the output of the ifconfig and route
+commands, as well as any pertinent log entries (ie. anything that starts
+with "arcnet:" and has shown up since the last reboot) in your mail.
+
+If you want to try fixing it yourself (I strongly recommend that you mail me
+about the problem first, since it might already have been solved) you may
+want to try some of the debug levels available.  For heavy testing on
+D_DURING or more, it would be a REALLY good idea to kill your klogd daemon
+first!  D_DURING displays 4-5 lines for each packet sent or received.  D_TX,
+D_RX, and D_SKB actually DISPLAY each packet as it is sent or received,
+which is obviously quite big.
+
+Starting with v2.40 ALPHA, the autoprobe routines have changed
+significantly.  In particular, they won't tell you why the card was not
+found unless you turn on the D_INIT_REASONS debugging flag.
+
+Once the driver is running, you can run the arcdump shell script (available
+from me or in the full ARCnet package, if you have it) as root to list the
+contents of the arcnet buffers at any time.  To make any sense at all out of
+this, you should grab the pertinent RFCs. (some are listed near the top of
+arcnet.c).  arcdump assumes your card is at 0xD0000.  If it isn't, edit the
+script.
+
+Buffers 0 and 1 are used for receiving, and Buffers 2 and 3 are for sending. 
+Ping-pong buffers are implemented both ways.
+
+If your debug level includes D_DURING and you did NOT define SLOW_XMIT_COPY,
+the buffers are cleared to a constant value of 0x42 every time the card is
+reset (which should only happen when you do an ifconfig up, or when Linux
+decides that the driver is broken).  During a transmit, unused parts of the
+buffer will be cleared to 0x42 as well.  This is to make it easier to figure
+out which bytes are being used by a packet.
+
+You can change the debug level without recompiling the kernel by typing:
+	ifconfig arc0 down metric 1xxx
+	/etc/rc.d/rc.inet1
+where "xxx" is the debug level you want.  For example, "metric 1015" would put
+you at debug level 15.  Debug level 7 is currently the default.
+
+Note that the debug level is (starting with v1.90 ALPHA) a binary
+combination of different debug flags; so debug level 7 is really 1+2+4 or
+D_NORMAL+D_EXTRA+D_INIT.  To include D_DURING, you would add 16 to this,
+resulting in debug level 23.
+
+If you don't understand that, you probably don't want to know anyway. 
+E-mail me about your problem.
+
+
+I want to send money: what now?
+-------------------------------
+
+Go take a nap or something.  You'll feel better in the morning.
diff --git a/Documentation/networking/atm.txt b/Documentation/networking/atm.txt
new file mode 100644
index 000000000..82921cee7
--- /dev/null
+++ b/Documentation/networking/atm.txt
@@ -0,0 +1,8 @@
+In order to use anything but the most primitive functions of ATM,
+several user-mode programs are required to assist the kernel. These
+programs and related material can be found via the ATM on Linux Web
+page at http://linux-atm.sourceforge.net/
+
+If you encounter problems with ATM, please report them on the ATM
+on Linux mailing list. Subscription information, archives, etc.,
+can be found on http://linux-atm.sourceforge.net/
diff --git a/Documentation/networking/ax25.txt b/Documentation/networking/ax25.txt
new file mode 100644
index 000000000..8257dbf9b
--- /dev/null
+++ b/Documentation/networking/ax25.txt
@@ -0,0 +1,10 @@
+To use the amateur radio protocols within Linux you will need to get a
+suitable copy of the AX.25 Utilities. More detailed information about
+AX.25, NET/ROM and ROSE, associated programs and and utilities can be
+found on http://www.linux-ax25.org.
+
+There is an active mailing list for discussing Linux amateur radio matters
+called linux-hams@vger.kernel.org. To subscribe to it, send a message to
+majordomo@vger.kernel.org with the words "subscribe linux-hams" in the body
+of the message, the subject field is ignored.  You don't need to be
+subscribed to post but of course that means you might miss an answer.
diff --git a/Documentation/networking/batman-adv.rst b/Documentation/networking/batman-adv.rst
new file mode 100644
index 000000000..245fb6c0a
--- /dev/null
+++ b/Documentation/networking/batman-adv.rst
@@ -0,0 +1,222 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========
+batman-adv
+==========
+
+Batman advanced is a new approach to wireless networking which does no longer
+operate on the IP basis. Unlike the batman daemon, which exchanges information
+using UDP packets and sets routing tables, batman-advanced operates on ISO/OSI
+Layer 2 only and uses and routes (or better: bridges) Ethernet Frames. It
+emulates a virtual network switch of all nodes participating. Therefore all
+nodes appear to be link local, thus all higher operating protocols won't be
+affected by any changes within the network. You can run almost any protocol
+above batman advanced, prominent examples are: IPv4, IPv6, DHCP, IPX.
+
+Batman advanced was implemented as a Linux kernel driver to reduce the overhead
+to a minimum. It does not depend on any (other) network driver, and can be used
+on wifi as well as ethernet lan, vpn, etc ... (anything with ethernet-style
+layer 2).
+
+
+Configuration
+=============
+
+Load the batman-adv module into your kernel::
+
+  $ insmod batman-adv.ko
+
+The module is now waiting for activation. You must add some interfaces on which
+batman can operate. After loading the module batman advanced will scan your
+systems interfaces to search for compatible interfaces. Once found, it will
+create subfolders in the ``/sys`` directories of each supported interface,
+e.g.::
+
+  $ ls /sys/class/net/eth0/batman_adv/
+  elp_interval iface_status mesh_iface throughput_override
+
+If an interface does not have the ``batman_adv`` subfolder, it probably is not
+supported. Not supported interfaces are: loopback, non-ethernet and batman's
+own interfaces.
+
+Note: After the module was loaded it will continuously watch for new
+interfaces to verify the compatibility. There is no need to reload the module
+if you plug your USB wifi adapter into your machine after batman advanced was
+initially loaded.
+
+The batman-adv soft-interface can be created using the iproute2 tool ``ip``::
+
+  $ ip link add name bat0 type batadv
+
+To activate a given interface simply attach it to the ``bat0`` interface::
+
+  $ ip link set dev eth0 master bat0
+
+Repeat this step for all interfaces you wish to add. Now batman starts
+using/broadcasting on this/these interface(s).
+
+By reading the "iface_status" file you can check its status::
+
+  $ cat /sys/class/net/eth0/batman_adv/iface_status
+  active
+
+To deactivate an interface you have to detach it from the "bat0" interface::
+
+  $ ip link set dev eth0 nomaster
+
+
+All mesh wide settings can be found in batman's own interface folder::
+
+  $ ls /sys/class/net/bat0/mesh/
+  aggregated_ogms       fragmentation isolation_mark routing_algo
+  ap_isolation          gw_bandwidth  log_level      vlan0
+  bonding               gw_mode       multicast_mode
+  bridge_loop_avoidance gw_sel_class  network_coding
+  distributed_arp_table hop_penalty   orig_interval
+
+There is a special folder for debugging information::
+
+  $ ls /sys/kernel/debug/batman_adv/bat0/
+  bla_backbone_table log         neighbors         transtable_local
+  bla_claim_table    mcast_flags originators
+  dat_cache          nc          socket
+  gateways           nc_nodes    transtable_global
+
+Some of the files contain all sort of status information regarding the mesh
+network. For example, you can view the table of originators (mesh
+participants) with::
+
+  $ cat /sys/kernel/debug/batman_adv/bat0/originators
+
+Other files allow to change batman's behaviour to better fit your requirements.
+For instance, you can check the current originator interval (value in
+milliseconds which determines how often batman sends its broadcast packets)::
+
+  $ cat /sys/class/net/bat0/mesh/orig_interval
+  1000
+
+and also change its value::
+
+  $ echo 3000 > /sys/class/net/bat0/mesh/orig_interval
+
+In very mobile scenarios, you might want to adjust the originator interval to a
+lower value. This will make the mesh more responsive to topology changes, but
+will also increase the overhead.
+
+
+Usage
+=====
+
+To make use of your newly created mesh, batman advanced provides a new
+interface "bat0" which you should use from this point on. All interfaces added
+to batman advanced are not relevant any longer because batman handles them for
+you. Basically, one "hands over" the data by using the batman interface and
+batman will make sure it reaches its destination.
+
+The "bat0" interface can be used like any other regular interface. It needs an
+IP address which can be either statically configured or dynamically (by using
+DHCP or similar services)::
+
+  NodeA: ip link set up dev bat0
+  NodeA: ip addr add 192.168.0.1/24 dev bat0
+
+  NodeB: ip link set up dev bat0
+  NodeB: ip addr add 192.168.0.2/24 dev bat0
+  NodeB: ping 192.168.0.1
+
+Note: In order to avoid problems remove all IP addresses previously assigned to
+interfaces now used by batman advanced, e.g.::
+
+  $ ip addr flush dev eth0
+
+
+Logging/Debugging
+=================
+
+All error messages, warnings and information messages are sent to the kernel
+log. Depending on your operating system distribution this can be read in one of
+a number of ways. Try using the commands: ``dmesg``, ``logread``, or looking in
+the files ``/var/log/kern.log`` or ``/var/log/syslog``. All batman-adv messages
+are prefixed with "batman-adv:" So to see just these messages try::
+
+  $ dmesg | grep batman-adv
+
+When investigating problems with your mesh network, it is sometimes necessary to
+see more detail debug messages. This must be enabled when compiling the
+batman-adv module. When building batman-adv as part of kernel, use "make
+menuconfig" and enable the option ``B.A.T.M.A.N. debugging``
+(``CONFIG_BATMAN_ADV_DEBUG=y``).
+
+Those additional debug messages can be accessed using a special file in
+debugfs::
+
+  $ cat /sys/kernel/debug/batman_adv/bat0/log
+
+The additional debug output is by default disabled. It can be enabled during
+run time. Following log_levels are defined:
+
+.. flat-table::
+
+   * - 0
+     - All debug output disabled
+   * - 1
+     - Enable messages related to routing / flooding / broadcasting
+   * - 2
+     - Enable messages related to route added / changed / deleted
+   * - 4
+     - Enable messages related to translation table operations
+   * - 8
+     - Enable messages related to bridge loop avoidance
+   * - 16
+     - Enable messages related to DAT, ARP snooping and parsing
+   * - 32
+     - Enable messages related to network coding
+   * - 64
+     - Enable messages related to multicast
+   * - 128
+     - Enable messages related to throughput meter
+   * - 255
+     - Enable all messages
+
+The debug output can be changed at runtime using the file
+``/sys/class/net/bat0/mesh/log_level``. e.g.::
+
+  $ echo 6 > /sys/class/net/bat0/mesh/log_level
+
+will enable debug messages for when routes change.
+
+Counters for different types of packets entering and leaving the batman-adv
+module are available through ethtool::
+
+  $ ethtool --statistics bat0
+
+
+batctl
+======
+
+As batman advanced operates on layer 2, all hosts participating in the virtual
+switch are completely transparent for all protocols above layer 2. Therefore
+the common diagnosis tools do not work as expected. To overcome these problems,
+batctl was created. At the moment the batctl contains ping, traceroute, tcpdump
+and interfaces to the kernel module settings.
+
+For more information, please see the manpage (``man batctl``).
+
+batctl is available on https://www.open-mesh.org/
+
+
+Contact
+=======
+
+Please send us comments, experiences, questions, anything :)
+
+IRC:
+  #batman on irc.freenode.org
+Mailing-list:
+  b.a.t.m.a.n@open-mesh.org (optional subscription at
+  https://lists.open-mesh.org/mm/listinfo/b.a.t.m.a.n)
+
+You can also contact the Authors:
+
+* Marek Lindner <mareklindner@neomailbox.ch>
+* Simon Wunderlich <sw@simonwunderlich.de>
diff --git a/Documentation/networking/baycom.txt b/Documentation/networking/baycom.txt
new file mode 100644
index 000000000..688f18fd4
--- /dev/null
+++ b/Documentation/networking/baycom.txt
@@ -0,0 +1,158 @@
+		    LINUX DRIVERS FOR BAYCOM MODEMS
+
+       Thomas M. Sailer, HB9JNX/AE4WA, <sailer@ife.ee.ethz.ch>
+
+!!NEW!! (04/98) The drivers for the baycom modems have been split into
+separate drivers as they did not share any code, and the driver
+and device names have changed.
+
+This document describes the Linux Kernel Drivers for simple Baycom style
+amateur radio modems. 
+
+The following drivers are available:
+
+baycom_ser_fdx:
+  This driver supports the SER12 modems either full or half duplex.
+  Its baud rate may be changed via the `baud' module parameter,
+  therefore it supports just about every bit bang modem on a
+  serial port. Its devices are called bcsf0 through bcsf3.
+  This is the recommended driver for SER12 type modems,
+  however if you have a broken UART clone that does not have working
+  delta status bits, you may try baycom_ser_hdx. 
+
+baycom_ser_hdx: 
+  This is an alternative driver for SER12 type modems.
+  It only supports half duplex, and only 1200 baud. Its devices
+  are called bcsh0 through bcsh3. Use this driver only if baycom_ser_fdx
+  does not work with your UART.
+
+baycom_par:
+  This driver supports the par96 and picpar modems.
+  Its devices are called bcp0 through bcp3.
+
+baycom_epp:
+  This driver supports the EPP modem.
+  Its devices are called bce0 through bce3.
+  This driver is work-in-progress.
+
+The following modems are supported:
+
+ser12:  This is a very simple 1200 baud AFSK modem. The modem consists only
+        of a modulator/demodulator chip, usually a TI TCM3105. The computer
+        is responsible for regenerating the receiver bit clock, as well as
+        for handling the HDLC protocol. The modem connects to a serial port,
+        hence the name. Since the serial port is not used as an async serial
+        port, the kernel driver for serial ports cannot be used, and this
+        driver only supports standard serial hardware (8250, 16450, 16550)
+
+par96:  This is a modem for 9600 baud FSK compatible to the G3RUH standard.
+        The modem does all the filtering and regenerates the receiver clock.
+        Data is transferred from and to the PC via a shift register.
+        The shift register is filled with 16 bits and an interrupt is signalled.
+        The PC then empties the shift register in a burst. This modem connects
+        to the parallel port, hence the name. The modem leaves the 
+        implementation of the HDLC protocol and the scrambler polynomial to
+        the PC.
+
+picpar: This is a redesign of the par96 modem by Henning Rech, DF9IC. The modem
+        is protocol compatible to par96, but uses only three low power ICs
+        and can therefore be fed from the parallel port and does not require
+        an additional power supply. Furthermore, it incorporates a carrier
+        detect circuitry.
+
+EPP:    This is a high-speed modem adaptor that connects to an enhanced parallel port.
+        Its target audience is users working over a high speed hub (76.8kbit/s).
+
+eppfpga: This is a redesign of the EPP adaptor.
+
+
+
+All of the above modems only support half duplex communications. However,
+the driver supports the KISS (see below) fullduplex command. It then simply
+starts to send as soon as there's a packet to transmit and does not care
+about DCD, i.e. it starts to send even if there's someone else on the channel.
+This command is required by some implementations of the DAMA channel 
+access protocol.
+
+
+The Interface of the drivers
+
+Unlike previous drivers, these drivers are no longer character devices,
+but they are now true kernel network interfaces. Installation is therefore
+simple. Once installed, four interfaces named bc{sf,sh,p,e}[0-3] are available.
+sethdlc from the ax25 utilities may be used to set driver states etc.
+Users of userland AX.25 stacks may use the net2kiss utility (also available
+in the ax25 utilities package) to convert packets of a network interface
+to a KISS stream on a pseudo tty. There's also a patch available from
+me for WAMPES which allows attaching a kernel network interface directly.
+
+
+Configuring the driver
+
+Every time a driver is inserted into the kernel, it has to know which
+modems it should access at which ports. This can be done with the setbaycom
+utility. If you are only using one modem, you can also configure the
+driver from the insmod command line (or by means of an option line in
+/etc/modprobe.d/*.conf).
+
+Examples:
+  modprobe baycom_ser_fdx mode="ser12*" iobase=0x3f8 irq=4
+  sethdlc -i bcsf0 -p mode "ser12*" io 0x3f8 irq 4
+
+Both lines configure the first port to drive a ser12 modem at the first
+serial port (COM1 under DOS). The * in the mode parameter instructs the driver to use
+the software DCD algorithm (see below).
+
+  insmod baycom_par mode="picpar" iobase=0x378
+  sethdlc -i bcp0 -p mode "picpar" io 0x378
+
+Both lines configure the first port to drive a picpar modem at the
+first parallel port (LPT1 under DOS). (Note: picpar implies
+hardware DCD, par96 implies software DCD).
+
+The channel access parameters can be set with sethdlc -a or kissparms.
+Note that both utilities interpret the values slightly differently.
+
+
+Hardware DCD versus Software DCD
+
+To avoid collisions on the air, the driver must know when the channel is
+busy. This is the task of the DCD circuitry/software. The driver may either
+utilise a software DCD algorithm (options=1) or use a DCD signal from
+the hardware (options=0).
+
+ser12:  if software DCD is utilised, the radio's squelch should always be
+        open. It is highly recommended to use the software DCD algorithm,
+        as it is much faster than most hardware squelch circuitry. The
+        disadvantage is a slightly higher load on the system.
+
+par96:  the software DCD algorithm for this type of modem is rather poor.
+        The modem simply does not provide enough information to implement
+        a reasonable DCD algorithm in software. Therefore, if your radio
+        feeds the DCD input of the PAR96 modem, the use of the hardware
+        DCD circuitry is recommended.
+
+picpar: the picpar modem features a builtin DCD hardware, which is highly
+        recommended.
+
+
+
+Compatibility with the rest of the Linux kernel
+
+The serial driver and the baycom serial drivers compete
+for the same hardware resources. Of course only one driver can access a given
+interface at a time. The serial driver grabs all interfaces it can find at
+startup time. Therefore the baycom drivers subsequently won't be able to
+access a serial port. You might therefore find it necessary to release
+a port owned by the serial driver with 'setserial /dev/ttyS# uart none', where
+# is the number of the interface. The baycom drivers do not reserve any
+ports at startup, unless one is specified on the 'insmod' command line. Another
+method to solve the problem is to compile all drivers as modules and
+leave it to kmod to load the correct driver depending on the application.
+
+The parallel port drivers (baycom_par, baycom_epp) now use the parport subsystem
+to arbitrate the ports between different client drivers.
+
+vy 73s de
+Tom Sailer, sailer@ife.ee.ethz.ch
+hb9jnx @ hb9w.ampr.org
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
new file mode 100644
index 000000000..4035a495c
--- /dev/null
+++ b/Documentation/networking/bonding.txt
@@ -0,0 +1,2828 @@
+
+		Linux Ethernet Bonding Driver HOWTO
+
+		Latest update: 27 April 2011
+
+Initial release : Thomas Davis <tadavis at lbl.gov>
+Corrections, HA extensions : 2000/10/03-15 :
+  - Willy Tarreau <willy at meta-x.org>
+  - Constantine Gavrilov <const-g at xpert.com>
+  - Chad N. Tindel <ctindel at ieee dot org>
+  - Janice Girouard <girouard at us dot ibm dot com>
+  - Jay Vosburgh <fubar at us dot ibm dot com>
+
+Reorganized and updated Feb 2005 by Jay Vosburgh
+Added Sysfs information: 2006/04/24
+  - Mitch Williams <mitch.a.williams at intel.com>
+
+Introduction
+============
+
+	The Linux bonding driver provides a method for aggregating
+multiple network interfaces into a single logical "bonded" interface.
+The behavior of the bonded interfaces depends upon the mode; generally
+speaking, modes provide either hot standby or load balancing services.
+Additionally, link integrity monitoring may be performed.
+	
+	The bonding driver originally came from Donald Becker's
+beowulf patches for kernel 2.0. It has changed quite a bit since, and
+the original tools from extreme-linux and beowulf sites will not work
+with this version of the driver.
+
+	For new versions of the driver, updated userspace tools, and
+who to ask for help, please follow the links at the end of this file.
+
+Table of Contents
+=================
+
+1. Bonding Driver Installation
+
+2. Bonding Driver Options
+
+3. Configuring Bonding Devices
+3.1	Configuration with Sysconfig Support
+3.1.1		Using DHCP with Sysconfig
+3.1.2		Configuring Multiple Bonds with Sysconfig
+3.2	Configuration with Initscripts Support
+3.2.1		Using DHCP with Initscripts
+3.2.2		Configuring Multiple Bonds with Initscripts
+3.3	Configuring Bonding Manually with Ifenslave
+3.3.1		Configuring Multiple Bonds Manually
+3.4	Configuring Bonding Manually via Sysfs
+3.5	Configuration with Interfaces Support
+3.6	Overriding Configuration for Special Cases
+3.7 Configuring LACP for 802.3ad mode in a more secure way
+
+4. Querying Bonding Configuration
+4.1	Bonding Configuration
+4.2	Network Configuration
+
+5. Switch Configuration
+
+6. 802.1q VLAN Support
+
+7. Link Monitoring
+7.1	ARP Monitor Operation
+7.2	Configuring Multiple ARP Targets
+7.3	MII Monitor Operation
+
+8. Potential Trouble Sources
+8.1	Adventures in Routing
+8.2	Ethernet Device Renaming
+8.3	Painfully Slow Or No Failed Link Detection By Miimon
+
+9. SNMP agents
+
+10. Promiscuous mode
+
+11. Configuring Bonding for High Availability
+11.1	High Availability in a Single Switch Topology
+11.2	High Availability in a Multiple Switch Topology
+11.2.1		HA Bonding Mode Selection for Multiple Switch Topology
+11.2.2		HA Link Monitoring for Multiple Switch Topology
+
+12. Configuring Bonding for Maximum Throughput
+12.1	Maximum Throughput in a Single Switch Topology
+12.1.1		MT Bonding Mode Selection for Single Switch Topology
+12.1.2		MT Link Monitoring for Single Switch Topology
+12.2	Maximum Throughput in a Multiple Switch Topology
+12.2.1		MT Bonding Mode Selection for Multiple Switch Topology
+12.2.2		MT Link Monitoring for Multiple Switch Topology
+
+13. Switch Behavior Issues
+13.1	Link Establishment and Failover Delays
+13.2	Duplicated Incoming Packets
+
+14. Hardware Specific Considerations
+14.1	IBM BladeCenter
+
+15. Frequently Asked Questions
+
+16. Resources and Links
+
+
+1. Bonding Driver Installation
+==============================
+
+	Most popular distro kernels ship with the bonding driver
+already available as a module. If your distro does not, or you
+have need to compile bonding from source (e.g., configuring and
+installing a mainline kernel from kernel.org), you'll need to perform
+the following steps:
+
+1.1 Configure and build the kernel with bonding
+-----------------------------------------------
+
+	The current version of the bonding driver is available in the
+drivers/net/bonding subdirectory of the most recent kernel source
+(which is available on http://kernel.org).  Most users "rolling their
+own" will want to use the most recent kernel from kernel.org.
+
+	Configure kernel with "make menuconfig" (or "make xconfig" or
+"make config"), then select "Bonding driver support" in the "Network
+device support" section.  It is recommended that you configure the
+driver as module since it is currently the only way to pass parameters
+to the driver or configure more than one bonding device.
+
+	Build and install the new kernel and modules.
+
+1.2 Bonding Control Utility
+-------------------------------------
+
+	 It is recommended to configure bonding via iproute2 (netlink)
+or sysfs, the old ifenslave control utility is obsolete.
+
+2. Bonding Driver Options
+=========================
+
+	Options for the bonding driver are supplied as parameters to the
+bonding module at load time, or are specified via sysfs.
+
+	Module options may be given as command line arguments to the
+insmod or modprobe command, but are usually specified in either the
+/etc/modprobe.d/*.conf configuration files, or in a distro-specific
+configuration file (some of which are detailed in the next section).
+
+	Details on bonding support for sysfs is provided in the
+"Configuring Bonding Manually via Sysfs" section, below.
+
+	The available bonding driver parameters are listed below. If a
+parameter is not specified the default value is used.  When initially
+configuring a bond, it is recommended "tail -f /var/log/messages" be
+run in a separate window to watch for bonding driver error messages.
+
+	It is critical that either the miimon or arp_interval and
+arp_ip_target parameters be specified, otherwise serious network
+degradation will occur during link failures.  Very few devices do not
+support at least miimon, so there is really no reason not to use it.
+
+	Options with textual values will accept either the text name
+or, for backwards compatibility, the option value.  E.g.,
+"mode=802.3ad" and "mode=4" set the same mode.
+
+	The parameters are as follows:
+
+active_slave
+
+	Specifies the new active slave for modes that support it
+	(active-backup, balance-alb and balance-tlb).  Possible values
+	are the name of any currently enslaved interface, or an empty
+	string.  If a name is given, the slave and its link must be up in order
+	to be selected as the new active slave.  If an empty string is
+	specified, the current active slave is cleared, and a new active
+	slave is selected automatically.
+
+	Note that this is only available through the sysfs interface. No module
+	parameter by this name exists.
+
+	The normal value of this option is the name of the currently
+	active slave, or the empty string if there is no active slave or
+	the current mode does not use an active slave.
+
+ad_actor_sys_prio
+
+	In an AD system, this specifies the system priority. The allowed range
+	is 1 - 65535. If the value is not specified, it takes 65535 as the
+	default value.
+
+	This parameter has effect only in 802.3ad mode and is available through
+	SysFs interface.
+
+ad_actor_system
+
+	In an AD system, this specifies the mac-address for the actor in
+	protocol packet exchanges (LACPDUs). The value cannot be a multicast
+	address. If the all-zeroes MAC is specified, bonding will internally
+	use the MAC of the bond itself. It is preferred to have the
+	local-admin bit set for this mac but driver does not enforce it. If
+	the value is not given then system defaults to using the masters'
+	mac address as actors' system address.
+
+	This parameter has effect only in 802.3ad mode and is available through
+	SysFs interface.
+
+ad_select
+
+	Specifies the 802.3ad aggregation selection logic to use.  The
+	possible values and their effects are:
+
+	stable or 0
+
+		The active aggregator is chosen by largest aggregate
+		bandwidth.
+
+		Reselection of the active aggregator occurs only when all
+		slaves of the active aggregator are down or the active
+		aggregator has no slaves.
+
+		This is the default value.
+
+	bandwidth or 1
+
+		The active aggregator is chosen by largest aggregate
+		bandwidth.  Reselection occurs if:
+
+		- A slave is added to or removed from the bond
+
+		- Any slave's link state changes
+
+		- Any slave's 802.3ad association state changes
+
+		- The bond's administrative state changes to up
+
+	count or 2
+
+		The active aggregator is chosen by the largest number of
+		ports (slaves).  Reselection occurs as described under the
+		"bandwidth" setting, above.
+
+	The bandwidth and count selection policies permit failover of
+	802.3ad aggregations when partial failure of the active aggregator
+	occurs.  This keeps the aggregator with the highest availability
+	(either in bandwidth or in number of ports) active at all times.
+
+	This option was added in bonding version 3.4.0.
+
+ad_user_port_key
+
+	In an AD system, the port-key has three parts as shown below -
+
+	   Bits   Use
+	   00     Duplex
+	   01-05  Speed
+	   06-15  User-defined
+
+	This defines the upper 10 bits of the port key. The values can be
+	from 0 - 1023. If not given, the system defaults to 0.
+
+	This parameter has effect only in 802.3ad mode and is available through
+	SysFs interface.
+
+all_slaves_active
+
+	Specifies that duplicate frames (received on inactive ports) should be
+	dropped (0) or delivered (1).
+
+	Normally, bonding will drop duplicate frames (received on inactive
+	ports), which is desirable for most users. But there are some times
+	it is nice to allow duplicate frames to be delivered.
+
+	The default value is 0 (drop duplicate frames received on inactive
+	ports).
+
+arp_interval
+
+	Specifies the ARP link monitoring frequency in milliseconds.
+
+	The ARP monitor works by periodically checking the slave
+	devices to determine whether they have sent or received
+	traffic recently (the precise criteria depends upon the
+	bonding mode, and the state of the slave).  Regular traffic is
+	generated via ARP probes issued for the addresses specified by
+	the arp_ip_target option.
+
+	This behavior can be modified by the arp_validate option,
+	below.
+
+	If ARP monitoring is used in an etherchannel compatible mode
+	(modes 0 and 2), the switch should be configured in a mode
+	that evenly distributes packets across all links. If the
+	switch is configured to distribute the packets in an XOR
+	fashion, all replies from the ARP targets will be received on
+	the same link which could cause the other team members to
+	fail.  ARP monitoring should not be used in conjunction with
+	miimon.  A value of 0 disables ARP monitoring.  The default
+	value is 0.
+
+arp_ip_target
+
+	Specifies the IP addresses to use as ARP monitoring peers when
+	arp_interval is > 0.  These are the targets of the ARP request
+	sent to determine the health of the link to the targets.
+	Specify these values in ddd.ddd.ddd.ddd format.  Multiple IP
+	addresses must be separated by a comma.  At least one IP
+	address must be given for ARP monitoring to function.  The
+	maximum number of targets that can be specified is 16.  The
+	default value is no IP addresses.
+
+arp_validate
+
+	Specifies whether or not ARP probes and replies should be
+	validated in any mode that supports arp monitoring, or whether
+	non-ARP traffic should be filtered (disregarded) for link
+	monitoring purposes.
+
+	Possible values are:
+
+	none or 0
+
+		No validation or filtering is performed.
+
+	active or 1
+
+		Validation is performed only for the active slave.
+
+	backup or 2
+
+		Validation is performed only for backup slaves.
+
+	all or 3
+
+		Validation is performed for all slaves.
+
+	filter or 4
+
+		Filtering is applied to all slaves. No validation is
+		performed.
+
+	filter_active or 5
+
+		Filtering is applied to all slaves, validation is performed
+		only for the active slave.
+
+	filter_backup or 6
+
+		Filtering is applied to all slaves, validation is performed
+		only for backup slaves.
+
+	Validation:
+
+	Enabling validation causes the ARP monitor to examine the incoming
+	ARP requests and replies, and only consider a slave to be up if it
+	is receiving the appropriate ARP traffic.
+
+	For an active slave, the validation checks ARP replies to confirm
+	that they were generated by an arp_ip_target.  Since backup slaves
+	do not typically receive these replies, the validation performed
+	for backup slaves is on the broadcast ARP request sent out via the
+	active slave.  It is possible that some switch or network
+	configurations may result in situations wherein the backup slaves
+	do not receive the ARP requests; in such a situation, validation
+	of backup slaves must be disabled.
+
+	The validation of ARP requests on backup slaves is mainly helping
+	bonding to decide which slaves are more likely to work in case of
+	the active slave failure, it doesn't really guarantee that the
+	backup slave will work if it's selected as the next active slave.
+
+	Validation is useful in network configurations in which multiple
+	bonding hosts are concurrently issuing ARPs to one or more targets
+	beyond a common switch.  Should the link between the switch and
+	target fail (but not the switch itself), the probe traffic
+	generated by the multiple bonding instances will fool the standard
+	ARP monitor into considering the links as still up.  Use of
+	validation can resolve this, as the ARP monitor will only consider
+	ARP requests and replies associated with its own instance of
+	bonding.
+
+	Filtering:
+
+	Enabling filtering causes the ARP monitor to only use incoming ARP
+	packets for link availability purposes.  Arriving packets that are
+	not ARPs are delivered normally, but do not count when determining
+	if a slave is available.
+
+	Filtering operates by only considering the reception of ARP
+	packets (any ARP packet, regardless of source or destination) when
+	determining if a slave has received traffic for link availability
+	purposes.
+
+	Filtering is useful in network configurations in which significant
+	levels of third party broadcast traffic would fool the standard
+	ARP monitor into considering the links as still up.  Use of
+	filtering can resolve this, as only ARP traffic is considered for
+	link availability purposes.
+
+	This option was added in bonding version 3.1.0.
+
+arp_all_targets
+
+	Specifies the quantity of arp_ip_targets that must be reachable
+	in order for the ARP monitor to consider a slave as being up.
+	This option affects only active-backup mode for slaves with
+	arp_validation enabled.
+
+	Possible values are:
+
+	any or 0
+
+		consider the slave up only when any of the arp_ip_targets
+		is reachable
+
+	all or 1
+
+		consider the slave up only when all of the arp_ip_targets
+		are reachable
+
+downdelay
+
+	Specifies the time, in milliseconds, to wait before disabling
+	a slave after a link failure has been detected.  This option
+	is only valid for the miimon link monitor.  The downdelay
+	value should be a multiple of the miimon value; if not, it
+	will be rounded down to the nearest multiple.  The default
+	value is 0.
+
+fail_over_mac
+
+	Specifies whether active-backup mode should set all slaves to
+	the same MAC address at enslavement (the traditional
+	behavior), or, when enabled, perform special handling of the
+	bond's MAC address in accordance with the selected policy.
+
+	Possible values are:
+
+	none or 0
+
+		This setting disables fail_over_mac, and causes
+		bonding to set all slaves of an active-backup bond to
+		the same MAC address at enslavement time.  This is the
+		default.
+
+	active or 1
+
+		The "active" fail_over_mac policy indicates that the
+		MAC address of the bond should always be the MAC
+		address of the currently active slave.  The MAC
+		address of the slaves is not changed; instead, the MAC
+		address of the bond changes during a failover.
+
+		This policy is useful for devices that cannot ever
+		alter their MAC address, or for devices that refuse
+		incoming broadcasts with their own source MAC (which
+		interferes with the ARP monitor).
+
+		The down side of this policy is that every device on
+		the network must be updated via gratuitous ARP,
+		vs. just updating a switch or set of switches (which
+		often takes place for any traffic, not just ARP
+		traffic, if the switch snoops incoming traffic to
+		update its tables) for the traditional method.  If the
+		gratuitous ARP is lost, communication may be
+		disrupted.
+
+		When this policy is used in conjunction with the mii
+		monitor, devices which assert link up prior to being
+		able to actually transmit and receive are particularly
+		susceptible to loss of the gratuitous ARP, and an
+		appropriate updelay setting may be required.
+
+	follow or 2
+
+		The "follow" fail_over_mac policy causes the MAC
+		address of the bond to be selected normally (normally
+		the MAC address of the first slave added to the bond).
+		However, the second and subsequent slaves are not set
+		to this MAC address while they are in a backup role; a
+		slave is programmed with the bond's MAC address at
+		failover time (and the formerly active slave receives
+		the newly active slave's MAC address).
+
+		This policy is useful for multiport devices that
+		either become confused or incur a performance penalty
+		when multiple ports are programmed with the same MAC
+		address.
+
+
+	The default policy is none, unless the first slave cannot
+	change its MAC address, in which case the active policy is
+	selected by default.
+
+	This option may be modified via sysfs only when no slaves are
+	present in the bond.
+
+	This option was added in bonding version 3.2.0.  The "follow"
+	policy was added in bonding version 3.3.0.
+
+lacp_rate
+
+	Option specifying the rate in which we'll ask our link partner
+	to transmit LACPDU packets in 802.3ad mode.  Possible values
+	are:
+
+	slow or 0
+		Request partner to transmit LACPDUs every 30 seconds
+
+	fast or 1
+		Request partner to transmit LACPDUs every 1 second
+
+	The default is slow.
+
+max_bonds
+
+	Specifies the number of bonding devices to create for this
+	instance of the bonding driver.  E.g., if max_bonds is 3, and
+	the bonding driver is not already loaded, then bond0, bond1
+	and bond2 will be created.  The default value is 1.  Specifying
+	a value of 0 will load bonding, but will not create any devices.
+
+miimon
+
+	Specifies the MII link monitoring frequency in milliseconds.
+	This determines how often the link state of each slave is
+	inspected for link failures.  A value of zero disables MII
+	link monitoring.  A value of 100 is a good starting point.
+	The use_carrier option, below, affects how the link state is
+	determined.  See the High Availability section for additional
+	information.  The default value is 0.
+
+min_links
+
+	Specifies the minimum number of links that must be active before
+	asserting carrier. It is similar to the Cisco EtherChannel min-links
+	feature. This allows setting the minimum number of member ports that
+	must be up (link-up state) before marking the bond device as up
+	(carrier on). This is useful for situations where higher level services
+	such as clustering want to ensure a minimum number of low bandwidth
+	links are active before switchover. This option only affect 802.3ad
+	mode.
+
+	The default value is 0. This will cause carrier to be asserted (for
+	802.3ad mode) whenever there is an active aggregator, regardless of the
+	number of available links in that aggregator. Note that, because an
+	aggregator cannot be active without at least one available link,
+	setting this option to 0 or to 1 has the exact same effect.
+
+mode
+
+	Specifies one of the bonding policies. The default is
+	balance-rr (round robin).  Possible values are:
+
+	balance-rr or 0
+
+		Round-robin policy: Transmit packets in sequential
+		order from the first available slave through the
+		last.  This mode provides load balancing and fault
+		tolerance.
+
+	active-backup or 1
+
+		Active-backup policy: Only one slave in the bond is
+		active.  A different slave becomes active if, and only
+		if, the active slave fails.  The bond's MAC address is
+		externally visible on only one port (network adapter)
+		to avoid confusing the switch.
+
+		In bonding version 2.6.2 or later, when a failover
+		occurs in active-backup mode, bonding will issue one
+		or more gratuitous ARPs on the newly active slave.
+		One gratuitous ARP is issued for the bonding master
+		interface and each VLAN interfaces configured above
+		it, provided that the interface has at least one IP
+		address configured.  Gratuitous ARPs issued for VLAN
+		interfaces are tagged with the appropriate VLAN id.
+
+		This mode provides fault tolerance.  The primary
+		option, documented below, affects the behavior of this
+		mode.
+
+	balance-xor or 2
+
+		XOR policy: Transmit based on the selected transmit
+		hash policy.  The default policy is a simple [(source
+		MAC address XOR'd with destination MAC address XOR
+		packet type ID) modulo slave count].  Alternate transmit
+		policies may be	selected via the xmit_hash_policy option,
+		described below.
+
+		This mode provides load balancing and fault tolerance.
+
+	broadcast or 3
+
+		Broadcast policy: transmits everything on all slave
+		interfaces.  This mode provides fault tolerance.
+
+	802.3ad or 4
+
+		IEEE 802.3ad Dynamic link aggregation.  Creates
+		aggregation groups that share the same speed and
+		duplex settings.  Utilizes all slaves in the active
+		aggregator according to the 802.3ad specification.
+
+		Slave selection for outgoing traffic is done according
+		to the transmit hash policy, which may be changed from
+		the default simple XOR policy via the xmit_hash_policy
+		option, documented below.  Note that not all transmit
+		policies may be 802.3ad compliant, particularly in
+		regards to the packet mis-ordering requirements of
+		section 43.2.4 of the 802.3ad standard.  Differing
+		peer implementations will have varying tolerances for
+		noncompliance.
+
+		Prerequisites:
+
+		1. Ethtool support in the base drivers for retrieving
+		the speed and duplex of each slave.
+
+		2. A switch that supports IEEE 802.3ad Dynamic link
+		aggregation.
+
+		Most switches will require some type of configuration
+		to enable 802.3ad mode.
+
+	balance-tlb or 5
+
+		Adaptive transmit load balancing: channel bonding that
+		does not require any special switch support.
+
+		In tlb_dynamic_lb=1 mode; the outgoing traffic is
+		distributed according to the current load (computed
+		relative to the speed) on each slave.
+
+		In tlb_dynamic_lb=0 mode; the load balancing based on
+		current load is disabled and the load is distributed
+		only using the hash distribution.
+
+		Incoming traffic is received by the current slave.
+		If the receiving slave fails, another slave takes over
+		the MAC address of the failed receiving slave.
+
+		Prerequisite:
+
+		Ethtool support in the base drivers for retrieving the
+		speed of each slave.
+
+	balance-alb or 6
+
+		Adaptive load balancing: includes balance-tlb plus
+		receive load balancing (rlb) for IPV4 traffic, and
+		does not require any special switch support.  The
+		receive load balancing is achieved by ARP negotiation.
+		The bonding driver intercepts the ARP Replies sent by
+		the local system on their way out and overwrites the
+		source hardware address with the unique hardware
+		address of one of the slaves in the bond such that
+		different peers use different hardware addresses for
+		the server.
+
+		Receive traffic from connections created by the server
+		is also balanced.  When the local system sends an ARP
+		Request the bonding driver copies and saves the peer's
+		IP information from the ARP packet.  When the ARP
+		Reply arrives from the peer, its hardware address is
+		retrieved and the bonding driver initiates an ARP
+		reply to this peer assigning it to one of the slaves
+		in the bond.  A problematic outcome of using ARP
+		negotiation for balancing is that each time that an
+		ARP request is broadcast it uses the hardware address
+		of the bond.  Hence, peers learn the hardware address
+		of the bond and the balancing of receive traffic
+		collapses to the current slave.  This is handled by
+		sending updates (ARP Replies) to all the peers with
+		their individually assigned hardware address such that
+		the traffic is redistributed.  Receive traffic is also
+		redistributed when a new slave is added to the bond
+		and when an inactive slave is re-activated.  The
+		receive load is distributed sequentially (round robin)
+		among the group of highest speed slaves in the bond.
+
+		When a link is reconnected or a new slave joins the
+		bond the receive traffic is redistributed among all
+		active slaves in the bond by initiating ARP Replies
+		with the selected MAC address to each of the
+		clients. The updelay parameter (detailed below) must
+		be set to a value equal or greater than the switch's
+		forwarding delay so that the ARP Replies sent to the
+		peers will not be blocked by the switch.
+
+		Prerequisites:
+
+		1. Ethtool support in the base drivers for retrieving
+		the speed of each slave.
+
+		2. Base driver support for setting the hardware
+		address of a device while it is open.  This is
+		required so that there will always be one slave in the
+		team using the bond hardware address (the
+		curr_active_slave) while having a unique hardware
+		address for each slave in the bond.  If the
+		curr_active_slave fails its hardware address is
+		swapped with the new curr_active_slave that was
+		chosen.
+
+num_grat_arp
+num_unsol_na
+
+	Specify the number of peer notifications (gratuitous ARPs and
+	unsolicited IPv6 Neighbor Advertisements) to be issued after a
+	failover event.  As soon as the link is up on the new slave
+	(possibly immediately) a peer notification is sent on the
+	bonding device and each VLAN sub-device.  This is repeated at
+	each link monitor interval (arp_interval or miimon, whichever
+	is active) if the number is greater than 1.
+
+	The valid range is 0 - 255; the default value is 1.  These options
+	affect only the active-backup mode.  These options were added for
+	bonding versions 3.3.0 and 3.4.0 respectively.
+
+	From Linux 3.0 and bonding version 3.7.1, these notifications
+	are generated by the ipv4 and ipv6 code and the numbers of
+	repetitions cannot be set independently.
+
+packets_per_slave
+
+	Specify the number of packets to transmit through a slave before
+	moving to the next one. When set to 0 then a slave is chosen at
+	random.
+
+	The valid range is 0 - 65535; the default value is 1. This option
+	has effect only in balance-rr mode.
+
+primary
+
+	A string (eth0, eth2, etc) specifying which slave is the
+	primary device.  The specified device will always be the
+	active slave while it is available.  Only when the primary is
+	off-line will alternate devices be used.  This is useful when
+	one slave is preferred over another, e.g., when one slave has
+	higher throughput than another.
+
+	The primary option is only valid for active-backup(1),
+	balance-tlb (5) and balance-alb (6) mode.
+
+primary_reselect
+
+	Specifies the reselection policy for the primary slave.  This
+	affects how the primary slave is chosen to become the active slave
+	when failure of the active slave or recovery of the primary slave
+	occurs.  This option is designed to prevent flip-flopping between
+	the primary slave and other slaves.  Possible values are:
+
+	always or 0 (default)
+
+		The primary slave becomes the active slave whenever it
+		comes back up.
+
+	better or 1
+
+		The primary slave becomes the active slave when it comes
+		back up, if the speed and duplex of the primary slave is
+		better than the speed and duplex of the current active
+		slave.
+
+	failure or 2
+
+		The primary slave becomes the active slave only if the
+		current active slave fails and the primary slave is up.
+
+	The primary_reselect setting is ignored in two cases:
+
+		If no slaves are active, the first slave to recover is
+		made the active slave.
+
+		When initially enslaved, the primary slave is always made
+		the active slave.
+
+	Changing the primary_reselect policy via sysfs will cause an
+	immediate selection of the best active slave according to the new
+	policy.  This may or may not result in a change of the active
+	slave, depending upon the circumstances.
+
+	This option was added for bonding version 3.6.0.
+
+tlb_dynamic_lb
+
+	Specifies if dynamic shuffling of flows is enabled in tlb
+	mode. The value has no effect on any other modes.
+
+	The default behavior of tlb mode is to shuffle active flows across
+	slaves based on the load in that interval. This gives nice lb
+	characteristics but can cause packet reordering. If re-ordering is
+	a concern use this variable to disable flow shuffling and rely on
+	load balancing provided solely by the hash distribution.
+	xmit-hash-policy can be used to select the appropriate hashing for
+	the setup.
+
+	The sysfs entry can be used to change the setting per bond device
+	and the initial value is derived from the module parameter. The
+	sysfs entry is allowed to be changed only if the bond device is
+	down.
+
+	The default value is "1" that enables flow shuffling while value "0"
+	disables it. This option was added in bonding driver 3.7.1
+
+
+updelay
+
+	Specifies the time, in milliseconds, to wait before enabling a
+	slave after a link recovery has been detected.  This option is
+	only valid for the miimon link monitor.  The updelay value
+	should be a multiple of the miimon value; if not, it will be
+	rounded down to the nearest multiple.  The default value is 0.
+
+use_carrier
+
+	Specifies whether or not miimon should use MII or ETHTOOL
+	ioctls vs. netif_carrier_ok() to determine the link
+	status. The MII or ETHTOOL ioctls are less efficient and
+	utilize a deprecated calling sequence within the kernel.  The
+	netif_carrier_ok() relies on the device driver to maintain its
+	state with netif_carrier_on/off; at this writing, most, but
+	not all, device drivers support this facility.
+
+	If bonding insists that the link is up when it should not be,
+	it may be that your network device driver does not support
+	netif_carrier_on/off.  The default state for netif_carrier is
+	"carrier on," so if a driver does not support netif_carrier,
+	it will appear as if the link is always up.  In this case,
+	setting use_carrier to 0 will cause bonding to revert to the
+	MII / ETHTOOL ioctl method to determine the link state.
+
+	A value of 1 enables the use of netif_carrier_ok(), a value of
+	0 will use the deprecated MII / ETHTOOL ioctls.  The default
+	value is 1.
+
+xmit_hash_policy
+
+	Selects the transmit hash policy to use for slave selection in
+	balance-xor, 802.3ad, and tlb modes.  Possible values are:
+
+	layer2
+
+		Uses XOR of hardware MAC addresses and packet type ID
+		field to generate the hash. The formula is
+
+		hash = source MAC XOR destination MAC XOR packet type ID
+		slave number = hash modulo slave count
+
+		This algorithm will place all traffic to a particular
+		network peer on the same slave.
+
+		This algorithm is 802.3ad compliant.
+
+	layer2+3
+
+		This policy uses a combination of layer2 and layer3
+		protocol information to generate the hash.
+
+		Uses XOR of hardware MAC addresses and IP addresses to
+		generate the hash.  The formula is
+
+		hash = source MAC XOR destination MAC XOR packet type ID
+		hash = hash XOR source IP XOR destination IP
+		hash = hash XOR (hash RSHIFT 16)
+		hash = hash XOR (hash RSHIFT 8)
+		And then hash is reduced modulo slave count.
+
+		If the protocol is IPv6 then the source and destination
+		addresses are first hashed using ipv6_addr_hash.
+
+		This algorithm will place all traffic to a particular
+		network peer on the same slave.  For non-IP traffic,
+		the formula is the same as for the layer2 transmit
+		hash policy.
+
+		This policy is intended to provide a more balanced
+		distribution of traffic than layer2 alone, especially
+		in environments where a layer3 gateway device is
+		required to reach most destinations.
+
+		This algorithm is 802.3ad compliant.
+
+	layer3+4
+
+		This policy uses upper layer protocol information,
+		when available, to generate the hash.  This allows for
+		traffic to a particular network peer to span multiple
+		slaves, although a single connection will not span
+		multiple slaves.
+
+		The formula for unfragmented TCP and UDP packets is
+
+		hash = source port, destination port (as in the header)
+		hash = hash XOR source IP XOR destination IP
+		hash = hash XOR (hash RSHIFT 16)
+		hash = hash XOR (hash RSHIFT 8)
+		And then hash is reduced modulo slave count.
+
+		If the protocol is IPv6 then the source and destination
+		addresses are first hashed using ipv6_addr_hash.
+
+		For fragmented TCP or UDP packets and all other IPv4 and
+		IPv6 protocol traffic, the source and destination port
+		information is omitted.  For non-IP traffic, the
+		formula is the same as for the layer2 transmit hash
+		policy.
+
+		This algorithm is not fully 802.3ad compliant.  A
+		single TCP or UDP conversation containing both
+		fragmented and unfragmented packets will see packets
+		striped across two interfaces.  This may result in out
+		of order delivery.  Most traffic types will not meet
+		this criteria, as TCP rarely fragments traffic, and
+		most UDP traffic is not involved in extended
+		conversations.  Other implementations of 802.3ad may
+		or may not tolerate this noncompliance.
+
+	encap2+3
+
+		This policy uses the same formula as layer2+3 but it
+		relies on skb_flow_dissect to obtain the header fields
+		which might result in the use of inner headers if an
+		encapsulation protocol is used. For example this will
+		improve the performance for tunnel users because the
+		packets will be distributed according to the encapsulated
+		flows.
+
+	encap3+4
+
+		This policy uses the same formula as layer3+4 but it
+		relies on skb_flow_dissect to obtain the header fields
+		which might result in the use of inner headers if an
+		encapsulation protocol is used. For example this will
+		improve the performance for tunnel users because the
+		packets will be distributed according to the encapsulated
+		flows.
+
+	The default value is layer2.  This option was added in bonding
+	version 2.6.3.  In earlier versions of bonding, this parameter
+	does not exist, and the layer2 policy is the only policy.  The
+	layer2+3 value was added for bonding version 3.2.2.
+
+resend_igmp
+
+	Specifies the number of IGMP membership reports to be issued after
+	a failover event. One membership report is issued immediately after
+	the failover, subsequent packets are sent in each 200ms interval.
+
+	The valid range is 0 - 255; the default value is 1. A value of 0
+	prevents the IGMP membership report from being issued in response
+	to the failover event.
+
+	This option is useful for bonding modes balance-rr (0), active-backup
+	(1), balance-tlb (5) and balance-alb (6), in which a failover can
+	switch the IGMP traffic from one slave to another.  Therefore a fresh
+	IGMP report must be issued to cause the switch to forward the incoming
+	IGMP traffic over the newly selected slave.
+
+	This option was added for bonding version 3.7.0.
+
+lp_interval
+
+	Specifies the number of seconds between instances where the bonding
+	driver sends learning packets to each slaves peer switch.
+
+	The valid range is 1 - 0x7fffffff; the default value is 1. This Option
+	has effect only in balance-tlb and balance-alb modes.
+
+3. Configuring Bonding Devices
+==============================
+
+	You can configure bonding using either your distro's network
+initialization scripts, or manually using either iproute2 or the
+sysfs interface.  Distros generally use one of three packages for the
+network initialization scripts: initscripts, sysconfig or interfaces.
+Recent versions of these packages have support for bonding, while older
+versions do not.
+
+	We will first describe the options for configuring bonding for
+distros using versions of initscripts, sysconfig and interfaces with full
+or partial support for bonding, then provide information on enabling
+bonding without support from the network initialization scripts (i.e.,
+older versions of initscripts or sysconfig).
+
+	If you're unsure whether your distro uses sysconfig,
+initscripts or interfaces, or don't know if it's new enough, have no fear.
+Determining this is fairly straightforward.
+
+	First, look for a file called interfaces in /etc/network directory.
+If this file is present in your system, then your system use interfaces. See
+Configuration with Interfaces Support.
+
+	Else, issue the command:
+
+$ rpm -qf /sbin/ifup
+
+	It will respond with a line of text starting with either
+"initscripts" or "sysconfig," followed by some numbers.  This is the
+package that provides your network initialization scripts.
+
+	Next, to determine if your installation supports bonding,
+issue the command:
+
+$ grep ifenslave /sbin/ifup
+
+	If this returns any matches, then your initscripts or
+sysconfig has support for bonding.
+
+3.1 Configuration with Sysconfig Support
+----------------------------------------
+
+	This section applies to distros using a version of sysconfig
+with bonding support, for example, SuSE Linux Enterprise Server 9.
+
+	SuSE SLES 9's networking configuration system does support
+bonding, however, at this writing, the YaST system configuration
+front end does not provide any means to work with bonding devices.
+Bonding devices can be managed by hand, however, as follows.
+
+	First, if they have not already been configured, configure the
+slave devices.  On SLES 9, this is most easily done by running the
+yast2 sysconfig configuration utility.  The goal is for to create an
+ifcfg-id file for each slave device.  The simplest way to accomplish
+this is to configure the devices for DHCP (this is only to get the
+file ifcfg-id file created; see below for some issues with DHCP).  The
+name of the configuration file for each device will be of the form:
+
+ifcfg-id-xx:xx:xx:xx:xx:xx
+
+	Where the "xx" portion will be replaced with the digits from
+the device's permanent MAC address.
+
+	Once the set of ifcfg-id-xx:xx:xx:xx:xx:xx files has been
+created, it is necessary to edit the configuration files for the slave
+devices (the MAC addresses correspond to those of the slave devices).
+Before editing, the file will contain multiple lines, and will look
+something like this:
+
+BOOTPROTO='dhcp'
+STARTMODE='on'
+USERCTL='no'
+UNIQUE='XNzu.WeZGOGF+4wE'
+_nm_name='bus-pci-0001:61:01.0'
+
+	Change the BOOTPROTO and STARTMODE lines to the following:
+
+BOOTPROTO='none'
+STARTMODE='off'
+
+	Do not alter the UNIQUE or _nm_name lines.  Remove any other
+lines (USERCTL, etc).
+
+	Once the ifcfg-id-xx:xx:xx:xx:xx:xx files have been modified,
+it's time to create the configuration file for the bonding device
+itself.  This file is named ifcfg-bondX, where X is the number of the
+bonding device to create, starting at 0.  The first such file is
+ifcfg-bond0, the second is ifcfg-bond1, and so on.  The sysconfig
+network configuration system will correctly start multiple instances
+of bonding.
+
+	The contents of the ifcfg-bondX file is as follows:
+
+BOOTPROTO="static"
+BROADCAST="10.0.2.255"
+IPADDR="10.0.2.10"
+NETMASK="255.255.0.0"
+NETWORK="10.0.2.0"
+REMOTE_IPADDR=""
+STARTMODE="onboot"
+BONDING_MASTER="yes"
+BONDING_MODULE_OPTS="mode=active-backup miimon=100"
+BONDING_SLAVE0="eth0"
+BONDING_SLAVE1="bus-pci-0000:06:08.1"
+
+	Replace the sample BROADCAST, IPADDR, NETMASK and NETWORK
+values with the appropriate values for your network.
+
+	The STARTMODE specifies when the device is brought online.
+The possible values are:
+
+	onboot:	 The device is started at boot time.  If you're not
+		 sure, this is probably what you want.
+
+	manual:	 The device is started only when ifup is called
+		 manually.  Bonding devices may be configured this
+		 way if you do not wish them to start automatically
+		 at boot for some reason.
+
+	hotplug: The device is started by a hotplug event.  This is not
+		 a valid choice for a bonding device.
+
+	off or ignore: The device configuration is ignored.
+
+	The line BONDING_MASTER='yes' indicates that the device is a
+bonding master device.  The only useful value is "yes."
+
+	The contents of BONDING_MODULE_OPTS are supplied to the
+instance of the bonding module for this device.  Specify the options
+for the bonding mode, link monitoring, and so on here.  Do not include
+the max_bonds bonding parameter; this will confuse the configuration
+system if you have multiple bonding devices.
+
+	Finally, supply one BONDING_SLAVEn="slave device" for each
+slave.  where "n" is an increasing value, one for each slave.  The
+"slave device" is either an interface name, e.g., "eth0", or a device
+specifier for the network device.  The interface name is easier to
+find, but the ethN names are subject to change at boot time if, e.g.,
+a device early in the sequence has failed.  The device specifiers
+(bus-pci-0000:06:08.1 in the example above) specify the physical
+network device, and will not change unless the device's bus location
+changes (for example, it is moved from one PCI slot to another).  The
+example above uses one of each type for demonstration purposes; most
+configurations will choose one or the other for all slave devices.
+
+	When all configuration files have been modified or created,
+networking must be restarted for the configuration changes to take
+effect.  This can be accomplished via the following:
+
+# /etc/init.d/network restart
+
+	Note that the network control script (/sbin/ifdown) will
+remove the bonding module as part of the network shutdown processing,
+so it is not necessary to remove the module by hand if, e.g., the
+module parameters have changed.
+
+	Also, at this writing, YaST/YaST2 will not manage bonding
+devices (they do not show bonding interfaces on its list of network
+devices).  It is necessary to edit the configuration file by hand to
+change the bonding configuration.
+
+	Additional general options and details of the ifcfg file
+format can be found in an example ifcfg template file:
+
+/etc/sysconfig/network/ifcfg.template
+
+	Note that the template does not document the various BONDING_
+settings described above, but does describe many of the other options.
+
+3.1.1 Using DHCP with Sysconfig
+-------------------------------
+
+	Under sysconfig, configuring a device with BOOTPROTO='dhcp'
+will cause it to query DHCP for its IP address information.  At this
+writing, this does not function for bonding devices; the scripts
+attempt to obtain the device address from DHCP prior to adding any of
+the slave devices.  Without active slaves, the DHCP requests are not
+sent to the network.
+
+3.1.2 Configuring Multiple Bonds with Sysconfig
+-----------------------------------------------
+
+	The sysconfig network initialization system is capable of
+handling multiple bonding devices.  All that is necessary is for each
+bonding instance to have an appropriately configured ifcfg-bondX file
+(as described above).  Do not specify the "max_bonds" parameter to any
+instance of bonding, as this will confuse sysconfig.  If you require
+multiple bonding devices with identical parameters, create multiple
+ifcfg-bondX files.
+
+	Because the sysconfig scripts supply the bonding module
+options in the ifcfg-bondX file, it is not necessary to add them to
+the system /etc/modules.d/*.conf configuration files.
+
+3.2 Configuration with Initscripts Support
+------------------------------------------
+
+	This section applies to distros using a recent version of
+initscripts with bonding support, for example, Red Hat Enterprise Linux
+version 3 or later, Fedora, etc.  On these systems, the network
+initialization scripts have knowledge of bonding, and can be configured to
+control bonding devices.  Note that older versions of the initscripts
+package have lower levels of support for bonding; this will be noted where
+applicable.
+
+	These distros will not automatically load the network adapter
+driver unless the ethX device is configured with an IP address.
+Because of this constraint, users must manually configure a
+network-script file for all physical adapters that will be members of
+a bondX link.  Network script files are located in the directory:
+
+/etc/sysconfig/network-scripts
+
+	The file name must be prefixed with "ifcfg-eth" and suffixed
+with the adapter's physical adapter number.  For example, the script
+for eth0 would be named /etc/sysconfig/network-scripts/ifcfg-eth0.
+Place the following text in the file:
+
+DEVICE=eth0
+USERCTL=no
+ONBOOT=yes
+MASTER=bond0
+SLAVE=yes
+BOOTPROTO=none
+
+	The DEVICE= line will be different for every ethX device and
+must correspond with the name of the file, i.e., ifcfg-eth1 must have
+a device line of DEVICE=eth1.  The setting of the MASTER= line will
+also depend on the final bonding interface name chosen for your bond.
+As with other network devices, these typically start at 0, and go up
+one for each device, i.e., the first bonding instance is bond0, the
+second is bond1, and so on.
+
+	Next, create a bond network script.  The file name for this
+script will be /etc/sysconfig/network-scripts/ifcfg-bondX where X is
+the number of the bond.  For bond0 the file is named "ifcfg-bond0",
+for bond1 it is named "ifcfg-bond1", and so on.  Within that file,
+place the following text:
+
+DEVICE=bond0
+IPADDR=192.168.1.1
+NETMASK=255.255.255.0
+NETWORK=192.168.1.0
+BROADCAST=192.168.1.255
+ONBOOT=yes
+BOOTPROTO=none
+USERCTL=no
+
+	Be sure to change the networking specific lines (IPADDR,
+NETMASK, NETWORK and BROADCAST) to match your network configuration.
+
+	For later versions of initscripts, such as that found with Fedora
+7 (or later) and Red Hat Enterprise Linux version 5 (or later), it is possible,
+and, indeed, preferable, to specify the bonding options in the ifcfg-bond0
+file, e.g. a line of the format:
+
+BONDING_OPTS="mode=active-backup arp_interval=60 arp_ip_target=192.168.1.254"
+
+	will configure the bond with the specified options.  The options
+specified in BONDING_OPTS are identical to the bonding module parameters
+except for the arp_ip_target field when using versions of initscripts older
+than and 8.57 (Fedora 8) and 8.45.19 (Red Hat Enterprise Linux 5.2).  When
+using older versions each target should be included as a separate option and
+should be preceded by a '+' to indicate it should be added to the list of
+queried targets, e.g.,
+
+	arp_ip_target=+192.168.1.1 arp_ip_target=+192.168.1.2
+
+	is the proper syntax to specify multiple targets.  When specifying
+options via BONDING_OPTS, it is not necessary to edit /etc/modprobe.d/*.conf.
+
+	For even older versions of initscripts that do not support
+BONDING_OPTS, it is necessary to edit /etc/modprobe.d/*.conf, depending upon
+your distro) to load the bonding module with your desired options when the
+bond0 interface is brought up.  The following lines in /etc/modprobe.d/*.conf
+will load the bonding module, and select its options:
+
+alias bond0 bonding
+options bond0 mode=balance-alb miimon=100
+
+	Replace the sample parameters with the appropriate set of
+options for your configuration.
+
+	Finally run "/etc/rc.d/init.d/network restart" as root.  This
+will restart the networking subsystem and your bond link should be now
+up and running.
+
+3.2.1 Using DHCP with Initscripts
+---------------------------------
+
+	Recent versions of initscripts (the versions supplied with Fedora
+Core 3 and Red Hat Enterprise Linux 4, or later versions, are reported to
+work) have support for assigning IP information to bonding devices via
+DHCP.
+
+	To configure bonding for DHCP, configure it as described
+above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"
+and add a line consisting of "TYPE=Bonding".  Note that the TYPE value
+is case sensitive.
+
+3.2.2 Configuring Multiple Bonds with Initscripts
+-------------------------------------------------
+
+	Initscripts packages that are included with Fedora 7 and Red Hat
+Enterprise Linux 5 support multiple bonding interfaces by simply
+specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is the
+number of the bond.  This support requires sysfs support in the kernel,
+and a bonding driver of version 3.0.0 or later.  Other configurations may
+not support this method for specifying multiple bonding interfaces; for
+those instances, see the "Configuring Multiple Bonds Manually" section,
+below.
+
+3.3 Configuring Bonding Manually with iproute2
+-----------------------------------------------
+
+	This section applies to distros whose network initialization
+scripts (the sysconfig or initscripts package) do not have specific
+knowledge of bonding.  One such distro is SuSE Linux Enterprise Server
+version 8.
+
+	The general method for these systems is to place the bonding
+module parameters into a config file in /etc/modprobe.d/ (as
+appropriate for the installed distro), then add modprobe and/or
+`ip link` commands to the system's global init script.  The name of
+the global init script differs; for sysconfig, it is
+/etc/init.d/boot.local and for initscripts it is /etc/rc.d/rc.local.
+
+	For example, if you wanted to make a simple bond of two e100
+devices (presumed to be eth0 and eth1), and have it persist across
+reboots, edit the appropriate file (/etc/init.d/boot.local or
+/etc/rc.d/rc.local), and add the following:
+
+modprobe bonding mode=balance-alb miimon=100
+modprobe e100
+ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
+ip link set eth0 master bond0
+ip link set eth1 master bond0
+
+	Replace the example bonding module parameters and bond0
+network configuration (IP address, netmask, etc) with the appropriate
+values for your configuration.
+
+	Unfortunately, this method will not provide support for the
+ifup and ifdown scripts on the bond devices.  To reload the bonding
+configuration, it is necessary to run the initialization script, e.g.,
+
+# /etc/init.d/boot.local
+
+	or
+
+# /etc/rc.d/rc.local
+
+	It may be desirable in such a case to create a separate script
+which only initializes the bonding configuration, then call that
+separate script from within boot.local.  This allows for bonding to be
+enabled without re-running the entire global init script.
+
+	To shut down the bonding devices, it is necessary to first
+mark the bonding device itself as being down, then remove the
+appropriate device driver modules.  For our example above, you can do
+the following:
+
+# ifconfig bond0 down
+# rmmod bonding
+# rmmod e100
+
+	Again, for convenience, it may be desirable to create a script
+with these commands.
+
+
+3.3.1 Configuring Multiple Bonds Manually
+-----------------------------------------
+
+	This section contains information on configuring multiple
+bonding devices with differing options for those systems whose network
+initialization scripts lack support for configuring multiple bonds.
+
+	If you require multiple bonding devices, but all with the same
+options, you may wish to use the "max_bonds" module parameter,
+documented above.
+
+	To create multiple bonding devices with differing options, it is
+preferable to use bonding parameters exported by sysfs, documented in the
+section below.
+
+	For versions of bonding without sysfs support, the only means to
+provide multiple instances of bonding with differing options is to load
+the bonding driver multiple times.  Note that current versions of the
+sysconfig network initialization scripts handle this automatically; if
+your distro uses these scripts, no special action is needed.  See the
+section Configuring Bonding Devices, above, if you're not sure about your
+network initialization scripts.
+
+	To load multiple instances of the module, it is necessary to
+specify a different name for each instance (the module loading system
+requires that every loaded module, even multiple instances of the same
+module, have a unique name).  This is accomplished by supplying multiple
+sets of bonding options in /etc/modprobe.d/*.conf, for example:
+
+alias bond0 bonding
+options bond0 -o bond0 mode=balance-rr miimon=100
+
+alias bond1 bonding
+options bond1 -o bond1 mode=balance-alb miimon=50
+
+	will load the bonding module two times.  The first instance is
+named "bond0" and creates the bond0 device in balance-rr mode with an
+miimon of 100.  The second instance is named "bond1" and creates the
+bond1 device in balance-alb mode with an miimon of 50.
+
+	In some circumstances (typically with older distributions),
+the above does not work, and the second bonding instance never sees
+its options.  In that case, the second options line can be substituted
+as follows:
+
+install bond1 /sbin/modprobe --ignore-install bonding -o bond1 \
+	mode=balance-alb miimon=50
+
+	This may be repeated any number of times, specifying a new and
+unique name in place of bond1 for each subsequent instance.
+
+	It has been observed that some Red Hat supplied kernels are unable
+to rename modules at load time (the "-o bond1" part).  Attempts to pass
+that option to modprobe will produce an "Operation not permitted" error.
+This has been reported on some Fedora Core kernels, and has been seen on
+RHEL 4 as well.  On kernels exhibiting this problem, it will be impossible
+to configure multiple bonds with differing parameters (as they are older
+kernels, and also lack sysfs support).
+
+3.4 Configuring Bonding Manually via Sysfs
+------------------------------------------
+
+	Starting with version 3.0.0, Channel Bonding may be configured
+via the sysfs interface.  This interface allows dynamic configuration
+of all bonds in the system without unloading the module.  It also
+allows for adding and removing bonds at runtime.  Ifenslave is no
+longer required, though it is still supported.
+
+	Use of the sysfs interface allows you to use multiple bonds
+with different configurations without having to reload the module.
+It also allows you to use multiple, differently configured bonds when
+bonding is compiled into the kernel.
+
+	You must have the sysfs filesystem mounted to configure
+bonding this way.  The examples in this document assume that you
+are using the standard mount point for sysfs, e.g. /sys.  If your
+sysfs filesystem is mounted elsewhere, you will need to adjust the
+example paths accordingly.
+
+Creating and Destroying Bonds
+-----------------------------
+To add a new bond foo:
+# echo +foo > /sys/class/net/bonding_masters
+
+To remove an existing bond bar:
+# echo -bar > /sys/class/net/bonding_masters
+
+To show all existing bonds:
+# cat /sys/class/net/bonding_masters
+
+NOTE: due to 4K size limitation of sysfs files, this list may be
+truncated if you have more than a few hundred bonds.  This is unlikely
+to occur under normal operating conditions.
+
+Adding and Removing Slaves
+--------------------------
+	Interfaces may be enslaved to a bond using the file
+/sys/class/net/<bond>/bonding/slaves.  The semantics for this file
+are the same as for the bonding_masters file.
+
+To enslave interface eth0 to bond bond0:
+# ifconfig bond0 up
+# echo +eth0 > /sys/class/net/bond0/bonding/slaves
+
+To free slave eth0 from bond bond0:
+# echo -eth0 > /sys/class/net/bond0/bonding/slaves
+
+	When an interface is enslaved to a bond, symlinks between the
+two are created in the sysfs filesystem.  In this case, you would get
+/sys/class/net/bond0/slave_eth0 pointing to /sys/class/net/eth0, and
+/sys/class/net/eth0/master pointing to /sys/class/net/bond0.
+
+	This means that you can tell quickly whether or not an
+interface is enslaved by looking for the master symlink.  Thus:
+# echo -eth0 > /sys/class/net/eth0/master/bonding/slaves
+will free eth0 from whatever bond it is enslaved to, regardless of
+the name of the bond interface.
+
+Changing a Bond's Configuration
+-------------------------------
+	Each bond may be configured individually by manipulating the
+files located in /sys/class/net/<bond name>/bonding
+
+	The names of these files correspond directly with the command-
+line parameters described elsewhere in this file, and, with the
+exception of arp_ip_target, they accept the same values.  To see the
+current setting, simply cat the appropriate file.
+
+	A few examples will be given here; for specific usage
+guidelines for each parameter, see the appropriate section in this
+document.
+
+To configure bond0 for balance-alb mode:
+# ifconfig bond0 down
+# echo 6 > /sys/class/net/bond0/bonding/mode
+ - or -
+# echo balance-alb > /sys/class/net/bond0/bonding/mode
+	NOTE: The bond interface must be down before the mode can be
+changed.
+
+To enable MII monitoring on bond0 with a 1 second interval:
+# echo 1000 > /sys/class/net/bond0/bonding/miimon
+	NOTE: If ARP monitoring is enabled, it will disabled when MII
+monitoring is enabled, and vice-versa.
+
+To add ARP targets:
+# echo +192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target
+# echo +192.168.0.101 > /sys/class/net/bond0/bonding/arp_ip_target
+	NOTE:  up to 16 target addresses may be specified.
+
+To remove an ARP target:
+# echo -192.168.0.100 > /sys/class/net/bond0/bonding/arp_ip_target
+
+To configure the interval between learning packet transmits:
+# echo 12 > /sys/class/net/bond0/bonding/lp_interval
+	NOTE: the lp_interval is the number of seconds between instances where
+the bonding driver sends learning packets to each slaves peer switch.  The
+default interval is 1 second.
+
+Example Configuration
+---------------------
+	We begin with the same example that is shown in section 3.3,
+executed with sysfs, and without using ifenslave.
+
+	To make a simple bond of two e100 devices (presumed to be eth0
+and eth1), and have it persist across reboots, edit the appropriate
+file (/etc/init.d/boot.local or /etc/rc.d/rc.local), and add the
+following:
+
+modprobe bonding
+modprobe e100
+echo balance-alb > /sys/class/net/bond0/bonding/mode
+ifconfig bond0 192.168.1.1 netmask 255.255.255.0 up
+echo 100 > /sys/class/net/bond0/bonding/miimon
+echo +eth0 > /sys/class/net/bond0/bonding/slaves
+echo +eth1 > /sys/class/net/bond0/bonding/slaves
+
+	To add a second bond, with two e1000 interfaces in
+active-backup mode, using ARP monitoring, add the following lines to
+your init script:
+
+modprobe e1000
+echo +bond1 > /sys/class/net/bonding_masters
+echo active-backup > /sys/class/net/bond1/bonding/mode
+ifconfig bond1 192.168.2.1 netmask 255.255.255.0 up
+echo +192.168.2.100 /sys/class/net/bond1/bonding/arp_ip_target
+echo 2000 > /sys/class/net/bond1/bonding/arp_interval
+echo +eth2 > /sys/class/net/bond1/bonding/slaves
+echo +eth3 > /sys/class/net/bond1/bonding/slaves
+
+3.5 Configuration with Interfaces Support
+-----------------------------------------
+
+        This section applies to distros which use /etc/network/interfaces file
+to describe network interface configuration, most notably Debian and it's
+derivatives.
+
+	The ifup and ifdown commands on Debian don't support bonding out of
+the box. The ifenslave-2.6 package should be installed to provide bonding
+support.  Once installed, this package will provide bond-* options to be used
+into /etc/network/interfaces.
+
+	Note that ifenslave-2.6 package will load the bonding module and use
+the ifenslave command when appropriate.
+
+Example Configurations
+----------------------
+
+In /etc/network/interfaces, the following stanza will configure bond0, in
+active-backup mode, with eth0 and eth1 as slaves.
+
+auto bond0
+iface bond0 inet dhcp
+	bond-slaves eth0 eth1
+	bond-mode active-backup
+	bond-miimon 100
+	bond-primary eth0 eth1
+
+If the above configuration doesn't work, you might have a system using
+upstart for system startup. This is most notably true for recent
+Ubuntu versions. The following stanza in /etc/network/interfaces will
+produce the same result on those systems.
+
+auto bond0
+iface bond0 inet dhcp
+	bond-slaves none
+	bond-mode active-backup
+	bond-miimon 100
+
+auto eth0
+iface eth0 inet manual
+	bond-master bond0
+	bond-primary eth0 eth1
+
+auto eth1
+iface eth1 inet manual
+	bond-master bond0
+	bond-primary eth0 eth1
+
+For a full list of bond-* supported options in /etc/network/interfaces and some
+more advanced examples tailored to you particular distros, see the files in
+/usr/share/doc/ifenslave-2.6.
+
+3.6 Overriding Configuration for Special Cases
+----------------------------------------------
+
+When using the bonding driver, the physical port which transmits a frame is
+typically selected by the bonding driver, and is not relevant to the user or
+system administrator.  The output port is simply selected using the policies of
+the selected bonding mode.  On occasion however, it is helpful to direct certain
+classes of traffic to certain physical interfaces on output to implement
+slightly more complex policies.  For example, to reach a web server over a
+bonded interface in which eth0 connects to a private network, while eth1
+connects via a public network, it may be desirous to bias the bond to send said
+traffic over eth0 first, using eth1 only as a fall back, while all other traffic
+can safely be sent over either interface.  Such configurations may be achieved
+using the traffic control utilities inherent in linux.
+
+By default the bonding driver is multiqueue aware and 16 queues are created
+when the driver initializes (see Documentation/networking/multiqueue.txt
+for details).  If more or less queues are desired the module parameter
+tx_queues can be used to change this value.  There is no sysfs parameter
+available as the allocation is done at module init time.
+
+The output of the file /proc/net/bonding/bondX has changed so the output Queue
+ID is now printed for each slave:
+
+Bonding Mode: fault-tolerance (active-backup)
+Primary Slave: None
+Currently Active Slave: eth0
+MII Status: up
+MII Polling Interval (ms): 0
+Up Delay (ms): 0
+Down Delay (ms): 0
+
+Slave Interface: eth0
+MII Status: up
+Link Failure Count: 0
+Permanent HW addr: 00:1a:a0:12:8f:cb
+Slave queue ID: 0
+
+Slave Interface: eth1
+MII Status: up
+Link Failure Count: 0
+Permanent HW addr: 00:1a:a0:12:8f:cc
+Slave queue ID: 2
+
+The queue_id for a slave can be set using the command:
+
+# echo "eth1:2" > /sys/class/net/bond0/bonding/queue_id
+
+Any interface that needs a queue_id set should set it with multiple calls
+like the one above until proper priorities are set for all interfaces.  On
+distributions that allow configuration via initscripts, multiple 'queue_id'
+arguments can be added to BONDING_OPTS to set all needed slave queues.
+
+These queue id's can be used in conjunction with the tc utility to configure
+a multiqueue qdisc and filters to bias certain traffic to transmit on certain
+slave devices.  For instance, say we wanted, in the above configuration to
+force all traffic bound to 192.168.1.100 to use eth1 in the bond as its output
+device. The following commands would accomplish this:
+
+# tc qdisc add dev bond0 handle 1 root multiq
+
+# tc filter add dev bond0 protocol ip parent 1: prio 1 u32 match ip dst \
+	192.168.1.100 action skbedit queue_mapping 2
+
+These commands tell the kernel to attach a multiqueue queue discipline to the
+bond0 interface and filter traffic enqueued to it, such that packets with a dst
+ip of 192.168.1.100 have their output queue mapping value overwritten to 2.
+This value is then passed into the driver, causing the normal output path
+selection policy to be overridden, selecting instead qid 2, which maps to eth1.
+
+Note that qid values begin at 1.  Qid 0 is reserved to initiate to the driver
+that normal output policy selection should take place.  One benefit to simply
+leaving the qid for a slave to 0 is the multiqueue awareness in the bonding
+driver that is now present.  This awareness allows tc filters to be placed on
+slave devices as well as bond devices and the bonding driver will simply act as
+a pass-through for selecting output queues on the slave device rather than 
+output port selection.
+
+This feature first appeared in bonding driver version 3.7.0 and support for
+output slave selection was limited to round-robin and active-backup modes.
+
+3.7 Configuring LACP for 802.3ad mode in a more secure way
+----------------------------------------------------------
+
+When using 802.3ad bonding mode, the Actor (host) and Partner (switch)
+exchange LACPDUs.  These LACPDUs cannot be sniffed, because they are
+destined to link local mac addresses (which switches/bridges are not
+supposed to forward).  However, most of the values are easily predictable
+or are simply the machine's MAC address (which is trivially known to all
+other hosts in the same L2).  This implies that other machines in the L2
+domain can spoof LACPDU packets from other hosts to the switch and potentially
+cause mayhem by joining (from the point of view of the switch) another
+machine's aggregate, thus receiving a portion of that hosts incoming
+traffic and / or spoofing traffic from that machine themselves (potentially
+even successfully terminating some portion of flows). Though this is not
+a likely scenario, one could avoid this possibility by simply configuring
+few bonding parameters:
+
+   (a) ad_actor_system : You can set a random mac-address that can be used for
+       these LACPDU exchanges. The value can not be either NULL or Multicast.
+       Also it's preferable to set the local-admin bit. Following shell code
+       generates a random mac-address as described above.
+
+       # sys_mac_addr=$(printf '%02x:%02x:%02x:%02x:%02x:%02x' \
+                                $(( (RANDOM & 0xFE) | 0x02 )) \
+                                $(( RANDOM & 0xFF )) \
+                                $(( RANDOM & 0xFF )) \
+                                $(( RANDOM & 0xFF )) \
+                                $(( RANDOM & 0xFF )) \
+                                $(( RANDOM & 0xFF )))
+       # echo $sys_mac_addr > /sys/class/net/bond0/bonding/ad_actor_system
+
+   (b) ad_actor_sys_prio : Randomize the system priority. The default value
+       is 65535, but system can take the value from 1 - 65535. Following shell
+       code generates random priority and sets it.
+
+       # sys_prio=$(( 1 + RANDOM + RANDOM ))
+       # echo $sys_prio > /sys/class/net/bond0/bonding/ad_actor_sys_prio
+
+   (c) ad_user_port_key : Use the user portion of the port-key. The default
+       keeps this empty. These are the upper 10 bits of the port-key and value
+       ranges from 0 - 1023. Following shell code generates these 10 bits and
+       sets it.
+
+       # usr_port_key=$(( RANDOM & 0x3FF ))
+       # echo $usr_port_key > /sys/class/net/bond0/bonding/ad_user_port_key
+
+
+4 Querying Bonding Configuration
+=================================
+
+4.1 Bonding Configuration
+-------------------------
+
+	Each bonding device has a read-only file residing in the
+/proc/net/bonding directory.  The file contents include information
+about the bonding configuration, options and state of each slave.
+
+	For example, the contents of /proc/net/bonding/bond0 after the
+driver is loaded with parameters of mode=0 and miimon=1000 is
+generally as follows:
+
+	Ethernet Channel Bonding Driver: 2.6.1 (October 29, 2004)
+        Bonding Mode: load balancing (round-robin)
+        Currently Active Slave: eth0
+        MII Status: up
+        MII Polling Interval (ms): 1000
+        Up Delay (ms): 0
+        Down Delay (ms): 0
+
+        Slave Interface: eth1
+        MII Status: up
+        Link Failure Count: 1
+
+        Slave Interface: eth0
+        MII Status: up
+        Link Failure Count: 1
+
+	The precise format and contents will change depending upon the
+bonding configuration, state, and version of the bonding driver.
+
+4.2 Network configuration
+-------------------------
+
+	The network configuration can be inspected using the ifconfig
+command.  Bonding devices will have the MASTER flag set; Bonding slave
+devices will have the SLAVE flag set.  The ifconfig output does not
+contain information on which slaves are associated with which masters.
+
+	In the example below, the bond0 interface is the master
+(MASTER) while eth0 and eth1 are slaves (SLAVE). Notice all slaves of
+bond0 have the same MAC address (HWaddr) as bond0 for all modes except
+TLB and ALB that require a unique MAC address for each slave.
+
+# /sbin/ifconfig
+bond0     Link encap:Ethernet  HWaddr 00:C0:F0:1F:37:B4
+          inet addr:XXX.XXX.XXX.YYY  Bcast:XXX.XXX.XXX.255  Mask:255.255.252.0
+          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
+          RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
+          collisions:0 txqueuelen:0
+
+eth0      Link encap:Ethernet  HWaddr 00:C0:F0:1F:37:B4
+          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
+          RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
+          collisions:0 txqueuelen:100
+          Interrupt:10 Base address:0x1080
+
+eth1      Link encap:Ethernet  HWaddr 00:C0:F0:1F:37:B4
+          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
+          RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
+          collisions:0 txqueuelen:100
+          Interrupt:9 Base address:0x1400
+
+5. Switch Configuration
+=======================
+
+	For this section, "switch" refers to whatever system the
+bonded devices are directly connected to (i.e., where the other end of
+the cable plugs into).  This may be an actual dedicated switch device,
+or it may be another regular system (e.g., another computer running
+Linux),
+
+	The active-backup, balance-tlb and balance-alb modes do not
+require any specific configuration of the switch.
+
+	The 802.3ad mode requires that the switch have the appropriate
+ports configured as an 802.3ad aggregation.  The precise method used
+to configure this varies from switch to switch, but, for example, a
+Cisco 3550 series switch requires that the appropriate ports first be
+grouped together in a single etherchannel instance, then that
+etherchannel is set to mode "lacp" to enable 802.3ad (instead of
+standard EtherChannel).
+
+	The balance-rr, balance-xor and broadcast modes generally
+require that the switch have the appropriate ports grouped together.
+The nomenclature for such a group differs between switches, it may be
+called an "etherchannel" (as in the Cisco example, above), a "trunk
+group" or some other similar variation.  For these modes, each switch
+will also have its own configuration options for the switch's transmit
+policy to the bond.  Typical choices include XOR of either the MAC or
+IP addresses.  The transmit policy of the two peers does not need to
+match.  For these three modes, the bonding mode really selects a
+transmit policy for an EtherChannel group; all three will interoperate
+with another EtherChannel group.
+
+
+6. 802.1q VLAN Support
+======================
+
+	It is possible to configure VLAN devices over a bond interface
+using the 8021q driver.  However, only packets coming from the 8021q
+driver and passing through bonding will be tagged by default.  Self
+generated packets, for example, bonding's learning packets or ARP
+packets generated by either ALB mode or the ARP monitor mechanism, are
+tagged internally by bonding itself.  As a result, bonding must
+"learn" the VLAN IDs configured above it, and use those IDs to tag
+self generated packets.
+
+	For reasons of simplicity, and to support the use of adapters
+that can do VLAN hardware acceleration offloading, the bonding
+interface declares itself as fully hardware offloading capable, it gets
+the add_vid/kill_vid notifications to gather the necessary
+information, and it propagates those actions to the slaves.  In case
+of mixed adapter types, hardware accelerated tagged packets that
+should go through an adapter that is not offloading capable are
+"un-accelerated" by the bonding driver so the VLAN tag sits in the
+regular location.
+
+	VLAN interfaces *must* be added on top of a bonding interface
+only after enslaving at least one slave.  The bonding interface has a
+hardware address of 00:00:00:00:00:00 until the first slave is added.
+If the VLAN interface is created prior to the first enslavement, it
+would pick up the all-zeroes hardware address.  Once the first slave
+is attached to the bond, the bond device itself will pick up the
+slave's hardware address, which is then available for the VLAN device.
+
+	Also, be aware that a similar problem can occur if all slaves
+are released from a bond that still has one or more VLAN interfaces on
+top of it.  When a new slave is added, the bonding interface will
+obtain its hardware address from the first slave, which might not
+match the hardware address of the VLAN interfaces (which was
+ultimately copied from an earlier slave).
+
+	There are two methods to insure that the VLAN device operates
+with the correct hardware address if all slaves are removed from a
+bond interface:
+
+	1. Remove all VLAN interfaces then recreate them
+
+	2. Set the bonding interface's hardware address so that it
+matches the hardware address of the VLAN interfaces.
+
+	Note that changing a VLAN interface's HW address would set the
+underlying device -- i.e. the bonding interface -- to promiscuous
+mode, which might not be what you want.
+
+
+7. Link Monitoring
+==================
+
+	The bonding driver at present supports two schemes for
+monitoring a slave device's link state: the ARP monitor and the MII
+monitor.
+
+	At the present time, due to implementation restrictions in the
+bonding driver itself, it is not possible to enable both ARP and MII
+monitoring simultaneously.
+
+7.1 ARP Monitor Operation
+-------------------------
+
+	The ARP monitor operates as its name suggests: it sends ARP
+queries to one or more designated peer systems on the network, and
+uses the response as an indication that the link is operating.  This
+gives some assurance that traffic is actually flowing to and from one
+or more peers on the local network.
+
+	The ARP monitor relies on the device driver itself to verify
+that traffic is flowing.  In particular, the driver must keep up to
+date the last receive time, dev->last_rx.  Drivers that use NETIF_F_LLTX
+flag must also update netdev_queue->trans_start.  If they do not, then the
+ARP monitor will immediately fail any slaves using that driver, and
+those slaves will stay down.  If networking monitoring (tcpdump, etc)
+shows the ARP requests and replies on the network, then it may be that
+your device driver is not updating last_rx and trans_start.
+
+7.2 Configuring Multiple ARP Targets
+------------------------------------
+
+	While ARP monitoring can be done with just one target, it can
+be useful in a High Availability setup to have several targets to
+monitor.  In the case of just one target, the target itself may go
+down or have a problem making it unresponsive to ARP requests.  Having
+an additional target (or several) increases the reliability of the ARP
+monitoring.
+
+	Multiple ARP targets must be separated by commas as follows:
+
+# example options for ARP monitoring with three targets
+alias bond0 bonding
+options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9
+
+	For just a single target the options would resemble:
+
+# example options for ARP monitoring with one target
+alias bond0 bonding
+options bond0 arp_interval=60 arp_ip_target=192.168.0.100
+
+
+7.3 MII Monitor Operation
+-------------------------
+
+	The MII monitor monitors only the carrier state of the local
+network interface.  It accomplishes this in one of three ways: by
+depending upon the device driver to maintain its carrier state, by
+querying the device's MII registers, or by making an ethtool query to
+the device.
+
+	If the use_carrier module parameter is 1 (the default value),
+then the MII monitor will rely on the driver for carrier state
+information (via the netif_carrier subsystem).  As explained in the
+use_carrier parameter information, above, if the MII monitor fails to
+detect carrier loss on the device (e.g., when the cable is physically
+disconnected), it may be that the driver does not support
+netif_carrier.
+
+	If use_carrier is 0, then the MII monitor will first query the
+device's (via ioctl) MII registers and check the link state.  If that
+request fails (not just that it returns carrier down), then the MII
+monitor will make an ethtool ETHOOL_GLINK request to attempt to obtain
+the same information.  If both methods fail (i.e., the driver either
+does not support or had some error in processing both the MII register
+and ethtool requests), then the MII monitor will assume the link is
+up.
+
+8. Potential Sources of Trouble
+===============================
+
+8.1 Adventures in Routing
+-------------------------
+
+	When bonding is configured, it is important that the slave
+devices not have routes that supersede routes of the master (or,
+generally, not have routes at all).  For example, suppose the bonding
+device bond0 has two slaves, eth0 and eth1, and the routing table is
+as follows:
+
+Kernel IP routing table
+Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
+10.0.0.0        0.0.0.0         255.255.0.0     U        40 0          0 eth0
+10.0.0.0        0.0.0.0         255.255.0.0     U        40 0          0 eth1
+10.0.0.0        0.0.0.0         255.255.0.0     U        40 0          0 bond0
+127.0.0.0       0.0.0.0         255.0.0.0       U        40 0          0 lo
+
+	This routing configuration will likely still update the
+receive/transmit times in the driver (needed by the ARP monitor), but
+may bypass the bonding driver (because outgoing traffic to, in this
+case, another host on network 10 would use eth0 or eth1 before bond0).
+
+	The ARP monitor (and ARP itself) may become confused by this
+configuration, because ARP requests (generated by the ARP monitor)
+will be sent on one interface (bond0), but the corresponding reply
+will arrive on a different interface (eth0).  This reply looks to ARP
+as an unsolicited ARP reply (because ARP matches replies on an
+interface basis), and is discarded.  The MII monitor is not affected
+by the state of the routing table.
+
+	The solution here is simply to insure that slaves do not have
+routes of their own, and if for some reason they must, those routes do
+not supersede routes of their master.  This should generally be the
+case, but unusual configurations or errant manual or automatic static
+route additions may cause trouble.
+
+8.2 Ethernet Device Renaming
+----------------------------
+
+	On systems with network configuration scripts that do not
+associate physical devices directly with network interface names (so
+that the same physical device always has the same "ethX" name), it may
+be necessary to add some special logic to config files in
+/etc/modprobe.d/.
+
+	For example, given a modules.conf containing the following:
+
+alias bond0 bonding
+options bond0 mode=some-mode miimon=50
+alias eth0 tg3
+alias eth1 tg3
+alias eth2 e1000
+alias eth3 e1000
+
+	If neither eth0 and eth1 are slaves to bond0, then when the
+bond0 interface comes up, the devices may end up reordered.  This
+happens because bonding is loaded first, then its slave device's
+drivers are loaded next.  Since no other drivers have been loaded,
+when the e1000 driver loads, it will receive eth0 and eth1 for its
+devices, but the bonding configuration tries to enslave eth2 and eth3
+(which may later be assigned to the tg3 devices).
+
+	Adding the following:
+
+add above bonding e1000 tg3
+
+	causes modprobe to load e1000 then tg3, in that order, when
+bonding is loaded.  This command is fully documented in the
+modules.conf manual page.
+
+	On systems utilizing modprobe an equivalent problem can occur.
+In this case, the following can be added to config files in
+/etc/modprobe.d/ as:
+
+softdep bonding pre: tg3 e1000
+
+	This will load tg3 and e1000 modules before loading the bonding one.
+Full documentation on this can be found in the modprobe.d and modprobe
+manual pages.
+
+8.3. Painfully Slow Or No Failed Link Detection By Miimon
+---------------------------------------------------------
+
+	By default, bonding enables the use_carrier option, which
+instructs bonding to trust the driver to maintain carrier state.
+
+	As discussed in the options section, above, some drivers do
+not support the netif_carrier_on/_off link state tracking system.
+With use_carrier enabled, bonding will always see these links as up,
+regardless of their actual state.
+
+	Additionally, other drivers do support netif_carrier, but do
+not maintain it in real time, e.g., only polling the link state at
+some fixed interval.  In this case, miimon will detect failures, but
+only after some long period of time has expired.  If it appears that
+miimon is very slow in detecting link failures, try specifying
+use_carrier=0 to see if that improves the failure detection time.  If
+it does, then it may be that the driver checks the carrier state at a
+fixed interval, but does not cache the MII register values (so the
+use_carrier=0 method of querying the registers directly works).  If
+use_carrier=0 does not improve the failover, then the driver may cache
+the registers, or the problem may be elsewhere.
+
+	Also, remember that miimon only checks for the device's
+carrier state.  It has no way to determine the state of devices on or
+beyond other ports of a switch, or if a switch is refusing to pass
+traffic while still maintaining carrier on.
+
+9. SNMP agents
+===============
+
+	If running SNMP agents, the bonding driver should be loaded
+before any network drivers participating in a bond.  This requirement
+is due to the interface index (ipAdEntIfIndex) being associated to
+the first interface found with a given IP address.  That is, there is
+only one ipAdEntIfIndex for each IP address.  For example, if eth0 and
+eth1 are slaves of bond0 and the driver for eth0 is loaded before the
+bonding driver, the interface for the IP address will be associated
+with the eth0 interface.  This configuration is shown below, the IP
+address 192.168.1.1 has an interface index of 2 which indexes to eth0
+in the ifDescr table (ifDescr.2).
+
+     interfaces.ifTable.ifEntry.ifDescr.1 = lo
+     interfaces.ifTable.ifEntry.ifDescr.2 = eth0
+     interfaces.ifTable.ifEntry.ifDescr.3 = eth1
+     interfaces.ifTable.ifEntry.ifDescr.4 = eth2
+     interfaces.ifTable.ifEntry.ifDescr.5 = eth3
+     interfaces.ifTable.ifEntry.ifDescr.6 = bond0
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
+
+	This problem is avoided by loading the bonding driver before
+any network drivers participating in a bond.  Below is an example of
+loading the bonding driver first, the IP address 192.168.1.1 is
+correctly associated with ifDescr.2.
+
+     interfaces.ifTable.ifEntry.ifDescr.1 = lo
+     interfaces.ifTable.ifEntry.ifDescr.2 = bond0
+     interfaces.ifTable.ifEntry.ifDescr.3 = eth0
+     interfaces.ifTable.ifEntry.ifDescr.4 = eth1
+     interfaces.ifTable.ifEntry.ifDescr.5 = eth2
+     interfaces.ifTable.ifEntry.ifDescr.6 = eth3
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5
+     ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
+
+	While some distributions may not report the interface name in
+ifDescr, the association between the IP address and IfIndex remains
+and SNMP functions such as Interface_Scan_Next will report that
+association.
+
+10. Promiscuous mode
+====================
+
+	When running network monitoring tools, e.g., tcpdump, it is
+common to enable promiscuous mode on the device, so that all traffic
+is seen (instead of seeing only traffic destined for the local host).
+The bonding driver handles promiscuous mode changes to the bonding
+master device (e.g., bond0), and propagates the setting to the slave
+devices.
+
+	For the balance-rr, balance-xor, broadcast, and 802.3ad modes,
+the promiscuous mode setting is propagated to all slaves.
+
+	For the active-backup, balance-tlb and balance-alb modes, the
+promiscuous mode setting is propagated only to the active slave.
+
+	For balance-tlb mode, the active slave is the slave currently
+receiving inbound traffic.
+
+	For balance-alb mode, the active slave is the slave used as a
+"primary."  This slave is used for mode-specific control traffic, for
+sending to peers that are unassigned or if the load is unbalanced.
+
+	For the active-backup, balance-tlb and balance-alb modes, when
+the active slave changes (e.g., due to a link failure), the
+promiscuous setting will be propagated to the new active slave.
+
+11. Configuring Bonding for High Availability
+=============================================
+
+	High Availability refers to configurations that provide
+maximum network availability by having redundant or backup devices,
+links or switches between the host and the rest of the world.  The
+goal is to provide the maximum availability of network connectivity
+(i.e., the network always works), even though other configurations
+could provide higher throughput.
+
+11.1 High Availability in a Single Switch Topology
+--------------------------------------------------
+
+	If two hosts (or a host and a single switch) are directly
+connected via multiple physical links, then there is no availability
+penalty to optimizing for maximum bandwidth.  In this case, there is
+only one switch (or peer), so if it fails, there is no alternative
+access to fail over to.  Additionally, the bonding load balance modes
+support link monitoring of their members, so if individual links fail,
+the load will be rebalanced across the remaining devices.
+
+	See Section 12, "Configuring Bonding for Maximum Throughput"
+for information on configuring bonding with one peer device.
+
+11.2 High Availability in a Multiple Switch Topology
+----------------------------------------------------
+
+	With multiple switches, the configuration of bonding and the
+network changes dramatically.  In multiple switch topologies, there is
+a trade off between network availability and usable bandwidth.
+
+	Below is a sample network, configured to maximize the
+availability of the network:
+
+                |                                     |
+                |port3                           port3|
+          +-----+----+                          +-----+----+
+          |          |port2       ISL      port2|          |
+          | switch A +--------------------------+ switch B |
+          |          |                          |          |
+          +-----+----+                          +-----++---+
+                |port1                           port1|
+                |             +-------+               |
+                +-------------+ host1 +---------------+
+                         eth0 +-------+ eth1
+
+	In this configuration, there is a link between the two
+switches (ISL, or inter switch link), and multiple ports connecting to
+the outside world ("port3" on each switch).  There is no technical
+reason that this could not be extended to a third switch.
+
+11.2.1 HA Bonding Mode Selection for Multiple Switch Topology
+-------------------------------------------------------------
+
+	In a topology such as the example above, the active-backup and
+broadcast modes are the only useful bonding modes when optimizing for
+availability; the other modes require all links to terminate on the
+same peer for them to behave rationally.
+
+active-backup: This is generally the preferred mode, particularly if
+	the switches have an ISL and play together well.  If the
+	network configuration is such that one switch is specifically
+	a backup switch (e.g., has lower capacity, higher cost, etc),
+	then the primary option can be used to insure that the
+	preferred link is always used when it is available.
+
+broadcast: This mode is really a special purpose mode, and is suitable
+	only for very specific needs.  For example, if the two
+	switches are not connected (no ISL), and the networks beyond
+	them are totally independent.  In this case, if it is
+	necessary for some specific one-way traffic to reach both
+	independent networks, then the broadcast mode may be suitable.
+
+11.2.2 HA Link Monitoring Selection for Multiple Switch Topology
+----------------------------------------------------------------
+
+	The choice of link monitoring ultimately depends upon your
+switch.  If the switch can reliably fail ports in response to other
+failures, then either the MII or ARP monitors should work.  For
+example, in the above example, if the "port3" link fails at the remote
+end, the MII monitor has no direct means to detect this.  The ARP
+monitor could be configured with a target at the remote end of port3,
+thus detecting that failure without switch support.
+
+	In general, however, in a multiple switch topology, the ARP
+monitor can provide a higher level of reliability in detecting end to
+end connectivity failures (which may be caused by the failure of any
+individual component to pass traffic for any reason).  Additionally,
+the ARP monitor should be configured with multiple targets (at least
+one for each switch in the network).  This will insure that,
+regardless of which switch is active, the ARP monitor has a suitable
+target to query.
+
+	Note, also, that of late many switches now support a functionality
+generally referred to as "trunk failover."  This is a feature of the
+switch that causes the link state of a particular switch port to be set
+down (or up) when the state of another switch port goes down (or up).
+Its purpose is to propagate link failures from logically "exterior" ports
+to the logically "interior" ports that bonding is able to monitor via
+miimon.  Availability and configuration for trunk failover varies by
+switch, but this can be a viable alternative to the ARP monitor when using
+suitable switches.
+
+12. Configuring Bonding for Maximum Throughput
+==============================================
+
+12.1 Maximizing Throughput in a Single Switch Topology
+------------------------------------------------------
+
+	In a single switch configuration, the best method to maximize
+throughput depends upon the application and network environment.  The
+various load balancing modes each have strengths and weaknesses in
+different environments, as detailed below.
+
+	For this discussion, we will break down the topologies into
+two categories.  Depending upon the destination of most traffic, we
+categorize them into either "gatewayed" or "local" configurations.
+
+	In a gatewayed configuration, the "switch" is acting primarily
+as a router, and the majority of traffic passes through this router to
+other networks.  An example would be the following:
+
+
+     +----------+                     +----------+
+     |          |eth0            port1|          | to other networks
+     | Host A   +---------------------+ router   +------------------->
+     |          +---------------------+          | Hosts B and C are out
+     |          |eth1            port2|          | here somewhere
+     +----------+                     +----------+
+
+	The router may be a dedicated router device, or another host
+acting as a gateway.  For our discussion, the important point is that
+the majority of traffic from Host A will pass through the router to
+some other network before reaching its final destination.
+
+	In a gatewayed network configuration, although Host A may
+communicate with many other systems, all of its traffic will be sent
+and received via one other peer on the local network, the router.
+
+	Note that the case of two systems connected directly via
+multiple physical links is, for purposes of configuring bonding, the
+same as a gatewayed configuration.  In that case, it happens that all
+traffic is destined for the "gateway" itself, not some other network
+beyond the gateway.
+
+	In a local configuration, the "switch" is acting primarily as
+a switch, and the majority of traffic passes through this switch to
+reach other stations on the same network.  An example would be the
+following:
+
+    +----------+            +----------+       +--------+
+    |          |eth0   port1|          +-------+ Host B |
+    |  Host A  +------------+  switch  |port3  +--------+
+    |          +------------+          |                  +--------+
+    |          |eth1   port2|          +------------------+ Host C |
+    +----------+            +----------+port4             +--------+
+
+
+	Again, the switch may be a dedicated switch device, or another
+host acting as a gateway.  For our discussion, the important point is
+that the majority of traffic from Host A is destined for other hosts
+on the same local network (Hosts B and C in the above example).
+
+	In summary, in a gatewayed configuration, traffic to and from
+the bonded device will be to the same MAC level peer on the network
+(the gateway itself, i.e., the router), regardless of its final
+destination.  In a local configuration, traffic flows directly to and
+from the final destinations, thus, each destination (Host B, Host C)
+will be addressed directly by their individual MAC addresses.
+
+	This distinction between a gatewayed and a local network
+configuration is important because many of the load balancing modes
+available use the MAC addresses of the local network source and
+destination to make load balancing decisions.  The behavior of each
+mode is described below.
+
+
+12.1.1 MT Bonding Mode Selection for Single Switch Topology
+-----------------------------------------------------------
+
+	This configuration is the easiest to set up and to understand,
+although you will have to decide which bonding mode best suits your
+needs.  The trade offs for each mode are detailed below:
+
+balance-rr: This mode is the only mode that will permit a single
+	TCP/IP connection to stripe traffic across multiple
+	interfaces. It is therefore the only mode that will allow a
+	single TCP/IP stream to utilize more than one interface's
+	worth of throughput.  This comes at a cost, however: the
+	striping generally results in peer systems receiving packets out
+	of order, causing TCP/IP's congestion control system to kick
+	in, often by retransmitting segments.
+
+	It is possible to adjust TCP/IP's congestion limits by
+	altering the net.ipv4.tcp_reordering sysctl parameter.  The
+	usual default value is 3. But keep in mind TCP stack is able
+	to automatically increase this when it detects reorders.
+
+	Note that the fraction of packets that will be delivered out of
+	order is highly variable, and is unlikely to be zero.  The level
+	of reordering depends upon a variety of factors, including the
+	networking interfaces, the switch, and the topology of the
+	configuration.  Speaking in general terms, higher speed network
+	cards produce more reordering (due to factors such as packet
+	coalescing), and a "many to many" topology will reorder at a
+	higher rate than a "many slow to one fast" configuration.
+
+	Many switches do not support any modes that stripe traffic
+	(instead choosing a port based upon IP or MAC level addresses);
+	for those devices, traffic for a particular connection flowing
+	through the switch to a balance-rr bond will not utilize greater
+	than one interface's worth of bandwidth.
+
+	If you are utilizing protocols other than TCP/IP, UDP for
+	example, and your application can tolerate out of order
+	delivery, then this mode can allow for single stream datagram
+	performance that scales near linearly as interfaces are added
+	to the bond.
+
+	This mode requires the switch to have the appropriate ports
+	configured for "etherchannel" or "trunking."
+
+active-backup: There is not much advantage in this network topology to
+	the active-backup mode, as the inactive backup devices are all
+	connected to the same peer as the primary.  In this case, a
+	load balancing mode (with link monitoring) will provide the
+	same level of network availability, but with increased
+	available bandwidth.  On the plus side, active-backup mode
+	does not require any configuration of the switch, so it may
+	have value if the hardware available does not support any of
+	the load balance modes.
+
+balance-xor: This mode will limit traffic such that packets destined
+	for specific peers will always be sent over the same
+	interface.  Since the destination is determined by the MAC
+	addresses involved, this mode works best in a "local" network
+	configuration (as described above), with destinations all on
+	the same local network.  This mode is likely to be suboptimal
+	if all your traffic is passed through a single router (i.e., a
+	"gatewayed" network configuration, as described above).
+
+	As with balance-rr, the switch ports need to be configured for
+	"etherchannel" or "trunking."
+
+broadcast: Like active-backup, there is not much advantage to this
+	mode in this type of network topology.
+
+802.3ad: This mode can be a good choice for this type of network
+	topology.  The 802.3ad mode is an IEEE standard, so all peers
+	that implement 802.3ad should interoperate well.  The 802.3ad
+	protocol includes automatic configuration of the aggregates,
+	so minimal manual configuration of the switch is needed
+	(typically only to designate that some set of devices is
+	available for 802.3ad).  The 802.3ad standard also mandates
+	that frames be delivered in order (within certain limits), so
+	in general single connections will not see misordering of
+	packets.  The 802.3ad mode does have some drawbacks: the
+	standard mandates that all devices in the aggregate operate at
+	the same speed and duplex.  Also, as with all bonding load
+	balance modes other than balance-rr, no single connection will
+	be able to utilize more than a single interface's worth of
+	bandwidth.  
+
+	Additionally, the linux bonding 802.3ad implementation
+	distributes traffic by peer (using an XOR of MAC addresses
+	and packet type ID), so in a "gatewayed" configuration, all
+	outgoing traffic will generally use the same device.  Incoming
+	traffic may also end up on a single device, but that is
+	dependent upon the balancing policy of the peer's 802.3ad
+	implementation.  In a "local" configuration, traffic will be
+	distributed across the devices in the bond.
+
+	Finally, the 802.3ad mode mandates the use of the MII monitor,
+	therefore, the ARP monitor is not available in this mode.
+
+balance-tlb: The balance-tlb mode balances outgoing traffic by peer.
+	Since the balancing is done according to MAC address, in a
+	"gatewayed" configuration (as described above), this mode will
+	send all traffic across a single device.  However, in a
+	"local" network configuration, this mode balances multiple
+	local network peers across devices in a vaguely intelligent
+	manner (not a simple XOR as in balance-xor or 802.3ad mode),
+	so that mathematically unlucky MAC addresses (i.e., ones that
+	XOR to the same value) will not all "bunch up" on a single
+	interface.
+
+	Unlike 802.3ad, interfaces may be of differing speeds, and no
+	special switch configuration is required.  On the down side,
+	in this mode all incoming traffic arrives over a single
+	interface, this mode requires certain ethtool support in the
+	network device driver of the slave interfaces, and the ARP
+	monitor is not available.
+
+balance-alb: This mode is everything that balance-tlb is, and more.
+	It has all of the features (and restrictions) of balance-tlb,
+	and will also balance incoming traffic from local network
+	peers (as described in the Bonding Module Options section,
+	above).
+
+	The only additional down side to this mode is that the network
+	device driver must support changing the hardware address while
+	the device is open.
+
+12.1.2 MT Link Monitoring for Single Switch Topology
+----------------------------------------------------
+
+	The choice of link monitoring may largely depend upon which
+mode you choose to use.  The more advanced load balancing modes do not
+support the use of the ARP monitor, and are thus restricted to using
+the MII monitor (which does not provide as high a level of end to end
+assurance as the ARP monitor).
+
+12.2 Maximum Throughput in a Multiple Switch Topology
+-----------------------------------------------------
+
+	Multiple switches may be utilized to optimize for throughput
+when they are configured in parallel as part of an isolated network
+between two or more systems, for example:
+
+                       +-----------+
+                       |  Host A   | 
+                       +-+---+---+-+
+                         |   |   |
+                +--------+   |   +---------+
+                |            |             |
+         +------+---+  +-----+----+  +-----+----+
+         | Switch A |  | Switch B |  | Switch C |
+         +------+---+  +-----+----+  +-----+----+
+                |            |             |
+                +--------+   |   +---------+
+                         |   |   |
+                       +-+---+---+-+
+                       |  Host B   | 
+                       +-----------+
+
+	In this configuration, the switches are isolated from one
+another.  One reason to employ a topology such as this is for an
+isolated network with many hosts (a cluster configured for high
+performance, for example), using multiple smaller switches can be more
+cost effective than a single larger switch, e.g., on a network with 24
+hosts, three 24 port switches can be significantly less expensive than
+a single 72 port switch.
+
+	If access beyond the network is required, an individual host
+can be equipped with an additional network device connected to an
+external network; this host then additionally acts as a gateway.
+
+12.2.1 MT Bonding Mode Selection for Multiple Switch Topology
+-------------------------------------------------------------
+
+	In actual practice, the bonding mode typically employed in
+configurations of this type is balance-rr.  Historically, in this
+network configuration, the usual caveats about out of order packet
+delivery are mitigated by the use of network adapters that do not do
+any kind of packet coalescing (via the use of NAPI, or because the
+device itself does not generate interrupts until some number of
+packets has arrived).  When employed in this fashion, the balance-rr
+mode allows individual connections between two hosts to effectively
+utilize greater than one interface's bandwidth.
+
+12.2.2 MT Link Monitoring for Multiple Switch Topology
+------------------------------------------------------
+
+	Again, in actual practice, the MII monitor is most often used
+in this configuration, as performance is given preference over
+availability.  The ARP monitor will function in this topology, but its
+advantages over the MII monitor are mitigated by the volume of probes
+needed as the number of systems involved grows (remember that each
+host in the network is configured with bonding).
+
+13. Switch Behavior Issues
+==========================
+
+13.1 Link Establishment and Failover Delays
+-------------------------------------------
+
+	Some switches exhibit undesirable behavior with regard to the
+timing of link up and down reporting by the switch.
+
+	First, when a link comes up, some switches may indicate that
+the link is up (carrier available), but not pass traffic over the
+interface for some period of time.  This delay is typically due to
+some type of autonegotiation or routing protocol, but may also occur
+during switch initialization (e.g., during recovery after a switch
+failure).  If you find this to be a problem, specify an appropriate
+value to the updelay bonding module option to delay the use of the
+relevant interface(s).
+
+	Second, some switches may "bounce" the link state one or more
+times while a link is changing state.  This occurs most commonly while
+the switch is initializing.  Again, an appropriate updelay value may
+help.
+
+	Note that when a bonding interface has no active links, the
+driver will immediately reuse the first link that goes up, even if the
+updelay parameter has been specified (the updelay is ignored in this
+case).  If there are slave interfaces waiting for the updelay timeout
+to expire, the interface that first went into that state will be
+immediately reused.  This reduces down time of the network if the
+value of updelay has been overestimated, and since this occurs only in
+cases with no connectivity, there is no additional penalty for
+ignoring the updelay.
+
+	In addition to the concerns about switch timings, if your
+switches take a long time to go into backup mode, it may be desirable
+to not activate a backup interface immediately after a link goes down.
+Failover may be delayed via the downdelay bonding module option.
+
+13.2 Duplicated Incoming Packets
+--------------------------------
+
+	NOTE: Starting with version 3.0.2, the bonding driver has logic to
+suppress duplicate packets, which should largely eliminate this problem.
+The following description is kept for reference.
+
+	It is not uncommon to observe a short burst of duplicated
+traffic when the bonding device is first used, or after it has been
+idle for some period of time.  This is most easily observed by issuing
+a "ping" to some other host on the network, and noticing that the
+output from ping flags duplicates (typically one per slave).
+
+	For example, on a bond in active-backup mode with five slaves
+all connected to one switch, the output may appear as follows:
+
+# ping -n 10.0.4.2
+PING 10.0.4.2 (10.0.4.2) from 10.0.3.10 : 56(84) bytes of data.
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.7 ms
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=1 ttl=64 time=13.8 ms (DUP!)
+64 bytes from 10.0.4.2: icmp_seq=2 ttl=64 time=0.216 ms
+64 bytes from 10.0.4.2: icmp_seq=3 ttl=64 time=0.267 ms
+64 bytes from 10.0.4.2: icmp_seq=4 ttl=64 time=0.222 ms
+
+	This is not due to an error in the bonding driver, rather, it
+is a side effect of how many switches update their MAC forwarding
+tables.  Initially, the switch does not associate the MAC address in
+the packet with a particular switch port, and so it may send the
+traffic to all ports until its MAC forwarding table is updated.  Since
+the interfaces attached to the bond may occupy multiple ports on a
+single switch, when the switch (temporarily) floods the traffic to all
+ports, the bond device receives multiple copies of the same packet
+(one per slave device).
+
+	The duplicated packet behavior is switch dependent, some
+switches exhibit this, and some do not.  On switches that display this
+behavior, it can be induced by clearing the MAC forwarding table (on
+most Cisco switches, the privileged command "clear mac address-table
+dynamic" will accomplish this).
+
+14. Hardware Specific Considerations
+====================================
+
+	This section contains additional information for configuring
+bonding on specific hardware platforms, or for interfacing bonding
+with particular switches or other devices.
+
+14.1 IBM BladeCenter
+--------------------
+
+	This applies to the JS20 and similar systems.
+
+	On the JS20 blades, the bonding driver supports only
+balance-rr, active-backup, balance-tlb and balance-alb modes.  This is
+largely due to the network topology inside the BladeCenter, detailed
+below.
+
+JS20 network adapter information
+--------------------------------
+
+	All JS20s come with two Broadcom Gigabit Ethernet ports
+integrated on the planar (that's "motherboard" in IBM-speak).  In the
+BladeCenter chassis, the eth0 port of all JS20 blades is hard wired to
+I/O Module #1; similarly, all eth1 ports are wired to I/O Module #2.
+An add-on Broadcom daughter card can be installed on a JS20 to provide
+two more Gigabit Ethernet ports.  These ports, eth2 and eth3, are
+wired to I/O Modules 3 and 4, respectively.
+
+	Each I/O Module may contain either a switch or a passthrough
+module (which allows ports to be directly connected to an external
+switch).  Some bonding modes require a specific BladeCenter internal
+network topology in order to function; these are detailed below.
+
+	Additional BladeCenter-specific networking information can be
+found in two IBM Redbooks (www.ibm.com/redbooks):
+
+"IBM eServer BladeCenter Networking Options"
+"IBM eServer BladeCenter Layer 2-7 Network Switching"
+
+BladeCenter networking configuration
+------------------------------------
+
+	Because a BladeCenter can be configured in a very large number
+of ways, this discussion will be confined to describing basic
+configurations.
+
+	Normally, Ethernet Switch Modules (ESMs) are used in I/O
+modules 1 and 2.  In this configuration, the eth0 and eth1 ports of a
+JS20 will be connected to different internal switches (in the
+respective I/O modules).
+
+	A passthrough module (OPM or CPM, optical or copper,
+passthrough module) connects the I/O module directly to an external
+switch.  By using PMs in I/O module #1 and #2, the eth0 and eth1
+interfaces of a JS20 can be redirected to the outside world and
+connected to a common external switch.
+
+	Depending upon the mix of ESMs and PMs, the network will
+appear to bonding as either a single switch topology (all PMs) or as a
+multiple switch topology (one or more ESMs, zero or more PMs).  It is
+also possible to connect ESMs together, resulting in a configuration
+much like the example in "High Availability in a Multiple Switch
+Topology," above.
+
+Requirements for specific modes
+-------------------------------
+
+	The balance-rr mode requires the use of passthrough modules
+for devices in the bond, all connected to an common external switch.
+That switch must be configured for "etherchannel" or "trunking" on the
+appropriate ports, as is usual for balance-rr.
+
+	The balance-alb and balance-tlb modes will function with
+either switch modules or passthrough modules (or a mix).  The only
+specific requirement for these modes is that all network interfaces
+must be able to reach all destinations for traffic sent over the
+bonding device (i.e., the network must converge at some point outside
+the BladeCenter).
+
+	The active-backup mode has no additional requirements.
+
+Link monitoring issues
+----------------------
+
+	When an Ethernet Switch Module is in place, only the ARP
+monitor will reliably detect link loss to an external switch.  This is
+nothing unusual, but examination of the BladeCenter cabinet would
+suggest that the "external" network ports are the ethernet ports for
+the system, when it fact there is a switch between these "external"
+ports and the devices on the JS20 system itself.  The MII monitor is
+only able to detect link failures between the ESM and the JS20 system.
+
+	When a passthrough module is in place, the MII monitor does
+detect failures to the "external" port, which is then directly
+connected to the JS20 system.
+
+Other concerns
+--------------
+
+	The Serial Over LAN (SoL) link is established over the primary
+ethernet (eth0) only, therefore, any loss of link to eth0 will result
+in losing your SoL connection.  It will not fail over with other
+network traffic, as the SoL system is beyond the control of the
+bonding driver.
+
+	It may be desirable to disable spanning tree on the switch
+(either the internal Ethernet Switch Module, or an external switch) to
+avoid fail-over delay issues when using bonding.
+
+	
+15. Frequently Asked Questions
+==============================
+
+1.  Is it SMP safe?
+
+	Yes. The old 2.0.xx channel bonding patch was not SMP safe.
+The new driver was designed to be SMP safe from the start.
+
+2.  What type of cards will work with it?
+
+	Any Ethernet type cards (you can even mix cards - a Intel
+EtherExpress PRO/100 and a 3com 3c905b, for example).  For most modes,
+devices need not be of the same speed.
+
+	Starting with version 3.2.1, bonding also supports Infiniband
+slaves in active-backup mode.
+
+3.  How many bonding devices can I have?
+
+	There is no limit.
+
+4.  How many slaves can a bonding device have?
+
+	This is limited only by the number of network interfaces Linux
+supports and/or the number of network cards you can place in your
+system.
+
+5.  What happens when a slave link dies?
+
+	If link monitoring is enabled, then the failing device will be
+disabled.  The active-backup mode will fail over to a backup link, and
+other modes will ignore the failed link.  The link will continue to be
+monitored, and should it recover, it will rejoin the bond (in whatever
+manner is appropriate for the mode). See the sections on High
+Availability and the documentation for each mode for additional
+information.
+	
+	Link monitoring can be enabled via either the miimon or
+arp_interval parameters (described in the module parameters section,
+above).  In general, miimon monitors the carrier state as sensed by
+the underlying network device, and the arp monitor (arp_interval)
+monitors connectivity to another host on the local network.
+
+	If no link monitoring is configured, the bonding driver will
+be unable to detect link failures, and will assume that all links are
+always available.  This will likely result in lost packets, and a
+resulting degradation of performance.  The precise performance loss
+depends upon the bonding mode and network configuration.
+
+6.  Can bonding be used for High Availability?
+
+	Yes.  See the section on High Availability for details.
+
+7.  Which switches/systems does it work with?
+
+	The full answer to this depends upon the desired mode.
+
+	In the basic balance modes (balance-rr and balance-xor), it
+works with any system that supports etherchannel (also called
+trunking).  Most managed switches currently available have such
+support, and many unmanaged switches as well.
+
+	The advanced balance modes (balance-tlb and balance-alb) do
+not have special switch requirements, but do need device drivers that
+support specific features (described in the appropriate section under
+module parameters, above).
+
+	In 802.3ad mode, it works with systems that support IEEE
+802.3ad Dynamic Link Aggregation.  Most managed and many unmanaged
+switches currently available support 802.3ad.
+
+        The active-backup mode should work with any Layer-II switch.
+
+8.  Where does a bonding device get its MAC address from?
+
+	When using slave devices that have fixed MAC addresses, or when
+the fail_over_mac option is enabled, the bonding device's MAC address is
+the MAC address of the active slave.
+
+	For other configurations, if not explicitly configured (with
+ifconfig or ip link), the MAC address of the bonding device is taken from
+its first slave device.  This MAC address is then passed to all following
+slaves and remains persistent (even if the first slave is removed) until
+the bonding device is brought down or reconfigured.
+
+	If you wish to change the MAC address, you can set it with
+ifconfig or ip link:
+
+# ifconfig bond0 hw ether 00:11:22:33:44:55
+
+# ip link set bond0 address 66:77:88:99:aa:bb
+
+	The MAC address can be also changed by bringing down/up the
+device and then changing its slaves (or their order):
+
+# ifconfig bond0 down ; modprobe -r bonding
+# ifconfig bond0 .... up
+# ifenslave bond0 eth...
+
+	This method will automatically take the address from the next
+slave that is added.
+
+	To restore your slaves' MAC addresses, you need to detach them
+from the bond (`ifenslave -d bond0 eth0'). The bonding driver will
+then restore the MAC addresses that the slaves had before they were
+enslaved.
+
+16. Resources and Links
+=======================
+
+	The latest version of the bonding driver can be found in the latest
+version of the linux kernel, found on http://kernel.org
+
+	The latest version of this document can be found in the latest kernel
+source (named Documentation/networking/bonding.txt).
+
+	Discussions regarding the usage of the bonding driver take place on the
+bonding-devel mailing list, hosted at sourceforge.net. If you have questions or
+problems, post them to the list.  The list address is:
+
+bonding-devel@lists.sourceforge.net
+
+	The administrative interface (to subscribe or unsubscribe) can
+be found at:
+
+https://lists.sourceforge.net/lists/listinfo/bonding-devel
+
+	Discussions regarding the development of the bonding driver take place
+on the main Linux network mailing list, hosted at vger.kernel.org. The list
+address is:
+
+netdev@vger.kernel.org
+
+	The administrative interface (to subscribe or unsubscribe) can
+be found at:
+
+http://vger.kernel.org/vger-lists.html#netdev
+
+Donald Becker's Ethernet Drivers and diag programs may be found at :
+ - http://web.archive.org/web/*/http://www.scyld.com/network/ 
+
+You will also find a lot of information regarding Ethernet, NWay, MII,
+etc. at www.scyld.com.
+
+-- END --
diff --git a/Documentation/networking/bridge.rst b/Documentation/networking/bridge.rst
new file mode 100644
index 000000000..4aef9cddd
--- /dev/null
+++ b/Documentation/networking/bridge.rst
@@ -0,0 +1,21 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Ethernet Bridging
+=================
+
+In order to use the Ethernet bridging functionality, you'll need the
+userspace tools.
+
+Documentation for Linux bridging is on:
+   http://www.linuxfoundation.org/collaborate/workgroups/networking/bridge
+
+The bridge-utilities are maintained at:
+   git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/bridge-utils.git
+
+Additionally, the iproute2 utilities can be used to configure
+bridge devices.
+
+If you still have questions, don't hesitate to post to the mailing list 
+(more info https://lists.linux-foundation.org/mailman/listinfo/bridge).
+
diff --git a/Documentation/networking/caif/Linux-CAIF.txt b/Documentation/networking/caif/Linux-CAIF.txt
new file mode 100644
index 000000000..0aa4bd381
--- /dev/null
+++ b/Documentation/networking/caif/Linux-CAIF.txt
@@ -0,0 +1,175 @@
+Linux CAIF
+===========
+copyright (C) ST-Ericsson AB 2010
+Author: Sjur Brendeland/ sjur.brandeland@stericsson.com
+License terms: GNU General Public License (GPL) version 2
+
+
+Introduction
+------------
+CAIF is a MUX protocol used by ST-Ericsson cellular modems for
+communication between Modem and host. The host processes can open virtual AT
+channels, initiate GPRS Data connections, Video channels and Utility Channels.
+The Utility Channels are general purpose pipes between modem and host.
+
+ST-Ericsson modems support a number of transports between modem
+and host. Currently, UART and Loopback are available for Linux.
+
+
+Architecture:
+------------
+The implementation of CAIF is divided into:
+* CAIF Socket Layer and GPRS IP Interface.
+* CAIF Core Protocol Implementation
+* CAIF Link Layer, implemented as NET devices.
+
+
+  RTNL
+   !
+   !	      +------+	 +------+
+   !	     +------+!	+------+!
+   !	     !	IP  !!	!Socket!!
+   +-------> !interf!+	! API  !+	<- CAIF Client APIs
+   !	     +------+	+------!
+   !		!	    !
+   !		+-----------+
+   !		      !
+   !		   +------+		<- CAIF Core Protocol
+   !		   ! CAIF !
+   !		   ! Core !
+   !		   +------+
+   !	   +----------!---------+
+   !	   !	      !		!
+   !	+------+   +-----+   +------+
+   +--> ! HSI  !   ! TTY !   ! USB  !	<- Link Layer (Net Devices)
+	+------+   +-----+   +------+
+
+
+
+I M P L E M E N T A T I O N
+===========================
+
+
+CAIF Core Protocol Layer
+=========================================
+
+CAIF Core layer implements the CAIF protocol as defined by ST-Ericsson.
+It implements the CAIF protocol stack in a layered approach, where
+each layer described in the specification is implemented as a separate layer.
+The architecture is inspired by the design patterns "Protocol Layer" and
+"Protocol Packet".
+
+== CAIF structure ==
+The Core CAIF implementation contains:
+      -	Simple implementation of CAIF.
+      -	Layered architecture (a la Streams), each layer in the CAIF
+	specification is implemented in a separate c-file.
+      -	Clients must call configuration function to add PHY layer.
+      -	Clients must implement CAIF layer to consume/produce
+	CAIF payload with receive and transmit functions.
+      -	Clients must call configuration function to add and connect the
+	Client layer.
+      - When receiving / transmitting CAIF Packets (cfpkt), ownership is passed
+	to the called function (except for framing layers' receive function)
+
+Layered Architecture
+--------------------
+The CAIF protocol can be divided into two parts: Support functions and Protocol
+Implementation. The support functions include:
+
+      - CFPKT CAIF Packet. Implementation of CAIF Protocol Packet. The
+	CAIF Packet has functions for creating, destroying and adding content
+	and for adding/extracting header and trailers to protocol packets.
+
+The CAIF Protocol implementation contains:
+
+      - CFCNFG CAIF Configuration layer. Configures the CAIF Protocol
+	Stack and provides a Client interface for adding Link-Layer and
+	Driver interfaces on top of the CAIF Stack.
+
+      - CFCTRL CAIF Control layer. Encodes and Decodes control messages
+	such as enumeration and channel setup. Also matches request and
+	response messages.
+
+      - CFSERVL General CAIF Service Layer functionality; handles flow
+	control and remote shutdown requests.
+
+      - CFVEI CAIF VEI layer. Handles CAIF AT Channels on VEI (Virtual
+	External Interface). This layer encodes/decodes VEI frames.
+
+      - CFDGML CAIF Datagram layer. Handles CAIF Datagram layer (IP
+	traffic), encodes/decodes Datagram frames.
+
+      - CFMUX CAIF Mux layer. Handles multiplexing between multiple
+	physical bearers and multiple channels such as VEI, Datagram, etc.
+	The MUX keeps track of the existing CAIF Channels and
+	Physical Instances and selects the appropriate instance based
+	on Channel-Id and Physical-ID.
+
+      - CFFRML CAIF Framing layer. Handles Framing i.e. Frame length
+	and frame checksum.
+
+      - CFSERL CAIF Serial layer. Handles concatenation/split of frames
+	into CAIF Frames with correct length.
+
+
+
+		    +---------+
+		    | Config  |
+		    | CFCNFG  |
+		    +---------+
+			 !
+    +---------+	    +---------+	    +---------+
+    |	AT    |	    | Control |	    | Datagram|
+    | CFVEIL  |	    | CFCTRL  |	    | CFDGML  |
+    +---------+	    +---------+	    +---------+
+	   \_____________!______________/
+			 !
+		    +---------+
+		    |	MUX   |
+		    |	      |
+		    +---------+
+		    _____!_____
+		   /	       \
+	    +---------+	    +---------+
+	    | CFFRML  |	    | CFFRML  |
+	    | Framing |	    | Framing |
+	    +---------+	    +---------+
+		 !		!
+	    +---------+	    +---------+
+	    |	      |	    | Serial  |
+	    |	      |	    | CFSERL  |
+	    +---------+	    +---------+
+
+
+In this layered approach the following "rules" apply.
+      - All layers embed the same structure "struct cflayer"
+      - A layer does not depend on any other layer's private data.
+      - Layers are stacked by setting the pointers
+		  layer->up , layer->dn
+      -	In order to send data upwards, each layer should do
+		 layer->up->receive(layer->up, packet);
+      - In order to send data downwards, each layer should do
+		 layer->dn->transmit(layer->dn, packet);
+
+
+CAIF Socket and IP interface
+===========================
+
+The IP interface and CAIF socket API are implemented on top of the
+CAIF Core protocol. The IP Interface and CAIF socket have an instance of
+'struct cflayer', just like the CAIF Core protocol stack.
+Net device and Socket implement the 'receive()' function defined by
+'struct cflayer', just like the rest of the CAIF stack. In this way, transmit and
+receive of packets is handled as by the rest of the layers: the 'dn->transmit()'
+function is called in order to transmit data.
+
+Configuration of Link Layer
+---------------------------
+The Link Layer is implemented as Linux network devices (struct net_device).
+Payload handling and registration is done using standard Linux mechanisms.
+
+The CAIF Protocol relies on a loss-less link layer without implementing
+retransmission. This implies that packet drops must not happen.
+Therefore a flow-control mechanism is implemented where the physical
+interface can initiate flow stop for all CAIF Channels.
diff --git a/Documentation/networking/caif/README b/Documentation/networking/caif/README
new file mode 100644
index 000000000..757ccfaa1
--- /dev/null
+++ b/Documentation/networking/caif/README
@@ -0,0 +1,109 @@
+Copyright (C) ST-Ericsson AB 2010
+Author: Sjur Brendeland/ sjur.brandeland@stericsson.com
+License terms: GNU General Public License (GPL) version 2
+---------------------------------------------------------
+
+=== Start ===
+If you have compiled CAIF for modules do:
+
+$modprobe crc_ccitt
+$modprobe caif
+$modprobe caif_socket
+$modprobe chnl_net
+
+
+=== Preparing the setup with a STE modem ===
+
+If you are working on integration of CAIF you should make sure
+that the kernel is built with module support.
+
+There are some things that need to be tweaked to get the host TTY correctly
+set up to talk to the modem.
+Since the CAIF stack is running in the kernel and we want to use the existing
+TTY, we are installing our physical serial driver as a line discipline above
+the TTY device.
+
+To achieve this we need to install the N_CAIF ldisc from user space.
+The benefit is that we can hook up to any TTY.
+
+The use of Start-of-frame-extension (STX) must also be set as
+module parameter "ser_use_stx".
+
+Normally Frame Checksum is always used on UART, but this is also provided as a
+module parameter "ser_use_fcs".
+
+$ modprobe caif_serial ser_ttyname=/dev/ttyS0 ser_use_stx=yes
+$ ifconfig caif_ttyS0 up
+
+PLEASE NOTE: 	There is a limitation in Android shell.
+		It only accepts one argument to insmod/modprobe!
+
+=== Trouble shooting ===
+
+There are debugfs parameters provided for serial communication.
+/sys/kernel/debug/caif_serial/<tty-name>/
+
+* ser_state:   Prints the bit-mask status where
+  - 0x02 means SENDING, this is a transient state.
+  - 0x10 means FLOW_OFF_SENT, i.e. the previous frame has not been sent
+	and is blocking further send operation. Flow OFF has been propagated
+	to all CAIF Channels using this TTY.
+
+* tty_status: Prints the bit-mask tty status information
+  - 0x01 - tty->warned is on.
+  - 0x02 - tty->low_latency is on.
+  - 0x04 - tty->packed is on.
+  - 0x08 - tty->flow_stopped is on.
+  - 0x10 - tty->hw_stopped is on.
+  - 0x20 - tty->stopped is on.
+
+* last_tx_msg: Binary blob Prints the last transmitted frame.
+	This can be printed with
+	$od --format=x1 /sys/kernel/debug/caif_serial/<tty>/last_rx_msg.
+	The first two tx messages sent look like this. Note: The initial
+	byte 02 is start of frame extension (STX) used for re-syncing
+	upon errors.
+
+  - Enumeration:
+        0000000  02 05 00 00 03 01 d2 02
+                 |  |     |  |  |  |
+                 STX(1)   |  |  |  |
+                    Length(2)|  |  |
+                          Control Channel(1)
+                             Command:Enumeration(1)
+                                Link-ID(1)
+                                    Checksum(2)
+  - Channel Setup:
+        0000000  02 07 00 00 00 21 a1 00 48 df
+                 |  |     |  |  |  |  |  |
+                 STX(1)   |  |  |  |  |  |
+                    Length(2)|  |  |  |  |
+                          Control Channel(1)
+                             Command:Channel Setup(1)
+                                Channel Type(1)
+                                    Priority and Link-ID(1)
+				      Endpoint(1)
+					  Checksum(2)
+
+* last_rx_msg: Prints the last transmitted frame.
+	The RX messages for LinkSetup look almost identical but they have the
+	bit 0x20 set in the command bit, and Channel Setup has added one byte
+	before Checksum containing Channel ID.
+	NOTE: Several CAIF Messages might be concatenated. The maximum debug
+	buffer size is 128 bytes.
+
+== Error Scenarios:
+- last_tx_msg contains channel setup message and last_rx_msg is empty ->
+  The host seems to be able to send over the UART, at least the CAIF ldisc get
+  notified that sending is completed.
+
+- last_tx_msg contains enumeration message and last_rx_msg is empty ->
+  The host is not able to send the message from UART, the tty has not been
+  able to complete the transmit operation.
+
+- if /sys/kernel/debug/caif_serial/<tty>/tty_status is non-zero there
+  might be problems transmitting over UART.
+  E.g. host and modem wiring is not correct you will typically see
+  tty_status = 0x10 (hw_stopped) and ser_state = 0x10 (FLOW_OFF_SENT).
+  You will probably see the enumeration message in last_tx_message
+  and empty last_rx_message.
diff --git a/Documentation/networking/caif/spi_porting.txt b/Documentation/networking/caif/spi_porting.txt
new file mode 100644
index 000000000..9efd0687d
--- /dev/null
+++ b/Documentation/networking/caif/spi_porting.txt
@@ -0,0 +1,208 @@
+- CAIF SPI porting -
+
+- CAIF SPI basics:
+
+Running CAIF over SPI needs some extra setup, owing to the nature of SPI.
+Two extra GPIOs have been added in order to negotiate the transfers
+ between the master and the slave. The minimum requirement for running
+CAIF over SPI is a SPI slave chip and two GPIOs (more details below).
+Please note that running as a slave implies that you need to keep up
+with the master clock. An overrun or underrun event is fatal.
+
+- CAIF SPI framework:
+
+To make porting as easy as possible, the CAIF SPI has been divided in
+two parts. The first part (called the interface part) deals with all
+generic functionality such as length framing, SPI frame negotiation
+and SPI frame delivery and transmission. The other part is the CAIF
+SPI slave device part, which is the module that you have to write if
+you want to run SPI CAIF on a new hardware. This part takes care of
+the physical hardware, both with regard to SPI and to GPIOs.
+
+- Implementing a CAIF SPI device:
+
+	- Functionality provided by the CAIF SPI slave device:
+
+	In order to implement a SPI device you will, as a minimum,
+	need to implement the following
+	functions:
+
+	int (*init_xfer) (struct cfspi_xfer * xfer, struct cfspi_dev *dev):
+
+	This function is called by the CAIF SPI interface to give
+	you a chance to set up your hardware to be ready to receive
+	a stream of data from the master. The xfer structure contains
+	both physical and logical addresses, as well as the total length
+	of the transfer in both directions.The dev parameter can be used
+	to map to different CAIF SPI slave devices.
+
+	void (*sig_xfer) (bool xfer, struct cfspi_dev *dev):
+
+	This function is called by the CAIF SPI interface when the output
+	(SPI_INT) GPIO needs to change state. The boolean value of the xfer
+	variable indicates whether the GPIO should be asserted (HIGH) or
+	deasserted (LOW). The dev parameter can be used to map to different CAIF
+	SPI slave devices.
+
+	- Functionality provided by the CAIF SPI interface:
+
+	void (*ss_cb) (bool assert, struct cfspi_ifc *ifc);
+
+	This function is called by the CAIF SPI slave device in order to
+	signal a change of state of the input GPIO (SS) to the interface.
+	Only active edges are mandatory to be reported.
+	This function can be called from IRQ context (recommended in order
+	not to introduce latency). The ifc parameter should be the pointer
+	returned from the platform probe function in the SPI device structure.
+
+	void (*xfer_done_cb) (struct cfspi_ifc *ifc);
+
+	This function is called by the CAIF SPI slave device in order to
+	report that a transfer is completed. This function should only be
+	called once both the transmission and the reception are completed.
+	This function can be called from IRQ context (recommended in order
+	not to introduce latency). The ifc parameter should be the pointer
+	returned from the platform probe function in the SPI device structure.
+
+	- Connecting the bits and pieces:
+
+		- Filling in the SPI slave device structure:
+
+		Connect the necessary callback functions.
+		Indicate clock speed (used to calculate toggle delays).
+		Chose a suitable name (helps debugging if you use several CAIF
+		SPI slave devices).
+		Assign your private data (can be used to map to your structure).
+
+		- Filling in the SPI slave platform device structure:
+		Add name of driver to connect to ("cfspi_sspi").
+		Assign the SPI slave device structure as platform data.
+
+- Padding:
+
+In order to optimize throughput, a number of SPI padding options are provided.
+Padding can be enabled independently for uplink and downlink transfers.
+Padding can be enabled for the head, the tail and for the total frame size.
+The padding needs to be correctly configured on both sides of the link.
+The padding can be changed via module parameters in cfspi_sspi.c or via
+the sysfs directory of the cfspi_sspi driver (before device registration).
+
+- CAIF SPI device template:
+
+/*
+ *	Copyright (C) ST-Ericsson AB 2010
+ *	Author: Daniel Martensson / Daniel.Martensson@stericsson.com
+ *	License terms: GNU General Public License (GPL), version 2.
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/wait.h>
+#include <linux/interrupt.h>
+#include <linux/dma-mapping.h>
+#include <net/caif/caif_spi.h>
+
+MODULE_LICENSE("GPL");
+
+struct sspi_struct {
+	struct cfspi_dev sdev;
+	struct cfspi_xfer *xfer;
+};
+
+static struct sspi_struct slave;
+static struct platform_device slave_device;
+
+static irqreturn_t sspi_irq(int irq, void *arg)
+{
+	/* You only need to trigger on an edge to the active state of the
+	 * SS signal. Once a edge is detected, the ss_cb() function should be
+	 * called with the parameter assert set to true. It is OK
+	 * (and even advised) to call the ss_cb() function in IRQ context in
+	 * order not to add any delay. */
+
+	return IRQ_HANDLED;
+}
+
+static void sspi_complete(void *context)
+{
+	/* Normally the DMA or the SPI framework will call you back
+	 * in something similar to this. The only thing you need to
+	 * do is to call the xfer_done_cb() function, providing the pointer
+	 * to the CAIF SPI interface. It is OK to call this function
+	 * from IRQ context. */
+}
+
+static int sspi_init_xfer(struct cfspi_xfer *xfer, struct cfspi_dev *dev)
+{
+	/* Store transfer info. For a normal implementation you should
+	 * set up your DMA here and make sure that you are ready to
+	 * receive the data from the master SPI. */
+
+	struct sspi_struct *sspi = (struct sspi_struct *)dev->priv;
+
+	sspi->xfer = xfer;
+
+	return 0;
+}
+
+void sspi_sig_xfer(bool xfer, struct cfspi_dev *dev)
+{
+	/* If xfer is true then you should assert the SPI_INT to indicate to
+	 * the master that you are ready to receive the data from the master
+	 * SPI. If xfer is false then you should de-assert SPI_INT to indicate
+	 * that the transfer is done.
+	 */
+
+	struct sspi_struct *sspi = (struct sspi_struct *)dev->priv;
+}
+
+static void sspi_release(struct device *dev)
+{
+	/*
+	 * Here you should release your SPI device resources.
+	 */
+}
+
+static int __init sspi_init(void)
+{
+	/* Here you should initialize your SPI device by providing the
+	 * necessary functions, clock speed, name and private data. Once
+	 * done, you can register your device with the
+	 * platform_device_register() function. This function will return
+	 * with the CAIF SPI interface initialized. This is probably also
+	 * the place where you should set up your GPIOs, interrupts and SPI
+	 * resources. */
+
+	int res = 0;
+
+	/* Initialize slave device. */
+	slave.sdev.init_xfer = sspi_init_xfer;
+	slave.sdev.sig_xfer = sspi_sig_xfer;
+	slave.sdev.clk_mhz = 13;
+	slave.sdev.priv = &slave;
+	slave.sdev.name = "spi_sspi";
+	slave_device.dev.release = sspi_release;
+
+	/* Initialize platform device. */
+	slave_device.name = "cfspi_sspi";
+	slave_device.dev.platform_data = &slave.sdev;
+
+	/* Register platform device. */
+	res = platform_device_register(&slave_device);
+	if (res) {
+		printk(KERN_WARNING "sspi_init: failed to register dev.\n");
+		return -ENODEV;
+	}
+
+	return res;
+}
+
+static void __exit sspi_exit(void)
+{
+	platform_device_del(&slave_device);
+}
+
+module_init(sspi_init);
+module_exit(sspi_exit);
diff --git a/Documentation/networking/can.rst b/Documentation/networking/can.rst
new file mode 100644
index 000000000..2fd0b51a8
--- /dev/null
+++ b/Documentation/networking/can.rst
@@ -0,0 +1,1437 @@
+===================================
+SocketCAN - Controller Area Network
+===================================
+
+Overview / What is SocketCAN
+============================
+
+The socketcan package is an implementation of CAN protocols
+(Controller Area Network) for Linux.  CAN is a networking technology
+which has widespread use in automation, embedded devices, and
+automotive fields.  While there have been other CAN implementations
+for Linux based on character devices, SocketCAN uses the Berkeley
+socket API, the Linux network stack and implements the CAN device
+drivers as network interfaces.  The CAN socket API has been designed
+as similar as possible to the TCP/IP protocols to allow programmers,
+familiar with network programming, to easily learn how to use CAN
+sockets.
+
+
+.. _socketcan-motivation:
+
+Motivation / Why Using the Socket API
+=====================================
+
+There have been CAN implementations for Linux before SocketCAN so the
+question arises, why we have started another project.  Most existing
+implementations come as a device driver for some CAN hardware, they
+are based on character devices and provide comparatively little
+functionality.  Usually, there is only a hardware-specific device
+driver which provides a character device interface to send and
+receive raw CAN frames, directly to/from the controller hardware.
+Queueing of frames and higher-level transport protocols like ISO-TP
+have to be implemented in user space applications.  Also, most
+character-device implementations support only one single process to
+open the device at a time, similar to a serial interface.  Exchanging
+the CAN controller requires employment of another device driver and
+often the need for adaption of large parts of the application to the
+new driver's API.
+
+SocketCAN was designed to overcome all of these limitations.  A new
+protocol family has been implemented which provides a socket interface
+to user space applications and which builds upon the Linux network
+layer, enabling use all of the provided queueing functionality.  A device
+driver for CAN controller hardware registers itself with the Linux
+network layer as a network device, so that CAN frames from the
+controller can be passed up to the network layer and on to the CAN
+protocol family module and also vice-versa.  Also, the protocol family
+module provides an API for transport protocol modules to register, so
+that any number of transport protocols can be loaded or unloaded
+dynamically.  In fact, the can core module alone does not provide any
+protocol and cannot be used without loading at least one additional
+protocol module.  Multiple sockets can be opened at the same time,
+on different or the same protocol module and they can listen/send
+frames on different or the same CAN IDs.  Several sockets listening on
+the same interface for frames with the same CAN ID are all passed the
+same received matching CAN frames.  An application wishing to
+communicate using a specific transport protocol, e.g. ISO-TP, just
+selects that protocol when opening the socket, and then can read and
+write application data byte streams, without having to deal with
+CAN-IDs, frames, etc.
+
+Similar functionality visible from user-space could be provided by a
+character device, too, but this would lead to a technically inelegant
+solution for a couple of reasons:
+
+* **Intricate usage:**  Instead of passing a protocol argument to
+  socket(2) and using bind(2) to select a CAN interface and CAN ID, an
+  application would have to do all these operations using ioctl(2)s.
+
+* **Code duplication:**  A character device cannot make use of the Linux
+  network queueing code, so all that code would have to be duplicated
+  for CAN networking.
+
+* **Abstraction:**  In most existing character-device implementations, the
+  hardware-specific device driver for a CAN controller directly
+  provides the character device for the application to work with.
+  This is at least very unusual in Unix systems for both, char and
+  block devices.  For example you don't have a character device for a
+  certain UART of a serial interface, a certain sound chip in your
+  computer, a SCSI or IDE controller providing access to your hard
+  disk or tape streamer device.  Instead, you have abstraction layers
+  which provide a unified character or block device interface to the
+  application on the one hand, and a interface for hardware-specific
+  device drivers on the other hand.  These abstractions are provided
+  by subsystems like the tty layer, the audio subsystem or the SCSI
+  and IDE subsystems for the devices mentioned above.
+
+  The easiest way to implement a CAN device driver is as a character
+  device without such a (complete) abstraction layer, as is done by most
+  existing drivers.  The right way, however, would be to add such a
+  layer with all the functionality like registering for certain CAN
+  IDs, supporting several open file descriptors and (de)multiplexing
+  CAN frames between them, (sophisticated) queueing of CAN frames, and
+  providing an API for device drivers to register with.  However, then
+  it would be no more difficult, or may be even easier, to use the
+  networking framework provided by the Linux kernel, and this is what
+  SocketCAN does.
+
+The use of the networking framework of the Linux kernel is just the
+natural and most appropriate way to implement CAN for Linux.
+
+
+.. _socketcan-concept:
+
+SocketCAN Concept
+=================
+
+As described in :ref:`socketcan-motivation` the main goal of SocketCAN is to
+provide a socket interface to user space applications which builds
+upon the Linux network layer. In contrast to the commonly known
+TCP/IP and ethernet networking, the CAN bus is a broadcast-only(!)
+medium that has no MAC-layer addressing like ethernet. The CAN-identifier
+(can_id) is used for arbitration on the CAN-bus. Therefore the CAN-IDs
+have to be chosen uniquely on the bus. When designing a CAN-ECU
+network the CAN-IDs are mapped to be sent by a specific ECU.
+For this reason a CAN-ID can be treated best as a kind of source address.
+
+
+.. _socketcan-receive-lists:
+
+Receive Lists
+-------------
+
+The network transparent access of multiple applications leads to the
+problem that different applications may be interested in the same
+CAN-IDs from the same CAN network interface. The SocketCAN core
+module - which implements the protocol family CAN - provides several
+high efficient receive lists for this reason. If e.g. a user space
+application opens a CAN RAW socket, the raw protocol module itself
+requests the (range of) CAN-IDs from the SocketCAN core that are
+requested by the user. The subscription and unsubscription of
+CAN-IDs can be done for specific CAN interfaces or for all(!) known
+CAN interfaces with the can_rx_(un)register() functions provided to
+CAN protocol modules by the SocketCAN core (see :ref:`socketcan-core-module`).
+To optimize the CPU usage at runtime the receive lists are split up
+into several specific lists per device that match the requested
+filter complexity for a given use-case.
+
+
+.. _socketcan-local-loopback1:
+
+Local Loopback of Sent Frames
+-----------------------------
+
+As known from other networking concepts the data exchanging
+applications may run on the same or different nodes without any
+change (except for the according addressing information):
+
+.. code::
+
+	 ___   ___   ___                   _______   ___
+	| _ | | _ | | _ |                 | _   _ | | _ |
+	||A|| ||B|| ||C||                 ||A| |B|| ||C||
+	|___| |___| |___|                 |_______| |___|
+	  |     |     |                       |       |
+	-----------------(1)- CAN bus -(2)---------------
+
+To ensure that application A receives the same information in the
+example (2) as it would receive in example (1) there is need for
+some kind of local loopback of the sent CAN frames on the appropriate
+node.
+
+The Linux network devices (by default) just can handle the
+transmission and reception of media dependent frames. Due to the
+arbitration on the CAN bus the transmission of a low prio CAN-ID
+may be delayed by the reception of a high prio CAN frame. To
+reflect the correct [#f1]_ traffic on the node the loopback of the sent
+data has to be performed right after a successful transmission. If
+the CAN network interface is not capable of performing the loopback for
+some reason the SocketCAN core can do this task as a fallback solution.
+See :ref:`socketcan-local-loopback1` for details (recommended).
+
+The loopback functionality is enabled by default to reflect standard
+networking behaviour for CAN applications. Due to some requests from
+the RT-SocketCAN group the loopback optionally may be disabled for each
+separate socket. See sockopts from the CAN RAW sockets in :ref:`socketcan-raw-sockets`.
+
+.. [#f1] you really like to have this when you're running analyser
+       tools like 'candump' or 'cansniffer' on the (same) node.
+
+
+.. _socketcan-network-problem-notifications:
+
+Network Problem Notifications
+-----------------------------
+
+The use of the CAN bus may lead to several problems on the physical
+and media access control layer. Detecting and logging of these lower
+layer problems is a vital requirement for CAN users to identify
+hardware issues on the physical transceiver layer as well as
+arbitration problems and error frames caused by the different
+ECUs. The occurrence of detected errors are important for diagnosis
+and have to be logged together with the exact timestamp. For this
+reason the CAN interface driver can generate so called Error Message
+Frames that can optionally be passed to the user application in the
+same way as other CAN frames. Whenever an error on the physical layer
+or the MAC layer is detected (e.g. by the CAN controller) the driver
+creates an appropriate error message frame. Error messages frames can
+be requested by the user application using the common CAN filter
+mechanisms. Inside this filter definition the (interested) type of
+errors may be selected. The reception of error messages is disabled
+by default. The format of the CAN error message frame is briefly
+described in the Linux header file "include/uapi/linux/can/error.h".
+
+
+How to use SocketCAN
+====================
+
+Like TCP/IP, you first need to open a socket for communicating over a
+CAN network. Since SocketCAN implements a new protocol family, you
+need to pass PF_CAN as the first argument to the socket(2) system
+call. Currently, there are two CAN protocols to choose from, the raw
+socket protocol and the broadcast manager (BCM). So to open a socket,
+you would write::
+
+    s = socket(PF_CAN, SOCK_RAW, CAN_RAW);
+
+and::
+
+    s = socket(PF_CAN, SOCK_DGRAM, CAN_BCM);
+
+respectively.  After the successful creation of the socket, you would
+normally use the bind(2) system call to bind the socket to a CAN
+interface (which is different from TCP/IP due to different addressing
+- see :ref:`socketcan-concept`). After binding (CAN_RAW) or connecting (CAN_BCM)
+the socket, you can read(2) and write(2) from/to the socket or use
+send(2), sendto(2), sendmsg(2) and the recv* counterpart operations
+on the socket as usual. There are also CAN specific socket options
+described below.
+
+The basic CAN frame structure and the sockaddr structure are defined
+in include/linux/can.h:
+
+.. code-block:: C
+
+    struct can_frame {
+            canid_t can_id;  /* 32 bit CAN_ID + EFF/RTR/ERR flags */
+            __u8    can_dlc; /* frame payload length in byte (0 .. 8) */
+            __u8    __pad;   /* padding */
+            __u8    __res0;  /* reserved / padding */
+            __u8    __res1;  /* reserved / padding */
+            __u8    data[8] __attribute__((aligned(8)));
+    };
+
+The alignment of the (linear) payload data[] to a 64bit boundary
+allows the user to define their own structs and unions to easily access
+the CAN payload. There is no given byteorder on the CAN bus by
+default. A read(2) system call on a CAN_RAW socket transfers a
+struct can_frame to the user space.
+
+The sockaddr_can structure has an interface index like the
+PF_PACKET socket, that also binds to a specific interface:
+
+.. code-block:: C
+
+    struct sockaddr_can {
+            sa_family_t can_family;
+            int         can_ifindex;
+            union {
+                    /* transport protocol class address info (e.g. ISOTP) */
+                    struct { canid_t rx_id, tx_id; } tp;
+
+                    /* reserved for future CAN protocols address information */
+            } can_addr;
+    };
+
+To determine the interface index an appropriate ioctl() has to
+be used (example for CAN_RAW sockets without error checking):
+
+.. code-block:: C
+
+    int s;
+    struct sockaddr_can addr;
+    struct ifreq ifr;
+
+    s = socket(PF_CAN, SOCK_RAW, CAN_RAW);
+
+    strcpy(ifr.ifr_name, "can0" );
+    ioctl(s, SIOCGIFINDEX, &ifr);
+
+    addr.can_family = AF_CAN;
+    addr.can_ifindex = ifr.ifr_ifindex;
+
+    bind(s, (struct sockaddr *)&addr, sizeof(addr));
+
+    (..)
+
+To bind a socket to all(!) CAN interfaces the interface index must
+be 0 (zero). In this case the socket receives CAN frames from every
+enabled CAN interface. To determine the originating CAN interface
+the system call recvfrom(2) may be used instead of read(2). To send
+on a socket that is bound to 'any' interface sendto(2) is needed to
+specify the outgoing interface.
+
+Reading CAN frames from a bound CAN_RAW socket (see above) consists
+of reading a struct can_frame:
+
+.. code-block:: C
+
+    struct can_frame frame;
+
+    nbytes = read(s, &frame, sizeof(struct can_frame));
+
+    if (nbytes < 0) {
+            perror("can raw socket read");
+            return 1;
+    }
+
+    /* paranoid check ... */
+    if (nbytes < sizeof(struct can_frame)) {
+            fprintf(stderr, "read: incomplete CAN frame\n");
+            return 1;
+    }
+
+    /* do something with the received CAN frame */
+
+Writing CAN frames can be done similarly, with the write(2) system call::
+
+    nbytes = write(s, &frame, sizeof(struct can_frame));
+
+When the CAN interface is bound to 'any' existing CAN interface
+(addr.can_ifindex = 0) it is recommended to use recvfrom(2) if the
+information about the originating CAN interface is needed:
+
+.. code-block:: C
+
+    struct sockaddr_can addr;
+    struct ifreq ifr;
+    socklen_t len = sizeof(addr);
+    struct can_frame frame;
+
+    nbytes = recvfrom(s, &frame, sizeof(struct can_frame),
+                      0, (struct sockaddr*)&addr, &len);
+
+    /* get interface name of the received CAN frame */
+    ifr.ifr_ifindex = addr.can_ifindex;
+    ioctl(s, SIOCGIFNAME, &ifr);
+    printf("Received a CAN frame from interface %s", ifr.ifr_name);
+
+To write CAN frames on sockets bound to 'any' CAN interface the
+outgoing interface has to be defined certainly:
+
+.. code-block:: C
+
+    strcpy(ifr.ifr_name, "can0");
+    ioctl(s, SIOCGIFINDEX, &ifr);
+    addr.can_ifindex = ifr.ifr_ifindex;
+    addr.can_family  = AF_CAN;
+
+    nbytes = sendto(s, &frame, sizeof(struct can_frame),
+                    0, (struct sockaddr*)&addr, sizeof(addr));
+
+An accurate timestamp can be obtained with an ioctl(2) call after reading
+a message from the socket:
+
+.. code-block:: C
+
+    struct timeval tv;
+    ioctl(s, SIOCGSTAMP, &tv);
+
+The timestamp has a resolution of one microsecond and is set automatically
+at the reception of a CAN frame.
+
+Remark about CAN FD (flexible data rate) support:
+
+Generally the handling of CAN FD is very similar to the formerly described
+examples. The new CAN FD capable CAN controllers support two different
+bitrates for the arbitration phase and the payload phase of the CAN FD frame
+and up to 64 bytes of payload. This extended payload length breaks all the
+kernel interfaces (ABI) which heavily rely on the CAN frame with fixed eight
+bytes of payload (struct can_frame) like the CAN_RAW socket. Therefore e.g.
+the CAN_RAW socket supports a new socket option CAN_RAW_FD_FRAMES that
+switches the socket into a mode that allows the handling of CAN FD frames
+and (legacy) CAN frames simultaneously (see :ref:`socketcan-rawfd`).
+
+The struct canfd_frame is defined in include/linux/can.h:
+
+.. code-block:: C
+
+    struct canfd_frame {
+            canid_t can_id;  /* 32 bit CAN_ID + EFF/RTR/ERR flags */
+            __u8    len;     /* frame payload length in byte (0 .. 64) */
+            __u8    flags;   /* additional flags for CAN FD */
+            __u8    __res0;  /* reserved / padding */
+            __u8    __res1;  /* reserved / padding */
+            __u8    data[64] __attribute__((aligned(8)));
+    };
+
+The struct canfd_frame and the existing struct can_frame have the can_id,
+the payload length and the payload data at the same offset inside their
+structures. This allows to handle the different structures very similar.
+When the content of a struct can_frame is copied into a struct canfd_frame
+all structure elements can be used as-is - only the data[] becomes extended.
+
+When introducing the struct canfd_frame it turned out that the data length
+code (DLC) of the struct can_frame was used as a length information as the
+length and the DLC has a 1:1 mapping in the range of 0 .. 8. To preserve
+the easy handling of the length information the canfd_frame.len element
+contains a plain length value from 0 .. 64. So both canfd_frame.len and
+can_frame.can_dlc are equal and contain a length information and no DLC.
+For details about the distinction of CAN and CAN FD capable devices and
+the mapping to the bus-relevant data length code (DLC), see :ref:`socketcan-can-fd-driver`.
+
+The length of the two CAN(FD) frame structures define the maximum transfer
+unit (MTU) of the CAN(FD) network interface and skbuff data length. Two
+definitions are specified for CAN specific MTUs in include/linux/can.h:
+
+.. code-block:: C
+
+  #define CAN_MTU   (sizeof(struct can_frame))   == 16  => 'legacy' CAN frame
+  #define CANFD_MTU (sizeof(struct canfd_frame)) == 72  => CAN FD frame
+
+
+.. _socketcan-raw-sockets:
+
+RAW Protocol Sockets with can_filters (SOCK_RAW)
+------------------------------------------------
+
+Using CAN_RAW sockets is extensively comparable to the commonly
+known access to CAN character devices. To meet the new possibilities
+provided by the multi user SocketCAN approach, some reasonable
+defaults are set at RAW socket binding time:
+
+- The filters are set to exactly one filter receiving everything
+- The socket only receives valid data frames (=> no error message frames)
+- The loopback of sent CAN frames is enabled (see :ref:`socketcan-local-loopback2`)
+- The socket does not receive its own sent frames (in loopback mode)
+
+These default settings may be changed before or after binding the socket.
+To use the referenced definitions of the socket options for CAN_RAW
+sockets, include <linux/can/raw.h>.
+
+
+.. _socketcan-rawfilter:
+
+RAW socket option CAN_RAW_FILTER
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The reception of CAN frames using CAN_RAW sockets can be controlled
+by defining 0 .. n filters with the CAN_RAW_FILTER socket option.
+
+The CAN filter structure is defined in include/linux/can.h:
+
+.. code-block:: C
+
+    struct can_filter {
+            canid_t can_id;
+            canid_t can_mask;
+    };
+
+A filter matches, when:
+
+.. code-block:: C
+
+    <received_can_id> & mask == can_id & mask
+
+which is analogous to known CAN controllers hardware filter semantics.
+The filter can be inverted in this semantic, when the CAN_INV_FILTER
+bit is set in can_id element of the can_filter structure. In
+contrast to CAN controller hardware filters the user may set 0 .. n
+receive filters for each open socket separately:
+
+.. code-block:: C
+
+    struct can_filter rfilter[2];
+
+    rfilter[0].can_id   = 0x123;
+    rfilter[0].can_mask = CAN_SFF_MASK;
+    rfilter[1].can_id   = 0x200;
+    rfilter[1].can_mask = 0x700;
+
+    setsockopt(s, SOL_CAN_RAW, CAN_RAW_FILTER, &rfilter, sizeof(rfilter));
+
+To disable the reception of CAN frames on the selected CAN_RAW socket:
+
+.. code-block:: C
+
+    setsockopt(s, SOL_CAN_RAW, CAN_RAW_FILTER, NULL, 0);
+
+To set the filters to zero filters is quite obsolete as to not read
+data causes the raw socket to discard the received CAN frames. But
+having this 'send only' use-case we may remove the receive list in the
+Kernel to save a little (really a very little!) CPU usage.
+
+CAN Filter Usage Optimisation
+.............................
+
+The CAN filters are processed in per-device filter lists at CAN frame
+reception time. To reduce the number of checks that need to be performed
+while walking through the filter lists the CAN core provides an optimized
+filter handling when the filter subscription focusses on a single CAN ID.
+
+For the possible 2048 SFF CAN identifiers the identifier is used as an index
+to access the corresponding subscription list without any further checks.
+For the 2^29 possible EFF CAN identifiers a 10 bit XOR folding is used as
+hash function to retrieve the EFF table index.
+
+To benefit from the optimized filters for single CAN identifiers the
+CAN_SFF_MASK or CAN_EFF_MASK have to be set into can_filter.mask together
+with set CAN_EFF_FLAG and CAN_RTR_FLAG bits. A set CAN_EFF_FLAG bit in the
+can_filter.mask makes clear that it matters whether a SFF or EFF CAN ID is
+subscribed. E.g. in the example from above:
+
+.. code-block:: C
+
+    rfilter[0].can_id   = 0x123;
+    rfilter[0].can_mask = CAN_SFF_MASK;
+
+both SFF frames with CAN ID 0x123 and EFF frames with 0xXXXXX123 can pass.
+
+To filter for only 0x123 (SFF) and 0x12345678 (EFF) CAN identifiers the
+filter has to be defined in this way to benefit from the optimized filters:
+
+.. code-block:: C
+
+    struct can_filter rfilter[2];
+
+    rfilter[0].can_id   = 0x123;
+    rfilter[0].can_mask = (CAN_EFF_FLAG | CAN_RTR_FLAG | CAN_SFF_MASK);
+    rfilter[1].can_id   = 0x12345678 | CAN_EFF_FLAG;
+    rfilter[1].can_mask = (CAN_EFF_FLAG | CAN_RTR_FLAG | CAN_EFF_MASK);
+
+    setsockopt(s, SOL_CAN_RAW, CAN_RAW_FILTER, &rfilter, sizeof(rfilter));
+
+
+RAW Socket Option CAN_RAW_ERR_FILTER
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As described in :ref:`socketcan-network-problem-notifications` the CAN interface driver can generate so
+called Error Message Frames that can optionally be passed to the user
+application in the same way as other CAN frames. The possible
+errors are divided into different error classes that may be filtered
+using the appropriate error mask. To register for every possible
+error condition CAN_ERR_MASK can be used as value for the error mask.
+The values for the error mask are defined in linux/can/error.h:
+
+.. code-block:: C
+
+    can_err_mask_t err_mask = ( CAN_ERR_TX_TIMEOUT | CAN_ERR_BUSOFF );
+
+    setsockopt(s, SOL_CAN_RAW, CAN_RAW_ERR_FILTER,
+               &err_mask, sizeof(err_mask));
+
+
+RAW Socket Option CAN_RAW_LOOPBACK
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To meet multi user needs the local loopback is enabled by default
+(see :ref:`socketcan-local-loopback1` for details). But in some embedded use-cases
+(e.g. when only one application uses the CAN bus) this loopback
+functionality can be disabled (separately for each socket):
+
+.. code-block:: C
+
+    int loopback = 0; /* 0 = disabled, 1 = enabled (default) */
+
+    setsockopt(s, SOL_CAN_RAW, CAN_RAW_LOOPBACK, &loopback, sizeof(loopback));
+
+
+RAW socket option CAN_RAW_RECV_OWN_MSGS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When the local loopback is enabled, all the sent CAN frames are
+looped back to the open CAN sockets that registered for the CAN
+frames' CAN-ID on this given interface to meet the multi user
+needs. The reception of the CAN frames on the same socket that was
+sending the CAN frame is assumed to be unwanted and therefore
+disabled by default. This default behaviour may be changed on
+demand:
+
+.. code-block:: C
+
+    int recv_own_msgs = 1; /* 0 = disabled (default), 1 = enabled */
+
+    setsockopt(s, SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS,
+               &recv_own_msgs, sizeof(recv_own_msgs));
+
+
+.. _socketcan-rawfd:
+
+RAW Socket Option CAN_RAW_FD_FRAMES
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+CAN FD support in CAN_RAW sockets can be enabled with a new socket option
+CAN_RAW_FD_FRAMES which is off by default. When the new socket option is
+not supported by the CAN_RAW socket (e.g. on older kernels), switching the
+CAN_RAW_FD_FRAMES option returns the error -ENOPROTOOPT.
+
+Once CAN_RAW_FD_FRAMES is enabled the application can send both CAN frames
+and CAN FD frames. OTOH the application has to handle CAN and CAN FD frames
+when reading from the socket:
+
+.. code-block:: C
+
+    CAN_RAW_FD_FRAMES enabled:  CAN_MTU and CANFD_MTU are allowed
+    CAN_RAW_FD_FRAMES disabled: only CAN_MTU is allowed (default)
+
+Example:
+
+.. code-block:: C
+
+    [ remember: CANFD_MTU == sizeof(struct canfd_frame) ]
+
+    struct canfd_frame cfd;
+
+    nbytes = read(s, &cfd, CANFD_MTU);
+
+    if (nbytes == CANFD_MTU) {
+            printf("got CAN FD frame with length %d\n", cfd.len);
+            /* cfd.flags contains valid data */
+    } else if (nbytes == CAN_MTU) {
+            printf("got legacy CAN frame with length %d\n", cfd.len);
+            /* cfd.flags is undefined */
+    } else {
+            fprintf(stderr, "read: invalid CAN(FD) frame\n");
+            return 1;
+    }
+
+    /* the content can be handled independently from the received MTU size */
+
+    printf("can_id: %X data length: %d data: ", cfd.can_id, cfd.len);
+    for (i = 0; i < cfd.len; i++)
+            printf("%02X ", cfd.data[i]);
+
+When reading with size CANFD_MTU only returns CAN_MTU bytes that have
+been received from the socket a legacy CAN frame has been read into the
+provided CAN FD structure. Note that the canfd_frame.flags data field is
+not specified in the struct can_frame and therefore it is only valid in
+CANFD_MTU sized CAN FD frames.
+
+Implementation hint for new CAN applications:
+
+To build a CAN FD aware application use struct canfd_frame as basic CAN
+data structure for CAN_RAW based applications. When the application is
+executed on an older Linux kernel and switching the CAN_RAW_FD_FRAMES
+socket option returns an error: No problem. You'll get legacy CAN frames
+or CAN FD frames and can process them the same way.
+
+When sending to CAN devices make sure that the device is capable to handle
+CAN FD frames by checking if the device maximum transfer unit is CANFD_MTU.
+The CAN device MTU can be retrieved e.g. with a SIOCGIFMTU ioctl() syscall.
+
+
+RAW socket option CAN_RAW_JOIN_FILTERS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The CAN_RAW socket can set multiple CAN identifier specific filters that
+lead to multiple filters in the af_can.c filter processing. These filters
+are indenpendent from each other which leads to logical OR'ed filters when
+applied (see :ref:`socketcan-rawfilter`).
+
+This socket option joines the given CAN filters in the way that only CAN
+frames are passed to user space that matched *all* given CAN filters. The
+semantic for the applied filters is therefore changed to a logical AND.
+
+This is useful especially when the filterset is a combination of filters
+where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or
+CAN ID ranges from the incoming traffic.
+
+
+RAW Socket Returned Message Flags
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When using recvmsg() call, the msg->msg_flags may contain following flags:
+
+MSG_DONTROUTE:
+	set when the received frame was created on the local host.
+
+MSG_CONFIRM:
+	set when the frame was sent via the socket it is received on.
+	This flag can be interpreted as a 'transmission confirmation' when the
+	CAN driver supports the echo of frames on driver level, see
+	:ref:`socketcan-local-loopback1` and :ref:`socketcan-local-loopback2`.
+	In order to receive such messages, CAN_RAW_RECV_OWN_MSGS must be set.
+
+
+Broadcast Manager Protocol Sockets (SOCK_DGRAM)
+-----------------------------------------------
+
+The Broadcast Manager protocol provides a command based configuration
+interface to filter and send (e.g. cyclic) CAN messages in kernel space.
+
+Receive filters can be used to down sample frequent messages; detect events
+such as message contents changes, packet length changes, and do time-out
+monitoring of received messages.
+
+Periodic transmission tasks of CAN frames or a sequence of CAN frames can be
+created and modified at runtime; both the message content and the two
+possible transmit intervals can be altered.
+
+A BCM socket is not intended for sending individual CAN frames using the
+struct can_frame as known from the CAN_RAW socket. Instead a special BCM
+configuration message is defined. The basic BCM configuration message used
+to communicate with the broadcast manager and the available operations are
+defined in the linux/can/bcm.h include. The BCM message consists of a
+message header with a command ('opcode') followed by zero or more CAN frames.
+The broadcast manager sends responses to user space in the same form:
+
+.. code-block:: C
+
+    struct bcm_msg_head {
+            __u32 opcode;                   /* command */
+            __u32 flags;                    /* special flags */
+            __u32 count;                    /* run 'count' times with ival1 */
+            struct timeval ival1, ival2;    /* count and subsequent interval */
+            canid_t can_id;                 /* unique can_id for task */
+            __u32 nframes;                  /* number of can_frames following */
+            struct can_frame frames[0];
+    };
+
+The aligned payload 'frames' uses the same basic CAN frame structure defined
+at the beginning of :ref:`socketcan-rawfd` and in the include/linux/can.h include. All
+messages to the broadcast manager from user space have this structure.
+
+Note a CAN_BCM socket must be connected instead of bound after socket
+creation (example without error checking):
+
+.. code-block:: C
+
+    int s;
+    struct sockaddr_can addr;
+    struct ifreq ifr;
+
+    s = socket(PF_CAN, SOCK_DGRAM, CAN_BCM);
+
+    strcpy(ifr.ifr_name, "can0");
+    ioctl(s, SIOCGIFINDEX, &ifr);
+
+    addr.can_family = AF_CAN;
+    addr.can_ifindex = ifr.ifr_ifindex;
+
+    connect(s, (struct sockaddr *)&addr, sizeof(addr));
+
+    (..)
+
+The broadcast manager socket is able to handle any number of in flight
+transmissions or receive filters concurrently. The different RX/TX jobs are
+distinguished by the unique can_id in each BCM message. However additional
+CAN_BCM sockets are recommended to communicate on multiple CAN interfaces.
+When the broadcast manager socket is bound to 'any' CAN interface (=> the
+interface index is set to zero) the configured receive filters apply to any
+CAN interface unless the sendto() syscall is used to overrule the 'any' CAN
+interface index. When using recvfrom() instead of read() to retrieve BCM
+socket messages the originating CAN interface is provided in can_ifindex.
+
+
+Broadcast Manager Operations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The opcode defines the operation for the broadcast manager to carry out,
+or details the broadcast managers response to several events, including
+user requests.
+
+Transmit Operations (user space to broadcast manager):
+
+TX_SETUP:
+	Create (cyclic) transmission task.
+
+TX_DELETE:
+	Remove (cyclic) transmission task, requires only can_id.
+
+TX_READ:
+	Read properties of (cyclic) transmission task for can_id.
+
+TX_SEND:
+	Send one CAN frame.
+
+Transmit Responses (broadcast manager to user space):
+
+TX_STATUS:
+	Reply to TX_READ request (transmission task configuration).
+
+TX_EXPIRED:
+	Notification when counter finishes sending at initial interval
+	'ival1'. Requires the TX_COUNTEVT flag to be set at TX_SETUP.
+
+Receive Operations (user space to broadcast manager):
+
+RX_SETUP:
+	Create RX content filter subscription.
+
+RX_DELETE:
+	Remove RX content filter subscription, requires only can_id.
+
+RX_READ:
+	Read properties of RX content filter subscription for can_id.
+
+Receive Responses (broadcast manager to user space):
+
+RX_STATUS:
+	Reply to RX_READ request (filter task configuration).
+
+RX_TIMEOUT:
+	Cyclic message is detected to be absent (timer ival1 expired).
+
+RX_CHANGED:
+	BCM message with updated CAN frame (detected content change).
+	Sent on first message received or on receipt of revised CAN messages.
+
+
+Broadcast Manager Message Flags
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When sending a message to the broadcast manager the 'flags' element may
+contain the following flag definitions which influence the behaviour:
+
+SETTIMER:
+	Set the values of ival1, ival2 and count
+
+STARTTIMER:
+	Start the timer with the actual values of ival1, ival2
+	and count. Starting the timer leads simultaneously to emit a CAN frame.
+
+TX_COUNTEVT:
+	Create the message TX_EXPIRED when count expires
+
+TX_ANNOUNCE:
+	A change of data by the process is emitted immediately.
+
+TX_CP_CAN_ID:
+	Copies the can_id from the message header to each
+	subsequent frame in frames. This is intended as usage simplification. For
+	TX tasks the unique can_id from the message header may differ from the
+	can_id(s) stored for transmission in the subsequent struct can_frame(s).
+
+RX_FILTER_ID:
+	Filter by can_id alone, no frames required (nframes=0).
+
+RX_CHECK_DLC:
+	A change of the DLC leads to an RX_CHANGED.
+
+RX_NO_AUTOTIMER:
+	Prevent automatically starting the timeout monitor.
+
+RX_ANNOUNCE_RESUME:
+	If passed at RX_SETUP and a receive timeout occurred, a
+	RX_CHANGED message will be generated when the (cyclic) receive restarts.
+
+TX_RESET_MULTI_IDX:
+	Reset the index for the multiple frame transmission.
+
+RX_RTR_FRAME:
+	Send reply for RTR-request (placed in op->frames[0]).
+
+
+Broadcast Manager Transmission Timers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Periodic transmission configurations may use up to two interval timers.
+In this case the BCM sends a number of messages ('count') at an interval
+'ival1', then continuing to send at another given interval 'ival2'. When
+only one timer is needed 'count' is set to zero and only 'ival2' is used.
+When SET_TIMER and START_TIMER flag were set the timers are activated.
+The timer values can be altered at runtime when only SET_TIMER is set.
+
+
+Broadcast Manager message sequence transmission
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Up to 256 CAN frames can be transmitted in a sequence in the case of a cyclic
+TX task configuration. The number of CAN frames is provided in the 'nframes'
+element of the BCM message head. The defined number of CAN frames are added
+as array to the TX_SETUP BCM configuration message:
+
+.. code-block:: C
+
+    /* create a struct to set up a sequence of four CAN frames */
+    struct {
+            struct bcm_msg_head msg_head;
+            struct can_frame frame[4];
+    } mytxmsg;
+
+    (..)
+    mytxmsg.msg_head.nframes = 4;
+    (..)
+
+    write(s, &mytxmsg, sizeof(mytxmsg));
+
+With every transmission the index in the array of CAN frames is increased
+and set to zero at index overflow.
+
+
+Broadcast Manager Receive Filter Timers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The timer values ival1 or ival2 may be set to non-zero values at RX_SETUP.
+When the SET_TIMER flag is set the timers are enabled:
+
+ival1:
+	Send RX_TIMEOUT when a received message is not received again within
+	the given time. When START_TIMER is set at RX_SETUP the timeout detection
+	is activated directly - even without a former CAN frame reception.
+
+ival2:
+	Throttle the received message rate down to the value of ival2. This
+	is useful to reduce messages for the application when the signal inside the
+	CAN frame is stateless as state changes within the ival2 periode may get
+	lost.
+
+Broadcast Manager Multiplex Message Receive Filter
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To filter for content changes in multiplex message sequences an array of more
+than one CAN frames can be passed in a RX_SETUP configuration message. The
+data bytes of the first CAN frame contain the mask of relevant bits that
+have to match in the subsequent CAN frames with the received CAN frame.
+If one of the subsequent CAN frames is matching the bits in that frame data
+mark the relevant content to be compared with the previous received content.
+Up to 257 CAN frames (multiplex filter bit mask CAN frame plus 256 CAN
+filters) can be added as array to the TX_SETUP BCM configuration message:
+
+.. code-block:: C
+
+    /* usually used to clear CAN frame data[] - beware of endian problems! */
+    #define U64_DATA(p) (*(unsigned long long*)(p)->data)
+
+    struct {
+            struct bcm_msg_head msg_head;
+            struct can_frame frame[5];
+    } msg;
+
+    msg.msg_head.opcode  = RX_SETUP;
+    msg.msg_head.can_id  = 0x42;
+    msg.msg_head.flags   = 0;
+    msg.msg_head.nframes = 5;
+    U64_DATA(&msg.frame[0]) = 0xFF00000000000000ULL; /* MUX mask */
+    U64_DATA(&msg.frame[1]) = 0x01000000000000FFULL; /* data mask (MUX 0x01) */
+    U64_DATA(&msg.frame[2]) = 0x0200FFFF000000FFULL; /* data mask (MUX 0x02) */
+    U64_DATA(&msg.frame[3]) = 0x330000FFFFFF0003ULL; /* data mask (MUX 0x33) */
+    U64_DATA(&msg.frame[4]) = 0x4F07FC0FF0000000ULL; /* data mask (MUX 0x4F) */
+
+    write(s, &msg, sizeof(msg));
+
+
+Broadcast Manager CAN FD Support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The programming API of the CAN_BCM depends on struct can_frame which is
+given as array directly behind the bcm_msg_head structure. To follow this
+schema for the CAN FD frames a new flag 'CAN_FD_FRAME' in the bcm_msg_head
+flags indicates that the concatenated CAN frame structures behind the
+bcm_msg_head are defined as struct canfd_frame:
+
+.. code-block:: C
+
+    struct {
+            struct bcm_msg_head msg_head;
+            struct canfd_frame frame[5];
+    } msg;
+
+    msg.msg_head.opcode  = RX_SETUP;
+    msg.msg_head.can_id  = 0x42;
+    msg.msg_head.flags   = CAN_FD_FRAME;
+    msg.msg_head.nframes = 5;
+    (..)
+
+When using CAN FD frames for multiplex filtering the MUX mask is still
+expected in the first 64 bit of the struct canfd_frame data section.
+
+
+Connected Transport Protocols (SOCK_SEQPACKET)
+----------------------------------------------
+
+(to be written)
+
+
+Unconnected Transport Protocols (SOCK_DGRAM)
+--------------------------------------------
+
+(to be written)
+
+
+.. _socketcan-core-module:
+
+SocketCAN Core Module
+=====================
+
+The SocketCAN core module implements the protocol family
+PF_CAN. CAN protocol modules are loaded by the core module at
+runtime. The core module provides an interface for CAN protocol
+modules to subscribe needed CAN IDs (see :ref:`socketcan-receive-lists`).
+
+
+can.ko Module Params
+--------------------
+
+- **stats_timer**:
+  To calculate the SocketCAN core statistics
+  (e.g. current/maximum frames per second) this 1 second timer is
+  invoked at can.ko module start time by default. This timer can be
+  disabled by using stattimer=0 on the module commandline.
+
+- **debug**:
+  (removed since SocketCAN SVN r546)
+
+
+procfs content
+--------------
+
+As described in :ref:`socketcan-receive-lists` the SocketCAN core uses several filter
+lists to deliver received CAN frames to CAN protocol modules. These
+receive lists, their filters and the count of filter matches can be
+checked in the appropriate receive list. All entries contain the
+device and a protocol module identifier::
+
+    foo@bar:~$ cat /proc/net/can/rcvlist_all
+
+    receive list 'rx_all':
+      (vcan3: no entry)
+      (vcan2: no entry)
+      (vcan1: no entry)
+      device   can_id   can_mask  function  userdata   matches  ident
+       vcan0     000    00000000  f88e6370  f6c6f400         0  raw
+      (any: no entry)
+
+In this example an application requests any CAN traffic from vcan0::
+
+    rcvlist_all - list for unfiltered entries (no filter operations)
+    rcvlist_eff - list for single extended frame (EFF) entries
+    rcvlist_err - list for error message frames masks
+    rcvlist_fil - list for mask/value filters
+    rcvlist_inv - list for mask/value filters (inverse semantic)
+    rcvlist_sff - list for single standard frame (SFF) entries
+
+Additional procfs files in /proc/net/can::
+
+    stats       - SocketCAN core statistics (rx/tx frames, match ratios, ...)
+    reset_stats - manual statistic reset
+    version     - prints the SocketCAN core version and the ABI version
+
+
+Writing Own CAN Protocol Modules
+--------------------------------
+
+To implement a new protocol in the protocol family PF_CAN a new
+protocol has to be defined in include/linux/can.h .
+The prototypes and definitions to use the SocketCAN core can be
+accessed by including include/linux/can/core.h .
+In addition to functions that register the CAN protocol and the
+CAN device notifier chain there are functions to subscribe CAN
+frames received by CAN interfaces and to send CAN frames::
+
+    can_rx_register   - subscribe CAN frames from a specific interface
+    can_rx_unregister - unsubscribe CAN frames from a specific interface
+    can_send          - transmit a CAN frame (optional with local loopback)
+
+For details see the kerneldoc documentation in net/can/af_can.c or
+the source code of net/can/raw.c or net/can/bcm.c .
+
+
+CAN Network Drivers
+===================
+
+Writing a CAN network device driver is much easier than writing a
+CAN character device driver. Similar to other known network device
+drivers you mainly have to deal with:
+
+- TX: Put the CAN frame from the socket buffer to the CAN controller.
+- RX: Put the CAN frame from the CAN controller to the socket buffer.
+
+See e.g. at Documentation/networking/netdevices.txt . The differences
+for writing CAN network device driver are described below:
+
+
+General Settings
+----------------
+
+.. code-block:: C
+
+    dev->type  = ARPHRD_CAN; /* the netdevice hardware type */
+    dev->flags = IFF_NOARP;  /* CAN has no arp */
+
+    dev->mtu = CAN_MTU; /* sizeof(struct can_frame) -> legacy CAN interface */
+
+    or alternative, when the controller supports CAN with flexible data rate:
+    dev->mtu = CANFD_MTU; /* sizeof(struct canfd_frame) -> CAN FD interface */
+
+The struct can_frame or struct canfd_frame is the payload of each socket
+buffer (skbuff) in the protocol family PF_CAN.
+
+
+.. _socketcan-local-loopback2:
+
+Local Loopback of Sent Frames
+-----------------------------
+
+As described in :ref:`socketcan-local-loopback1` the CAN network device driver should
+support a local loopback functionality similar to the local echo
+e.g. of tty devices. In this case the driver flag IFF_ECHO has to be
+set to prevent the PF_CAN core from locally echoing sent frames
+(aka loopback) as fallback solution::
+
+    dev->flags = (IFF_NOARP | IFF_ECHO);
+
+
+CAN Controller Hardware Filters
+-------------------------------
+
+To reduce the interrupt load on deep embedded systems some CAN
+controllers support the filtering of CAN IDs or ranges of CAN IDs.
+These hardware filter capabilities vary from controller to
+controller and have to be identified as not feasible in a multi-user
+networking approach. The use of the very controller specific
+hardware filters could make sense in a very dedicated use-case, as a
+filter on driver level would affect all users in the multi-user
+system. The high efficient filter sets inside the PF_CAN core allow
+to set different multiple filters for each socket separately.
+Therefore the use of hardware filters goes to the category 'handmade
+tuning on deep embedded systems'. The author is running a MPC603e
+@133MHz with four SJA1000 CAN controllers from 2002 under heavy bus
+load without any problems ...
+
+
+The Virtual CAN Driver (vcan)
+-----------------------------
+
+Similar to the network loopback devices, vcan offers a virtual local
+CAN interface. A full qualified address on CAN consists of
+
+- a unique CAN Identifier (CAN ID)
+- the CAN bus this CAN ID is transmitted on (e.g. can0)
+
+so in common use cases more than one virtual CAN interface is needed.
+
+The virtual CAN interfaces allow the transmission and reception of CAN
+frames without real CAN controller hardware. Virtual CAN network
+devices are usually named 'vcanX', like vcan0 vcan1 vcan2 ...
+When compiled as a module the virtual CAN driver module is called vcan.ko
+
+Since Linux Kernel version 2.6.24 the vcan driver supports the Kernel
+netlink interface to create vcan network devices. The creation and
+removal of vcan network devices can be managed with the ip(8) tool::
+
+  - Create a virtual CAN network interface:
+       $ ip link add type vcan
+
+  - Create a virtual CAN network interface with a specific name 'vcan42':
+       $ ip link add dev vcan42 type vcan
+
+  - Remove a (virtual CAN) network interface 'vcan42':
+       $ ip link del vcan42
+
+
+The CAN Network Device Driver Interface
+---------------------------------------
+
+The CAN network device driver interface provides a generic interface
+to setup, configure and monitor CAN network devices. The user can then
+configure the CAN device, like setting the bit-timing parameters, via
+the netlink interface using the program "ip" from the "IPROUTE2"
+utility suite. The following chapter describes briefly how to use it.
+Furthermore, the interface uses a common data structure and exports a
+set of common functions, which all real CAN network device drivers
+should use. Please have a look to the SJA1000 or MSCAN driver to
+understand how to use them. The name of the module is can-dev.ko.
+
+
+Netlink interface to set/get devices properties
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The CAN device must be configured via netlink interface. The supported
+netlink message types are defined and briefly described in
+"include/linux/can/netlink.h". CAN link support for the program "ip"
+of the IPROUTE2 utility suite is available and it can be used as shown
+below:
+
+Setting CAN device properties::
+
+    $ ip link set can0 type can help
+    Usage: ip link set DEVICE type can
+        [ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
+        [ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
+          phase-seg2 PHASE-SEG2 [ sjw SJW ] ]
+
+        [ dbitrate BITRATE [ dsample-point SAMPLE-POINT] ] |
+        [ dtq TQ dprop-seg PROP_SEG dphase-seg1 PHASE-SEG1
+          dphase-seg2 PHASE-SEG2 [ dsjw SJW ] ]
+
+        [ loopback { on | off } ]
+        [ listen-only { on | off } ]
+        [ triple-sampling { on | off } ]
+        [ one-shot { on | off } ]
+        [ berr-reporting { on | off } ]
+        [ fd { on | off } ]
+        [ fd-non-iso { on | off } ]
+        [ presume-ack { on | off } ]
+
+        [ restart-ms TIME-MS ]
+        [ restart ]
+
+        Where: BITRATE       := { 1..1000000 }
+               SAMPLE-POINT  := { 0.000..0.999 }
+               TQ            := { NUMBER }
+               PROP-SEG      := { 1..8 }
+               PHASE-SEG1    := { 1..8 }
+               PHASE-SEG2    := { 1..8 }
+               SJW           := { 1..4 }
+               RESTART-MS    := { 0 | NUMBER }
+
+Display CAN device details and statistics::
+
+    $ ip -details -statistics link show can0
+    2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP qlen 10
+      link/can
+      can <TRIPLE-SAMPLING> state ERROR-ACTIVE restart-ms 100
+      bitrate 125000 sample_point 0.875
+      tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
+      sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
+      clock 8000000
+      re-started bus-errors arbit-lost error-warn error-pass bus-off
+      41         17457      0          41         42         41
+      RX: bytes  packets  errors  dropped overrun mcast
+      140859     17608    17457   0       0       0
+      TX: bytes  packets  errors  dropped carrier collsns
+      861        112      0       41      0       0
+
+More info to the above output:
+
+"<TRIPLE-SAMPLING>"
+	Shows the list of selected CAN controller modes: LOOPBACK,
+	LISTEN-ONLY, or TRIPLE-SAMPLING.
+
+"state ERROR-ACTIVE"
+	The current state of the CAN controller: "ERROR-ACTIVE",
+	"ERROR-WARNING", "ERROR-PASSIVE", "BUS-OFF" or "STOPPED"
+
+"restart-ms 100"
+	Automatic restart delay time. If set to a non-zero value, a
+	restart of the CAN controller will be triggered automatically
+	in case of a bus-off condition after the specified delay time
+	in milliseconds. By default it's off.
+
+"bitrate 125000 sample-point 0.875"
+	Shows the real bit-rate in bits/sec and the sample-point in the
+	range 0.000..0.999. If the calculation of bit-timing parameters
+	is enabled in the kernel (CONFIG_CAN_CALC_BITTIMING=y), the
+	bit-timing can be defined by setting the "bitrate" argument.
+	Optionally the "sample-point" can be specified. By default it's
+	0.000 assuming CIA-recommended sample-points.
+
+"tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1"
+	Shows the time quanta in ns, propagation segment, phase buffer
+	segment 1 and 2 and the synchronisation jump width in units of
+	tq. They allow to define the CAN bit-timing in a hardware
+	independent format as proposed by the Bosch CAN 2.0 spec (see
+	chapter 8 of http://www.semiconductors.bosch.de/pdf/can2spec.pdf).
+
+"sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1 clock 8000000"
+	Shows the bit-timing constants of the CAN controller, here the
+	"sja1000". The minimum and maximum values of the time segment 1
+	and 2, the synchronisation jump width in units of tq, the
+	bitrate pre-scaler and the CAN system clock frequency in Hz.
+	These constants could be used for user-defined (non-standard)
+	bit-timing calculation algorithms in user-space.
+
+"re-started bus-errors arbit-lost error-warn error-pass bus-off"
+	Shows the number of restarts, bus and arbitration lost errors,
+	and the state changes to the error-warning, error-passive and
+	bus-off state. RX overrun errors are listed in the "overrun"
+	field of the standard network statistics.
+
+Setting the CAN Bit-Timing
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The CAN bit-timing parameters can always be defined in a hardware
+independent format as proposed in the Bosch CAN 2.0 specification
+specifying the arguments "tq", "prop_seg", "phase_seg1", "phase_seg2"
+and "sjw"::
+
+    $ ip link set canX type can tq 125 prop-seg 6 \
+				phase-seg1 7 phase-seg2 2 sjw 1
+
+If the kernel option CONFIG_CAN_CALC_BITTIMING is enabled, CIA
+recommended CAN bit-timing parameters will be calculated if the bit-
+rate is specified with the argument "bitrate"::
+
+    $ ip link set canX type can bitrate 125000
+
+Note that this works fine for the most common CAN controllers with
+standard bit-rates but may *fail* for exotic bit-rates or CAN system
+clock frequencies. Disabling CONFIG_CAN_CALC_BITTIMING saves some
+space and allows user-space tools to solely determine and set the
+bit-timing parameters. The CAN controller specific bit-timing
+constants can be used for that purpose. They are listed by the
+following command::
+
+    $ ip -details link show can0
+    ...
+      sja1000: clock 8000000 tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
+
+
+Starting and Stopping the CAN Network Device
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A CAN network device is started or stopped as usual with the command
+"ifconfig canX up/down" or "ip link set canX up/down". Be aware that
+you *must* define proper bit-timing parameters for real CAN devices
+before you can start it to avoid error-prone default settings::
+
+    $ ip link set canX up type can bitrate 125000
+
+A device may enter the "bus-off" state if too many errors occurred on
+the CAN bus. Then no more messages are received or sent. An automatic
+bus-off recovery can be enabled by setting the "restart-ms" to a
+non-zero value, e.g.::
+
+    $ ip link set canX type can restart-ms 100
+
+Alternatively, the application may realize the "bus-off" condition
+by monitoring CAN error message frames and do a restart when
+appropriate with the command::
+
+    $ ip link set canX type can restart
+
+Note that a restart will also create a CAN error message frame (see
+also :ref:`socketcan-network-problem-notifications`).
+
+
+.. _socketcan-can-fd-driver:
+
+CAN FD (Flexible Data Rate) Driver Support
+------------------------------------------
+
+CAN FD capable CAN controllers support two different bitrates for the
+arbitration phase and the payload phase of the CAN FD frame. Therefore a
+second bit timing has to be specified in order to enable the CAN FD bitrate.
+
+Additionally CAN FD capable CAN controllers support up to 64 bytes of
+payload. The representation of this length in can_frame.can_dlc and
+canfd_frame.len for userspace applications and inside the Linux network
+layer is a plain value from 0 .. 64 instead of the CAN 'data length code'.
+The data length code was a 1:1 mapping to the payload length in the legacy
+CAN frames anyway. The payload length to the bus-relevant DLC mapping is
+only performed inside the CAN drivers, preferably with the helper
+functions can_dlc2len() and can_len2dlc().
+
+The CAN netdevice driver capabilities can be distinguished by the network
+devices maximum transfer unit (MTU)::
+
+  MTU = 16 (CAN_MTU)   => sizeof(struct can_frame)   => 'legacy' CAN device
+  MTU = 72 (CANFD_MTU) => sizeof(struct canfd_frame) => CAN FD capable device
+
+The CAN device MTU can be retrieved e.g. with a SIOCGIFMTU ioctl() syscall.
+N.B. CAN FD capable devices can also handle and send legacy CAN frames.
+
+When configuring CAN FD capable CAN controllers an additional 'data' bitrate
+has to be set. This bitrate for the data phase of the CAN FD frame has to be
+at least the bitrate which was configured for the arbitration phase. This
+second bitrate is specified analogue to the first bitrate but the bitrate
+setting keywords for the 'data' bitrate start with 'd' e.g. dbitrate,
+dsample-point, dsjw or dtq and similar settings. When a data bitrate is set
+within the configuration process the controller option "fd on" can be
+specified to enable the CAN FD mode in the CAN controller. This controller
+option also switches the device MTU to 72 (CANFD_MTU).
+
+The first CAN FD specification presented as whitepaper at the International
+CAN Conference 2012 needed to be improved for data integrity reasons.
+Therefore two CAN FD implementations have to be distinguished today:
+
+- ISO compliant:     The ISO 11898-1:2015 CAN FD implementation (default)
+- non-ISO compliant: The CAN FD implementation following the 2012 whitepaper
+
+Finally there are three types of CAN FD controllers:
+
+1. ISO compliant (fixed)
+2. non-ISO compliant (fixed, like the M_CAN IP core v3.0.1 in m_can.c)
+3. ISO/non-ISO CAN FD controllers (switchable, like the PEAK PCAN-USB FD)
+
+The current ISO/non-ISO mode is announced by the CAN controller driver via
+netlink and displayed by the 'ip' tool (controller option FD-NON-ISO).
+The ISO/non-ISO-mode can be altered by setting 'fd-non-iso {on|off}' for
+switchable CAN FD controllers only.
+
+Example configuring 500 kbit/s arbitration bitrate and 4 Mbit/s data bitrate::
+
+    $ ip link set can0 up type can bitrate 500000 sample-point 0.75 \
+                                   dbitrate 4000000 dsample-point 0.8 fd on
+    $ ip -details link show can0
+    5: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UNKNOWN \
+             mode DEFAULT group default qlen 10
+    link/can  promiscuity 0
+    can <FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
+          bitrate 500000 sample-point 0.750
+          tq 50 prop-seg 14 phase-seg1 15 phase-seg2 10 sjw 1
+          pcan_usb_pro_fd: tseg1 1..64 tseg2 1..16 sjw 1..16 brp 1..1024 \
+          brp-inc 1
+          dbitrate 4000000 dsample-point 0.800
+          dtq 12 dprop-seg 7 dphase-seg1 8 dphase-seg2 4 dsjw 1
+          pcan_usb_pro_fd: dtseg1 1..16 dtseg2 1..8 dsjw 1..4 dbrp 1..1024 \
+          dbrp-inc 1
+          clock 80000000
+
+Example when 'fd-non-iso on' is added on this switchable CAN FD adapter::
+
+   can <FD,FD-NON-ISO> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
+
+
+Supported CAN Hardware
+----------------------
+
+Please check the "Kconfig" file in "drivers/net/can" to get an actual
+list of the support CAN hardware. On the SocketCAN project website
+(see :ref:`socketcan-resources`) there might be further drivers available, also for
+older kernel versions.
+
+
+.. _socketcan-resources:
+
+SocketCAN Resources
+===================
+
+The Linux CAN / SocketCAN project resources (project site / mailing list)
+are referenced in the MAINTAINERS file in the Linux source tree.
+Search for CAN NETWORK [LAYERS|DRIVERS].
+
+Credits
+=======
+
+- Oliver Hartkopp (PF_CAN core, filters, drivers, bcm, SJA1000 driver)
+- Urs Thuermann (PF_CAN core, kernel integration, socket interfaces, raw, vcan)
+- Jan Kizka (RT-SocketCAN core, Socket-API reconciliation)
+- Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews, CAN device driver interface, MSCAN driver)
+- Robert Schwebel (design reviews, PTXdist integration)
+- Marc Kleine-Budde (design reviews, Kernel 2.6 cleanups, drivers)
+- Benedikt Spranger (reviews)
+- Thomas Gleixner (LKML reviews, coding style, posting hints)
+- Andrey Volkov (kernel subtree structure, ioctls, MSCAN driver)
+- Matthias Brukner (first SJA1000 CAN netdevice implementation Q2/2003)
+- Klaus Hitschler (PEAK driver integration)
+- Uwe Koppe (CAN netdevices with PF_PACKET approach)
+- Michael Schulze (driver layer loopback requirement, RT CAN drivers review)
+- Pavel Pisa (Bit-timing calculation)
+- Sascha Hauer (SJA1000 platform driver)
+- Sebastian Haas (SJA1000 EMS PCI driver)
+- Markus Plessing (SJA1000 EMS PCI driver)
+- Per Dalen (SJA1000 Kvaser PCI driver)
+- Sam Ravnborg (reviews, coding style, kbuild help)
diff --git a/Documentation/networking/can_ucan_protocol.rst b/Documentation/networking/can_ucan_protocol.rst
new file mode 100644
index 000000000..4cef88d24
--- /dev/null
+++ b/Documentation/networking/can_ucan_protocol.rst
@@ -0,0 +1,332 @@
+=================
+The UCAN Protocol
+=================
+
+UCAN is the protocol used by the microcontroller-based USB-CAN
+adapter that is integrated on System-on-Modules from Theobroma Systems
+and that is also available as a standalone USB stick.
+
+The UCAN protocol has been designed to be hardware-independent.
+It is modeled closely after how Linux represents CAN devices
+internally. All multi-byte integers are encoded as Little Endian.
+
+All structures mentioned in this document are defined in
+``drivers/net/can/usb/ucan.c``.
+
+USB Endpoints
+=============
+
+UCAN devices use three USB endpoints:
+
+CONTROL endpoint
+  The driver sends device management commands on this endpoint
+
+IN endpoint
+  The device sends CAN data frames and CAN error frames
+
+OUT endpoint
+  The driver sends CAN data frames on the out endpoint
+
+
+CONTROL Messages
+================
+
+UCAN devices are configured using vendor requests on the control pipe.
+
+To support multiple CAN interfaces in a single USB device all
+configuration commands target the corresponding interface in the USB
+descriptor.
+
+The driver uses ``ucan_ctrl_command_in/out`` and
+``ucan_device_request_in`` to deliver commands to the device.
+
+Setup Packet
+------------
+
+=================  =====================================================
+``bmRequestType``  Direction | Vendor | (Interface or Device)
+``bRequest``       Command Number
+``wValue``         Subcommand Number (16 Bit) or 0 if not used
+``wIndex``         USB Interface Index (0 for device commands)
+``wLength``        * Host to Device - Number of bytes to transmit
+                   * Device to Host - Maximum Number of bytes to
+                     receive. If the device send less. Commom ZLP
+                     semantics are used.
+=================  =====================================================
+
+Error Handling
+--------------
+
+The device indicates failed control commands by stalling the
+pipe.
+
+Device Commands
+---------------
+
+UCAN_DEVICE_GET_FW_STRING
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*Dev2Host; optional*
+
+Request the device firmware string.
+
+
+Interface Commands
+------------------
+
+UCAN_COMMAND_START
+~~~~~~~~~~~~~~~~~~
+
+*Host2Dev; mandatory*
+
+Bring the CAN interface up.
+
+Payload Format
+  ``ucan_ctl_payload_t.cmd_start``
+
+====  ============================
+mode  or mask of ``UCAN_MODE_*``
+====  ============================
+
+UCAN_COMMAND_STOP
+~~~~~~~~~~~~~~~~~~
+
+*Host2Dev; mandatory*
+
+Stop the CAN interface
+
+Payload Format
+  *empty*
+
+UCAN_COMMAND_RESET
+~~~~~~~~~~~~~~~~~~
+
+*Host2Dev; mandatory*
+
+Reset the CAN controller (including error counters)
+
+Payload Format
+  *empty*
+
+UCAN_COMMAND_GET
+~~~~~~~~~~~~~~~~
+
+*Host2Dev; mandatory*
+
+Get Information from the Device
+
+Subcommands
+^^^^^^^^^^^
+
+UCAN_COMMAND_GET_INFO
+  Request the device information structure ``ucan_ctl_payload_t.device_info``.
+
+  See the ``device_info`` field for details, and
+  ``uapi/linux/can/netlink.h`` for an explanation of the
+  ``can_bittiming fields``.
+
+  Payload Format
+    ``ucan_ctl_payload_t.device_info``
+
+UCAN_COMMAND_GET_PROTOCOL_VERSION
+
+  Request the device protocol version
+  ``ucan_ctl_payload_t.protocol_version``. The current protocol version is 3.
+
+  Payload Format
+    ``ucan_ctl_payload_t.protocol_version``
+
+.. note:: Devices that do not implement this command use the old
+          protocol version 1
+
+UCAN_COMMAND_SET_BITTIMING
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+*Host2Dev; mandatory*
+
+Setup bittiming by sending the the structure
+``ucan_ctl_payload_t.cmd_set_bittiming`` (see ``struct bittiming`` for
+details)
+
+Payload Format
+  ``ucan_ctl_payload_t.cmd_set_bittiming``.
+
+UCAN_SLEEP/WAKE
+~~~~~~~~~~~~~~~
+
+*Host2Dev; optional*
+
+Configure sleep and wake modes. Not yet supported by the driver.
+
+UCAN_FILTER
+~~~~~~~~~~~
+
+*Host2Dev; optional*
+
+Setup hardware CAN filters. Not yet supported by the driver.
+
+Allowed interface commands
+--------------------------
+
+==================  ===================  ==================
+Legal Device State  Command              New Device State
+==================  ===================  ==================
+stopped             SET_BITTIMING        stopped
+stopped             START                started
+started             STOP or RESET        stopped
+stopped             STOP or RESET        stopped
+started             RESTART              started
+any                 GET                  *no change*
+==================  ===================  ==================
+
+IN Message Format
+=================
+
+A data packet on the USB IN endpoint contains one or more
+``ucan_message_in`` values. If multiple messages are batched in a USB
+data packet, the ``len`` field can be used to jump to the next
+``ucan_message_in`` value (take care to sanity-check the ``len`` value
+against the actual data size).
+
+.. _can_ucan_in_message_len:
+
+``len`` field
+-------------
+
+Each ``ucan_message_in`` must be aligned to a 4-byte boundary (relative
+to the start of the start of the data buffer). That means that there
+may be padding bytes between multiple ``ucan_message_in`` values:
+
+.. code::
+
+    +----------------------------+ < 0
+    |                            |
+    |   struct ucan_message_in   |
+    |                            |
+    +----------------------------+ < len
+              [padding]
+    +----------------------------+ < round_up(len, 4)
+    |                            |
+    |   struct ucan_message_in   |
+    |                            |
+    +----------------------------+
+                [...]
+
+``type`` field
+--------------
+
+The ``type`` field specifies the type of the message.
+
+UCAN_IN_RX
+~~~~~~~~~~
+
+``subtype``
+  zero
+
+Data received from the CAN bus (ID + payload).
+
+UCAN_IN_TX_COMPLETE
+~~~~~~~~~~~~~~~~~~~
+
+``subtype``
+  zero
+
+The CAN device has sent a message to the CAN bus. It answers with a
+list of of tuples <echo-ids, flags>.
+
+The echo-id identifies the frame from (echos the id from a previous
+UCAN_OUT_TX message). The flag indicates the result of the
+transmission. Whereas a set Bit 0 indicates success. All other bits
+are reserved and set to zero.
+
+Flow Control
+------------
+
+When receiving CAN messages there is no flow control on the USB
+buffer. The driver has to handle inbound message quickly enough to
+avoid drops. I case the device buffer overflow the condition is
+reported by sending corresponding error frames (see
+:ref:`can_ucan_error_handling`)
+
+
+OUT Message Format
+==================
+
+A data packet on the USB OUT endpoint contains one or more ``struct
+ucan_message_out`` values. If multiple messages are batched into one
+data packet, the device uses the ``len`` field to jump to the next
+ucan_message_out value. Each ucan_message_out must be aligned to 4
+bytes (relative to the start of the data buffer). The mechanism is
+same as described in :ref:`can_ucan_in_message_len`.
+
+.. code::
+
+    +----------------------------+ < 0
+    |                            |
+    |   struct ucan_message_out  |
+    |                            |
+    +----------------------------+ < len
+              [padding]
+    +----------------------------+ < round_up(len, 4)
+    |                            |
+    |   struct ucan_message_out  |
+    |                            |
+    +----------------------------+
+                [...]
+
+``type`` field
+--------------
+
+In protocol version 3 only ``UCAN_OUT_TX`` is defined, others are used
+only by legacy devices (protocol version 1).
+
+UCAN_OUT_TX
+~~~~~~~~~~~
+``subtype``
+  echo id to be replied within a CAN_IN_TX_COMPLETE message
+
+Transmit a CAN frame. (parameters: ``id``, ``data``)
+
+Flow Control
+------------
+
+When the device outbound buffers are full it starts sending *NAKs* on
+the *OUT* pipe until more buffers are available. The driver stops the
+queue when a certain threshold of out packets are incomplete.
+
+.. _can_ucan_error_handling:
+
+CAN Error Handling
+==================
+
+If error reporting is turned on the device encodes errors into CAN
+error frames (see ``uapi/linux/can/error.h``) and sends it using the
+IN endpoint. The driver updates its error statistics and forwards
+it.
+
+Although UCAN devices can suppress error frames completely, in Linux
+the driver is always interested. Hence, the device is always started with
+the ``UCAN_MODE_BERR_REPORT`` set. Filtering those messages for the
+user space is done by the driver.
+
+Bus OFF
+-------
+
+- The device does not recover from bus of automatically.
+- Bus OFF is indicated by an error frame (see ``uapi/linux/can/error.h``)
+- Bus OFF recovery is started by ``UCAN_COMMAND_RESTART``
+- Once Bus OFF recover is completed the device sends an error frame
+  indicating that it is on ERROR-ACTIVE state.
+- During Bus OFF no frames are sent by the device.
+- During Bus OFF transmission requests from the host are completed
+  immediately with the success bit left unset.
+
+Example Conversation
+====================
+
+#) Device is connected to USB
+#) Host sends command ``UCAN_COMMAND_RESET``, subcmd 0
+#) Host sends command ``UCAN_COMMAND_GET``, subcmd ``UCAN_COMMAND_GET_INFO``
+#) Device sends ``UCAN_IN_DEVICE_INFO``
+#) Host sends command ``UCAN_OUT_SET_BITTIMING``
+#) Host sends command ``UCAN_COMMAND_START``, subcmd 0, mode ``UCAN_MODE_BERR_REPORT``
diff --git a/Documentation/networking/cdc_mbim.txt b/Documentation/networking/cdc_mbim.txt
new file mode 100644
index 000000000..4e68f0bc5
--- /dev/null
+++ b/Documentation/networking/cdc_mbim.txt
@@ -0,0 +1,339 @@
+     cdc_mbim - Driver for CDC MBIM Mobile Broadband modems
+    ========================================================
+
+The cdc_mbim driver supports USB devices conforming to the "Universal
+Serial Bus Communications Class Subclass Specification for Mobile
+Broadband Interface Model" [1], which is a further development of
+"Universal Serial Bus Communications Class Subclass Specifications for
+Network Control Model Devices" [2] optimized for Mobile Broadband
+devices, aka "3G/LTE modems".
+
+
+Command Line Parameters
+=======================
+
+The cdc_mbim driver has no parameters of its own.  But the probing
+behaviour for NCM 1.0 backwards compatible MBIM functions (an
+"NCM/MBIM function" as defined in section 3.2 of [1]) is affected
+by a cdc_ncm driver parameter:
+
+prefer_mbim
+-----------
+Type:          Boolean
+Valid Range:   N/Y (0-1)
+Default Value: Y (MBIM is preferred)
+
+This parameter sets the system policy for NCM/MBIM functions.  Such
+functions will be handled by either the cdc_ncm driver or the cdc_mbim
+driver depending on the prefer_mbim setting.  Setting prefer_mbim=N
+makes the cdc_mbim driver ignore these functions and lets the cdc_ncm
+driver handle them instead.
+
+The parameter is writable, and can be changed at any time. A manual
+unbind/bind is required to make the change effective for NCM/MBIM
+functions bound to the "wrong" driver
+
+
+Basic usage
+===========
+
+MBIM functions are inactive when unmanaged. The cdc_mbim driver only
+provides a userspace interface to the MBIM control channel, and will
+not participate in the management of the function. This implies that a
+userspace MBIM management application always is required to enable a
+MBIM function.
+
+Such userspace applications includes, but are not limited to:
+ - mbimcli (included with the libmbim [3] library), and
+ - ModemManager [4]
+
+Establishing a MBIM IP session reequires at least these actions by the
+management application:
+ - open the control channel
+ - configure network connection settings
+ - connect to network
+ - configure IP interface
+
+Management application development
+----------------------------------
+The driver <-> userspace interfaces are described below.  The MBIM
+control channel protocol is described in [1].
+
+
+MBIM control channel userspace ABI
+==================================
+
+/dev/cdc-wdmX character device
+------------------------------
+The driver creates a two-way pipe to the MBIM function control channel
+using the cdc-wdm driver as a subdriver.  The userspace end of the
+control channel pipe is a /dev/cdc-wdmX character device.
+
+The cdc_mbim driver does not process or police messages on the control
+channel.  The channel is fully delegated to the userspace management
+application.  It is therefore up to this application to ensure that it
+complies with all the control channel requirements in [1].
+
+The cdc-wdmX device is created as a child of the MBIM control
+interface USB device.  The character device associated with a specific
+MBIM function can be looked up using sysfs.  For example:
+
+ bjorn@nemi:~$ ls /sys/bus/usb/drivers/cdc_mbim/2-4:2.12/usbmisc
+ cdc-wdm0
+
+ bjorn@nemi:~$ grep . /sys/bus/usb/drivers/cdc_mbim/2-4:2.12/usbmisc/cdc-wdm0/dev
+ 180:0
+
+
+USB configuration descriptors
+-----------------------------
+The wMaxControlMessage field of the CDC MBIM functional descriptor
+limits the maximum control message size. The managament application is
+responsible for negotiating a control message size complying with the
+requirements in section 9.3.1 of [1], taking this descriptor field
+into consideration.
+
+The userspace application can access the CDC MBIM functional
+descriptor of a MBIM function using either of the two USB
+configuration descriptor kernel interfaces described in [6] or [7].
+
+See also the ioctl documentation below.
+
+
+Fragmentation
+-------------
+The userspace application is responsible for all control message
+fragmentation and defragmentaion, as described in section 9.5 of [1].
+
+
+/dev/cdc-wdmX write()
+---------------------
+The MBIM control messages from the management application *must not*
+exceed the negotiated control message size.
+
+
+/dev/cdc-wdmX read()
+--------------------
+The management application *must* accept control messages of up the
+negotiated control message size.
+
+
+/dev/cdc-wdmX ioctl()
+--------------------
+IOCTL_WDM_MAX_COMMAND: Get Maximum Command Size
+This ioctl returns the wMaxControlMessage field of the CDC MBIM
+functional descriptor for MBIM devices.  This is intended as a
+convenience, eliminating the need to parse the USB descriptors from
+userspace.
+
+	#include <stdio.h>
+	#include <fcntl.h>
+	#include <sys/ioctl.h>
+	#include <linux/types.h>
+	#include <linux/usb/cdc-wdm.h>
+	int main()
+	{
+		__u16 max;
+		int fd = open("/dev/cdc-wdm0", O_RDWR);
+		if (!ioctl(fd, IOCTL_WDM_MAX_COMMAND, &max))
+			printf("wMaxControlMessage is %d\n", max);
+	}
+
+
+Custom device services
+----------------------
+The MBIM specification allows vendors to freely define additional
+services.  This is fully supported by the cdc_mbim driver.
+
+Support for new MBIM services, including vendor specified services, is
+implemented entirely in userspace, like the rest of the MBIM control
+protocol
+
+New services should be registered in the MBIM Registry [5].
+
+
+
+MBIM data channel userspace ABI
+===============================
+
+wwanY network device
+--------------------
+The cdc_mbim driver represents the MBIM data channel as a single
+network device of the "wwan" type. This network device is initially
+mapped to MBIM IP session 0.
+
+
+Multiplexed IP sessions (IPS)
+-----------------------------
+MBIM allows multiplexing up to 256 IP sessions over a single USB data
+channel.  The cdc_mbim driver models such IP sessions as 802.1q VLAN
+subdevices of the master wwanY device, mapping MBIM IP session Z to
+VLAN ID Z for all values of Z greater than 0.
+
+The device maximum Z is given in the MBIM_DEVICE_CAPS_INFO structure
+described in section 10.5.1 of [1].
+
+The userspace management application is responsible for adding new
+VLAN links prior to establishing MBIM IP sessions where the SessionId
+is greater than 0. These links can be added by using the normal VLAN
+kernel interfaces, either ioctl or netlink.
+
+For example, adding a link for a MBIM IP session with SessionId 3:
+
+  ip link add link wwan0 name wwan0.3 type vlan id 3
+
+The driver will automatically map the "wwan0.3" network device to MBIM
+IP session 3.
+
+
+Device Service Streams (DSS)
+----------------------------
+MBIM also allows up to 256 non-IP data streams to be multiplexed over
+the same shared USB data channel.  The cdc_mbim driver models these
+sessions as another set of 802.1q VLAN subdevices of the master wwanY
+device, mapping MBIM DSS session A to VLAN ID (256 + A) for all values
+of A.
+
+The device maximum A is given in the MBIM_DEVICE_SERVICES_INFO
+structure described in section 10.5.29 of [1].
+
+The DSS VLAN subdevices are used as a practical interface between the
+shared MBIM data channel and a MBIM DSS aware userspace application.
+It is not intended to be presented as-is to an end user. The
+assumption is that a userspace application initiating a DSS session
+also takes care of the necessary framing of the DSS data, presenting
+the stream to the end user in an appropriate way for the stream type.
+
+The network device ABI requires a dummy ethernet header for every DSS
+data frame being transported.  The contents of this header is
+arbitrary, with the following exceptions:
+ - TX frames using an IP protocol (0x0800 or 0x86dd) will be dropped
+ - RX frames will have the protocol field set to ETH_P_802_3 (but will
+   not be properly formatted 802.3 frames)
+ - RX frames will have the destination address set to the hardware
+   address of the master device
+
+The DSS supporting userspace management application is responsible for
+adding the dummy ethernet header on TX and stripping it on RX.
+
+This is a simple example using tools commonly available, exporting
+DssSessionId 5 as a pty character device pointed to by a /dev/nmea
+symlink:
+
+  ip link add link wwan0 name wwan0.dss5 type vlan id 261
+  ip link set dev wwan0.dss5 up
+  socat INTERFACE:wwan0.dss5,type=2 PTY:,echo=0,link=/dev/nmea
+
+This is only an example, most suitable for testing out a DSS
+service. Userspace applications supporting specific MBIM DSS services
+are expected to use the tools and programming interfaces required by
+that service.
+
+Note that adding VLAN links for DSS sessions is entirely optional.  A
+management application may instead choose to bind a packet socket
+directly to the master network device, using the received VLAN tags to
+map frames to the correct DSS session and adding 18 byte VLAN ethernet
+headers with the appropriate tag on TX.  In this case using a socket
+filter is recommended, matching only the DSS VLAN subset. This avoid
+unnecessary copying of unrelated IP session data to userspace.  For
+example:
+
+  static struct sock_filter dssfilter[] = {
+	/* use special negative offsets to get VLAN tag */
+	BPF_STMT(BPF_LD|BPF_B|BPF_ABS, SKF_AD_OFF + SKF_AD_VLAN_TAG_PRESENT),
+	BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, 1, 0, 6), /* true */
+
+	/* verify DSS VLAN range */
+	BPF_STMT(BPF_LD|BPF_H|BPF_ABS, SKF_AD_OFF + SKF_AD_VLAN_TAG),
+	BPF_JUMP(BPF_JMP|BPF_JGE|BPF_K, 256, 0, 4),	/* 256 is first DSS VLAN */
+	BPF_JUMP(BPF_JMP|BPF_JGE|BPF_K, 512, 3, 0),	/* 511 is last DSS VLAN */
+
+	/* verify ethertype */
+        BPF_STMT(BPF_LD|BPF_H|BPF_ABS, 2 * ETH_ALEN),
+        BPF_JUMP(BPF_JMP|BPF_JEQ|BPF_K, ETH_P_802_3, 0, 1),
+
+        BPF_STMT(BPF_RET|BPF_K, (u_int)-1),	/* accept */
+        BPF_STMT(BPF_RET|BPF_K, 0),		/* ignore */
+  };
+
+
+
+Tagged IP session 0 VLAN
+------------------------
+As described above, MBIM IP session 0 is treated as special by the
+driver.  It is initially mapped to untagged frames on the wwanY
+network device.
+
+This mapping implies a few restrictions on multiplexed IPS and DSS
+sessions, which may not always be practical:
+ - no IPS or DSS session can use a frame size greater than the MTU on
+   IP session 0
+ - no IPS or DSS session can be in the up state unless the network
+   device representing IP session 0 also is up
+
+These problems can be avoided by optionally making the driver map IP
+session 0 to a VLAN subdevice, similar to all other IP sessions.  This
+behaviour is triggered by adding a VLAN link for the magic VLAN ID
+4094.  The driver will then immediately start mapping MBIM IP session
+0 to this VLAN, and will drop untagged frames on the master wwanY
+device.
+
+Tip: It might be less confusing to the end user to name this VLAN
+subdevice after the MBIM SessionID instead of the VLAN ID.  For
+example:
+
+  ip link add link wwan0 name wwan0.0 type vlan id 4094
+
+
+VLAN mapping
+------------
+
+Summarizing the cdc_mbim driver mapping described above, we have this
+relationship between VLAN tags on the wwanY network device and MBIM
+sessions on the shared USB data channel:
+
+  VLAN ID       MBIM type   MBIM SessionID           Notes
+  ---------------------------------------------------------
+  untagged      IPS         0                        a)
+  1 - 255       IPS         1 - 255 <VLANID>
+  256 - 511     DSS         0 - 255 <VLANID - 256>
+  512 - 4093                                         b)
+  4094          IPS         0                        c)
+
+    a) if no VLAN ID 4094 link exists, else dropped
+    b) unsupported VLAN range, unconditionally dropped
+    c) if a VLAN ID 4094 link exists, else dropped
+
+
+
+
+References
+==========
+
+[1] USB Implementers Forum, Inc. - "Universal Serial Bus
+      Communications Class Subclass Specification for Mobile Broadband
+      Interface Model", Revision 1.0 (Errata 1), May 1, 2013
+      - http://www.usb.org/developers/docs/devclass_docs/
+
+[2] USB Implementers Forum, Inc. - "Universal Serial Bus
+      Communications Class Subclass Specifications for Network Control
+      Model Devices", Revision 1.0 (Errata 1), November 24, 2010
+      - http://www.usb.org/developers/docs/devclass_docs/
+
+[3] libmbim - "a glib-based library for talking to WWAN modems and
+      devices which speak the Mobile Interface Broadband Model (MBIM)
+      protocol"
+      - http://www.freedesktop.org/wiki/Software/libmbim/
+
+[4] ModemManager - "a DBus-activated daemon which controls mobile
+      broadband (2G/3G/4G) devices and connections"
+      - http://www.freedesktop.org/wiki/Software/ModemManager/
+
+[5] "MBIM (Mobile Broadband Interface Model) Registry"
+       - http://compliance.usb.org/mbim/
+
+[6] "/sys/kernel/debug/usb/devices output format"
+       - Documentation/driver-api/usb/usb.rst
+
+[7] "/sys/bus/usb/devices/.../descriptors"
+       - Documentation/ABI/stable/sysfs-bus-usb
diff --git a/Documentation/networking/checksum-offloads.txt b/Documentation/networking/checksum-offloads.txt
new file mode 100644
index 000000000..27bc09cfc
--- /dev/null
+++ b/Documentation/networking/checksum-offloads.txt
@@ -0,0 +1,122 @@
+Checksum Offloads in the Linux Networking Stack
+
+
+Introduction
+============
+
+This document describes a set of techniques in the Linux networking stack
+ to take advantage of checksum offload capabilities of various NICs.
+
+The following technologies are described:
+ * TX Checksum Offload
+ * LCO: Local Checksum Offload
+ * RCO: Remote Checksum Offload
+
+Things that should be documented here but aren't yet:
+ * RX Checksum Offload
+ * CHECKSUM_UNNECESSARY conversion
+
+
+TX Checksum Offload
+===================
+
+The interface for offloading a transmit checksum to a device is explained
+ in detail in comments near the top of include/linux/skbuff.h.
+In brief, it allows to request the device fill in a single ones-complement
+ checksum defined by the sk_buff fields skb->csum_start and
+ skb->csum_offset.  The device should compute the 16-bit ones-complement
+ checksum (i.e. the 'IP-style' checksum) from csum_start to the end of the
+ packet, and fill in the result at (csum_start + csum_offset).
+Because csum_offset cannot be negative, this ensures that the previous
+ value of the checksum field is included in the checksum computation, thus
+ it can be used to supply any needed corrections to the checksum (such as
+ the sum of the pseudo-header for UDP or TCP).
+This interface only allows a single checksum to be offloaded.  Where
+ encapsulation is used, the packet may have multiple checksum fields in
+ different header layers, and the rest will have to be handled by another
+ mechanism such as LCO or RCO.
+CRC32c can also be offloaded using this interface, by means of filling
+ skb->csum_start and skb->csum_offset as described above, and setting
+ skb->csum_not_inet: see skbuff.h comment (section 'D') for more details.
+No offloading of the IP header checksum is performed; it is always done in
+ software.  This is OK because when we build the IP header, we obviously
+ have it in cache, so summing it isn't expensive.  It's also rather short.
+The requirements for GSO are more complicated, because when segmenting an
+ encapsulated packet both the inner and outer checksums may need to be
+ edited or recomputed for each resulting segment.  See the skbuff.h comment
+ (section 'E') for more details.
+
+A driver declares its offload capabilities in netdev->hw_features; see
+ Documentation/networking/netdev-features.txt for more.  Note that a device
+ which only advertises NETIF_F_IP[V6]_CSUM must still obey the csum_start
+ and csum_offset given in the SKB; if it tries to deduce these itself in
+ hardware (as some NICs do) the driver should check that the values in the
+ SKB match those which the hardware will deduce, and if not, fall back to
+ checksumming in software instead (with skb_csum_hwoffload_help() or one of
+ the skb_checksum_help() / skb_crc32c_csum_help functions, as mentioned in
+ include/linux/skbuff.h).
+
+The stack should, for the most part, assume that checksum offload is
+ supported by the underlying device.  The only place that should check is
+ validate_xmit_skb(), and the functions it calls directly or indirectly.
+ That function compares the offload features requested by the SKB (which
+ may include other offloads besides TX Checksum Offload) and, if they are
+ not supported or enabled on the device (determined by netdev->features),
+ performs the corresponding offload in software.  In the case of TX
+ Checksum Offload, that means calling skb_csum_hwoffload_help(skb, features).
+
+
+LCO: Local Checksum Offload
+===========================
+
+LCO is a technique for efficiently computing the outer checksum of an
+ encapsulated datagram when the inner checksum is due to be offloaded.
+The ones-complement sum of a correctly checksummed TCP or UDP packet is
+ equal to the complement of the sum of the pseudo header, because everything
+ else gets 'cancelled out' by the checksum field.  This is because the sum was
+ complemented before being written to the checksum field.
+More generally, this holds in any case where the 'IP-style' ones complement
+ checksum is used, and thus any checksum that TX Checksum Offload supports.
+That is, if we have set up TX Checksum Offload with a start/offset pair, we
+ know that after the device has filled in that checksum, the ones
+ complement sum from csum_start to the end of the packet will be equal to
+ the complement of whatever value we put in the checksum field beforehand.
+ This allows us to compute the outer checksum without looking at the payload:
+ we simply stop summing when we get to csum_start, then add the complement of
+ the 16-bit word at (csum_start + csum_offset).
+Then, when the true inner checksum is filled in (either by hardware or by
+ skb_checksum_help()), the outer checksum will become correct by virtue of
+ the arithmetic.
+
+LCO is performed by the stack when constructing an outer UDP header for an
+ encapsulation such as VXLAN or GENEVE, in udp_set_csum().  Similarly for
+ the IPv6 equivalents, in udp6_set_csum().
+It is also performed when constructing an IPv4 GRE header, in
+ net/ipv4/ip_gre.c:build_header().  It is *not* currently performed when
+ constructing an IPv6 GRE header; the GRE checksum is computed over the
+ whole packet in net/ipv6/ip6_gre.c:ip6gre_xmit2(), but it should be
+ possible to use LCO here as IPv6 GRE still uses an IP-style checksum.
+All of the LCO implementations use a helper function lco_csum(), in
+ include/linux/skbuff.h.
+
+LCO can safely be used for nested encapsulations; in this case, the outer
+ encapsulation layer will sum over both its own header and the 'middle'
+ header.  This does mean that the 'middle' header will get summed multiple
+ times, but there doesn't seem to be a way to avoid that without incurring
+ bigger costs (e.g. in SKB bloat).
+
+
+RCO: Remote Checksum Offload
+============================
+
+RCO is a technique for eliding the inner checksum of an encapsulated
+ datagram, allowing the outer checksum to be offloaded.  It does, however,
+ involve a change to the encapsulation protocols, which the receiver must
+ also support.  For this reason, it is disabled by default.
+RCO is detailed in the following Internet-Drafts:
+https://tools.ietf.org/html/draft-herbert-remotecsumoffload-00
+https://tools.ietf.org/html/draft-herbert-vxlan-rco-00
+In Linux, RCO is implemented individually in each encapsulation protocol,
+ and most tunnel types have flags controlling its use.  For instance, VXLAN
+ has the flag VXLAN_F_REMCSUM_TX (per struct vxlan_rdst) to indicate that
+ RCO should be used when transmitting to a given remote destination.
diff --git a/Documentation/networking/conf.py b/Documentation/networking/conf.py
new file mode 100644
index 000000000..40f69e67a
--- /dev/null
+++ b/Documentation/networking/conf.py
@@ -0,0 +1,10 @@
+# -*- coding: utf-8; mode: python -*-
+
+project = "Linux Networking Documentation"
+
+tags.add("subproject")
+
+latex_documents = [
+    ('index', 'networking.tex', project,
+     'The kernel development community', 'manual'),
+]
diff --git a/Documentation/networking/cops.txt b/Documentation/networking/cops.txt
new file mode 100644
index 000000000..3e344b448
--- /dev/null
+++ b/Documentation/networking/cops.txt
@@ -0,0 +1,63 @@
+Text File for the COPS LocalTalk Linux driver (cops.c).
+	By Jay Schulist <jschlst@samba.org>
+
+This driver has two modes and they are: Dayna mode and Tangent mode.
+Each mode corresponds with the type of card. It has been found
+that there are 2 main types of cards and all other cards are
+the same and just have different names or only have minor differences
+such as more IO ports. As this driver is tested it will
+become more clear exactly what cards are supported. 
+
+Right now these cards are known to work with the COPS driver. The
+LT-200 cards work in a somewhat more limited capacity than the
+DL200 cards, which work very well and are in use by many people.
+
+TANGENT driver mode:
+	Tangent ATB-II, Novell NL-1000, Daystar Digital LT-200
+DAYNA driver mode:
+	Dayna DL2000/DaynaTalk PC (Half Length), COPS LT-95,
+	Farallon PhoneNET PC III, Farallon PhoneNET PC II
+Other cards possibly supported mode unknown though:
+	Dayna DL2000 (Full length)
+
+The COPS driver defaults to using Dayna mode. To change the driver's 
+mode if you built a driver with dual support use board_type=1 or
+board_type=2 for Dayna or Tangent with insmod.
+
+** Operation/loading of the driver.
+Use modprobe like this:	/sbin/modprobe cops.o (IO #) (IRQ #)
+If you do not specify any options the driver will try and use the IO = 0x240,
+IRQ = 5. As of right now I would only use IRQ 5 for the card, if autoprobing.
+
+To load multiple COPS driver Localtalk cards you can do one of the following.
+
+insmod cops io=0x240 irq=5
+insmod -o cops2 cops io=0x260 irq=3
+
+Or in lilo.conf put something like this:
+	append="ether=5,0x240,lt0 ether=3,0x260,lt1"
+
+Then bring up the interface with ifconfig. It will look something like this:
+lt0       Link encap:UNSPEC  HWaddr 00-00-00-00-00-00-00-F7-00-00-00-00-00-00-00-00
+          inet addr:192.168.1.2  Bcast:192.168.1.255  Mask:255.255.255.0
+          UP BROADCAST RUNNING NOARP MULTICAST  MTU:600  Metric:1
+          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 coll:0
+
+** Netatalk Configuration
+You will need to configure atalkd with something like the following to make
+it work with the cops.c driver.
+
+* For single LTalk card use.
+dummy -seed -phase 2 -net 2000 -addr 2000.10 -zone "1033"
+lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
+
+* For multiple cards, Ethernet and LocalTalk.
+eth0 -seed -phase 2 -net 3000 -addr 3000.20 -zone "1033"
+lt0 -seed -phase 1 -net 1000 -addr 1000.50 -zone "1033"
+
+* For multiple LocalTalk cards, and an Ethernet card.
+* Order seems to matter here, Ethernet last.
+lt0 -seed -phase 1 -net 1000 -addr 1000.10 -zone "LocalTalk1"
+lt1 -seed -phase 1 -net 2000 -addr 2000.20 -zone "LocalTalk2"
+eth0 -seed -phase 2 -net 3000 -addr 3000.30 -zone "EtherTalk"
diff --git a/Documentation/networking/cs89x0.txt b/Documentation/networking/cs89x0.txt
new file mode 100644
index 000000000..0e190180e
--- /dev/null
+++ b/Documentation/networking/cs89x0.txt
@@ -0,0 +1,624 @@
+
+NOTE
+----
+
+This document was contributed by Cirrus Logic for kernel 2.2.5.  This version
+has been updated for 2.3.48 by Andrew Morton.
+
+Cirrus make a copy of this driver available at their website, as
+described below.  In general, you should use the driver version which
+comes with your Linux distribution.
+
+
+
+CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
+Linux Network Interface Driver ver. 2.00 <kernel 2.3.48>
+===============================================================================
+ 
+
+TABLE OF CONTENTS
+
+1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
+    1.1 Product Overview 
+    1.2 Driver Description
+	1.2.1 Driver Name
+	1.2.2 File in the Driver Package
+    1.3 System Requirements
+    1.4 Licensing Information
+
+2.0 ADAPTER INSTALLATION and CONFIGURATION
+    2.1 CS8900-based Adapter Configuration
+    2.2 CS8920-based Adapter Configuration 
+
+3.0 LOADING THE DRIVER AS A MODULE
+
+4.0 COMPILING THE DRIVER
+    4.1 Compiling the Driver as a Loadable Module
+    4.2 Compiling the driver to support memory mode
+    4.3 Compiling the driver to support Rx DMA 
+
+5.0 TESTING AND TROUBLESHOOTING
+    5.1 Known Defects and Limitations
+    5.2 Testing the Adapter
+        5.2.1 Diagnostic Self-Test
+        5.2.2 Diagnostic Network Test
+    5.3 Using the Adapter's LEDs
+    5.4 Resolving I/O Conflicts
+
+6.0 TECHNICAL SUPPORT
+    6.1 Contacting Cirrus Logic's Technical Support
+    6.2 Information Required Before Contacting Technical Support
+    6.3 Obtaining the Latest Driver Version
+    6.4 Current maintainer
+    6.5 Kernel boot parameters
+
+
+1.0 CIRRUS LOGIC LAN CS8900/CS8920 ETHERNET ADAPTERS
+===============================================================================
+
+
+1.1 PRODUCT OVERVIEW
+
+The CS8900-based ISA Ethernet Adapters from Cirrus Logic follow 
+IEEE 802.3 standards and support half or full-duplex operation in ISA bus 
+computers on 10 Mbps Ethernet networks.  The adapters are designed for operation 
+in 16-bit ISA or EISA bus expansion slots and are available in 
+10BaseT-only or 3-media configurations (10BaseT, 10Base2, and AUI for 10Base-5 
+or fiber networks).  
+
+CS8920-based adapters are similar to the CS8900-based adapter with additional 
+features for Plug and Play (PnP) support and Wakeup Frame recognition.  As 
+such, the configuration procedures differ somewhat between the two types of 
+adapters.  Refer to the "Adapter Configuration" section for details on 
+configuring both types of adapters.
+
+
+1.2 DRIVER DESCRIPTION
+
+The CS8900/CS8920 Ethernet Adapter driver for Linux supports the Linux
+v2.3.48 or greater kernel.  It can be compiled directly into the kernel
+or loaded at run-time as a device driver module.
+
+1.2.1 Driver Name: cs89x0
+
+1.2.2 Files in the Driver Archive:
+
+The files in the driver at Cirrus' website include:
+
+  readme.txt         - this file
+  build              - batch file to compile cs89x0.c.
+  cs89x0.c           - driver C code
+  cs89x0.h           - driver header file
+  cs89x0.o           - pre-compiled module (for v2.2.5 kernel)
+  config/Config.in   - sample file to include cs89x0 driver in the kernel.
+  config/Makefile    - sample file to include cs89x0 driver in the kernel.
+  config/Space.c     - sample file to include cs89x0 driver in the kernel.
+
+
+
+1.3 SYSTEM REQUIREMENTS
+
+The following hardware is required:
+
+   * Cirrus Logic LAN (CS8900/20-based) Ethernet ISA Adapter   
+
+   * IBM or IBM-compatible PC with:
+     * An 80386 or higher processor
+     * 16 bytes of contiguous IO space available between 210h - 370h
+     * One available IRQ (5,10,11,or 12 for the CS8900, 3-7,9-15 for CS8920).
+
+   * Appropriate cable (and connector for AUI, 10BASE-2) for your network
+     topology.
+
+The following software is required:
+
+* LINUX kernel version 2.3.48 or higher
+
+   * CS8900/20 Setup Utility (DOS-based)
+
+   * LINUX kernel sources for your kernel (if compiling into kernel)
+
+   * GNU Toolkit (gcc and make) v2.6 or above (if compiling into kernel 
+     or a module)   
+
+
+
+1.4 LICENSING INFORMATION
+
+This program is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free Software
+Foundation, version 1.
+
+This program is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for 
+more details.
+
+For a full copy of the GNU General Public License, write to the Free Software
+Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+
+
+2.0 ADAPTER INSTALLATION and CONFIGURATION
+===============================================================================
+
+Both the CS8900 and CS8920-based adapters can be configured using parameters 
+stored in an on-board EEPROM. You must use the DOS-based CS8900/20 Setup 
+Utility if you want to change the adapter's configuration in EEPROM.  
+
+When loading the driver as a module, you can specify many of the adapter's 
+configuration parameters on the command-line to override the EEPROM's settings 
+or for interface configuration when an EEPROM is not used. (CS8920-based 
+adapters must use an EEPROM.) See Section 3.0 LOADING THE DRIVER AS A MODULE.
+
+Since the CS8900/20 Setup Utility is a DOS-based application, you must install 
+and configure the adapter in a DOS-based system using the CS8900/20 Setup 
+Utility before installation in the target LINUX system.  (Not required if 
+installing a CS8900-based adapter and the default configuration is acceptable.)
+     
+
+2.1 CS8900-BASED ADAPTER CONFIGURATION
+
+CS8900-based adapters shipped from Cirrus Logic have been configured 
+with the following "default" settings:
+
+  Operation Mode:      Memory Mode
+  IRQ:                 10
+  Base I/O Address:    300
+  Memory Base Address: D0000
+  Optimization:	       DOS Client
+  Transmission Mode:   Half-duplex
+  BootProm:            None
+  Media Type:	       Autodetect (3-media cards) or 
+                       10BASE-T (10BASE-T only adapter)
+
+You should only change the default configuration settings if conflicts with 
+another adapter exists. To change the adapter's configuration, run the 
+CS8900/20 Setup Utility. 
+
+
+2.2 CS8920-BASED ADAPTER CONFIGURATION
+
+CS8920-based adapters are shipped from Cirrus Logic configured as Plug
+and Play (PnP) enabled.  However, since the cs89x0 driver does NOT
+support PnP, you must install the CS8920 adapter in a DOS-based PC and
+run the CS8900/20 Setup Utility to disable PnP and configure the
+adapter before installation in the target Linux system.  Failure to do
+this will leave the adapter inactive and the driver will be unable to
+communicate with the adapter.  
+
+
+        **************************************************************** 
+        *                    CS8920-BASED ADAPTERS:                    *
+        *                                                              * 
+        * CS8920-BASED ADAPTERS ARE PLUG and PLAY ENABLED BY DEFAULT.  * 
+        * THE CS89X0 DRIVER DOES NOT SUPPORT PnP. THEREFORE, YOU MUST  *
+        * RUN THE CS8900/20 SETUP UTILITY TO DISABLE PnP SUPPORT AND   *
+        * TO ACTIVATE THE ADAPTER.                                     *
+        ****************************************************************
+
+
+
+
+3.0 LOADING THE DRIVER AS A MODULE
+===============================================================================
+
+If the driver is compiled as a loadable module, you can load the driver module
+with the 'modprobe' command.  Many of the adapter's configuration parameters can 
+be specified as command-line arguments to the load command.  This facility 
+provides a means to override the EEPROM's settings or for interface 
+configuration when an EEPROM is not used.
+
+Example:
+
+    insmod cs89x0.o io=0x200 irq=0xA media=aui
+
+This example loads the module and configures the adapter to use an IO port base
+address of 200h, interrupt 10, and use the AUI media connection.  The following
+configuration options are available on the command line:
+
+* io=###               - specify IO address (200h-360h)
+* irq=##               - specify interrupt level
+* use_dma=1            - Enable DMA
+* dma=#                - specify dma channel (Driver is compiled to support
+                         Rx DMA only)
+* dmasize=# (16 or 64) - DMA size 16K or 64K.  Default value is set to 16.
+* media=rj45           - specify media type
+   or media=bnc
+   or media=aui
+   or media=auto
+* duplex=full          - specify forced half/full/autonegotiate duplex
+   or duplex=half
+   or duplex=auto
+* debug=#              - debug level (only available if the driver was compiled
+                         for debugging)
+
+NOTES:
+
+a) If an EEPROM is present, any specified command-line parameter
+   will override the corresponding configuration value stored in
+   EEPROM.
+
+b) The "io" parameter must be specified on the command-line.  
+
+c) The driver's hardware probe routine is designed to avoid
+   writing to I/O space until it knows that there is a cs89x0
+   card at the written addresses.  This could cause problems
+   with device probing.  To avoid this behaviour, add one
+   to the `io=' module parameter.  This doesn't actually change
+   the I/O address, but it is a flag to tell the driver
+   to partially initialise the hardware before trying to
+   identify the card.  This could be dangerous if you are
+   not sure that there is a cs89x0 card at the provided address.
+
+   For example, to scan for an adapter located at IO base 0x300,
+   specify an IO address of 0x301.  
+
+d) The "duplex=auto" parameter is only supported for the CS8920.
+
+e) The minimum command-line configuration required if an EEPROM is
+   not present is:
+
+   io 
+   irq 
+   media type (no autodetect)
+
+f) The following additional parameters are CS89XX defaults (values
+   used with no EEPROM or command-line argument).
+
+   * DMA Burst = enabled
+   * IOCHRDY Enabled = enabled
+   * UseSA = enabled
+   * CS8900 defaults to half-duplex if not specified on command-line
+   * CS8920 defaults to autoneg if not specified on command-line
+   * Use reset defaults for other config parameters
+   * dma_mode = 0
+
+g) You can use ifconfig to set the adapter's Ethernet address.
+
+h) Many Linux distributions use the 'modprobe' command to load
+   modules.  This program uses the '/etc/conf.modules' file to
+   determine configuration information which is passed to a driver
+   module when it is loaded.  All the configuration options which are
+   described above may be placed within /etc/conf.modules.
+
+   For example:
+
+   > cat /etc/conf.modules
+   ...
+   alias eth0 cs89x0
+   options cs89x0 io=0x0200 dma=5 use_dma=1
+   ...
+
+   In this example we are telling the module system that the
+   ethernet driver for this machine should use the cs89x0 driver.  We
+   are asking 'modprobe' to pass the 'io', 'dma' and 'use_dma'
+   arguments to the driver when it is loaded.
+
+i) Cirrus recommend that the cs89x0 use the ISA DMA channels 5, 6 or
+   7.  You will probably find that other DMA channels will not work.
+
+j) The cs89x0 supports DMA for receiving only.  DMA mode is
+   significantly more efficient.  Flooding a 400 MHz Celeron machine
+   with large ping packets consumes 82% of its CPU capacity in non-DMA
+   mode.  With DMA this is reduced to 45%.
+
+k) If your Linux kernel was compiled with inbuilt plug-and-play
+   support you will be able to find information about the cs89x0 card
+   with the command
+
+   cat /proc/isapnp
+
+l) If during DMA operation you find erratic behavior or network data
+   corruption you should use your PC's BIOS to slow the EISA bus clock.
+
+m) If the cs89x0 driver is compiled directly into the kernel
+   (non-modular) then its I/O address is automatically determined by
+   ISA bus probing.  The IRQ number, media options, etc are determined
+   from the card's EEPROM.
+
+n) If the cs89x0 driver is compiled directly into the kernel, DMA
+   mode may be selected by providing the kernel with a boot option
+   'cs89x0_dma=N' where 'N' is the desired DMA channel number (5, 6 or 7).
+
+   Kernel boot options may be provided on the LILO command line:
+
+	LILO boot: linux cs89x0_dma=5
+
+   or they may be placed in /etc/lilo.conf:
+
+	image=/boot/bzImage-2.3.48
+	  append="cs89x0_dma=5"
+	  label=linux
+	  root=/dev/hda5
+	  read-only
+
+   The DMA Rx buffer size is hardwired to 16 kbytes in this mode.
+   (64k mode is not available).
+
+
+4.0 COMPILING THE DRIVER
+===============================================================================
+
+The cs89x0 driver can be compiled directly into the kernel or compiled into
+a loadable device driver module.
+
+
+4.1 COMPILING THE DRIVER AS A LOADABLE MODULE
+
+To compile the driver into a loadable module, use the following command 
+(single command line, without quotes):
+
+"gcc -D__KERNEL__ -I/usr/src/linux/include -I/usr/src/linux/net/inet -Wall 
+-Wstrict-prototypes -O2 -fomit-frame-pointer -DMODULE -DCONFIG_MODVERSIONS 
+-c cs89x0.c"
+
+4.2 COMPILING THE DRIVER TO SUPPORT MEMORY MODE
+
+Support for memory mode was not carried over into the 2.3 series kernels.
+
+4.3 COMPILING THE DRIVER TO SUPPORT Rx DMA
+
+The compile-time optionality for DMA was removed in the 2.3 kernel
+series.  DMA support is now unconditionally part of the driver.  It is
+enabled by the 'use_dma=1' module option.
+
+
+5.0 TESTING AND TROUBLESHOOTING
+===============================================================================
+
+5.1 KNOWN DEFECTS and LIMITATIONS
+
+Refer to the RELEASE.TXT file distributed as part of this archive for a list of 
+known defects, driver limitations, and work arounds.
+
+
+5.2 TESTING THE ADAPTER
+
+Once the adapter has been installed and configured, the diagnostic option of 
+the CS8900/20 Setup Utility can be used to test the functionality of the 
+adapter and its network connection.  Use the diagnostics 'Self Test' option to
+test the functionality of the adapter with the hardware configuration you have
+assigned. You can use the diagnostics 'Network Test' to test the ability of the
+adapter to communicate across the Ethernet with another PC equipped with a 
+CS8900/20-based adapter card (it must also be running the CS8900/20 Setup 
+Utility).
+
+         NOTE: The Setup Utility's diagnostics are designed to run in a
+         DOS-only operating system environment.  DO NOT run the diagnostics 
+         from a DOS or command prompt session under Windows 95, Windows NT, 
+         OS/2, or other operating system.
+
+To run the diagnostics tests on the CS8900/20 adapter:
+
+   1.) Boot DOS on the PC and start the CS8900/20 Setup Utility.
+
+   2.) The adapter's current configuration is displayed.  Hit the ENTER key to
+       get to the main menu.
+
+   4.) Select 'Diagnostics' (ALT-G) from the main menu.  
+       * Select 'Self-Test' to test the adapter's basic functionality.
+       * Select 'Network Test' to test the network connection and cabling.
+
+
+5.2.1 DIAGNOSTIC SELF-TEST
+
+The diagnostic self-test checks the adapter's basic functionality as well as 
+its ability to communicate across the ISA bus based on the system resources 
+assigned during hardware configuration.  The following tests are performed:
+
+   * IO Register Read/Write Test
+     The IO Register Read/Write test insures that the CS8900/20 can be 
+     accessed in IO mode, and that the IO base address is correct.
+
+   * Shared Memory Test
+     The Shared Memory test insures the CS8900/20 can be accessed in memory 
+     mode and that the range of memory addresses assigned does not conflict 
+     with other devices in the system.
+
+   * Interrupt Test
+     The Interrupt test insures there are no conflicts with the assigned IRQ
+     signal.
+
+   * EEPROM Test
+     The EEPROM test insures the EEPROM can be read.
+
+   * Chip RAM Test
+     The Chip RAM test insures the 4K of memory internal to the CS8900/20 is
+     working properly.
+
+   * Internal Loop-back Test
+     The Internal Loop Back test insures the adapter's transmitter and 
+     receiver are operating properly.  If this test fails, make sure the 
+     adapter's cable is connected to the network (check for LED activity for 
+     example).
+
+   * Boot PROM Test
+     The Boot PROM  test insures the Boot PROM is present, and can be read.
+     Failure indicates the Boot PROM  was not successfully read due to a
+     hardware problem or due to a conflicts on the Boot PROM address
+     assignment. (Test only applies if the adapter is configured to use the
+     Boot PROM option.)
+
+Failure of a test item indicates a possible system resource conflict with 
+another device on the ISA bus.  In this case, you should use the Manual Setup 
+option to reconfigure the adapter by selecting a different value for the system
+resource that failed.
+
+
+5.2.2 DIAGNOSTIC NETWORK TEST
+
+The Diagnostic Network Test verifies a working network connection by 
+transferring data between two CS8900/20 adapters installed in different PCs 
+on the same network. (Note: the diagnostic network test should not be run 
+between two nodes across a router.) 
+
+This test requires that each of the two PCs have a CS8900/20-based adapter
+installed and have the CS8900/20 Setup Utility running.  The first PC is 
+configured as a Responder and the other PC is configured as an Initiator.  
+Once the Initiator is started, it sends data frames to the Responder which 
+returns the frames to the Initiator.
+
+The total number of frames received and transmitted are displayed on the 
+Initiator's display, along with a count of the number of frames received and 
+transmitted OK or in error.  The test can be terminated anytime by the user at 
+either PC.
+
+To setup the Diagnostic Network Test:
+
+    1.) Select a PC with a CS8900/20-based adapter and a known working network
+        connection to act as the Responder.  Run the CS8900/20 Setup Utility 
+        and select 'Diagnostics -> Network Test -> Responder' from the main 
+        menu.  Hit ENTER to start the Responder.
+
+    2.) Return to the PC with the CS8900/20-based adapter you want to test and
+        start the CS8900/20 Setup Utility. 
+
+    3.) From the main menu, Select 'Diagnostic -> Network Test -> Initiator'.
+        Hit ENTER to start the test.
+ 
+You may stop the test on the Initiator at any time while allowing the Responder
+to continue running.  In this manner, you can move to additional PCs and test 
+them by starting the Initiator on another PC without having to stop/start the 
+Responder.
+ 
+
+
+5.3 USING THE ADAPTER'S LEDs
+
+The 2 and 3-media adapters have two LEDs visible on the back end of the board 
+located near the 10Base-T connector.  
+
+Link Integrity LED: A "steady" ON of the green LED indicates a valid 10Base-T 
+connection.  (Only applies to 10Base-T.  The green LED has no significance for
+a 10Base-2 or AUI connection.)
+
+TX/RX LED: The yellow LED lights briefly each time the adapter transmits or 
+receives data. (The yellow LED will appear to "flicker" on a typical network.)
+
+
+5.4 RESOLVING I/O CONFLICTS
+
+An IO conflict occurs when two or more adapter use the same ISA resource (IO 
+address, memory address or IRQ).  You can usually detect an IO conflict in one 
+of four ways after installing and or configuring the CS8900/20-based adapter:
+
+    1.) The system does not boot properly (or at all).
+
+    2.) The driver cannot communicate with the adapter, reporting an "Adapter
+        not found" error message.
+
+    3.) You cannot connect to the network or the driver will not load.
+
+    4.) If you have configured the adapter to run in memory mode but the driver
+        reports it is using IO mode when loading, this is an indication of a
+        memory address conflict.
+
+If an IO conflict occurs, run the CS8900/20 Setup Utility and perform a 
+diagnostic self-test.  Normally, the ISA resource in conflict will fail the 
+self-test.  If so, reconfigure the adapter selecting another choice for the 
+resource in conflict.  Run the diagnostics again to check for further IO 
+conflicts.
+
+In some cases, such as when the PC will not boot, it may be necessary to remove
+the adapter and reconfigure it by installing it in another PC to run the 
+CS8900/20 Setup Utility.  Once reinstalled in the target system, run the 
+diagnostics self-test to ensure the new configuration is free of conflicts 
+before loading the driver again.
+
+When manually configuring the adapter, keep in mind the typical ISA system 
+resource usage as indicated in the tables below.
+
+I/O Address    	Device                        IRQ      Device
+-----------    	--------                      ---      --------
+ 200-20F       	Game I/O adapter               3       COM2, Bus Mouse
+ 230-23F       	Bus Mouse                      4       COM1
+ 270-27F       	LPT3: third parallel port      5       LPT2
+ 2F0-2FF       	COM2: second serial port       6       Floppy Disk controller
+ 320-32F       	Fixed disk controller          7       LPT1
+                                      	       8       Real-time Clock
+                                                 9       EGA/VGA display adapter    
+                                                12       Mouse (PS/2)                              
+Memory Address  Device                          13       Math Coprocessor
+--------------  ---------------------           14       Hard Disk controller
+A000-BFFF	EGA Graphics Adapter
+A000-C7FF	VGA Graphics Adapter
+B000-BFFF	Mono Graphics Adapter
+B800-BFFF	Color Graphics Adapter
+E000-FFFF	AT BIOS
+
+
+
+
+6.0 TECHNICAL SUPPORT
+===============================================================================
+
+6.1 CONTACTING CIRRUS LOGIC'S TECHNICAL SUPPORT
+
+Cirrus Logic's CS89XX Technical Application Support can be reached at:
+
+Telephone  :(800) 888-5016 (from inside U.S. and Canada)
+           :(512) 442-7555 (from outside the U.S. and Canada)
+Fax        :(512) 912-3871
+Email      :ethernet@crystal.cirrus.com
+WWW        :http://www.cirrus.com
+
+
+6.2 INFORMATION REQUIRED BEFORE CONTACTING TECHNICAL SUPPORT
+
+Before contacting Cirrus Logic for technical support, be prepared to provide as 
+Much of the following information as possible. 
+
+1.) Adapter type (CRD8900, CDB8900, CDB8920, etc.)
+
+2.) Adapter configuration
+
+    * IO Base, Memory Base, IO or memory mode enabled, IRQ, DMA channel
+    * Plug and Play enabled/disabled (CS8920-based adapters only)
+    * Configured for media auto-detect or specific media type (which type).    
+
+3.) PC System's Configuration
+
+    * Plug and Play system (yes/no)
+    * BIOS (make and version)
+    * System make and model
+    * CPU (type and speed)
+    * System RAM
+    * SCSI Adapter
+
+4.) Software
+
+    * CS89XX driver and version
+    * Your network operating system and version
+    * Your system's OS version 
+    * Version of all protocol support files
+
+5.) Any Error Message displayed.
+
+
+
+6.3 OBTAINING THE LATEST DRIVER VERSION
+
+You can obtain the latest CS89XX drivers and support software from Cirrus Logic's 
+Web site.  You can also contact Cirrus Logic's Technical Support (email:
+ethernet@crystal.cirrus.com) and request that you be registered for automatic 
+software-update notification.
+
+Cirrus Logic maintains a web page at http://www.cirrus.com with the
+latest drivers and technical publications.
+
+
+6.4 Current maintainer
+
+In February 2000 the maintenance of this driver was assumed by Andrew
+Morton.
+
+6.5 Kernel module parameters
+
+For use in embedded environments with no cs89x0 EEPROM, the kernel boot
+parameter `cs89x0_media=' has been implemented.  Usage is:
+
+	cs89x0_media=rj45    or
+	cs89x0_media=aui     or
+	cs89x0_media=bnc
+
diff --git a/Documentation/networking/cxacru-cf.py b/Documentation/networking/cxacru-cf.py
new file mode 100644
index 000000000..b41d29839
--- /dev/null
+++ b/Documentation/networking/cxacru-cf.py
@@ -0,0 +1,48 @@
+#!/usr/bin/env python
+# Copyright 2009 Simon Arlott
+#
+# This program is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 2 of the License, or (at your option)
+# any later version.
+#
+# This program is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+# more details.
+#
+# You should have received a copy of the GNU General Public License along with
+# this program; if not, write to the Free Software Foundation, Inc., 59
+# Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+#
+# Usage: cxacru-cf.py < cxacru-cf.bin
+# Output: values string suitable for the sysfs adsl_config attribute
+#
+# Warning: cxacru-cf.bin with MD5 hash cdbac2689969d5ed5d4850f117702110
+# contains mis-aligned values which will stop the modem from being able
+# to make a connection. If the first and last two bytes are removed then
+# the values become valid, but the modulation will be forced to ANSI
+# T1.413 only which may not be appropriate.
+#
+# The original binary format is a packed list of le32 values.
+
+import sys
+import struct
+
+i = 0
+while True:
+	buf = sys.stdin.read(4)
+
+	if len(buf) == 0:
+		break
+	elif len(buf) != 4:
+		sys.stdout.write("\n")
+		sys.stderr.write("Error: read {0} not 4 bytes\n".format(len(buf)))
+		sys.exit(1)
+
+	if i > 0:
+		sys.stdout.write(" ")
+	sys.stdout.write("{0:x}={1}".format(i, struct.unpack("<I", buf)[0]))
+	i += 1
+
+sys.stdout.write("\n")
diff --git a/Documentation/networking/cxacru.txt b/Documentation/networking/cxacru.txt
new file mode 100644
index 000000000..2cce04457
--- /dev/null
+++ b/Documentation/networking/cxacru.txt
@@ -0,0 +1,100 @@
+Firmware is required for this device: http://accessrunner.sourceforge.net/
+
+While it is capable of managing/maintaining the ADSL connection without the
+module loaded, the device will sometimes stop responding after unloading the
+driver and it is necessary to unplug/remove power to the device to fix this.
+
+Note: support for cxacru-cf.bin has been removed. It was not loaded correctly
+so it had no effect on the device configuration. Fixing it could have stopped
+existing devices working when an invalid configuration is supplied.
+
+There is a script cxacru-cf.py to convert an existing file to the sysfs form.
+
+Detected devices will appear as ATM devices named "cxacru". In /sys/class/atm/
+these are directories named cxacruN where N is the device number. A symlink
+named device points to the USB interface device's directory which contains
+several sysfs attribute files for retrieving device statistics:
+
+* adsl_controller_version
+
+* adsl_headend
+* adsl_headend_environment
+	Information about the remote headend.
+
+* adsl_config
+	Configuration writing interface.
+	Write parameters in hexadecimal format <index>=<value>,
+	separated by whitespace, e.g.:
+		"1=0 a=5"
+	Up to 7 parameters at a time will be sent and the modem will restart
+	the ADSL connection when any value is set. These are logged for future
+	reference.
+
+* downstream_attenuation (dB)
+* downstream_bits_per_frame
+* downstream_rate (kbps)
+* downstream_snr_margin (dB)
+	Downstream stats.
+
+* upstream_attenuation (dB)
+* upstream_bits_per_frame
+* upstream_rate (kbps)
+* upstream_snr_margin (dB)
+* transmitter_power (dBm/Hz)
+	Upstream stats.
+
+* downstream_crc_errors
+* downstream_fec_errors
+* downstream_hec_errors
+* upstream_crc_errors
+* upstream_fec_errors
+* upstream_hec_errors
+	Error counts.
+
+* line_startable
+	Indicates that ADSL support on the device
+	is/can be enabled, see adsl_start.
+
+* line_status
+	"initialising"
+	"down"
+	"attempting to activate"
+	"training"
+	"channel analysis"
+	"exchange"
+	"waiting"
+	"up"
+
+	Changes between "down" and "attempting to activate"
+	if there is no signal.
+
+* link_status
+	"not connected"
+	"connected"
+	"lost"
+
+* mac_address
+
+* modulation
+	"" (when not connected)
+	"ANSI T1.413"
+	"ITU-T G.992.1 (G.DMT)"
+	"ITU-T G.992.2 (G.LITE)"
+
+* startup_attempts
+	Count of total attempts to initialise ADSL.
+
+To enable/disable ADSL, the following can be written to the adsl_state file:
+	"start"
+	"stop
+	"restart" (stops, waits 1.5s, then starts)
+	"poll" (used to resume status polling if it was disabled due to failure)
+
+Changes in adsl/line state are reported via kernel log messages:
+	[4942145.150704] ATM dev 0: ADSL state: running
+	[4942243.663766] ATM dev 0: ADSL line: down
+	[4942249.665075] ATM dev 0: ADSL line: attempting to activate
+	[4942253.654954] ATM dev 0: ADSL line: training
+	[4942255.666387] ATM dev 0: ADSL line: channel analysis
+	[4942259.656262] ATM dev 0: ADSL line: exchange
+	[2635357.696901] ATM dev 0: ADSL line: up (8128 kb/s down | 832 kb/s up)
diff --git a/Documentation/networking/cxgb.txt b/Documentation/networking/cxgb.txt
new file mode 100644
index 000000000..20a887615
--- /dev/null
+++ b/Documentation/networking/cxgb.txt
@@ -0,0 +1,352 @@
+                 Chelsio N210 10Gb Ethernet Network Controller
+
+                         Driver Release Notes for Linux
+
+                                 Version 2.1.1
+
+                                 June 20, 2005
+
+CONTENTS
+========
+ INTRODUCTION
+ FEATURES
+ PERFORMANCE
+ DRIVER MESSAGES
+ KNOWN ISSUES
+ SUPPORT
+
+
+INTRODUCTION
+============
+
+ This document describes the Linux driver for Chelsio 10Gb Ethernet Network
+ Controller. This driver supports the Chelsio N210 NIC and is backward
+ compatible with the Chelsio N110 model 10Gb NICs.
+
+
+FEATURES
+========
+
+ Adaptive Interrupts (adaptive-rx)
+ ---------------------------------
+
+  This feature provides an adaptive algorithm that adjusts the interrupt
+  coalescing parameters, allowing the driver to dynamically adapt the latency
+  settings to achieve the highest performance during various types of network
+  load.
+
+  The interface used to control this feature is ethtool. Please see the
+  ethtool manpage for additional usage information.
+
+  By default, adaptive-rx is disabled.
+  To enable adaptive-rx:
+
+      ethtool -C <interface> adaptive-rx on
+
+  To disable adaptive-rx, use ethtool:
+
+      ethtool -C <interface> adaptive-rx off
+
+  After disabling adaptive-rx, the timer latency value will be set to 50us.
+  You may set the timer latency after disabling adaptive-rx:
+
+      ethtool -C <interface> rx-usecs <microseconds>
+
+  An example to set the timer latency value to 100us on eth0:
+
+      ethtool -C eth0 rx-usecs 100
+
+  You may also provide a timer latency value while disabling adaptive-rx:
+
+      ethtool -C <interface> adaptive-rx off rx-usecs <microseconds>
+
+  If adaptive-rx is disabled and a timer latency value is specified, the timer
+  will be set to the specified value until changed by the user or until
+  adaptive-rx is enabled.
+
+  To view the status of the adaptive-rx and timer latency values:
+
+      ethtool -c <interface>
+
+
+ TCP Segmentation Offloading (TSO) Support
+ -----------------------------------------
+
+  This feature, also known as "large send", enables a system's protocol stack
+  to offload portions of outbound TCP processing to a network interface card
+  thereby reducing system CPU utilization and enhancing performance.
+
+  The interface used to control this feature is ethtool version 1.8 or higher.
+  Please see the ethtool manpage for additional usage information.
+
+  By default, TSO is enabled.
+  To disable TSO:
+
+      ethtool -K <interface> tso off
+
+  To enable TSO:
+
+      ethtool -K <interface> tso on
+
+  To view the status of TSO:
+
+      ethtool -k <interface>
+
+
+PERFORMANCE
+===========
+
+ The following information is provided as an example of how to change system
+ parameters for "performance tuning" an what value to use. You may or may not
+ want to change these system parameters, depending on your server/workstation
+ application. Doing so is not warranted in any way by Chelsio Communications,
+ and is done at "YOUR OWN RISK". Chelsio will not be held responsible for loss
+ of data or damage to equipment.
+
+ Your distribution may have a different way of doing things, or you may prefer
+ a different method. These commands are shown only to provide an example of
+ what to do and are by no means definitive.
+
+ Making any of the following system changes will only last until you reboot
+ your system. You may want to write a script that runs at boot-up which
+ includes the optimal settings for your system.
+
+  Setting PCI Latency Timer:
+      setpci -d 1425:* 0x0c.l=0x0000F800
+
+  Disabling TCP timestamp:
+      sysctl -w net.ipv4.tcp_timestamps=0
+
+  Disabling SACK:
+      sysctl -w net.ipv4.tcp_sack=0
+
+  Setting large number of incoming connection requests:
+      sysctl -w net.ipv4.tcp_max_syn_backlog=3000
+
+  Setting maximum receive socket buffer size:
+      sysctl -w net.core.rmem_max=1024000
+
+  Setting maximum send socket buffer size:
+      sysctl -w net.core.wmem_max=1024000
+
+  Set smp_affinity (on a multiprocessor system) to a single CPU:
+      echo 1 > /proc/irq/<interrupt_number>/smp_affinity
+
+  Setting default receive socket buffer size:
+      sysctl -w net.core.rmem_default=524287
+
+  Setting default send socket buffer size:
+      sysctl -w net.core.wmem_default=524287
+
+  Setting maximum option memory buffers:
+      sysctl -w net.core.optmem_max=524287
+
+  Setting maximum backlog (# of unprocessed packets before kernel drops):
+      sysctl -w net.core.netdev_max_backlog=300000
+
+  Setting TCP read buffers (min/default/max):
+      sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
+
+  Setting TCP write buffers (min/pressure/max):
+      sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
+
+  Setting TCP buffer space (min/pressure/max):
+      sysctl -w net.ipv4.tcp_mem="10000000 10000000 10000000"
+
+  TCP window size for single connections:
+   The receive buffer (RX_WINDOW) size must be at least as large as the
+   Bandwidth-Delay Product of the communication link between the sender and
+   receiver. Due to the variations of RTT, you may want to increase the buffer
+   size up to 2 times the Bandwidth-Delay Product. Reference page 289 of
+   "TCP/IP Illustrated, Volume 1, The Protocols" by W. Richard Stevens.
+   At 10Gb speeds, use the following formula:
+       RX_WINDOW >= 1.25MBytes * RTT(in milliseconds)
+       Example for RTT with 100us: RX_WINDOW = (1,250,000 * 0.1) = 125,000
+   RX_WINDOW sizes of 256KB - 512KB should be sufficient.
+   Setting the min, max, and default receive buffer (RX_WINDOW) size:
+       sysctl -w net.ipv4.tcp_rmem="<min> <default> <max>"
+
+  TCP window size for multiple connections:
+   The receive buffer (RX_WINDOW) size may be calculated the same as single
+   connections, but should be divided by the number of connections. The
+   smaller window prevents congestion and facilitates better pacing,
+   especially if/when MAC level flow control does not work well or when it is
+   not supported on the machine. Experimentation may be necessary to attain
+   the correct value. This method is provided as a starting point for the
+   correct receive buffer size.
+   Setting the min, max, and default receive buffer (RX_WINDOW) size is
+   performed in the same manner as single connection.
+
+
+DRIVER MESSAGES
+===============
+
+ The following messages are the most common messages logged by syslog. These
+ may be found in /var/log/messages.
+
+  Driver up:
+     Chelsio Network Driver - version 2.1.1
+
+  NIC detected:
+     eth#: Chelsio N210 1x10GBaseX NIC (rev #), PCIX 133MHz/64-bit
+
+  Link up:
+     eth#: link is up at 10 Gbps, full duplex
+
+  Link down:
+     eth#: link is down
+
+
+KNOWN ISSUES
+============
+
+ These issues have been identified during testing. The following information
+ is provided as a workaround to the problem. In some cases, this problem is
+ inherent to Linux or to a particular Linux Distribution and/or hardware
+ platform.
+
+  1. Large number of TCP retransmits on a multiprocessor (SMP) system.
+
+      On a system with multiple CPUs, the interrupt (IRQ) for the network
+      controller may be bound to more than one CPU. This will cause TCP
+      retransmits if the packet data were to be split across different CPUs
+      and re-assembled in a different order than expected.
+
+      To eliminate the TCP retransmits, set smp_affinity on the particular
+      interrupt to a single CPU. You can locate the interrupt (IRQ) used on
+      the N110/N210 by using ifconfig:
+          ifconfig <dev_name> | grep Interrupt
+      Set the smp_affinity to a single CPU:
+          echo 1 > /proc/irq/<interrupt_number>/smp_affinity
+
+      It is highly suggested that you do not run the irqbalance daemon on your
+      system, as this will change any smp_affinity setting you have applied.
+      The irqbalance daemon runs on a 10 second interval and binds interrupts
+      to the least loaded CPU determined by the daemon. To disable this daemon:
+          chkconfig --level 2345 irqbalance off
+
+      By default, some Linux distributions enable the kernel feature,
+      irqbalance, which performs the same function as the daemon. To disable
+      this feature, add the following line to your bootloader:
+          noirqbalance
+
+          Example using the Grub bootloader:
+              title Red Hat Enterprise Linux AS (2.4.21-27.ELsmp)
+              root (hd0,0)
+              kernel /vmlinuz-2.4.21-27.ELsmp ro root=/dev/hda3 noirqbalance
+              initrd /initrd-2.4.21-27.ELsmp.img
+
+  2. After running insmod, the driver is loaded and the incorrect network
+     interface is brought up without running ifup.
+
+      When using 2.4.x kernels, including RHEL kernels, the Linux kernel
+      invokes a script named "hotplug". This script is primarily used to
+      automatically bring up USB devices when they are plugged in, however,
+      the script also attempts to automatically bring up a network interface
+      after loading the kernel module. The hotplug script does this by scanning
+      the ifcfg-eth# config files in /etc/sysconfig/network-scripts, looking
+      for HWADDR=<mac_address>.
+
+      If the hotplug script does not find the HWADDRR within any of the
+      ifcfg-eth# files, it will bring up the device with the next available
+      interface name. If this interface is already configured for a different
+      network card, your new interface will have incorrect IP address and
+      network settings.
+
+      To solve this issue, you can add the HWADDR=<mac_address> key to the
+      interface config file of your network controller.
+
+      To disable this "hotplug" feature, you may add the driver (module name)
+      to the "blacklist" file located in /etc/hotplug. It has been noted that
+      this does not work for network devices because the net.agent script
+      does not use the blacklist file. Simply remove, or rename, the net.agent
+      script located in /etc/hotplug to disable this feature.
+
+  3. Transport Protocol (TP) hangs when running heavy multi-connection traffic
+     on an AMD Opteron system with HyperTransport PCI-X Tunnel chipset.
+
+      If your AMD Opteron system uses the AMD-8131 HyperTransport PCI-X Tunnel
+      chipset, you may experience the "133-Mhz Mode Split Completion Data
+      Corruption" bug identified by AMD while using a 133Mhz PCI-X card on the
+      bus PCI-X bus.
+
+      AMD states, "Under highly specific conditions, the AMD-8131 PCI-X Tunnel
+      can provide stale data via split completion cycles to a PCI-X card that
+      is operating at 133 Mhz", causing data corruption.
+
+      AMD's provides three workarounds for this problem, however, Chelsio
+      recommends the first option for best performance with this bug:
+
+        For 133Mhz secondary bus operation, limit the transaction length and
+        the number of outstanding transactions, via BIOS configuration
+        programming of the PCI-X card, to the following:
+
+           Data Length (bytes): 1k
+           Total allowed outstanding transactions: 2
+
+      Please refer to AMD 8131-HT/PCI-X Errata 26310 Rev 3.08 August 2004,
+      section 56, "133-MHz Mode Split Completion Data Corruption" for more
+      details with this bug and workarounds suggested by AMD.
+
+      It may be possible to work outside AMD's recommended PCI-X settings, try
+      increasing the Data Length to 2k bytes for increased performance. If you
+      have issues with these settings, please revert to the "safe" settings
+      and duplicate the problem before submitting a bug or asking for support.
+
+      NOTE: The default setting on most systems is 8 outstanding transactions
+            and 2k bytes data length.
+
+  4. On multiprocessor systems, it has been noted that an application which
+     is handling 10Gb networking can switch between CPUs causing degraded
+     and/or unstable performance.
+
+      If running on an SMP system and taking performance measurements, it
+      is suggested you either run the latest netperf-2.4.0+ or use a binding
+      tool such as Tim Hockin's procstate utilities (runon)
+      <http://www.hockin.org/~thockin/procstate/>.
+
+      Binding netserver and netperf (or other applications) to particular
+      CPUs will have a significant difference in performance measurements.
+      You may need to experiment which CPU to bind the application to in
+      order to achieve the best performance for your system.
+
+      If you are developing an application designed for 10Gb networking,
+      please keep in mind you may want to look at kernel functions
+      sched_setaffinity & sched_getaffinity to bind your application.
+
+      If you are just running user-space applications such as ftp, telnet,
+      etc., you may want to try the runon tool provided by Tim Hockin's
+      procstate utility. You could also try binding the interface to a
+      particular CPU: runon 0 ifup eth0
+
+
+SUPPORT
+=======
+
+ If you have problems with the software or hardware, please contact our
+ customer support team via email at support@chelsio.com or check our website
+ at http://www.chelsio.com
+
+===============================================================================
+
+ Chelsio Communications
+ 370 San Aleso Ave.
+ Suite 100
+ Sunnyvale, CA 94085
+ http://www.chelsio.com
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License, version 2, as
+published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License along
+with this program; if not, write to the Free Software Foundation, Inc.,
+59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+
+THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
+WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
+
+ Copyright (c) 2003-2005 Chelsio Communications. All rights reserved.
+
+===============================================================================
diff --git a/Documentation/networking/dccp.txt b/Documentation/networking/dccp.txt
new file mode 100644
index 000000000..55c575fca
--- /dev/null
+++ b/Documentation/networking/dccp.txt
@@ -0,0 +1,207 @@
+DCCP protocol
+=============
+
+
+Contents
+========
+- Introduction
+- Missing features
+- Socket options
+- Sysctl variables
+- IOCTLs
+- Other tunables
+- Notes
+
+
+Introduction
+============
+Datagram Congestion Control Protocol (DCCP) is an unreliable, connection
+oriented protocol designed to solve issues present in UDP and TCP, particularly
+for real-time and multimedia (streaming) traffic.
+It divides into a base protocol (RFC 4340) and pluggable congestion control
+modules called CCIDs. Like pluggable TCP congestion control, at least one CCID
+needs to be enabled in order for the protocol to function properly. In the Linux
+implementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as
+the TCP-friendly CCID3 (RFC 4342), are optional.
+For a brief introduction to CCIDs and suggestions for choosing a CCID to match
+given applications, see section 10 of RFC 4340.
+
+It has a base protocol and pluggable congestion control IDs (CCIDs).
+
+DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol
+is at http://www.ietf.org/html.charters/dccp-charter.html
+
+
+Missing features
+================
+The Linux DCCP implementation does not currently support all the features that are
+specified in RFCs 4340...42.
+
+The known bugs are at:
+	http://www.linuxfoundation.org/collaborate/workgroups/networking/todo#DCCP
+
+For more up-to-date versions of the DCCP implementation, please consider using
+the experimental DCCP test tree; instructions for checking this out are on:
+http://www.linuxfoundation.org/collaborate/workgroups/networking/dccp_testing#Experimental_DCCP_source_tree
+
+
+Socket options
+==============
+DCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes
+a policy ID as argument and can only be set before the connection (i.e. changes
+during an established connection are not supported). Currently, two policies are
+defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special,
+and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an
+u32 priority value as ancillary data to sendmsg(), where higher numbers indicate
+a higher packet priority (similar to SO_PRIORITY). This ancillary data needs to
+be formatted using a cmsg(3) message header filled in as follows:
+	cmsg->cmsg_level = SOL_DCCP;
+	cmsg->cmsg_type	 = DCCP_SCM_PRIORITY;
+	cmsg->cmsg_len	 = CMSG_LEN(sizeof(uint32_t));	/* or CMSG_LEN(4) */
+
+DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero
+value is always interpreted as unbounded queue length. If different from zero,
+the interpretation of this parameter depends on the current dequeuing policy
+(see above): the "simple" policy will enforce a fixed queue size by returning
+EAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the
+lowest-priority packet first. The default value for this parameter is
+initialised from /proc/sys/net/dccp/default/tx_qlen.
+
+DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
+service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
+the socket will fall back to 0 (which means that no meaningful service code
+is present). On active sockets this is set before connect(); specifying more
+than one code has no effect (all subsequent service codes are ignored). The
+case is different for passive sockets, where multiple service codes (up to 32)
+can be set before calling bind().
+
+DCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
+size (application payload size) in bytes, see RFC 4340, section 14.
+
+DCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs
+supported by the endpoint. The option value is an array of type uint8_t whose
+size is passed as option length. The minimum array size is 4 elements, the
+value returned in the optlen argument always reflects the true number of
+built-in CCIDs.
+
+DCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same
+time, combining the operation of the next two socket options. This option is
+preferable over the latter two, since often applications will use the same
+type of CCID for both directions; and mixed use of CCIDs is not currently well
+understood. This socket option takes as argument at least one uint8_t value, or
+an array of uint8_t values, which must match available CCIDS (see above). CCIDs
+must be registered on the socket before calling connect() or listen().
+
+DCCP_SOCKOPT_TX_CCID is read/write. It returns the current CCID (if set) or sets
+the preference list for the TX CCID, using the same format as DCCP_SOCKOPT_CCID.
+Please note that the getsockopt argument type here is `int', not uint8_t.
+
+DCCP_SOCKOPT_RX_CCID is analogous to DCCP_SOCKOPT_TX_CCID, but for the RX CCID.
+
+DCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold
+timewait state when closing the connection (RFC 4340, 8.3). The usual case is
+that the closing server sends a CloseReq, whereupon the client holds timewait
+state. When this boolean socket option is on, the server sends a Close instead
+and will enter TIMEWAIT. This option must be set after accept() returns.
+
+DCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the
+partial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums
+always cover the entire packet and that only fully covered application data is
+accepted by the receiver. Hence, when using this feature on the sender, it must
+be enabled at the receiver, too with suitable choice of CsCov.
+
+DCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the
+	range 0..15 are acceptable. The default setting is 0 (full coverage),
+	values between 1..15 indicate partial coverage.
+DCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it
+	sets a threshold, where again values 0..15 are acceptable. The default
+	of 0 means that all packets with a partial coverage will be discarded.
+	Values in the range 1..15 indicate that packets with minimally such a
+	coverage value are also acceptable. The higher the number, the more
+	restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage
+	settings are inherited to the child socket after accept().
+
+The following two options apply to CCID 3 exclusively and are getsockopt()-only.
+In either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
+DCCP_SOCKOPT_CCID_RX_INFO
+	Returns a `struct tfrc_rx_info' in optval; the buffer for optval and
+	optlen must be set to at least sizeof(struct tfrc_rx_info).
+DCCP_SOCKOPT_CCID_TX_INFO
+	Returns a `struct tfrc_tx_info' in optval; the buffer for optval and
+	optlen must be set to at least sizeof(struct tfrc_tx_info).
+
+On unidirectional connections it is useful to close the unused half-connection
+via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs.
+
+
+Sysctl variables
+================
+Several DCCP default parameters can be managed by the following sysctls
+(sysctl net.dccp.default or /proc/sys/net/dccp/default):
+
+request_retries
+	The number of active connection initiation retries (the number of
+	Requests minus one) before timing out. In addition, it also governs
+	the behaviour of the other, passive side: this variable also sets
+	the number of times DCCP repeats sending a Response when the initial
+	handshake does not progress from RESPOND to OPEN (i.e. when no Ack
+	is received after the initial Request).  This value should be greater
+	than 0, suggested is less than 10. Analogue of tcp_syn_retries.
+
+retries1
+	How often a DCCP Response is retransmitted until the listening DCCP
+	side considers its connecting peer dead. Analogue of tcp_retries1.
+
+retries2
+	The number of times a general DCCP packet is retransmitted. This has
+	importance for retransmitted acknowledgments and feature negotiation,
+	data packets are never retransmitted. Analogue of tcp_retries2.
+
+tx_ccid = 2
+	Default CCID for the sender-receiver half-connection. Depending on the
+	choice of CCID, the Send Ack Vector feature is enabled automatically.
+
+rx_ccid = 2
+	Default CCID for the receiver-sender half-connection; see tx_ccid.
+
+seq_window = 100
+	The initial sequence window (sec. 7.5.2) of the sender. This influences
+	the local ackno validity and the remote seqno validity windows (7.5.1).
+	Values in the range Wmin = 32 (RFC 4340, 7.5.2) up to 2^32-1 can be set.
+
+tx_qlen = 5
+	The size of the transmit buffer in packets. A value of 0 corresponds
+	to an unbounded transmit buffer.
+
+sync_ratelimit = 125 ms
+	The timeout between subsequent DCCP-Sync packets sent in response to
+	sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
+	of this parameter is milliseconds; a value of 0 disables rate-limiting.
+
+
+IOCTLS
+======
+FIONREAD
+	Works as in udp(7): returns in the `int' argument pointer the size of
+	the next pending datagram in bytes, or 0 when no datagram is pending.
+
+
+Other tunables
+==============
+Per-route rto_min support
+	CCID-2 supports the RTAX_RTO_MIN per-route setting for the minimum value
+	of the RTO timer. This setting can be modified via the 'rto_min' option
+	of iproute2; for example:
+		> ip route change 10.0.0.0/24   rto_min 250j dev wlan0
+		> ip route add    10.0.0.254/32 rto_min 800j dev wlan0
+		> ip route show dev wlan0
+	CCID-3 also supports the rto_min setting: it is used to define the lower
+	bound for the expiry of the nofeedback timer. This can be useful on LANs
+	with very low RTTs (e.g., loopback, Gbit ethernet).
+
+
+Notes
+=====
+DCCP does not travel through NAT successfully at present on many boxes. This is
+because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
+support for DCCP has been added.
diff --git a/Documentation/networking/dctcp.txt b/Documentation/networking/dctcp.txt
new file mode 100644
index 000000000..13a857753
--- /dev/null
+++ b/Documentation/networking/dctcp.txt
@@ -0,0 +1,44 @@
+DCTCP (DataCenter TCP)
+----------------------
+
+DCTCP is an enhancement to the TCP congestion control algorithm for data
+center networks and leverages Explicit Congestion Notification (ECN) in
+the data center network to provide multi-bit feedback to the end hosts.
+
+To enable it on end hosts:
+
+  sysctl -w net.ipv4.tcp_congestion_control=dctcp
+  sysctl -w net.ipv4.tcp_ecn_fallback=0 (optional)
+
+All switches in the data center network running DCTCP must support ECN
+marking and be configured for marking when reaching defined switch buffer
+thresholds. The default ECN marking threshold heuristic for DCTCP on
+switches is 20 packets (30KB) at 1Gbps, and 65 packets (~100KB) at 10Gbps,
+but might need further careful tweaking.
+
+For more details, see below documents:
+
+Paper:
+
+The algorithm is further described in detail in the following two
+SIGCOMM/SIGMETRICS papers:
+
+ i) Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye,
+    Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan:
+      "Data Center TCP (DCTCP)", Data Center Networks session
+      Proc. ACM SIGCOMM, New Delhi, 2010.
+    http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp-final.pdf
+    http://www.sigcomm.org/ccr/papers/2010/October/1851275.1851192
+
+ii) Mohammad Alizadeh, Adel Javanmard, and Balaji Prabhakar:
+      "Analysis of DCTCP: Stability, Convergence, and Fairness"
+      Proc. ACM SIGMETRICS, San Jose, 2011.
+    http://simula.stanford.edu/~alizade/Site/DCTCP_files/dctcp_analysis-full.pdf
+
+IETF informational draft:
+
+  http://tools.ietf.org/html/draft-bensley-tcpm-dctcp-00
+
+DCTCP site:
+
+  http://simula.stanford.edu/~alizade/Site/DCTCP.html
diff --git a/Documentation/networking/de4x5.txt b/Documentation/networking/de4x5.txt
new file mode 100644
index 000000000..c8e4ca9b2
--- /dev/null
+++ b/Documentation/networking/de4x5.txt
@@ -0,0 +1,178 @@
+    Originally,   this  driver  was    written  for the  Digital   Equipment
+    Corporation series of EtherWORKS Ethernet cards:
+
+        DE425 TP/COAX EISA
+	DE434 TP PCI
+	DE435 TP/COAX/AUI PCI
+	DE450 TP/COAX/AUI PCI
+	DE500 10/100 PCI Fasternet
+
+    but it  will  now attempt  to  support all  cards which   conform to the
+    Digital Semiconductor   SROM   Specification.    The  driver   currently
+    recognises the following chips:
+
+        DC21040  (no SROM) 
+	DC21041[A]  
+	DC21140[A] 
+	DC21142 
+	DC21143 
+
+    So far the driver is known to work with the following cards:
+
+        KINGSTON
+	Linksys
+	ZNYX342
+	SMC8432
+	SMC9332 (w/new SROM)
+	ZNYX31[45]
+	ZNYX346 10/100 4 port (can act as a 10/100 bridge!) 
+
+    The driver has been tested on a relatively busy network using the DE425,
+    DE434, DE435 and DE500 cards and benchmarked with 'ttcp': it transferred
+    16M of data to a DECstation 5000/200 as follows:
+
+                TCP           UDP
+             TX     RX     TX     RX
+    DE425   1030k  997k   1170k  1128k
+    DE434   1063k  995k   1170k  1125k
+    DE435   1063k  995k   1170k  1125k
+    DE500   1063k  998k   1170k  1125k  in 10Mb/s mode
+
+    All  values are typical (in   kBytes/sec) from a  sample  of 4 for  each
+    measurement. Their error is +/-20k on a quiet (private) network and also
+    depend on what load the CPU has.
+
+    =========================================================================
+
+    The ability to load this  driver as a loadable  module has been included
+    and used extensively  during the driver development  (to save those long
+    reboot sequences).  Loadable module support  under PCI and EISA has been
+    achieved by letting the driver autoprobe as if it were compiled into the
+    kernel. Do make sure  you're not sharing  interrupts with anything  that
+    cannot accommodate  interrupt  sharing!
+
+    To utilise this ability, you have to do 8 things:
+
+    0) have a copy of the loadable modules code installed on your system.
+    1) copy de4x5.c from the  /linux/drivers/net directory to your favourite
+    temporary directory.
+    2) for fixed  autoprobes (not  recommended),  edit the source code  near
+    line 5594 to reflect the I/O address  you're using, or assign these when
+    loading by:
+
+                   insmod de4x5 io=0xghh           where g = bus number
+		                                        hh = device number   
+
+       NB: autoprobing for modules is now supported by default. You may just
+           use:
+
+                   insmod de4x5
+
+           to load all available boards. For a specific board, still use
+	   the 'io=?' above.
+    3) compile  de4x5.c, but include -DMODULE in  the command line to ensure
+    that the correct bits are compiled (see end of source code).
+    4) if you are wanting to add a new  card, goto 5. Otherwise, recompile a
+    kernel with the de4x5 configuration turned off and reboot.
+    5) insmod de4x5 [io=0xghh]
+    6) run the net startup bits for your new eth?? interface(s) manually 
+    (usually /etc/rc.inet[12] at boot time). 
+    7) enjoy!
+
+    To unload a module, turn off the associated interface(s) 
+    'ifconfig eth?? down' then 'rmmod de4x5'.
+
+    Automedia detection is included so that in  principle you can disconnect
+    from, e.g.  TP, reconnect  to BNC  and  things will still work  (after a
+    pause whilst the   driver figures out   where its media went).  My tests
+    using ping showed that it appears to work....
+
+    By  default,  the driver will  now   autodetect any  DECchip based card.
+    Should you have a need to restrict the driver to DIGITAL only cards, you
+    can compile with a  DEC_ONLY define, or if  loading as a module, use the
+    'dec_only=1'  parameter. 
+
+    I've changed the timing routines to  use the kernel timer and scheduling
+    functions  so that the  hangs  and other assorted problems that occurred
+    while autosensing the  media  should be gone.  A  bonus  for the DC21040
+    auto  media sense algorithm is  that it can now  use one that is more in
+    line with the  rest (the DC21040  chip doesn't  have a hardware  timer).
+    The downside is the 1 'jiffies' (10ms) resolution.
+
+    IEEE 802.3u MII interface code has  been added in anticipation that some
+    products may use it in the future.
+
+    The SMC9332 card  has a non-compliant SROM  which needs fixing -  I have
+    patched this  driver to detect it  because the SROM format used complies
+    to a previous DEC-STD format.
+
+    I have removed the buffer copies needed for receive on Intels.  I cannot
+    remove them for   Alphas since  the  Tulip hardware   only does longword
+    aligned  DMA transfers  and  the  Alphas get   alignment traps with  non
+    longword aligned data copies (which makes them really slow). No comment.
+
+    I  have added SROM decoding  routines to make this  driver work with any
+    card that  supports the Digital  Semiconductor SROM spec. This will help
+    all  cards running the dc2114x  series chips in particular.  Cards using
+    the dc2104x  chips should run correctly with  the basic  driver.  I'm in
+    debt to <mjacob@feral.com> for the  testing and feedback that helped get
+    this feature working.  So far we have  tested KINGSTON, SMC8432, SMC9332
+    (with the latest SROM complying  with the SROM spec  V3: their first was
+    broken), ZNYX342  and  LinkSys. ZNYX314 (dual  21041  MAC) and  ZNYX 315
+    (quad 21041 MAC)  cards also  appear  to work despite their  incorrectly
+    wired IRQs.
+
+    I have added a temporary fix for interrupt problems when some SCSI cards
+    share the same interrupt as the DECchip based  cards. The problem occurs
+    because  the SCSI card wants to  grab the interrupt  as a fast interrupt
+    (runs the   service routine with interrupts turned   off) vs.  this card
+    which really needs to run the service routine with interrupts turned on.
+    This driver will  now   add the interrupt service   routine  as  a  fast
+    interrupt if it   is bounced from the   slow interrupt.  THIS IS NOT   A
+    RECOMMENDED WAY TO RUN THE DRIVER  and has been done  for a limited time
+    until  people   sort  out their  compatibility    issues and the  kernel
+    interrupt  service code  is  fixed.   YOU  SHOULD SEPARATE OUT  THE FAST
+    INTERRUPT CARDS FROM THE SLOW INTERRUPT CARDS to ensure that they do not
+    run on the same interrupt. PCMCIA/CardBus is another can of worms...
+
+    Finally, I think  I have really  fixed  the module  loading problem with
+    more than one DECchip based  card.  As a  side effect, I don't mess with
+    the  device structure any  more which means that  if more than 1 card in
+    2.0.x is    installed (4  in   2.1.x),  the  user   will have   to  edit
+    linux/drivers/net/Space.c  to make room for  them. Hence, module loading
+    is  the preferred way to use   this driver, since  it  doesn't have this
+    limitation.
+
+    Where SROM media  detection is used and  full duplex is specified in the
+    SROM,  the feature is  ignored unless  lp->params.fdx  is set at compile
+    time  OR during  a   module load  (insmod  de4x5   args='eth??:fdx' [see
+    below]).  This is because there  is no way  to automatically detect full
+    duplex   links  except through   autonegotiation.    When I  include the
+    autonegotiation feature in  the SROM autoconf  code, this detection will
+    occur automatically for that case.
+
+    Command line  arguments are  now allowed, similar to  passing  arguments
+    through LILO. This will allow a per adapter board set  up of full duplex
+    and media. The only lexical constraints are:  the board name (dev->name)
+    appears in  the list before its parameters.  The list of parameters ends
+    either at the end of the parameter list or with another board name.  The
+    following parameters are allowed:
+
+            fdx        for full duplex
+	    autosense  to set the media/speed; with the following 
+	               sub-parameters:
+		       TP, TP_NW, BNC, AUI, BNC_AUI, 100Mb, 10Mb, AUTO
+
+    Case sensitivity is important  for  the sub-parameters. They *must*   be
+    upper case. Examples:
+
+        insmod de4x5 args='eth1:fdx autosense=BNC eth0:autosense=100Mb'.
+
+    For a compiled in driver, in linux/drivers/net/CONFIG, place e.g.
+	DE4X5_OPTS = -DDE4X5_PARM='"eth0:fdx autosense=AUI eth2:autosense=TP"' 
+
+    Yes,  I know full duplex  isn't permissible on BNC  or AUI; they're just
+    examples. By default, full duplex is turned  off and AUTO is the default
+    autosense setting. In  reality, I expect only the  full duplex option to
+    be used. Note the use of single quotes in the two examples above and the
+    lack of commas to separate items.
diff --git a/Documentation/networking/decnet.txt b/Documentation/networking/decnet.txt
new file mode 100644
index 000000000..e12a4900c
--- /dev/null
+++ b/Documentation/networking/decnet.txt
@@ -0,0 +1,232 @@
+                    Linux DECnet Networking Layer Information
+                   ===========================================
+
+1) Other documentation....
+
+   o Project Home Pages
+       http://www.chygwyn.com/                      	    - Kernel info
+       http://linux-decnet.sourceforge.net/                - Userland tools
+       http://www.sourceforge.net/projects/linux-decnet/   - Status page
+
+2) Configuring the kernel
+
+Be sure to turn on the following options:
+
+    CONFIG_DECNET (obviously)
+    CONFIG_PROC_FS (to see what's going on)
+    CONFIG_SYSCTL (for easy configuration)
+
+if you want to try out router support (not properly debugged yet)
+you'll need the following options as well...
+
+    CONFIG_DECNET_ROUTER (to be able to add/delete routes)
+    CONFIG_NETFILTER (will be required for the DECnet routing daemon)
+
+    CONFIG_DECNET_ROUTE_FWMARK is optional
+
+Don't turn on SIOCGIFCONF support for DECnet unless you are really sure
+that you need it, in general you won't and it can cause ifconfig to
+malfunction.
+
+Run time configuration has changed slightly from the 2.4 system. If you
+want to configure an endnode, then the simplified procedure is as follows:
+
+ o Set the MAC address on your ethernet card before starting _any_ other
+   network protocols.
+
+As soon as your network card is brought into the UP state, DECnet should
+start working. If you need something more complicated or are unsure how
+to set the MAC address, see the next section. Also all configurations which
+worked with 2.4 will work under 2.5 with no change.
+
+3) Command line options
+
+You can set a DECnet address on the kernel command line for compatibility
+with the 2.4 configuration procedure, but in general it's not needed any more.
+If you do st a DECnet address on the command line, it has only one purpose
+which is that its added to the addresses on the loopback device.
+
+With 2.4 kernels, DECnet would only recognise addresses as local if they
+were added to the loopback device. In 2.5, any local interface address
+can be used to loop back to the local machine. Of course this does not
+prevent you adding further addresses to the loopback device if you
+want to.
+
+N.B. Since the address list of an interface determines the addresses for
+which "hello" messages are sent, if you don't set an address on the loopback
+interface then you won't see any entries in /proc/net/neigh for the local
+host until such time as you start a connection. This doesn't affect the
+operation of the local communications in any other way though.
+
+The kernel command line takes options looking like the following:
+
+    decnet.addr=1,2
+
+the two numbers are the node address 1,2 = 1.2 For 2.2.xx kernels
+and early 2.3.xx kernels, you must use a comma when specifying the
+DECnet address like this. For more recent 2.3.xx kernels, you may
+use almost any character except space, although a `.` would be the most
+obvious choice :-)
+
+There used to be a third number specifying the node type. This option
+has gone away in favour of a per interface node type. This is now set
+using /proc/sys/net/decnet/conf/<dev>/forwarding. This file can be
+set with a single digit, 0=EndNode, 1=L1 Router and  2=L2 Router.
+
+There are also equivalent options for modules. The node address can
+also be set through the /proc/sys/net/decnet/ files, as can other system
+parameters.
+
+Currently the only supported devices are ethernet and ip_gre. The
+ethernet address of your ethernet card has to be set according to the DECnet
+address of the node in order for it to be autoconfigured (and then appear in
+/proc/net/decnet_dev). There is a utility available at the above
+FTP sites called dn2ethaddr which can compute the correct ethernet
+address to use. The address can be set by ifconfig either before or
+at the time the device is brought up. If you are using RedHat you can
+add the line:
+
+    MACADDR=AA:00:04:00:03:04
+
+or something similar, to /etc/sysconfig/network-scripts/ifcfg-eth0 or
+wherever your network card's configuration lives. Setting the MAC address
+of your ethernet card to an address starting with "hi-ord" will cause a
+DECnet address which matches to be added to the interface (which you can
+verify with iproute2).
+
+The default device for routing can be set through the /proc filesystem
+by setting /proc/sys/net/decnet/default_device to the
+device you want DECnet to route packets out of when no specific route
+is available. Usually this will be eth0, for example:
+
+    echo -n "eth0" >/proc/sys/net/decnet/default_device
+
+If you don't set the default device, then it will default to the first
+ethernet card which has been autoconfigured as described above. You can
+confirm that by looking in the default_device file of course.
+
+There is a list of what the other files under /proc/sys/net/decnet/ do
+on the kernel patch web site (shown above).
+
+4) Run time kernel configuration
+
+This is either done through the sysctl/proc interface (see the kernel web
+pages for details on what the various options do) or through the iproute2
+package in the same way as IPv4/6 configuration is performed.
+
+Documentation for iproute2 is included with the package, although there is
+as yet no specific section on DECnet, most of the features apply to both
+IP and DECnet, albeit with DECnet addresses instead of IP addresses and
+a reduced functionality.
+
+If you want to configure a DECnet router you'll need the iproute2 package
+since its the _only_ way to add and delete routes currently. Eventually
+there will be a routing daemon to send and receive routing messages for
+each interface and update the kernel routing tables accordingly. The
+routing daemon will use netfilter to listen to routing packets, and
+rtnetlink to update the kernels routing tables. 
+
+The DECnet raw socket layer has been removed since it was there purely
+for use by the routing daemon which will now use netfilter (a much cleaner
+and more generic solution) instead.
+
+5) How can I tell if its working ?
+
+Here is a quick guide of what to look for in order to know if your DECnet
+kernel subsystem is working.
+
+   - Is the node address set (see /proc/sys/net/decnet/node_address)
+   - Is the node of the correct type 
+                             (see /proc/sys/net/decnet/conf/<dev>/forwarding)
+   - Is the Ethernet MAC address of each Ethernet card set to match
+     the DECnet address. If in doubt use the dn2ethaddr utility available
+     at the ftp archive.
+   - If the previous two steps are satisfied, and the Ethernet card is up,
+     you should find that it is listed in /proc/net/decnet_dev and also
+     that it appears as a directory in /proc/sys/net/decnet/conf/. The
+     loopback device (lo) should also appear and is required to communicate
+     within a node.
+   - If you have any DECnet routers on your network, they should appear
+     in /proc/net/decnet_neigh, otherwise this file will only contain the
+     entry for the node itself (if it doesn't check to see if lo is up).
+   - If you want to send to any node which is not listed in the
+     /proc/net/decnet_neigh file, you'll need to set the default device
+     to point to an Ethernet card with connection to a router. This is
+     again done with the /proc/sys/net/decnet/default_device file.
+   - Try starting a simple server and client, like the dnping/dnmirror
+     over the loopback interface. With luck they should communicate.
+     For this step and those after, you'll need the DECnet library
+     which can be obtained from the above ftp sites as well as the
+     actual utilities themselves.
+   - If this seems to work, then try talking to a node on your local
+     network, and see if you can obtain the same results.
+   - At this point you are on your own... :-)
+
+6) How to send a bug report
+
+If you've found a bug and want to report it, then there are several things
+you can do to help me work out exactly what it is that is wrong. Useful
+information (_most_ of which _is_ _essential_) includes:
+
+ - What kernel version are you running ?
+ - What version of the patch are you running ?
+ - How far though the above set of tests can you get ?
+ - What is in the /proc/decnet* files and /proc/sys/net/decnet/* files ?
+ - Which services are you running ?
+ - Which client caused the problem ?
+ - How much data was being transferred ?
+ - Was the network congested ?
+ - How can the problem be reproduced ?
+ - Can you use tcpdump to get a trace ? (N.B. Most (all?) versions of 
+   tcpdump don't understand how to dump DECnet properly, so including
+   the hex listing of the packet contents is _essential_, usually the -x flag.
+   You may also need to increase the length grabbed with the -s flag. The
+   -e flag also provides very useful information (ethernet MAC addresses))
+
+7) MAC FAQ
+
+A quick FAQ on ethernet MAC addresses to explain how Linux and DECnet
+interact and how to get the best performance from your hardware. 
+
+Ethernet cards are designed to normally only pass received network frames 
+to a host computer when they are addressed to it, or to the broadcast address.
+
+Linux has an interface which allows the setting of extra addresses for
+an ethernet card to listen to. If the ethernet card supports it, the
+filtering operation will be done in hardware, if not the extra unwanted packets
+received will be discarded by the host computer. In the latter case,
+significant processor time and bus bandwidth can be used up on a busy
+network (see the NAPI documentation for a longer explanation of these
+effects).
+
+DECnet makes use of this interface to allow running DECnet on an ethernet 
+card which has already been configured using TCP/IP (presumably using the 
+built in MAC address of the card, as usual) and/or to allow multiple DECnet
+addresses on each physical interface. If you do this, be aware that if your
+ethernet card doesn't support perfect hashing in its MAC address filter
+then your computer will be doing more work than required. Some cards
+will simply set themselves into promiscuous mode in order to receive
+packets from the DECnet specified addresses. So if you have one of these
+cards its better to set the MAC address of the card as described above
+to gain the best efficiency. Better still is to use a card which supports
+NAPI as well.
+
+
+8) Mailing list
+
+If you are keen to get involved in development, or want to ask questions
+about configuration, or even just report bugs, then there is a mailing
+list that you can join, details are at:
+
+http://sourceforge.net/mail/?group_id=4993
+
+9) Legal Info
+
+The Linux DECnet project team have placed their code under the GPL. The
+software is provided "as is" and without warranty express or implied.
+DECnet is a trademark of Compaq. This software is not a product of
+Compaq. We acknowledge the help of people at Compaq in providing extra
+documentation above and beyond what was previously publicly available.
+
+Steve Whitehouse <SteveW@ACM.org>
+
diff --git a/Documentation/networking/dl2k.txt b/Documentation/networking/dl2k.txt
new file mode 100644
index 000000000..cba74f7a3
--- /dev/null
+++ b/Documentation/networking/dl2k.txt
@@ -0,0 +1,282 @@
+
+    D-Link DL2000-based Gigabit Ethernet Adapter Installation
+    for Linux
+    May 23, 2002
+
+Contents
+========
+ - Compatibility List
+ - Quick Install
+ - Compiling the Driver
+ - Installing the Driver
+ - Option parameter
+ - Configuration Script Sample
+ - Troubleshooting
+
+
+Compatibility List
+=================
+Adapter Support:
+
+D-Link DGE-550T Gigabit Ethernet Adapter.
+D-Link DGE-550SX Gigabit Ethernet Adapter.
+D-Link DL2000-based Gigabit Ethernet Adapter.
+
+
+The driver support Linux kernel 2.4.7 later. We had tested it
+on the environments below.
+
+ . Red Hat v6.2 (update kernel to 2.4.7)
+ . Red Hat v7.0 (update kernel to 2.4.7)
+ . Red Hat v7.1 (kernel 2.4.7)
+ . Red Hat v7.2 (kernel 2.4.7-10)
+
+
+Quick Install
+=============
+Install linux driver as following command:
+
+1. make all
+2. insmod dl2k.ko
+3. ifconfig eth0 up 10.xxx.xxx.xxx netmask 255.0.0.0
+		    ^^^^^^^^^^^^^^^\	    ^^^^^^^^\
+				    IP		     NETMASK
+Now eth0 should active, you can test it by "ping" or get more information by
+"ifconfig". If tested ok, continue the next step.
+
+4. cp dl2k.ko /lib/modules/`uname -r`/kernel/drivers/net
+5. Add the following line to /etc/modprobe.d/dl2k.conf:
+	alias eth0 dl2k
+6. Run depmod to updated module indexes.
+7. Run "netconfig" or "netconf" to create configuration script ifcfg-eth0
+   located at /etc/sysconfig/network-scripts or create it manually.
+   [see - Configuration Script Sample]
+8. Driver will automatically load and configure at next boot time.
+
+Compiling the Driver
+====================
+  In Linux, NIC drivers are most commonly configured as loadable modules.
+The approach of building a monolithic kernel has become obsolete. The driver
+can be compiled as part of a monolithic kernel, but is strongly discouraged.
+The remainder of this section assumes the driver is built as a loadable module.
+In the Linux environment, it is a good idea to rebuild the driver from the
+source instead of relying on a precompiled version. This approach provides
+better reliability since a precompiled driver might depend on libraries or
+kernel features that are not present in a given Linux installation.
+
+The 3 files necessary to build Linux device driver are dl2k.c, dl2k.h and
+Makefile. To compile, the Linux installation must include the gcc compiler,
+the kernel source, and the kernel headers. The Linux driver supports Linux
+Kernels 2.4.7. Copy the files to a directory and enter the following command
+to compile and link the driver:
+
+CD-ROM drive
+------------
+
+[root@XXX /] mkdir cdrom
+[root@XXX /] mount -r -t iso9660 -o conv=auto /dev/cdrom /cdrom
+[root@XXX /] cd root
+[root@XXX /root] mkdir dl2k
+[root@XXX /root] cd dl2k
+[root@XXX dl2k] cp /cdrom/linux/dl2k.tgz /root/dl2k
+[root@XXX dl2k] tar xfvz dl2k.tgz
+[root@XXX dl2k] make all
+
+Floppy disc drive
+-----------------
+
+[root@XXX /] cd root
+[root@XXX /root] mkdir dl2k
+[root@XXX /root] cd dl2k
+[root@XXX dl2k] mcopy a:/linux/dl2k.tgz /root/dl2k
+[root@XXX dl2k] tar xfvz dl2k.tgz
+[root@XXX dl2k] make all
+
+Installing the Driver
+=====================
+
+  Manual Installation
+  -------------------
+  Once the driver has been compiled, it must be loaded, enabled, and bound
+  to a protocol stack in order to establish network connectivity. To load a
+  module enter the command:
+
+  insmod dl2k.o
+
+  or
+
+  insmod dl2k.o <optional parameter>	; add parameter
+
+  ===============================================================
+   example: insmod dl2k.o media=100mbps_hd
+   or	    insmod dl2k.o media=3
+   or	    insmod dl2k.o media=3,2	; for 2 cards
+  ===============================================================
+
+  Please reference the list of the command line parameters supported by
+  the Linux device driver below.
+
+  The insmod command only loads the driver and gives it a name of the form
+  eth0, eth1, etc. To bring the NIC into an operational state,
+  it is necessary to issue the following command:
+
+  ifconfig eth0 up
+
+  Finally, to bind the driver to the active protocol (e.g., TCP/IP with
+  Linux), enter the following command:
+
+  ifup eth0
+
+  Note that this is meaningful only if the system can find a configuration
+  script that contains the necessary network information. A sample will be
+  given in the next paragraph.
+
+  The commands to unload a driver are as follows:
+
+  ifdown eth0
+  ifconfig eth0 down
+  rmmod dl2k.o
+
+  The following are the commands to list the currently loaded modules and
+  to see the current network configuration.
+
+  lsmod
+  ifconfig
+
+
+  Automated Installation
+  ----------------------
+  This section describes how to install the driver such that it is
+  automatically loaded and configured at boot time. The following description
+  is based on a Red Hat 6.0/7.0 distribution, but it can easily be ported to
+  other distributions as well.
+
+  Red Hat v6.x/v7.x
+  -----------------
+  1. Copy dl2k.o to the network modules directory, typically
+     /lib/modules/2.x.x-xx/net or /lib/modules/2.x.x/kernel/drivers/net.
+  2. Locate the boot module configuration file, most commonly in the
+     /etc/modprobe.d/ directory. Add the following lines:
+
+     alias ethx dl2k
+     options dl2k <optional parameters>
+
+     where ethx will be eth0 if the NIC is the only ethernet adapter, eth1 if
+     one other ethernet adapter is installed, etc. Refer to the table in the
+     previous section for the list of optional parameters.
+  3. Locate the network configuration scripts, normally the
+     /etc/sysconfig/network-scripts directory, and create a configuration
+     script named ifcfg-ethx that contains network information.
+  4. Note that for most Linux distributions, Red Hat included, a configuration
+     utility with a graphical user interface is provided to perform steps 2
+     and 3 above.
+
+
+Parameter Description
+=====================
+You can install this driver without any additional parameter. However, if you
+are going to have extensive functions then it is necessary to set extra
+parameter. Below is a list of the command line parameters supported by the
+Linux device
+driver.
+
+mtu=packet_size			- Specifies the maximum packet size. default
+				  is 1500.
+
+media=media_type		- Specifies the media type the NIC operates at.
+				  autosense	Autosensing active media.
+				  10mbps_hd	10Mbps half duplex.
+				  10mbps_fd	10Mbps full duplex.
+				  100mbps_hd	100Mbps half duplex.
+				  100mbps_fd	100Mbps full duplex.
+				  1000mbps_fd	1000Mbps full duplex.
+				  1000mbps_hd	1000Mbps half duplex.
+				  0		Autosensing active media.
+				  1		10Mbps half duplex.
+				  2		10Mbps full duplex.
+				  3		100Mbps half duplex.
+				  4		100Mbps full duplex.
+				  5          	1000Mbps half duplex.
+				  6          	1000Mbps full duplex.
+
+				  By default, the NIC operates at autosense.
+				  1000mbps_fd and 1000mbps_hd types are only
+				  available for fiber adapter.
+
+vlan=n				- Specifies the VLAN ID. If vlan=0, the
+				  Virtual Local Area Network (VLAN) function is
+				  disable.
+
+jumbo=[0|1]			- Specifies the jumbo frame support. If jumbo=1,
+				  the NIC accept jumbo frames. By default, this
+				  function is disabled.
+				  Jumbo frame usually improve the performance
+				  int gigabit.
+				  This feature need jumbo frame compatible 
+				  remote.
+				  
+rx_coalesce=m			- Number of rx frame handled each interrupt.
+rx_timeout=n			- Rx DMA wait time for an interrupt. 
+				  If set rx_coalesce > 0, hardware only assert 
+				  an interrupt for m frames. Hardware won't 
+				  assert rx interrupt until m frames received or
+				  reach timeout of n * 640 nano seconds. 
+				  Set proper rx_coalesce and rx_timeout can 
+				  reduce congestion collapse and overload which
+				  has been a bottleneck for high speed network.
+				  
+				  For example, rx_coalesce=10 rx_timeout=800.
+				  that is, hardware assert only 1 interrupt 
+				  for 10 frames received or timeout of 512 us. 
+
+tx_coalesce=n			- Number of tx frame handled each interrupt.
+				  Set n > 1 can reduce the interrupts 
+				  congestion usually lower performance of
+				  high speed network card. Default is 16.
+				  
+tx_flow=[1|0]			- Specifies the Tx flow control. If tx_flow=0, 
+				  the Tx flow control disable else driver
+				  autodetect.
+rx_flow=[1|0]			- Specifies the Rx flow control. If rx_flow=0, 
+				  the Rx flow control enable else driver
+				  autodetect.
+
+
+Configuration Script Sample
+===========================
+Here is a sample of a simple configuration script:
+
+DEVICE=eth0
+USERCTL=no
+ONBOOT=yes
+POOTPROTO=none
+BROADCAST=207.200.5.255
+NETWORK=207.200.5.0
+NETMASK=255.255.255.0
+IPADDR=207.200.5.2
+
+
+Troubleshooting
+===============
+Q1. Source files contain ^ M behind every line.
+	Make sure all files are Unix file format (no LF). Try the following
+    shell command to convert files.
+
+	cat dl2k.c | col -b > dl2k.tmp
+	mv dl2k.tmp dl2k.c
+
+	OR
+
+	cat dl2k.c | tr -d "\r" > dl2k.tmp
+	mv dl2k.tmp dl2k.c
+
+Q2: Could not find header files (*.h) ?
+	To compile the driver, you need kernel header files. After
+    installing the kernel source, the header files are usually located in
+    /usr/src/linux/include, which is the default include directory configured
+    in Makefile. For some distributions, there is a copy of header files in
+    /usr/src/include/linux and /usr/src/include/asm, that you can change the
+    INCLUDEDIR in Makefile to /usr/include without installing kernel source.
+	Note that RH 7.0 didn't provide correct header files in /usr/include,
+    including those files will make a wrong version driver.
+
diff --git a/Documentation/networking/dm9000.txt b/Documentation/networking/dm9000.txt
new file mode 100644
index 000000000..5552e2e57
--- /dev/null
+++ b/Documentation/networking/dm9000.txt
@@ -0,0 +1,167 @@
+DM9000 Network driver
+=====================
+
+Copyright 2008 Simtec Electronics,
+	  Ben Dooks <ben@simtec.co.uk> <ben-linux@fluff.org>
+
+
+Introduction
+------------
+
+This file describes how to use the DM9000 platform-device based network driver
+that is contained in the files drivers/net/dm9000.c and drivers/net/dm9000.h.
+
+The driver supports three DM9000 variants, the DM9000E which is the first chip
+supported as well as the newer DM9000A and DM9000B devices. It is currently
+maintained and tested by Ben Dooks, who should be CC: to any patches for this
+driver.
+
+
+Defining the platform device
+----------------------------
+
+The minimum set of resources attached to the platform device are as follows:
+
+    1) The physical address of the address register
+    2) The physical address of the data register
+    3) The IRQ line the device's interrupt pin is connected to.
+
+These resources should be specified in that order, as the ordering of the
+two address regions is important (the driver expects these to be address
+and then data).
+
+An example from arch/arm/mach-s3c2410/mach-bast.c is:
+
+static struct resource bast_dm9k_resource[] = {
+	[0] = {
+		.start = S3C2410_CS5 + BAST_PA_DM9000,
+		.end   = S3C2410_CS5 + BAST_PA_DM9000 + 3,
+		.flags = IORESOURCE_MEM,
+	},
+	[1] = {
+		.start = S3C2410_CS5 + BAST_PA_DM9000 + 0x40,
+		.end   = S3C2410_CS5 + BAST_PA_DM9000 + 0x40 + 0x3f,
+		.flags = IORESOURCE_MEM,
+	},
+	[2] = {
+		.start = IRQ_DM9000,
+		.end   = IRQ_DM9000,
+		.flags = IORESOURCE_IRQ | IORESOURCE_IRQ_HIGHLEVEL,
+	}
+};
+
+static struct platform_device bast_device_dm9k = {
+	.name		= "dm9000",
+	.id		= 0,
+	.num_resources	= ARRAY_SIZE(bast_dm9k_resource),
+	.resource	= bast_dm9k_resource,
+};
+
+Note the setting of the IRQ trigger flag in bast_dm9k_resource[2].flags,
+as this will generate a warning if it is not present. The trigger from
+the flags field will be passed to request_irq() when registering the IRQ
+handler to ensure that the IRQ is setup correctly.
+
+This shows a typical platform device, without the optional configuration
+platform data supplied. The next example uses the same resources, but adds
+the optional platform data to pass extra configuration data:
+
+static struct dm9000_plat_data bast_dm9k_platdata = {
+	.flags		= DM9000_PLATF_16BITONLY,
+};
+
+static struct platform_device bast_device_dm9k = {
+	.name		= "dm9000",
+	.id		= 0,
+	.num_resources	= ARRAY_SIZE(bast_dm9k_resource),
+	.resource	= bast_dm9k_resource,
+	.dev		= {
+		.platform_data = &bast_dm9k_platdata,
+	}
+};
+
+The platform data is defined in include/linux/dm9000.h and described below.
+
+
+Platform data
+-------------
+
+Extra platform data for the DM9000 can describe the IO bus width to the
+device, whether or not an external PHY is attached to the device and
+the availability of an external configuration EEPROM.
+
+The flags for the platform data .flags field are as follows:
+
+DM9000_PLATF_8BITONLY
+
+	The IO should be done with 8bit operations.
+
+DM9000_PLATF_16BITONLY
+
+	The IO should be done with 16bit operations.
+
+DM9000_PLATF_32BITONLY
+
+	The IO should be done with 32bit operations.
+
+DM9000_PLATF_EXT_PHY
+
+	The chip is connected to an external PHY.
+
+DM9000_PLATF_NO_EEPROM
+
+	This can be used to signify that the board does not have an
+	EEPROM, or that the EEPROM should be hidden from the user.
+
+DM9000_PLATF_SIMPLE_PHY
+
+	Switch to using the simpler PHY polling method which does not
+	try and read the MII PHY state regularly. This is only available
+	when using the internal PHY. See the section on link state polling
+	for more information.
+
+	The config symbol DM9000_FORCE_SIMPLE_PHY_POLL, Kconfig entry
+	"Force simple NSR based PHY polling" allows this flag to be
+	forced on at build time.
+
+
+PHY Link state polling
+----------------------
+
+The driver keeps track of the link state and informs the network core
+about link (carrier) availability. This is managed by several methods
+depending on the version of the chip and on which PHY is being used.
+
+For the internal PHY, the original (and currently default) method is
+to read the MII state, either when the status changes if we have the
+necessary interrupt support in the chip or every two seconds via a
+periodic timer.
+
+To reduce the overhead for the internal PHY, there is now the option
+of using the DM9000_FORCE_SIMPLE_PHY_POLL config, or DM9000_PLATF_SIMPLE_PHY
+platform data option to read the summary information without the
+expensive MII accesses. This method is faster, but does not print
+as much information.
+
+When using an external PHY, the driver currently has to poll the MII
+link status as there is no method for getting an interrupt on link change.
+
+
+DM9000A / DM9000B
+-----------------
+
+These chips are functionally similar to the DM9000E and are supported easily
+by the same driver. The features are:
+
+   1) Interrupt on internal PHY state change. This means that the periodic
+      polling of the PHY status may be disabled on these devices when using
+      the internal PHY.
+
+   2) TCP/UDP checksum offloading, which the driver does not currently support.
+
+
+ethtool
+-------
+
+The driver supports the ethtool interface for access to the driver
+state information, the PHY state and the EEPROM.
diff --git a/Documentation/networking/dmfe.txt b/Documentation/networking/dmfe.txt
new file mode 100644
index 000000000..25320bf19
--- /dev/null
+++ b/Documentation/networking/dmfe.txt
@@ -0,0 +1,66 @@
+Note: This driver doesn't have a maintainer.
+
+Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver for Linux.
+
+This program is free software; you can redistribute it and/or
+modify it under the terms of the GNU General   Public License
+as published by the Free Software Foundation; either version 2
+of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+
+This driver provides kernel support for Davicom DM9102(A)/DM9132/DM9801 ethernet cards ( CNET
+10/100 ethernet cards uses Davicom chipset too, so this driver supports CNET cards too ).If you
+didn't compile this driver as a module, it will automatically load itself on boot and print a
+line similar to :
+
+	dmfe: Davicom DM9xxx net driver, version 1.36.4 (2002-01-17)
+
+If you compiled this driver as a module, you have to load it on boot.You can load it with command :
+
+	insmod dmfe
+
+This way it will autodetect the device mode.This is the suggested way to load the module.Or you can pass
+a mode= setting to module while loading, like :
+
+	insmod dmfe mode=0 # Force 10M Half Duplex
+	insmod dmfe mode=1 # Force 100M Half Duplex
+	insmod dmfe mode=4 # Force 10M Full Duplex
+	insmod dmfe mode=5 # Force 100M Full Duplex
+
+Next you should configure your network interface with a command similar to :
+
+	ifconfig eth0 172.22.3.18
+                      ^^^^^^^^^^^
+		     Your IP Address
+
+Then you may have to modify the default routing table with command :
+
+	route add default eth0
+
+
+Now your ethernet card should be up and running.
+
+
+TODO:
+
+Implement pci_driver::suspend() and pci_driver::resume() power management methods.
+Check on 64 bit boxes.
+Check and fix on big endian boxes.
+Test and make sure PCI latency is now correct for all cases.
+
+
+Authors:
+
+Sten Wang <sten_wang@davicom.com.tw >   : Original Author
+
+Contributors:
+
+Marcelo Tosatti <marcelo@conectiva.com.br>
+Alan Cox <alan@lxorguk.ukuu.org.uk>
+Jeff Garzik <jgarzik@pobox.com>
+Vojtech Pavlik <vojtech@suse.cz>
diff --git a/Documentation/networking/dns_resolver.txt b/Documentation/networking/dns_resolver.txt
new file mode 100644
index 000000000..eaa8f9a6f
--- /dev/null
+++ b/Documentation/networking/dns_resolver.txt
@@ -0,0 +1,157 @@
+			     ===================
+			     DNS Resolver Module
+			     ===================
+
+Contents:
+
+ - Overview.
+ - Compilation.
+ - Setting up.
+ - Usage.
+ - Mechanism.
+ - Debugging.
+
+
+========
+OVERVIEW
+========
+
+The DNS resolver module provides a way for kernel services to make DNS queries
+by way of requesting a key of key type dns_resolver.  These queries are
+upcalled to userspace through /sbin/request-key.
+
+These routines must be supported by userspace tools dns.upcall, cifs.upcall and
+request-key.  It is under development and does not yet provide the full feature
+set.  The features it does support include:
+
+ (*) Implements the dns_resolver key_type to contact userspace.
+
+It does not yet support the following AFS features:
+
+ (*) Dns query support for AFSDB resource record.
+
+This code is extracted from the CIFS filesystem.
+
+
+===========
+COMPILATION
+===========
+
+The module should be enabled by turning on the kernel configuration options:
+
+	CONFIG_DNS_RESOLVER	- tristate "DNS Resolver support"
+
+
+==========
+SETTING UP
+==========
+
+To set up this facility, the /etc/request-key.conf file must be altered so that
+/sbin/request-key can appropriately direct the upcalls.  For example, to handle
+basic dname to IPv4/IPv6 address resolution, the following line should be
+added:
+
+	#OP	TYPE		DESC	CO-INFO	PROGRAM ARG1 ARG2 ARG3 ...
+	#======	============	=======	=======	==========================
+	create	dns_resolver  	*	*	/usr/sbin/cifs.upcall %k
+
+To direct a query for query type 'foo', a line of the following should be added
+before the more general line given above as the first match is the one taken.
+
+	create	dns_resolver  	foo:*	*	/usr/sbin/dns.foo %k
+
+
+=====
+USAGE
+=====
+
+To make use of this facility, one of the following functions that are
+implemented in the module can be called after doing:
+
+	#include <linux/dns_resolver.h>
+
+ (1) int dns_query(const char *type, const char *name, size_t namelen,
+		   const char *options, char **_result, time_t *_expiry);
+
+     This is the basic access function.  It looks for a cached DNS query and if
+     it doesn't find it, it upcalls to userspace to make a new DNS query, which
+     may then be cached.  The key description is constructed as a string of the
+     form:
+
+		[<type>:]<name>
+
+     where <type> optionally specifies the particular upcall program to invoke,
+     and thus the type of query to do, and <name> specifies the string to be
+     looked up.  The default query type is a straight hostname to IP address
+     set lookup.
+
+     The name parameter is not required to be a NUL-terminated string, and its
+     length should be given by the namelen argument.
+
+     The options parameter may be NULL or it may be a set of options
+     appropriate to the query type.
+
+     The return value is a string appropriate to the query type.  For instance,
+     for the default query type it is just a list of comma-separated IPv4 and
+     IPv6 addresses.  The caller must free the result.
+
+     The length of the result string is returned on success, and a negative
+     error code is returned otherwise.  -EKEYREJECTED will be returned if the
+     DNS lookup failed.
+
+     If _expiry is non-NULL, the expiry time (TTL) of the result will be
+     returned also.
+
+The kernel maintains an internal keyring in which it caches looked up keys.
+This can be cleared by any process that has the CAP_SYS_ADMIN capability by
+the use of KEYCTL_KEYRING_CLEAR on the keyring ID.
+
+
+===============================
+READING DNS KEYS FROM USERSPACE
+===============================
+
+Keys of dns_resolver type can be read from userspace using keyctl_read() or
+"keyctl read/print/pipe".
+
+
+=========
+MECHANISM
+=========
+
+The dnsresolver module registers a key type called "dns_resolver".  Keys of
+this type are used to transport and cache DNS lookup results from userspace.
+
+When dns_query() is invoked, it calls request_key() to search the local
+keyrings for a cached DNS result.  If that fails to find one, it upcalls to
+userspace to get a new result.
+
+Upcalls to userspace are made through the request_key() upcall vector, and are
+directed by means of configuration lines in /etc/request-key.conf that tell
+/sbin/request-key what program to run to instantiate the key.
+
+The upcall handler program is responsible for querying the DNS, processing the
+result into a form suitable for passing to the keyctl_instantiate_key()
+routine.  This then passes the data to dns_resolver_instantiate() which strips
+off and processes any options included in the data, and then attaches the
+remainder of the string to the key as its payload.
+
+The upcall handler program should set the expiry time on the key to that of the
+lowest TTL of all the records it has extracted a result from.  This means that
+the key will be discarded and recreated when the data it holds has expired.
+
+dns_query() returns a copy of the value attached to the key, or an error if
+that is indicated instead.
+
+See <file:Documentation/security/keys/request-key.rst> for further
+information about request-key function.
+
+
+=========
+DEBUGGING
+=========
+
+Debugging messages can be turned on dynamically by writing a 1 into the
+following file:
+
+        /sys/module/dnsresolver/parameters/debug
diff --git a/Documentation/networking/dpaa.txt b/Documentation/networking/dpaa.txt
new file mode 100644
index 000000000..f88194f71
--- /dev/null
+++ b/Documentation/networking/dpaa.txt
@@ -0,0 +1,260 @@
+The QorIQ DPAA Ethernet Driver
+==============================
+
+Authors:
+Madalin Bucur <madalin.bucur@nxp.com>
+Camelia Groza <camelia.groza@nxp.com>
+
+Contents
+========
+
+	- DPAA Ethernet Overview
+	- DPAA Ethernet Supported SoCs
+	- Configuring DPAA Ethernet in your kernel
+	- DPAA Ethernet Frame Processing
+	- DPAA Ethernet Features
+	- DPAA IRQ Affinity and Receive Side Scaling
+	- Debugging
+
+DPAA Ethernet Overview
+======================
+
+DPAA stands for Data Path Acceleration Architecture and it is a
+set of networking acceleration IPs that are available on several
+generations of SoCs, both on PowerPC and ARM64.
+
+The Freescale DPAA architecture consists of a series of hardware blocks
+that support Ethernet connectivity. The Ethernet driver depends upon the
+following drivers in the Linux kernel:
+
+ - Peripheral Access Memory Unit (PAMU) (* needed only for PPC platforms)
+    drivers/iommu/fsl_*
+ - Frame Manager (FMan)
+    drivers/net/ethernet/freescale/fman
+ - Queue Manager (QMan), Buffer Manager (BMan)
+    drivers/soc/fsl/qbman
+
+A simplified view of the dpaa_eth interfaces mapped to FMan MACs:
+
+  dpaa_eth       /eth0\     ...       /ethN\
+  driver        |      |             |      |
+  -------------   ----   -----------   ----   -------------
+       -Ports  / Tx  Rx \    ...    / Tx  Rx \
+  FMan        |          |         |          |
+       -MACs  |   MAC0   |         |   MACN   |
+             /   dtsec0   \  ...  /   dtsecN   \ (or tgec)
+            /              \     /              \(or memac)
+  ---------  --------------  ---  --------------  ---------
+      FMan, FMan Port, FMan SP, FMan MURAM drivers
+  ---------------------------------------------------------
+      FMan HW blocks: MURAM, MACs, Ports, SP
+  ---------------------------------------------------------
+
+The dpaa_eth relation to the QMan, BMan and FMan:
+              ________________________________
+  dpaa_eth   /            eth0                \
+  driver    /                                  \
+  ---------   -^-   -^-   -^-   ---    ---------
+  QMan driver / \   / \   / \  \   /  | BMan    |
+             |Rx | |Rx | |Tx | |Tx |  | driver  |
+  ---------  |Dfl| |Err| |Cnf| |FQs|  |         |
+  QMan HW    |FQ | |FQ | |FQs| |   |  |         |
+             /   \ /   \ /   \  \ /   |         |
+  ---------   ---   ---   ---   -v-    ---------
+            |        FMan QMI         |         |
+            | FMan HW       FMan BMI  | BMan HW |
+              -----------------------   --------
+
+where the acronyms used above (and in the code) are:
+DPAA = Data Path Acceleration Architecture
+FMan = DPAA Frame Manager
+QMan = DPAA Queue Manager
+BMan = DPAA Buffers Manager
+QMI = QMan interface in FMan
+BMI = BMan interface in FMan
+FMan SP = FMan Storage Profiles
+MURAM = Multi-user RAM in FMan
+FQ = QMan Frame Queue
+Rx Dfl FQ = default reception FQ
+Rx Err FQ = Rx error frames FQ
+Tx Cnf FQ = Tx confirmation FQs
+Tx FQs = transmission frame queues
+dtsec = datapath three speed Ethernet controller (10/100/1000 Mbps)
+tgec = ten gigabit Ethernet controller (10 Gbps)
+memac = multirate Ethernet MAC (10/100/1000/10000)
+
+DPAA Ethernet Supported SoCs
+============================
+
+The DPAA drivers enable the Ethernet controllers present on the following SoCs:
+
+# PPC
+P1023
+P2041
+P3041
+P4080
+P5020
+P5040
+T1023
+T1024
+T1040
+T1042
+T2080
+T4240
+B4860
+
+# ARM
+LS1043A
+LS1046A
+
+Configuring DPAA Ethernet in your kernel
+========================================
+
+To enable the DPAA Ethernet driver, the following Kconfig options are required:
+
+# common for arch/arm64 and arch/powerpc platforms
+CONFIG_FSL_DPAA=y
+CONFIG_FSL_FMAN=y
+CONFIG_FSL_DPAA_ETH=y
+CONFIG_FSL_XGMAC_MDIO=y
+
+# for arch/powerpc only
+CONFIG_FSL_PAMU=y
+
+# common options needed for the PHYs used on the RDBs
+CONFIG_VITESSE_PHY=y
+CONFIG_REALTEK_PHY=y
+CONFIG_AQUANTIA_PHY=y
+
+DPAA Ethernet Frame Processing
+==============================
+
+On Rx, buffers for the incoming frames are retrieved from one of the three
+existing buffers pools. The driver initializes and seeds these, each with
+buffers of different sizes: 1KB, 2KB and 4KB.
+
+On Tx, all transmitted frames are returned to the driver through Tx
+confirmation frame queues. The driver is then responsible for freeing the
+buffers. In order to do this properly, a backpointer is added to the buffer
+before transmission that points to the skb. When the buffer returns to the
+driver on a confirmation FQ, the skb can be correctly consumed.
+
+DPAA Ethernet Features
+======================
+
+Currently the DPAA Ethernet driver enables the basic features required for
+a Linux Ethernet driver. The support for advanced features will be added
+gradually.
+
+The driver has Rx and Tx checksum offloading for UDP and TCP. Currently the Rx
+checksum offload feature is enabled by default and cannot be controlled through
+ethtool. Also, rx-flow-hash and rx-hashing was added. The addition of RSS
+provides a big performance boost for the forwarding scenarios, allowing
+different traffic flows received by one interface to be processed by different
+CPUs in parallel.
+
+The driver has support for multiple prioritized Tx traffic classes. Priorities
+range from 0 (lowest) to 3 (highest). These are mapped to HW workqueues with
+strict priority levels. Each traffic class contains NR_CPU TX queues. By
+default, only one traffic class is enabled and the lowest priority Tx queues
+are used. Higher priority traffic classes can be enabled with the mqprio
+qdisc. For example, all four traffic classes are enabled on an interface with
+the following command. Furthermore, skb priority levels are mapped to traffic
+classes as follows:
+
+	* priorities 0 to 3 - traffic class 0 (low priority)
+	* priorities 4 to 7 - traffic class 1 (medium-low priority)
+	* priorities 8 to 11 - traffic class 2 (medium-high priority)
+	* priorities 12 to 15 - traffic class 3 (high priority)
+
+tc qdisc add dev <int> root handle 1: \
+	 mqprio num_tc 4 map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 hw 1
+
+DPAA IRQ Affinity and Receive Side Scaling
+==========================================
+
+Traffic coming on the DPAA Rx queues or on the DPAA Tx confirmation
+queues is seen by the CPU as ingress traffic on a certain portal.
+The DPAA QMan portal interrupts are affined each to a certain CPU.
+The same portal interrupt services all the QMan portal consumers.
+
+By default the DPAA Ethernet driver enables RSS, making use of the
+DPAA FMan Parser and Keygen blocks to distribute traffic on 128
+hardware frame queues using a hash on IP v4/v6 source and destination
+and L4 source and destination ports, in present in the received frame.
+When RSS is disabled, all traffic received by a certain interface is
+received on the default Rx frame queue. The default DPAA Rx frame
+queues are configured to put the received traffic into a pool channel
+that allows any available CPU portal to dequeue the ingress traffic.
+The default frame queues have the HOLDACTIVE option set, ensuring that
+traffic bursts from a certain queue are serviced by the same CPU.
+This ensures a very low rate of frame reordering. A drawback of this
+is that only one CPU at a time can service the traffic received by a
+certain interface when RSS is not enabled.
+
+To implement RSS, the DPAA Ethernet driver allocates an extra set of
+128 Rx frame queues that are configured to dedicated channels, in a
+round-robin manner. The mapping of the frame queues to CPUs is now
+hardcoded, there is no indirection table to move traffic for a certain
+FQ (hash result) to another CPU. The ingress traffic arriving on one
+of these frame queues will arrive at the same portal and will always
+be processed by the same CPU. This ensures intra-flow order preservation
+and workload distribution for multiple traffic flows.
+
+RSS can be turned off for a certain interface using ethtool, i.e.
+
+	# ethtool -N fm1-mac9 rx-flow-hash tcp4 ""
+
+To turn it back on, one needs to set rx-flow-hash for tcp4/6 or udp4/6:
+
+	# ethtool -N fm1-mac9 rx-flow-hash udp4 sfdn
+
+There is no independent control for individual protocols, any command
+run for one of tcp4|udp4|ah4|esp4|sctp4|tcp6|udp6|ah6|esp6|sctp6 is
+going to control the rx-flow-hashing for all protocols on that interface.
+
+Besides using the FMan Keygen computed hash for spreading traffic on the
+128 Rx FQs, the DPAA Ethernet driver also sets the skb hash value when
+the NETIF_F_RXHASH feature is on (active by default). This can be turned
+on or off through ethtool, i.e.:
+
+	# ethtool -K fm1-mac9 rx-hashing off
+	# ethtool -k fm1-mac9 | grep hash
+	receive-hashing: off
+	# ethtool -K fm1-mac9 rx-hashing on
+	Actual changes:
+	receive-hashing: on
+	# ethtool -k fm1-mac9 | grep hash
+	receive-hashing: on
+
+Please note that Rx hashing depends upon the rx-flow-hashing being on
+for that interface - turning off rx-flow-hashing will also disable the
+rx-hashing (without ethtool reporting it as off as that depends on the
+NETIF_F_RXHASH feature flag).
+
+Debugging
+=========
+
+The following statistics are exported for each interface through ethtool:
+
+	- interrupt count per CPU
+	- Rx packets count per CPU
+	- Tx packets count per CPU
+	- Tx confirmed packets count per CPU
+	- Tx S/G frames count per CPU
+	- Tx error count per CPU
+	- Rx error count per CPU
+	- Rx error count per type
+	- congestion related statistics:
+		- congestion status
+		- time spent in congestion
+		- number of time the device entered congestion
+		- dropped packets count per cause
+
+The driver also exports the following information in sysfs:
+
+	- the FQ IDs for each FQ type
+	/sys/devices/platform/dpaa-ethernet.0/net/<int>/fqids
+
+	- the IDs of the buffer pools in use
+	/sys/devices/platform/dpaa-ethernet.0/net/<int>/bpids
diff --git a/Documentation/networking/dpaa2/dpio-driver.rst b/Documentation/networking/dpaa2/dpio-driver.rst
new file mode 100644
index 000000000..135881041
--- /dev/null
+++ b/Documentation/networking/dpaa2/dpio-driver.rst
@@ -0,0 +1,158 @@
+.. include:: <isonum.txt>
+
+DPAA2 DPIO (Data Path I/O) Overview
+===================================
+
+:Copyright: |copy| 2016-2018 NXP
+
+This document provides an overview of the Freescale DPAA2 DPIO
+drivers
+
+Introduction
+============
+
+A DPAA2 DPIO (Data Path I/O) is a hardware object that provides
+interfaces to enqueue and dequeue frames to/from network interfaces
+and other accelerators.  A DPIO also provides hardware buffer
+pool management for network interfaces.
+
+This document provides an overview the Linux DPIO driver, its
+subcomponents, and its APIs.
+
+See Documentation/networking/dpaa2/overview.rst for a general overview of DPAA2
+and the general DPAA2 driver architecture in Linux.
+
+Driver Overview
+---------------
+
+The DPIO driver is bound to DPIO objects discovered on the fsl-mc bus and
+provides services that:
+  A) allow other drivers, such as the Ethernet driver, to enqueue and dequeue
+     frames for their respective objects
+  B) allow drivers to register callbacks for data availability notifications
+     when data becomes available on a queue or channel
+  C) allow drivers to manage hardware buffer pools
+
+The Linux DPIO driver consists of 3 primary components--
+   DPIO object driver-- fsl-mc driver that manages the DPIO object
+
+   DPIO service-- provides APIs to other Linux drivers for services
+
+   QBman portal interface-- sends portal commands, gets responses
+::
+
+          fsl-mc          other
+           bus           drivers
+            |               |
+        +---+----+   +------+-----+
+        |DPIO obj|   |DPIO service|
+        | driver |---|  (DPIO)    |
+        +--------+   +------+-----+
+                            |
+                     +------+-----+
+                     |    QBman   |
+                     | portal i/f |
+                     +------------+
+                            |
+                         hardware
+
+
+The diagram below shows how the DPIO driver components fit with the other
+DPAA2 Linux driver components::
+                                                   +------------+
+                                                   | OS Network |
+                                                   |   Stack    |
+                 +------------+                    +------------+
+                 | Allocator  |. . . . . . .       |  Ethernet  |
+                 |(DPMCP,DPBP)|                    |   (DPNI)   |
+                 +-.----------+                    +---+---+----+
+                  .          .                         ^   |
+                 .            .           <data avail, |   |<enqueue,
+                .              .           tx confirm> |   | dequeue>
+    +-------------+             .                      |   |
+    | DPRC driver |              .    +--------+ +------------+
+    |   (DPRC)    |               . . |DPIO obj| |DPIO service|
+    +----------+--+                   | driver |-|  (DPIO)    |
+               |                      +--------+ +------+-----+
+               |<dev add/remove>                 +------|-----+
+               |                                 |   QBman    |
+          +----+--------------+                  | portal i/f |
+          |   MC-bus driver   |                  +------------+
+          |                   |                     |
+          | /soc/fsl-mc       |                     |
+          +-------------------+                     |
+                                                    |
+ =========================================|=========|========================
+                                        +-+--DPIO---|-----------+
+                                        |           |           |
+                                        |        QBman Portal   |
+                                        +-----------------------+
+
+ ============================================================================
+
+
+DPIO Object Driver (dpio-driver.c)
+----------------------------------
+
+   The dpio-driver component registers with the fsl-mc bus to handle objects of
+   type "dpio".  The implementation of probe() handles basic initialization
+   of the DPIO including mapping of the DPIO regions (the QBman SW portal)
+   and initializing interrupts and registering irq handlers.  The dpio-driver
+   registers the probed DPIO with dpio-service.
+
+DPIO service  (dpio-service.c, dpaa2-io.h)
+------------------------------------------
+
+   The dpio service component provides queuing, notification, and buffers
+   management services to DPAA2 drivers, such as the Ethernet driver.  A system
+   will typically allocate 1 DPIO object per CPU to allow queuing operations
+   to happen simultaneously across all CPUs.
+
+   Notification handling
+      dpaa2_io_service_register()
+
+      dpaa2_io_service_deregister()
+
+      dpaa2_io_service_rearm()
+
+   Queuing
+      dpaa2_io_service_pull_fq()
+
+      dpaa2_io_service_pull_channel()
+
+      dpaa2_io_service_enqueue_fq()
+
+      dpaa2_io_service_enqueue_qd()
+
+      dpaa2_io_store_create()
+
+      dpaa2_io_store_destroy()
+
+      dpaa2_io_store_next()
+
+   Buffer pool management
+      dpaa2_io_service_release()
+
+      dpaa2_io_service_acquire()
+
+QBman portal interface (qbman-portal.c)
+---------------------------------------
+
+   The qbman-portal component provides APIs to do the low level hardware
+   bit twiddling for operations such as:
+      -initializing Qman software portals
+
+      -building and sending portal commands
+
+      -portal interrupt configuration and processing
+
+   The qbman-portal APIs are not public to other drivers, and are
+   only used by dpio-service.
+
+Other (dpaa2-fd.h, dpaa2-global.h)
+----------------------------------
+
+   Frame descriptor and scatter-gather definitions and the APIs used to
+   manipulate them are defined in dpaa2-fd.h.
+
+   Dequeue result struct and parsing APIs are defined in dpaa2-global.h.
diff --git a/Documentation/networking/dpaa2/index.rst b/Documentation/networking/dpaa2/index.rst
new file mode 100644
index 000000000..10bea113a
--- /dev/null
+++ b/Documentation/networking/dpaa2/index.rst
@@ -0,0 +1,9 @@
+===================
+DPAA2 Documentation
+===================
+
+.. toctree::
+   :maxdepth: 1
+
+   overview
+   dpio-driver
diff --git a/Documentation/networking/dpaa2/overview.rst b/Documentation/networking/dpaa2/overview.rst
new file mode 100644
index 000000000..d638b5a8a
--- /dev/null
+++ b/Documentation/networking/dpaa2/overview.rst
@@ -0,0 +1,405 @@
+.. include:: <isonum.txt>
+
+=========================================================
+DPAA2 (Data Path Acceleration Architecture Gen2) Overview
+=========================================================
+
+:Copyright: |copy| 2015 Freescale Semiconductor Inc.
+:Copyright: |copy| 2018 NXP
+
+This document provides an overview of the Freescale DPAA2 architecture
+and how it is integrated into the Linux kernel.
+
+Introduction
+============
+
+DPAA2 is a hardware architecture designed for high-speeed network
+packet processing.  DPAA2 consists of sophisticated mechanisms for
+processing Ethernet packets, queue management, buffer management,
+autonomous L2 switching, virtual Ethernet bridging, and accelerator
+(e.g. crypto) sharing.
+
+A DPAA2 hardware component called the Management Complex (or MC) manages the
+DPAA2 hardware resources.  The MC provides an object-based abstraction for
+software drivers to use the DPAA2 hardware.
+The MC uses DPAA2 hardware resources such as queues, buffer pools, and
+network ports to create functional objects/devices such as network
+interfaces, an L2 switch, or accelerator instances.
+The MC provides memory-mapped I/O command interfaces (MC portals)
+which DPAA2 software drivers use to operate on DPAA2 objects.
+
+The diagram below shows an overview of the DPAA2 resource management
+architecture::
+
+	+--------------------------------------+
+	|                  OS                  |
+	|                        DPAA2 drivers |
+	|                             |        |
+	+-----------------------------|--------+
+	                              |
+	                              | (create,discover,connect
+	                              |  config,use,destroy)
+	                              |
+	                 DPAA2        |
+	+------------------------| mc portal |-+
+	|                             |        |
+	|   +- - - - - - - - - - - - -V- - -+  |
+	|   |                               |  |
+	|   |   Management Complex (MC)     |  |
+	|   |                               |  |
+	|   +- - - - - - - - - - - - - - - -+  |
+	|                                      |
+	| Hardware                  Hardware   |
+	| Resources                 Objects    |
+	| ---------                 -------    |
+	| -queues                   -DPRC      |
+	| -buffer pools             -DPMCP     |
+	| -Eth MACs/ports           -DPIO      |
+	| -network interface        -DPNI      |
+	|  profiles                 -DPMAC     |
+	| -queue portals            -DPBP      |
+	| -MC portals                ...       |
+	|  ...                                 |
+	|                                      |
+	+--------------------------------------+
+
+
+The MC mediates operations such as create, discover,
+connect, configuration, and destroy.  Fast-path operations
+on data, such as packet transmit/receive, are not mediated by
+the MC and are done directly using memory mapped regions in
+DPIO objects.
+
+Overview of DPAA2 Objects
+=========================
+
+The section provides a brief overview of some key DPAA2 objects.
+A simple scenario is described illustrating the objects involved
+in creating a network interfaces.
+
+DPRC (Datapath Resource Container)
+----------------------------------
+
+A DPRC is a container object that holds all the other
+types of DPAA2 objects.  In the example diagram below there
+are 8 objects of 5 types (DPMCP, DPIO, DPBP, DPNI, and DPMAC)
+in the container.
+
+::
+
+	+---------------------------------------------------------+
+	| DPRC                                                    |
+	|                                                         |
+	|  +-------+  +-------+  +-------+  +-------+  +-------+  |
+	|  | DPMCP |  | DPIO  |  | DPBP  |  | DPNI  |  | DPMAC |  |
+	|  +-------+  +-------+  +-------+  +---+---+  +---+---+  |
+	|  | DPMCP |  | DPIO  |                                   |
+	|  +-------+  +-------+                                   |
+	|  | DPMCP |                                              |
+	|  +-------+                                              |
+	|                                                         |
+	+---------------------------------------------------------+
+
+From the point of view of an OS, a DPRC behaves similar to a plug and
+play bus, like PCI.  DPRC commands can be used to enumerate the contents
+of the DPRC, discover the hardware objects present (including mappable
+regions and interrupts).
+
+::
+
+	DPRC.1 (bus)
+	   |
+	   +--+--------+-------+-------+-------+
+	      |        |       |       |       |
+	    DPMCP.1  DPIO.1  DPBP.1  DPNI.1  DPMAC.1
+	    DPMCP.2  DPIO.2
+	    DPMCP.3
+
+Hardware objects can be created and destroyed dynamically, providing
+the ability to hot plug/unplug objects in and out of the DPRC.
+
+A DPRC has a mappable MMIO region (an MC portal) that can be used
+to send MC commands.  It has an interrupt for status events (like
+hotplug).
+All objects in a container share the same hardware "isolation context".
+This means that with respect to an IOMMU the isolation granularity
+is at the DPRC (container) level, not at the individual object
+level.
+
+DPRCs can be defined statically and populated with objects
+via a config file passed to the MC when firmware starts it.
+
+DPAA2 Objects for an Ethernet Network Interface
+-----------------------------------------------
+
+A typical Ethernet NIC is monolithic-- the NIC device contains TX/RX
+queuing mechanisms, configuration mechanisms, buffer management,
+physical ports, and interrupts.  DPAA2 uses a more granular approach
+utilizing multiple hardware objects.  Each object provides specialized
+functions. Groups of these objects are used by software to provide
+Ethernet network interface functionality.  This approach provides
+efficient use of finite hardware resources, flexibility, and
+performance advantages.
+
+The diagram below shows the objects needed for a simple
+network interface configuration on a system with 2 CPUs.
+
+::
+
+	+---+---+ +---+---+
+	   CPU0     CPU1
+	+---+---+ +---+---+
+	    |         |
+	+---+---+ +---+---+
+	   DPIO     DPIO
+	+---+---+ +---+---+
+	    \     /
+	     \   /
+	      \ /
+	   +---+---+
+	      DPNI  --- DPBP,DPMCP
+	   +---+---+
+	       |
+	       |
+	   +---+---+
+	     DPMAC
+	   +---+---+
+	       |
+	   port/PHY
+
+Below the objects are described.  For each object a brief description
+is provided along with a summary of the kinds of operations the object
+supports and a summary of key resources of the object (MMIO regions
+and IRQs).
+
+DPMAC (Datapath Ethernet MAC)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Represents an Ethernet MAC, a hardware device that connects to an Ethernet
+PHY and allows physical transmission and reception of Ethernet frames.
+
+- MMIO regions: none
+- IRQs: DPNI link change
+- commands: set link up/down, link config, get stats,
+  IRQ config, enable, reset
+
+DPNI (Datapath Network Interface)
+Contains TX/RX queues, network interface configuration, and RX buffer pool
+configuration mechanisms.  The TX/RX queues are in memory and are identified
+by queue number.
+
+- MMIO regions: none
+- IRQs: link state
+- commands: port config, offload config, queue config,
+  parse/classify config, IRQ config, enable, reset
+
+DPIO (Datapath I/O)
+~~~~~~~~~~~~~~~~~~~
+Provides interfaces to enqueue and dequeue
+packets and do hardware buffer pool management operations.  The DPAA2
+architecture separates the mechanism to access queues (the DPIO object)
+from the queues themselves.  The DPIO provides an MMIO interface to
+enqueue/dequeue packets.  To enqueue something a descriptor is written
+to the DPIO MMIO region, which includes the target queue number.
+There will typically be one DPIO assigned to each CPU.  This allows all
+CPUs to simultaneously perform enqueue/dequeued operations.  DPIOs are
+expected to be shared by different DPAA2 drivers.
+
+- MMIO regions: queue operations, buffer management
+- IRQs: data availability, congestion notification, buffer
+  pool depletion
+- commands: IRQ config, enable, reset
+
+DPBP (Datapath Buffer Pool)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Represents a hardware buffer pool.
+
+- MMIO regions: none
+- IRQs: none
+- commands: enable, reset
+
+DPMCP (Datapath MC Portal)
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Provides an MC command portal.
+Used by drivers to send commands to the MC to manage
+objects.
+
+- MMIO regions: MC command portal
+- IRQs: command completion
+- commands: IRQ config, enable, reset
+
+Object Connections
+==================
+Some objects have explicit relationships that must
+be configured:
+
+- DPNI <--> DPMAC
+- DPNI <--> DPNI
+- DPNI <--> L2-switch-port
+
+    A DPNI must be connected to something such as a DPMAC,
+    another DPNI, or L2 switch port.  The DPNI connection
+    is made via a DPRC command.
+
+::
+
+              +-------+  +-------+
+              | DPNI  |  | DPMAC |
+              +---+---+  +---+---+
+                  |          |
+                  +==========+
+
+- DPNI <--> DPBP
+
+    A network interface requires a 'buffer pool' (DPBP
+    object) which provides a list of pointers to memory
+    where received Ethernet data is to be copied.  The
+    Ethernet driver configures the DPBPs associated with
+    the network interface.
+
+Interrupts
+==========
+All interrupts generated by DPAA2 objects are message
+interrupts.  At the hardware level message interrupts
+generated by devices will normally have 3 components--
+1) a non-spoofable 'device-id' expressed on the hardware
+bus, 2) an address, 3) a data value.
+
+In the case of DPAA2 devices/objects, all objects in the
+same container/DPRC share the same 'device-id'.
+For ARM-based SoC this is the same as the stream ID.
+
+
+DPAA2 Linux Drivers Overview
+============================
+
+This section provides an overview of the Linux kernel drivers for
+DPAA2-- 1) the bus driver and associated "DPAA2 infrastructure"
+drivers and 2) functional object drivers (such as Ethernet).
+
+As described previously, a DPRC is a container that holds the other
+types of DPAA2 objects.  It is functionally similar to a plug-and-play
+bus controller.
+Each object in the DPRC is a Linux "device" and is bound to a driver.
+The diagram below shows the Linux drivers involved in a networking
+scenario and the objects bound to each driver.  A brief description
+of each driver follows.
+
+::
+
+	                                     +------------+
+	                                     | OS Network |
+	                                     |   Stack    |
+	         +------------+              +------------+
+	         | Allocator  |. . . . . . . |  Ethernet  |
+	         |(DPMCP,DPBP)|              |   (DPNI)   |
+	         +-.----------+              +---+---+----+
+	          .          .                   ^   |
+	         .            .     <data avail, |   | <enqueue,
+	        .              .     tx confirm> |   | dequeue>
+	+-------------+         .                |   |
+	| DPRC driver |          .           +---+---V----+     +---------+
+	|   (DPRC)    |           . . . . . .| DPIO driver|     |   MAC   |
+	+----------+--+                      |  (DPIO)    |     | (DPMAC) |
+	           |                         +------+-----+     +-----+---+
+	           |<dev add/remove>                |                 |
+	           |                                |                 |
+	  +--------+----------+                     |              +--+---+
+	  |   MC-bus driver   |                     |              | PHY  |
+	  |                   |                     |              |driver|
+	  |   /bus/fsl-mc     |                     |              +--+---+
+	  +-------------------+                     |                 |
+	                                            |                 |
+	========================= HARDWARE =========|=================|======
+	                                          DPIO                |
+	                                            |                 |
+	                                          DPNI---DPBP         |
+	                                            |                 |
+	                                          DPMAC               |
+	                                            |                 |
+	                                           PHY ---------------+
+	============================================|========================
+
+A brief description of each driver is provided below.
+
+MC-bus driver
+-------------
+The MC-bus driver is a platform driver and is probed from a
+node in the device tree (compatible "fsl,qoriq-mc") passed in by boot
+firmware.  It is responsible for bootstrapping the DPAA2 kernel
+infrastructure.
+Key functions include:
+
+- registering a new bus type named "fsl-mc" with the kernel,
+  and implementing bus call-backs (e.g. match/uevent/dev_groups)
+- implementing APIs for DPAA2 driver registration and for device
+  add/remove
+- creates an MSI IRQ domain
+- doing a 'device add' to expose the 'root' DPRC, in turn triggering
+  a bind of the root DPRC to the DPRC driver
+
+The binding for the MC-bus device-tree node can be consulted at
+*Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt*.
+The sysfs bind/unbind interfaces for the MC-bus can be consulted at
+*Documentation/ABI/testing/sysfs-bus-fsl-mc*.
+
+DPRC driver
+-----------
+The DPRC driver is bound to DPRC objects and does runtime management
+of a bus instance.  It performs the initial bus scan of the DPRC
+and handles interrupts for container events such as hot plug by
+re-scanning the DPRC.
+
+Allocator
+---------
+Certain objects such as DPMCP and DPBP are generic and fungible,
+and are intended to be used by other drivers.  For example,
+the DPAA2 Ethernet driver needs:
+
+- DPMCPs to send MC commands, to configure network interfaces
+- DPBPs for network buffer pools
+
+The allocator driver registers for these allocatable object types
+and those objects are bound to the allocator when the bus is probed.
+The allocator maintains a pool of objects that are available for
+allocation by other DPAA2 drivers.
+
+DPIO driver
+-----------
+The DPIO driver is bound to DPIO objects and provides services that allow
+other drivers such as the Ethernet driver to enqueue and dequeue data for
+their respective objects.
+Key services include:
+
+- data availability notifications
+- hardware queuing operations (enqueue and dequeue of data)
+- hardware buffer pool management
+
+To transmit a packet the Ethernet driver puts data on a queue and
+invokes a DPIO API.  For receive, the Ethernet driver registers
+a data availability notification callback.  To dequeue a packet
+a DPIO API is used.
+There is typically one DPIO object per physical CPU for optimum
+performance, allowing different CPUs to simultaneously enqueue
+and dequeue data.
+
+The DPIO driver operates on behalf of all DPAA2 drivers
+active in the kernel--  Ethernet, crypto, compression,
+etc.
+
+Ethernet driver
+---------------
+The Ethernet driver is bound to a DPNI and implements the kernel
+interfaces needed to connect the DPAA2 network interface to
+the network stack.
+Each DPNI corresponds to a Linux network interface.
+
+MAC driver
+----------
+An Ethernet PHY is an off-chip, board specific component and is managed
+by the appropriate PHY driver via an mdio bus.  The MAC driver
+plays a role of being a proxy between the PHY driver and the
+MC.  It does this proxy via the MC commands to a DPMAC object.
+If the PHY driver signals a link change, the MAC driver notifies
+the MC via a DPMAC command.  If a network interface is brought
+up or down, the MC notifies the DPMAC driver via an interrupt and
+the driver can take appropriate action.
diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt
new file mode 100644
index 000000000..da59e2884
--- /dev/null
+++ b/Documentation/networking/driver.txt
@@ -0,0 +1,93 @@
+Document about softnet driver issues
+
+Transmit path guidelines:
+
+1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under
+   any normal circumstances.  It is considered a hard error unless
+   there is no way your device can tell ahead of time when it's
+   transmit function will become busy.
+
+   Instead it must maintain the queue properly.  For example,
+   for a driver implementing scatter-gather this means:
+
+	static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb,
+					       struct net_device *dev)
+	{
+		struct drv *dp = netdev_priv(dev);
+
+		lock_tx(dp);
+		...
+		/* This is a hard error log it. */
+		if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) {
+			netif_stop_queue(dev);
+			unlock_tx(dp);
+			printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n",
+			       dev->name);
+			return NETDEV_TX_BUSY;
+		}
+
+		... queue packet to card ...
+		... update tx consumer index ...
+
+		if (TX_BUFFS_AVAIL(dp) <= (MAX_SKB_FRAGS + 1))
+			netif_stop_queue(dev);
+
+		...
+		unlock_tx(dp);
+		...
+		return NETDEV_TX_OK;
+	}
+
+   And then at the end of your TX reclamation event handling:
+
+	if (netif_queue_stopped(dp->dev) &&
+            TX_BUFFS_AVAIL(dp) > (MAX_SKB_FRAGS + 1))
+		netif_wake_queue(dp->dev);
+
+   For a non-scatter-gather supporting card, the three tests simply become:
+
+		/* This is a hard error log it. */
+		if (TX_BUFFS_AVAIL(dp) <= 0)
+
+   and:
+
+		if (TX_BUFFS_AVAIL(dp) == 0)
+
+   and:
+
+	if (netif_queue_stopped(dp->dev) &&
+            TX_BUFFS_AVAIL(dp) > 0)
+		netif_wake_queue(dp->dev);
+
+2) An ndo_start_xmit method must not modify the shared parts of a
+   cloned SKB.
+
+3) Do not forget that once you return NETDEV_TX_OK from your
+   ndo_start_xmit method, it is your driver's responsibility to free
+   up the SKB and in some finite amount of time.
+
+   For example, this means that it is not allowed for your TX
+   mitigation scheme to let TX packets "hang out" in the TX
+   ring unreclaimed forever if no new TX packets are sent.
+   This error can deadlock sockets waiting for send buffer room
+   to be freed up.
+
+   If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you
+   must not keep any reference to that SKB and you must not attempt
+   to free it up.
+
+Probing guidelines:
+
+1) Any hardware layer address you obtain for your device should
+   be verified.  For example, for ethernet check it with
+   linux/etherdevice.h:is_valid_ether_addr()
+
+Close/stop guidelines:
+
+1) After the ndo_stop routine has been called, the hardware must
+   not receive or transmit any data.  All in flight packets must
+   be aborted. If necessary, poll or wait for completion of 
+   any reset commands.
+
+2) The ndo_stop routine will be called by unregister_netdevice
+   if device is still UP.
diff --git a/Documentation/networking/dsa/bcm_sf2.txt b/Documentation/networking/dsa/bcm_sf2.txt
new file mode 100644
index 000000000..eba3a2431
--- /dev/null
+++ b/Documentation/networking/dsa/bcm_sf2.txt
@@ -0,0 +1,114 @@
+Broadcom Starfighter 2 Ethernet switch driver
+=============================================
+
+Broadcom's Starfighter 2 Ethernet switch hardware block is commonly found and
+deployed in the following products:
+
+- xDSL gateways such as BCM63138
+- streaming/multimedia Set Top Box such as BCM7445
+- Cable Modem/residential gateways such as BCM7145/BCM3390
+
+The switch is typically deployed in a configuration involving between 5 to 13
+ports, offering a range of built-in and customizable interfaces:
+
+- single integrated Gigabit PHY
+- quad integrated Gigabit PHY
+- quad external Gigabit PHY w/ MDIO multiplexer
+- integrated MoCA PHY
+- several external MII/RevMII/GMII/RGMII interfaces
+
+The switch also supports specific congestion control features which allow MoCA
+fail-over not to lose packets during a MoCA role re-election, as well as out of
+band back-pressure to the host CPU network interface when downstream interfaces
+are connected at a lower speed.
+
+The switch hardware block is typically interfaced using MMIO accesses and
+contains a bunch of sub-blocks/registers:
+
+* SWITCH_CORE: common switch registers
+* SWITCH_REG: external interfaces switch register
+* SWITCH_MDIO: external MDIO bus controller (there is another one in SWITCH_CORE,
+  which is used for indirect PHY accesses)
+* SWITCH_INDIR_RW: 64-bits wide register helper block
+* SWITCH_INTRL2_0/1: Level-2 interrupt controllers
+* SWITCH_ACB: Admission control block
+* SWITCH_FCB: Fail-over control block
+
+Implementation details
+======================
+
+The driver is located in drivers/net/dsa/bcm_sf2.c and is implemented as a DSA
+driver; see Documentation/networking/dsa/dsa.txt for details on the subsystem
+and what it provides.
+
+The SF2 switch is configured to enable a Broadcom specific 4-bytes switch tag
+which gets inserted by the switch for every packet forwarded to the CPU
+interface, conversely, the CPU network interface should insert a similar tag for
+packets entering the CPU port. The tag format is described in
+net/dsa/tag_brcm.c.
+
+Overall, the SF2 driver is a fairly regular DSA driver; there are a few
+specifics covered below.
+
+Device Tree probing
+-------------------
+
+The DSA platform device driver is probed using a specific compatible string
+provided in net/dsa/dsa.c. The reason for that is because the DSA subsystem gets
+registered as a platform device driver currently. DSA will provide the needed
+device_node pointers which are then accessible by the switch driver setup
+function to setup resources such as register ranges and interrupts. This
+currently works very well because none of the of_* functions utilized by the
+driver require a struct device to be bound to a struct device_node, but things
+may change in the future.
+
+MDIO indirect accesses
+----------------------
+
+Due to a limitation in how Broadcom switches have been designed, external
+Broadcom switches connected to a SF2 require the use of the DSA slave MDIO bus
+in order to properly configure them. By default, the SF2 pseudo-PHY address, and
+an external switch pseudo-PHY address will both be snooping for incoming MDIO
+transactions, since they are at the same address (30), resulting in some kind of
+"double" programming. Using DSA, and setting ds->phys_mii_mask accordingly, we
+selectively divert reads and writes towards external Broadcom switches
+pseudo-PHY addresses. Newer revisions of the SF2 hardware have introduced a
+configurable pseudo-PHY address which circumvents the initial design limitation.
+
+Multimedia over CoAxial (MoCA) interfaces
+-----------------------------------------
+
+MoCA interfaces are fairly specific and require the use of a firmware blob which
+gets loaded onto the MoCA processor(s) for packet processing. The switch
+hardware contains logic which will assert/de-assert link states accordingly for
+the MoCA interface whenever the MoCA coaxial cable gets disconnected or the
+firmware gets reloaded. The SF2 driver relies on such events to properly set its
+MoCA interface carrier state and properly report this to the networking stack.
+
+The MoCA interfaces are supported using the PHY library's fixed PHY/emulated PHY
+device and the switch driver registers a fixed_link_update callback for such
+PHYs which reflects the link state obtained from the interrupt handler.
+
+
+Power Management
+----------------
+
+Whenever possible, the SF2 driver tries to minimize the overall switch power
+consumption by applying a combination of:
+
+- turning off internal buffers/memories
+- disabling packet processing logic
+- putting integrated PHYs in IDDQ/low-power
+- reducing the switch core clock based on the active port count
+- enabling and advertising EEE
+- turning off RGMII data processing logic when the link goes down
+
+Wake-on-LAN
+-----------
+
+Wake-on-LAN is currently implemented by utilizing the host processor Ethernet
+MAC controller wake-on logic. Whenever Wake-on-LAN is requested, an intersection
+between the user request and the supported host Ethernet interface WoL
+capabilities is done and the intersection result gets configured. During
+system-wide suspend/resume, only ports not participating in Wake-on-LAN are
+disabled.
diff --git a/Documentation/networking/dsa/dsa.txt b/Documentation/networking/dsa/dsa.txt
new file mode 100644
index 000000000..25170ad7d
--- /dev/null
+++ b/Documentation/networking/dsa/dsa.txt
@@ -0,0 +1,601 @@
+Distributed Switch Architecture
+===============================
+
+Introduction
+============
+
+This document describes the Distributed Switch Architecture (DSA) subsystem
+design principles, limitations, interactions with other subsystems, and how to
+develop drivers for this subsystem as well as a TODO for developers interested
+in joining the effort.
+
+Design principles
+=================
+
+The Distributed Switch Architecture is a subsystem which was primarily designed
+to support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line)
+using Linux, but has since evolved to support other vendors as well.
+
+The original philosophy behind this design was to be able to use unmodified
+Linux tools such as bridge, iproute2, ifconfig to work transparently whether
+they configured/queried a switch port network device or a regular network
+device.
+
+An Ethernet switch is typically comprised of multiple front-panel ports, and one
+or more CPU or management port. The DSA subsystem currently relies on the
+presence of a management port connected to an Ethernet controller capable of
+receiving Ethernet frames from the switch. This is a very common setup for all
+kinds of Ethernet switches found in Small Home and Office products: routers,
+gateways, or even top-of-the rack switches. This host Ethernet controller will
+be later referred to as "master" and "cpu" in DSA terminology and code.
+
+The D in DSA stands for Distributed, because the subsystem has been designed
+with the ability to configure and manage cascaded switches on top of each other
+using upstream and downstream Ethernet links between switches. These specific
+ports are referred to as "dsa" ports in DSA terminology and code. A collection
+of multiple switches connected to each other is called a "switch tree".
+
+For each front-panel port, DSA will create specialized network devices which are
+used as controlling and data-flowing endpoints for use by the Linux networking
+stack. These specialized network interfaces are referred to as "slave" network
+interfaces in DSA terminology and code.
+
+The ideal case for using DSA is when an Ethernet switch supports a "switch tag"
+which is a hardware feature making the switch insert a specific tag for each
+Ethernet frames it received to/from specific ports to help the management
+interface figure out:
+
+- what port is this frame coming from
+- what was the reason why this frame got forwarded
+- how to send CPU originated traffic to specific ports
+
+The subsystem does support switches not capable of inserting/stripping tags, but
+the features might be slightly limited in that case (traffic separation relies
+on Port-based VLAN IDs).
+
+Note that DSA does not currently create network interfaces for the "cpu" and
+"dsa" ports because:
+
+- the "cpu" port is the Ethernet switch facing side of the management
+  controller, and as such, would create a duplication of feature, since you
+  would get two interfaces for the same conduit: master netdev, and "cpu" netdev
+
+- the "dsa" port(s) are just conduits between two or more switches, and as such
+  cannot really be used as proper network interfaces either, only the
+  downstream, or the top-most upstream interface makes sense with that model
+
+Switch tagging protocols
+------------------------
+
+DSA currently supports 5 different tagging protocols, and a tag-less mode as
+well. The different protocols are implemented in:
+
+net/dsa/tag_trailer.c: Marvell's 4 trailer tag mode (legacy)
+net/dsa/tag_dsa.c: Marvell's original DSA tag
+net/dsa/tag_edsa.c: Marvell's enhanced DSA tag
+net/dsa/tag_brcm.c: Broadcom's 4 bytes tag
+net/dsa/tag_qca.c: Qualcomm's 2 bytes tag
+
+The exact format of the tag protocol is vendor specific, but in general, they
+all contain something which:
+
+- identifies which port the Ethernet frame came from/should be sent to
+- provides a reason why this frame was forwarded to the management interface
+
+Master network devices
+----------------------
+
+Master network devices are regular, unmodified Linux network device drivers for
+the CPU/management Ethernet interface. Such a driver might occasionally need to
+know whether DSA is enabled (e.g.: to enable/disable specific offload features),
+but the DSA subsystem has been proven to work with industry standard drivers:
+e1000e, mv643xx_eth etc. without having to introduce modifications to these
+drivers. Such network devices are also often referred to as conduit network
+devices since they act as a pipe between the host processor and the hardware
+Ethernet switch.
+
+Networking stack hooks
+----------------------
+
+When a master netdev is used with DSA, a small hook is placed in in the
+networking stack is in order to have the DSA subsystem process the Ethernet
+switch specific tagging protocol. DSA accomplishes this by registering a
+specific (and fake) Ethernet type (later becoming skb->protocol) with the
+networking stack, this is also known as a ptype or packet_type. A typical
+Ethernet Frame receive sequence looks like this:
+
+Master network device (e.g.: e1000e):
+
+Receive interrupt fires:
+- receive function is invoked
+- basic packet processing is done: getting length, status etc.
+- packet is prepared to be processed by the Ethernet layer by calling
+  eth_type_trans
+
+net/ethernet/eth.c:
+
+eth_type_trans(skb, dev)
+	if (dev->dsa_ptr != NULL)
+		-> skb->protocol = ETH_P_XDSA
+
+drivers/net/ethernet/*:
+
+netif_receive_skb(skb)
+	-> iterate over registered packet_type
+		-> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv()
+
+net/dsa/dsa.c:
+	-> dsa_switch_rcv()
+		-> invoke switch tag specific protocol handler in
+		   net/dsa/tag_*.c
+
+net/dsa/tag_*.c:
+	-> inspect and strip switch tag protocol to determine originating port
+	-> locate per-port network device
+	-> invoke eth_type_trans() with the DSA slave network device
+	-> invoked netif_receive_skb()
+
+Past this point, the DSA slave network devices get delivered regular Ethernet
+frames that can be processed by the networking stack.
+
+Slave network devices
+---------------------
+
+Slave network devices created by DSA are stacked on top of their master network
+device, each of these network interfaces will be responsible for being a
+controlling and data-flowing end-point for each front-panel port of the switch.
+These interfaces are specialized in order to:
+
+- insert/remove the switch tag protocol (if it exists) when sending traffic
+  to/from specific switch ports
+- query the switch for ethtool operations: statistics, link state,
+  Wake-on-LAN, register dumps...
+- external/internal PHY management: link, auto-negotiation etc.
+
+These slave network devices have custom net_device_ops and ethtool_ops function
+pointers which allow DSA to introduce a level of layering between the networking
+stack/ethtool, and the switch driver implementation.
+
+Upon frame transmission from these slave network devices, DSA will look up which
+switch tagging protocol is currently registered with these network devices, and
+invoke a specific transmit routine which takes care of adding the relevant
+switch tag in the Ethernet frames.
+
+These frames are then queued for transmission using the master network device
+ndo_start_xmit() function, since they contain the appropriate switch tag, the
+Ethernet switch will be able to process these incoming frames from the
+management interface and delivers these frames to the physical switch port.
+
+Graphical representation
+------------------------
+
+Summarized, this is basically how DSA looks like from a network device
+perspective:
+
+
+			|---------------------------
+			| CPU network device (eth0)|
+			----------------------------
+			| <tag added by switch     |
+			|                          |
+			|                          |
+			|        tag added by CPU> |
+		|--------------------------------------------|
+		| Switch driver				     |
+		|--------------------------------------------|
+                    ||        ||         ||
+		|-------|  |-------|  |-------|
+		| sw0p0 |  | sw0p1 |  | sw0p2 |
+		|-------|  |-------|  |-------|
+
+Slave MDIO bus
+--------------
+
+In order to be able to read to/from a switch PHY built into it, DSA creates a
+slave MDIO bus which allows a specific switch driver to divert and intercept
+MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
+switches, these functions would utilize direct or indirect PHY addressing mode
+to return standard MII registers from the switch builtin PHYs, allowing the PHY
+library and/or to return link status, link partner pages, auto-negotiation
+results etc..
+
+For Ethernet switches which have both external and internal MDIO busses, the
+slave MII bus can be utilized to mux/demux MDIO reads and writes towards either
+internal or external MDIO devices this switch might be connected to: internal
+PHYs, external PHYs, or even external switches.
+
+Data structures
+---------------
+
+DSA data structures are defined in include/net/dsa.h as well as
+net/dsa/dsa_priv.h.
+
+dsa_chip_data: platform data configuration for a given switch device, this
+structure describes a switch device's parent device, its address, as well as
+various properties of its ports: names/labels, and finally a routing table
+indication (when cascading switches)
+
+dsa_platform_data: platform device configuration data which can reference a
+collection of dsa_chip_data structure if multiples switches are cascaded, the
+master network device this switch tree is attached to needs to be referenced
+
+dsa_switch_tree: structure assigned to the master network device under
+"dsa_ptr", this structure references a dsa_platform_data structure as well as
+the tagging protocol supported by the switch tree, and which receive/transmit
+function hooks should be invoked, information about the directly attached switch
+is also provided: CPU port. Finally, a collection of dsa_switch are referenced
+to address individual switches in the tree.
+
+dsa_switch: structure describing a switch device in the tree, referencing a
+dsa_switch_tree as a backpointer, slave network devices, master network device,
+and a reference to the backing dsa_switch_ops
+
+dsa_switch_ops: structure referencing function pointers, see below for a full
+description.
+
+Design limitations
+==================
+
+DSA is a platform device driver
+-------------------------------
+
+DSA is implemented as a DSA platform device driver which is convenient because
+it will register the entire DSA switch tree attached to a master network device
+in one-shot, facilitating the device creation and simplifying the device driver
+model a bit, this comes however with a number of limitations:
+
+- building DSA and its switch drivers as modules is currently not working
+- the device driver parenting does not necessarily reflect the original
+  bus/device the switch can be created from
+- supporting non-MDIO and non-MMIO (platform) switches is not possible
+
+Limits on the number of devices and ports
+-----------------------------------------
+
+DSA currently limits the number of maximum switches within a tree to 4
+(DSA_MAX_SWITCHES), and the number of ports per switch to 12 (DSA_MAX_PORTS).
+These limits could be extended to support larger configurations would this need
+arise.
+
+Lack of CPU/DSA network devices
+-------------------------------
+
+DSA does not currently create slave network devices for the CPU or DSA ports, as
+described before. This might be an issue in the following cases:
+
+- inability to fetch switch CPU port statistics counters using ethtool, which
+  can make it harder to debug MDIO switch connected using xMII interfaces
+
+- inability to configure the CPU port link parameters based on the Ethernet
+  controller capabilities attached to it: http://patchwork.ozlabs.org/patch/509806/
+
+- inability to configure specific VLAN IDs / trunking VLANs between switches
+  when using a cascaded setup
+
+Common pitfalls using DSA setups
+--------------------------------
+
+Once a master network device is configured to use DSA (dev->dsa_ptr becomes
+non-NULL), and the switch behind it expects a tagging protocol, this network
+interface can only exclusively be used as a conduit interface. Sending packets
+directly through this interface (e.g.: opening a socket using this interface)
+will not make us go through the switch tagging protocol transmit function, so
+the Ethernet switch on the other end, expecting a tag will typically drop this
+frame.
+
+Slave network devices check that the master network device is UP before allowing
+you to administratively bring UP these slave network devices. A common
+configuration mistake is forgetting to bring UP the master network device first.
+
+Interactions with other subsystems
+==================================
+
+DSA currently leverages the following subsystems:
+
+- MDIO/PHY library: drivers/net/phy/phy.c, mdio_bus.c
+- Switchdev: net/switchdev/*
+- Device Tree for various of_* functions
+
+MDIO/PHY library
+----------------
+
+Slave network devices exposed by DSA may or may not be interfacing with PHY
+devices (struct phy_device as defined in include/linux/phy.h), but the DSA
+subsystem deals with all possible combinations:
+
+- internal PHY devices, built into the Ethernet switch hardware
+- external PHY devices, connected via an internal or external MDIO bus
+- internal PHY devices, connected via an internal MDIO bus
+- special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a
+  fixed PHYs
+
+The PHY configuration is done by the dsa_slave_phy_setup() function and the
+logic basically looks like this:
+
+- if Device Tree is used, the PHY device is looked up using the standard
+  "phy-handle" property, if found, this PHY device is created and registered
+  using of_phy_connect()
+
+- if Device Tree is used, and the PHY device is "fixed", that is, conforms to
+  the definition of a non-MDIO managed PHY as defined in
+  Documentation/devicetree/bindings/net/fixed-link.txt, the PHY is registered
+  and connected transparently using the special fixed MDIO bus driver
+
+- finally, if the PHY is built into the switch, as is very common with
+  standalone switch packages, the PHY is probed using the slave MII bus created
+  by DSA
+
+
+SWITCHDEV
+---------
+
+DSA directly utilizes SWITCHDEV when interfacing with the bridge layer, and
+more specifically with its VLAN filtering portion when configuring VLANs on top
+of per-port slave network devices. Since DSA primarily deals with
+MDIO-connected switches, although not exclusively, SWITCHDEV's
+prepare/abort/commit phases are often simplified into a prepare phase which
+checks whether the operation is supported by the DSA switch driver, and a commit
+phase which applies the changes.
+
+As of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN
+objects.
+
+Device Tree
+-----------
+
+DSA features a standardized binding which is documented in
+Documentation/devicetree/bindings/net/dsa/dsa.txt. PHY/MDIO library helper
+functions such as of_get_phy_mode(), of_phy_connect() are also used to query
+per-port PHY specific details: interface connection, MDIO bus location etc..
+
+Driver development
+==================
+
+DSA switch drivers need to implement a dsa_switch_ops structure which will
+contain the various members described below.
+
+register_switch_driver() registers this dsa_switch_ops in its internal list
+of drivers to probe for. unregister_switch_driver() does the exact opposite.
+
+Unless requested differently by setting the priv_size member accordingly, DSA
+does not allocate any driver private context space.
+
+Switch configuration
+--------------------
+
+- tag_protocol: this is to indicate what kind of tagging protocol is supported,
+  should be a valid value from the dsa_tag_protocol enum
+
+- probe: probe routine which will be invoked by the DSA platform device upon
+  registration to test for the presence/absence of a switch device. For MDIO
+  devices, it is recommended to issue a read towards internal registers using
+  the switch pseudo-PHY and return whether this is a supported device. For other
+  buses, return a non-NULL string
+
+- setup: setup function for the switch, this function is responsible for setting
+  up the dsa_switch_ops private structure with all it needs: register maps,
+  interrupts, mutexes, locks etc.. This function is also expected to properly
+  configure the switch to separate all network interfaces from each other, that
+  is, they should be isolated by the switch hardware itself, typically by creating
+  a Port-based VLAN ID for each port and allowing only the CPU port and the
+  specific port to be in the forwarding vector. Ports that are unused by the
+  platform should be disabled. Past this function, the switch is expected to be
+  fully configured and ready to serve any kind of request. It is recommended
+  to issue a software reset of the switch during this setup function in order to
+  avoid relying on what a previous software agent such as a bootloader/firmware
+  may have previously configured.
+
+PHY devices and link management
+-------------------------------
+
+- get_phy_flags: Some switches are interfaced to various kinds of Ethernet PHYs,
+  if the PHY library PHY driver needs to know about information it cannot obtain
+  on its own (e.g.: coming from switch memory mapped registers), this function
+  should return a 32-bits bitmask of "flags", that is private between the switch
+  driver and the Ethernet PHY driver in drivers/net/phy/*.
+
+- phy_read: Function invoked by the DSA slave MDIO bus when attempting to read
+  the switch port MDIO registers. If unavailable, return 0xffff for each read.
+  For builtin switch Ethernet PHYs, this function should allow reading the link
+  status, auto-negotiation results, link partner pages etc..
+
+- phy_write: Function invoked by the DSA slave MDIO bus when attempting to write
+  to the switch port MDIO registers. If unavailable return a negative error
+  code.
+
+- adjust_link: Function invoked by the PHY library when a slave network device
+  is attached to a PHY device. This function is responsible for appropriately
+  configuring the switch port link parameters: speed, duplex, pause based on
+  what the phy_device is providing.
+
+- fixed_link_update: Function invoked by the PHY library, and specifically by
+  the fixed PHY driver asking the switch driver for link parameters that could
+  not be auto-negotiated, or obtained by reading the PHY registers through MDIO.
+  This is particularly useful for specific kinds of hardware such as QSGMII,
+  MoCA or other kinds of non-MDIO managed PHYs where out of band link
+  information is obtained
+
+Ethtool operations
+------------------
+
+- get_strings: ethtool function used to query the driver's strings, will
+  typically return statistics strings, private flags strings etc.
+
+- get_ethtool_stats: ethtool function used to query per-port statistics and
+  return their values. DSA overlays slave network devices general statistics:
+  RX/TX counters from the network device, with switch driver specific statistics
+  per port
+
+- get_sset_count: ethtool function used to query the number of statistics items
+
+- get_wol: ethtool function used to obtain Wake-on-LAN settings per-port, this
+  function may, for certain implementations also query the master network device
+  Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN
+
+- set_wol: ethtool function used to configure Wake-on-LAN settings per-port,
+  direct counterpart to set_wol with similar restrictions
+
+- set_eee: ethtool function which is used to configure a switch port EEE (Green
+  Ethernet) settings, can optionally invoke the PHY library to enable EEE at the
+  PHY level if relevant. This function should enable EEE at the switch port MAC
+  controller and data-processing logic
+
+- get_eee: ethtool function which is used to query a switch port EEE settings,
+  this function should return the EEE state of the switch port MAC controller
+  and data-processing logic as well as query the PHY for its currently configured
+  EEE settings
+
+- get_eeprom_len: ethtool function returning for a given switch the EEPROM
+  length/size in bytes
+
+- get_eeprom: ethtool function returning for a given switch the EEPROM contents
+
+- set_eeprom: ethtool function writing specified data to a given switch EEPROM
+
+- get_regs_len: ethtool function returning the register length for a given
+  switch
+
+- get_regs: ethtool function returning the Ethernet switch internal register
+  contents. This function might require user-land code in ethtool to
+  pretty-print register values and registers
+
+Power management
+----------------
+
+- suspend: function invoked by the DSA platform device when the system goes to
+  suspend, should quiesce all Ethernet switch activities, but keep ports
+  participating in Wake-on-LAN active as well as additional wake-up logic if
+  supported
+
+- resume: function invoked by the DSA platform device when the system resumes,
+  should resume all Ethernet switch activities and re-configure the switch to be
+  in a fully active state
+
+- port_enable: function invoked by the DSA slave network device ndo_open
+  function when a port is administratively brought up, this function should be
+  fully enabling a given switch port. DSA takes care of marking the port with
+  BR_STATE_BLOCKING if the port is a bridge member, or BR_STATE_FORWARDING if it
+  was not, and propagating these changes down to the hardware
+
+- port_disable: function invoked by the DSA slave network device ndo_close
+  function when a port is administratively brought down, this function should be
+  fully disabling a given switch port. DSA takes care of marking the port with
+  BR_STATE_DISABLED and propagating changes to the hardware if this port is
+  disabled while being a bridge member
+
+Bridge layer
+------------
+
+- port_bridge_join: bridge layer function invoked when a given switch port is
+  added to a bridge, this function should be doing the necessary at the switch
+  level to permit the joining port from being added to the relevant logical
+  domain for it to ingress/egress traffic with other members of the bridge.
+
+- port_bridge_leave: bridge layer function invoked when a given switch port is
+  removed from a bridge, this function should be doing the necessary at the
+  switch level to deny the leaving port from ingress/egress traffic from the
+  remaining bridge members. When the port leaves the bridge, it should be aged
+  out at the switch hardware for the switch to (re) learn MAC addresses behind
+  this port.
+
+- port_stp_state_set: bridge layer function invoked when a given switch port STP
+  state is computed by the bridge layer and should be propagated to switch
+  hardware to forward/block/learn traffic. The switch driver is responsible for
+  computing a STP state change based on current and asked parameters and perform
+  the relevant ageing based on the intersection results
+
+Bridge VLAN filtering
+---------------------
+
+- port_vlan_filtering: bridge layer function invoked when the bridge gets
+  configured for turning on or off VLAN filtering. If nothing specific needs to
+  be done at the hardware level, this callback does not need to be implemented.
+  When VLAN filtering is turned on, the hardware must be programmed with
+  rejecting 802.1Q frames which have VLAN IDs outside of the programmed allowed
+  VLAN ID map/rules.  If there is no PVID programmed into the switch port,
+  untagged frames must be rejected as well. When turned off the switch must
+  accept any 802.1Q frames irrespective of their VLAN ID, and untagged frames are
+  allowed.
+
+- port_vlan_prepare: bridge layer function invoked when the bridge prepares the
+  configuration of a VLAN on the given port. If the operation is not supported
+  by the hardware, this function should return -EOPNOTSUPP to inform the bridge
+  code to fallback to a software implementation. No hardware setup must be done
+  in this function. See port_vlan_add for this and details.
+
+- port_vlan_add: bridge layer function invoked when a VLAN is configured
+  (tagged or untagged) for the given switch port
+
+- port_vlan_del: bridge layer function invoked when a VLAN is removed from the
+  given switch port
+
+- port_vlan_dump: bridge layer function invoked with a switchdev callback
+  function that the driver has to call for each VLAN the given port is a member
+  of. A switchdev object is used to carry the VID and bridge flags.
+
+- port_fdb_prepare: bridge layer function invoked when the bridge prepares the
+  installation of a Forwarding Database entry. If the operation is not
+  supported, this function should return -EOPNOTSUPP to inform the bridge code
+  to fallback to a software implementation. No hardware setup must be done in
+  this function. See port_fdb_add for this and details.
+
+- port_fdb_add: bridge layer function invoked when the bridge wants to install a
+  Forwarding Database entry, the switch hardware should be programmed with the
+  specified address in the specified VLAN Id in the forwarding database
+  associated with this VLAN ID
+
+Note: VLAN ID 0 corresponds to the port private database, which, in the context
+of DSA, would be the its port-based VLAN, used by the associated bridge device.
+
+- port_fdb_del: bridge layer function invoked when the bridge wants to remove a
+  Forwarding Database entry, the switch hardware should be programmed to delete
+  the specified MAC address from the specified VLAN ID if it was mapped into
+  this port forwarding database
+
+- port_fdb_dump: bridge layer function invoked with a switchdev callback
+  function that the driver has to call for each MAC address known to be behind
+  the given port. A switchdev object is used to carry the VID and FDB info.
+
+- port_mdb_prepare: bridge layer function invoked when the bridge prepares the
+  installation of a multicast database entry. If the operation is not supported,
+  this function should return -EOPNOTSUPP to inform the bridge code to fallback
+  to a software implementation. No hardware setup must be done in this function.
+  See port_fdb_add for this and details.
+
+- port_mdb_add: bridge layer function invoked when the bridge wants to install
+  a multicast database entry, the switch hardware should be programmed with the
+  specified address in the specified VLAN ID in the forwarding database
+  associated with this VLAN ID.
+
+Note: VLAN ID 0 corresponds to the port private database, which, in the context
+of DSA, would be the its port-based VLAN, used by the associated bridge device.
+
+- port_mdb_del: bridge layer function invoked when the bridge wants to remove a
+  multicast database entry, the switch hardware should be programmed to delete
+  the specified MAC address from the specified VLAN ID if it was mapped into
+  this port forwarding database.
+
+- port_mdb_dump: bridge layer function invoked with a switchdev callback
+  function that the driver has to call for each MAC address known to be behind
+  the given port. A switchdev object is used to carry the VID and MDB info.
+
+TODO
+====
+
+Making SWITCHDEV and DSA converge towards an unified codebase
+-------------------------------------------------------------
+
+SWITCHDEV properly takes care of abstracting the networking stack with offload
+capable hardware, but does not enforce a strict switch device driver model. On
+the other DSA enforces a fairly strict device driver model, and deals with most
+of the switch specific. At some point we should envision a merger between these
+two subsystems and get the best of both worlds.
+
+Other hanging fruits
+--------------------
+
+- making the number of ports fully dynamic and not dependent on DSA_MAX_PORTS
+- allowing more than one CPU/management interface:
+  http://comments.gmane.org/gmane.linux.network/365657
+- porting more drivers from other vendors:
+  http://comments.gmane.org/gmane.linux.network/365510
diff --git a/Documentation/networking/dsa/lan9303.txt b/Documentation/networking/dsa/lan9303.txt
new file mode 100644
index 000000000..144b02b95
--- /dev/null
+++ b/Documentation/networking/dsa/lan9303.txt
@@ -0,0 +1,37 @@
+LAN9303 Ethernet switch driver
+==============================
+
+The LAN9303 is a three port 10/100 Mbps ethernet switch with integrated phys for
+the two external ethernet ports. The third port is an RMII/MII interface to a
+host master network interface (e.g. fixed link).
+
+
+Driver details
+==============
+
+The driver is implemented as a DSA driver, see
+Documentation/networking/dsa/dsa.txt.
+
+See Documentation/devicetree/bindings/net/dsa/lan9303.txt for device tree
+binding.
+
+The LAN9303 can be managed both via MDIO and I2C, both supported by this driver.
+
+At startup the driver configures the device to provide two separate network
+interfaces (which is the default state of a DSA device). Due to HW limitations,
+no HW MAC learning takes place in this mode.
+
+When both user ports are joined to the same bridge, the normal HW MAC learning
+is enabled. This means that unicast traffic is forwarded in HW. Broadcast and
+multicast is flooded in HW. STP is also supported in this mode. The driver
+support fdb/mdb operations as well, meaning IGMP snooping is supported.
+
+If one of the user ports leave the bridge, the ports goes back to the initial
+separated operation.
+
+
+Driver limitations
+==================
+
+ - Support for VLAN filtering is not implemented
+ - The HW does not support VLAN-specific fdb entries
diff --git a/Documentation/networking/e100.rst b/Documentation/networking/e100.rst
new file mode 100644
index 000000000..f81111eba
--- /dev/null
+++ b/Documentation/networking/e100.rst
@@ -0,0 +1,186 @@
+==============================================================
+Linux* Base Driver for the Intel(R) PRO/100 Family of Adapters
+==============================================================
+
+June 1, 2018
+
+Contents
+========
+
+- In This Release
+- Identifying Your Adapter
+- Building and Installation
+- Driver Configuration Parameters
+- Additional Configurations
+- Known Issues
+- Support
+
+
+In This Release
+===============
+
+This file describes the Linux* Base Driver for the Intel(R) PRO/100 Family of
+Adapters. This driver includes support for Itanium(R)2-based systems.
+
+For questions related to hardware requirements, refer to the documentation
+supplied with your Intel PRO/100 adapter.
+
+The following features are now available in supported kernels:
+ - Native VLANs
+ - Channel Bonding (teaming)
+ - SNMP
+
+Channel Bonding documentation can be found in the Linux kernel source:
+/Documentation/networking/bonding.txt
+
+
+Identifying Your Adapter
+========================
+
+For information on how to identify your adapter, and for the latest Intel
+network drivers, refer to the Intel Support website:
+http://www.intel.com/support
+
+Driver Configuration Parameters
+===============================
+
+The default value for each parameter is generally the recommended setting,
+unless otherwise noted.
+
+Rx Descriptors:
+   Number of receive descriptors. A receive descriptor is a data
+   structure that describes a receive buffer and its attributes to the network
+   controller. The data in the descriptor is used by the controller to write
+   data from the controller to host memory. In the 3.x.x driver the valid range
+   for this parameter is 64-256. The default value is 256. This parameter can be
+   changed using the command::
+
+     ethtool -G eth? rx n
+
+   Where n is the number of desired Rx descriptors.
+
+Tx Descriptors:
+   Number of transmit descriptors. A transmit descriptor is a data
+   structure that describes a transmit buffer and its attributes to the network
+   controller. The data in the descriptor is used by the controller to read
+   data from the host memory to the controller. In the 3.x.x driver the valid
+   range for this parameter is 64-256. The default value is 128. This parameter
+   can be changed using the command::
+
+     ethtool -G eth? tx n
+
+   Where n is the number of desired Tx descriptors.
+
+Speed/Duplex:
+   The driver auto-negotiates the link speed and duplex settings by
+   default. The ethtool utility can be used as follows to force speed/duplex.::
+
+     ethtool -s eth?  autoneg off speed {10|100} duplex {full|half}
+
+   NOTE: setting the speed/duplex to incorrect values will cause the link to
+   fail.
+
+Event Log Message Level:
+   The driver uses the message level flag to log events
+   to syslog. The message level can be set at driver load time. It can also be
+   set using the command::
+
+     ethtool -s eth? msglvl n
+
+
+Additional Configurations
+=========================
+
+Configuring the Driver on Different Distributions
+-------------------------------------------------
+
+Configuring a network driver to load properly when the system is started
+is distribution dependent.  Typically, the configuration process involves
+adding an alias line to `/etc/modprobe.d/*.conf` as well as editing other
+system startup scripts and/or configuration files.  Many popular Linux
+distributions ship with tools to make these changes for you.  To learn
+the proper way to configure a network device for your system, refer to
+your distribution documentation.  If during this process you are asked
+for the driver or module name, the name for the Linux Base Driver for
+the Intel PRO/100 Family of Adapters is e100.
+
+As an example, if you install the e100 driver for two PRO/100 adapters
+(eth0 and eth1), add the following to a configuration file in
+/etc/modprobe.d/::
+
+       alias eth0 e100
+       alias eth1 e100
+
+Viewing Link Messages
+---------------------
+
+In order to see link messages and other Intel driver information on your
+console, you must set the dmesg level up to six.  This can be done by
+entering the following on the command line before loading the e100
+driver::
+
+       dmesg -n 6
+
+If you wish to see all messages issued by the driver, including debug
+messages, set the dmesg level to eight.
+
+NOTE: This setting is not saved across reboots.
+
+ethtool
+-------
+
+The driver utilizes the ethtool interface for driver configuration and
+diagnostics, as well as displaying statistical information.  The ethtool
+version 1.6 or later is required for this functionality.
+
+The latest release of ethtool can be found from
+https://www.kernel.org/pub/software/network/ethtool/
+
+Enabling Wake on LAN* (WoL)
+---------------------------
+WoL is provided through the ethtool* utility.  For instructions on
+enabling WoL with ethtool, refer to the ethtool man page.  WoL will be
+enabled on the system during the next shut down or reboot.  For this
+driver version, in order to enable WoL, the e100 driver must be loaded
+when shutting down or rebooting the system.
+
+NAPI
+----
+
+NAPI (Rx polling mode) is supported in the e100 driver.
+
+See https://wiki.linuxfoundation.org/networking/napi for more
+information on NAPI.
+
+Multiple Interfaces on Same Ethernet Broadcast Network
+------------------------------------------------------
+
+Due to the default ARP behavior on Linux, it is not possible to have one
+system on two IP networks in the same Ethernet broadcast domain
+(non-partitioned switch) behave as expected.  All Ethernet interfaces
+will respond to IP traffic for any IP address assigned to the system.
+This results in unbalanced receive traffic.
+
+If you have multiple interfaces in a server, either turn on ARP
+filtering by
+
+(1) entering::
+
+	echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
+
+    (this only works if your kernel's version is higher than 2.4.5), or
+
+(2) installing the interfaces in separate broadcast domains (either
+    in different switches or in a switch partitioned to VLANs).
+
+
+Support
+=======
+For general information, go to the Intel support website at:
+http://www.intel.com/support/
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+http://sourceforge.net/projects/e1000
+If an issue is identified with the released source code on a supported kernel
+with a supported adapter, email the specific information related to the issue
+to e1000-devel@lists.sf.net.
diff --git a/Documentation/networking/e1000.rst b/Documentation/networking/e1000.rst
new file mode 100644
index 000000000..f10dd4086
--- /dev/null
+++ b/Documentation/networking/e1000.rst
@@ -0,0 +1,461 @@
+===========================================================
+Linux* Base Driver for Intel(R) Ethernet Network Connection
+===========================================================
+
+Intel Gigabit Linux driver.
+Copyright(c) 1999 - 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Command Line Parameters
+- Speed and Duplex Configuration
+- Additional Configurations
+- Support
+
+Identifying Your Adapter
+========================
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/go/network/adapter/idguide.htm
+
+For the latest Intel network drivers for Linux, refer to the following
+website.  In the search field, enter your adapter name or type, or use the
+networking link on the left to search for your adapter:
+
+    http://support.intel.com/support/go/network/adapter/home.htm
+
+Command Line Parameters
+=======================
+
+The default value for each parameter is generally the recommended setting,
+unless otherwise noted.
+
+NOTES:
+	For more information about the AutoNeg, Duplex, and Speed
+        parameters, see the "Speed and Duplex Configuration" section in
+        this document.
+
+        For more information about the InterruptThrottleRate,
+        RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
+        parameters, see the application note at:
+        http://www.intel.com/design/network/applnots/ap450.htm
+
+AutoNeg
+-------
+
+(Supported only on adapters with copper connections)
+
+:Valid Range:   0x01-0x0F, 0x20-0x2F
+:Default Value: 0x2F
+
+This parameter is a bit-mask that specifies the speed and duplex settings
+advertised by the adapter.  When this parameter is used, the Speed and
+Duplex parameters must not be specified.
+
+NOTE:
+       Refer to the Speed and Duplex section of this readme for more
+       information on the AutoNeg parameter.
+
+Duplex
+------
+
+(Supported only on adapters with copper connections)
+
+:Valid Range:   0-2 (0=auto-negotiate, 1=half, 2=full)
+:Default Value: 0
+
+This defines the direction in which data is allowed to flow.  Can be
+either one or two-directional.  If both Duplex and the link partner are
+set to auto-negotiate, the board auto-detects the correct duplex.  If the
+link partner is forced (either full or half), Duplex defaults to half-
+duplex.
+
+FlowControl
+-----------
+
+:Valid Range:   0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
+:Default Value: Reads flow control settings from the EEPROM
+
+This parameter controls the automatic generation(Tx) and response(Rx)
+to Ethernet PAUSE frames.
+
+InterruptThrottleRate
+---------------------
+
+(not supported on Intel(R) 82542, 82543 or 82544-based adapters)
+
+:Valid Range:
+   0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative,
+   4=simplified balancing)
+:Default Value: 3
+
+The driver can limit the amount of interrupts per second that the adapter
+will generate for incoming packets. It does this by writing a value to the
+adapter that is based on the maximum amount of interrupts that the adapter
+will generate per second.
+
+Setting InterruptThrottleRate to a value greater or equal to 100
+will program the adapter to send out a maximum of that many interrupts
+per second, even if more packets have come in. This reduces interrupt
+load on the system and can lower CPU utilization under heavy load,
+but will increase latency as packets are not processed as quickly.
+
+The default behaviour of the driver previously assumed a static
+InterruptThrottleRate value of 8000, providing a good fallback value for
+all traffic types,but lacking in small packet performance and latency.
+The hardware can handle many more small packets per second however, and
+for this reason an adaptive interrupt moderation algorithm was implemented.
+
+Since 7.3.x, the driver has two adaptive modes (setting 1 or 3) in which
+it dynamically adjusts the InterruptThrottleRate value based on the traffic
+that it receives. After determining the type of incoming traffic in the last
+timeframe, it will adjust the InterruptThrottleRate to an appropriate value
+for that traffic.
+
+The algorithm classifies the incoming traffic every interval into
+classes.  Once the class is determined, the InterruptThrottleRate value is
+adjusted to suit that traffic type the best. There are three classes defined:
+"Bulk traffic", for large amounts of packets of normal size; "Low latency",
+for small amounts of traffic and/or a significant percentage of small
+packets; and "Lowest latency", for almost completely small packets or
+minimal traffic.
+
+In dynamic conservative mode, the InterruptThrottleRate value is set to 4000
+for traffic that falls in class "Bulk traffic". If traffic falls in the "Low
+latency" or "Lowest latency" class, the InterruptThrottleRate is increased
+stepwise to 20000. This default mode is suitable for most applications.
+
+For situations where low latency is vital such as cluster or
+grid computing, the algorithm can reduce latency even more when
+InterruptThrottleRate is set to mode 1. In this mode, which operates
+the same as mode 3, the InterruptThrottleRate will be increased stepwise to
+70000 for traffic in class "Lowest latency".
+
+In simplified mode the interrupt rate is based on the ratio of TX and
+RX traffic.  If the bytes per second rate is approximately equal, the
+interrupt rate will drop as low as 2000 interrupts per second.  If the
+traffic is mostly transmit or mostly receive, the interrupt rate could
+be as high as 8000.
+
+Setting InterruptThrottleRate to 0 turns off any interrupt moderation
+and may improve small packet latency, but is generally not suitable
+for bulk throughput traffic.
+
+NOTE:
+       InterruptThrottleRate takes precedence over the TxAbsIntDelay and
+       RxAbsIntDelay parameters.  In other words, minimizing the receive
+       and/or transmit absolute delays does not force the controller to
+       generate more interrupts than what the Interrupt Throttle Rate
+       allows.
+
+CAUTION:
+          If you are using the Intel(R) PRO/1000 CT Network Connection
+          (controller 82547), setting InterruptThrottleRate to a value
+          greater than 75,000, may hang (stop transmitting) adapters
+          under certain network conditions.  If this occurs a NETDEV
+          WATCHDOG message is logged in the system event log.  In
+          addition, the controller is automatically reset, restoring
+          the network connection.  To eliminate the potential for the
+          hang, ensure that InterruptThrottleRate is set no greater
+          than 75,000 and is not set to 0.
+
+NOTE:
+       When e1000 is loaded with default settings and multiple adapters
+       are in use simultaneously, the CPU utilization may increase non-
+       linearly.  In order to limit the CPU utilization without impacting
+       the overall throughput, we recommend that you load the driver as
+       follows::
+
+           modprobe e1000 InterruptThrottleRate=3000,3000,3000
+
+       This sets the InterruptThrottleRate to 3000 interrupts/sec for
+       the first, second, and third instances of the driver.  The range
+       of 2000 to 3000 interrupts per second works on a majority of
+       systems and is a good starting point, but the optimal value will
+       be platform-specific.  If CPU utilization is not a concern, use
+       RX_POLLING (NAPI) and default driver settings.
+
+RxDescriptors
+-------------
+
+:Valid Range:
+ - 48-256 for 82542 and 82543-based adapters
+ - 48-4096 for all other supported adapters
+:Default Value: 256
+
+This value specifies the number of receive buffer descriptors allocated
+by the driver.  Increasing this value allows the driver to buffer more
+incoming packets, at the expense of increased system memory utilization.
+
+Each descriptor is 16 bytes.  A receive buffer is also allocated for each
+descriptor and can be either 2048, 4096, 8192, or 16384 bytes, depending
+on the MTU setting. The maximum MTU size is 16110.
+
+NOTE:
+       MTU designates the frame size.  It only needs to be set for Jumbo
+       Frames.  Depending on the available system resources, the request
+       for a higher number of receive descriptors may be denied.  In this
+       case, use a lower number.
+
+RxIntDelay
+----------
+
+:Valid Range:   0-65535 (0=off)
+:Default Value: 0
+
+This value delays the generation of receive interrupts in units of 1.024
+microseconds.  Receive interrupt reduction can improve CPU efficiency if
+properly tuned for specific network traffic.  Increasing this value adds
+extra latency to frame reception and can end up decreasing the throughput
+of TCP traffic.  If the system is reporting dropped receives, this value
+may be set too high, causing the driver to run out of available receive
+descriptors.
+
+CAUTION:
+          When setting RxIntDelay to a value other than 0, adapters may
+          hang (stop transmitting) under certain network conditions.  If
+          this occurs a NETDEV WATCHDOG message is logged in the system
+          event log.  In addition, the controller is automatically reset,
+          restoring the network connection.  To eliminate the potential
+          for the hang ensure that RxIntDelay is set to 0.
+
+RxAbsIntDelay
+-------------
+
+(This parameter is supported only on 82540, 82545 and later adapters.)
+
+:Valid Range:   0-65535 (0=off)
+:Default Value: 128
+
+This value, in units of 1.024 microseconds, limits the delay in which a
+receive interrupt is generated.  Useful only if RxIntDelay is non-zero,
+this value ensures that an interrupt is generated after the initial
+packet is received within the set amount of time.  Proper tuning,
+along with RxIntDelay, may improve traffic throughput in specific network
+conditions.
+
+Speed
+-----
+
+(This parameter is supported only on adapters with copper connections.)
+
+:Valid Settings: 0, 10, 100, 1000
+:Default Value:  0 (auto-negotiate at all supported speeds)
+
+Speed forces the line speed to the specified value in megabits per second
+(Mbps).  If this parameter is not specified or is set to 0 and the link
+partner is set to auto-negotiate, the board will auto-detect the correct
+speed.  Duplex should also be set when Speed is set to either 10 or 100.
+
+TxDescriptors
+-------------
+
+:Valid Range:
+  - 48-256 for 82542 and 82543-based adapters
+  - 48-4096 for all other supported adapters
+:Default Value: 256
+
+This value is the number of transmit descriptors allocated by the driver.
+Increasing this value allows the driver to queue more transmits.  Each
+descriptor is 16 bytes.
+
+NOTE:
+       Depending on the available system resources, the request for a
+       higher number of transmit descriptors may be denied.  In this case,
+       use a lower number.
+
+TxIntDelay
+----------
+
+:Valid Range:   0-65535 (0=off)
+:Default Value: 8
+
+This value delays the generation of transmit interrupts in units of
+1.024 microseconds.  Transmit interrupt reduction can improve CPU
+efficiency if properly tuned for specific network traffic.  If the
+system is reporting dropped transmits, this value may be set too high
+causing the driver to run out of available transmit descriptors.
+
+TxAbsIntDelay
+-------------
+
+(This parameter is supported only on 82540, 82545 and later adapters.)
+
+:Valid Range:   0-65535 (0=off)
+:Default Value: 32
+
+This value, in units of 1.024 microseconds, limits the delay in which a
+transmit interrupt is generated.  Useful only if TxIntDelay is non-zero,
+this value ensures that an interrupt is generated after the initial
+packet is sent on the wire within the set amount of time.  Proper tuning,
+along with TxIntDelay, may improve traffic throughput in specific
+network conditions.
+
+XsumRX
+------
+
+(This parameter is NOT supported on the 82542-based adapter.)
+
+:Valid Range:   0-1
+:Default Value: 1
+
+A value of '1' indicates that the driver should enable IP checksum
+offload for received packets (both UDP and TCP) to the adapter hardware.
+
+Copybreak
+---------
+
+:Valid Range:   0-xxxxxxx (0=off)
+:Default Value: 256
+:Usage: modprobe e1000.ko copybreak=128
+
+Driver copies all packets below or equaling this size to a fresh RX
+buffer before handing it up the stack.
+
+This parameter is different than other parameters, in that it is a
+single (not 1,1,1 etc.) parameter applied to all driver instances and
+it is also available during runtime at
+/sys/module/e1000/parameters/copybreak
+
+SmartPowerDownEnable
+--------------------
+
+:Valid Range: 0-1
+:Default Value:  0 (disabled)
+
+Allows PHY to turn off in lower power states. The user can turn off
+this parameter in supported chipsets.
+
+Speed and Duplex Configuration
+==============================
+
+Three keywords are used to control the speed and duplex configuration.
+These keywords are Speed, Duplex, and AutoNeg.
+
+If the board uses a fiber interface, these keywords are ignored, and the
+fiber interface board only links at 1000 Mbps full-duplex.
+
+For copper-based boards, the keywords interact as follows:
+
+- The default operation is auto-negotiate.  The board advertises all
+  supported speed and duplex combinations, and it links at the highest
+  common speed and duplex mode IF the link partner is set to auto-negotiate.
+
+- If Speed = 1000, limited auto-negotiation is enabled and only 1000 Mbps
+  is advertised (The 1000BaseT spec requires auto-negotiation.)
+
+- If Speed = 10 or 100, then both Speed and Duplex should be set.  Auto-
+  negotiation is disabled, and the AutoNeg parameter is ignored.  Partner
+  SHOULD also be forced.
+
+The AutoNeg parameter is used when more control is required over the
+auto-negotiation process.  It should be used when you wish to control which
+speed and duplex combinations are advertised during the auto-negotiation
+process.
+
+The parameter may be specified as either a decimal or hexadecimal value as
+determined by the bitmap below.
+
+============== ====== ====== ======= ======= ====== ====== ======= ======
+Bit position   7      6      5       4       3      2      1       0
+Decimal Value  128    64     32      16      8      4      2       1
+Hex value      80     40     20      10      8      4      2       1
+Speed (Mbps)   N/A    N/A    1000    N/A     100    100    10      10
+Duplex                       Full            Full   Half   Full    Half
+============== ====== ====== ======= ======= ====== ====== ======= ======
+
+Some examples of using AutoNeg::
+
+  modprobe e1000 AutoNeg=0x01 (Restricts autonegotiation to 10 Half)
+  modprobe e1000 AutoNeg=1 (Same as above)
+  modprobe e1000 AutoNeg=0x02 (Restricts autonegotiation to 10 Full)
+  modprobe e1000 AutoNeg=0x03 (Restricts autonegotiation to 10 Half or 10 Full)
+  modprobe e1000 AutoNeg=0x04 (Restricts autonegotiation to 100 Half)
+  modprobe e1000 AutoNeg=0x05 (Restricts autonegotiation to 10 Half or 100
+  Half)
+  modprobe e1000 AutoNeg=0x020 (Restricts autonegotiation to 1000 Full)
+  modprobe e1000 AutoNeg=32 (Same as above)
+
+Note that when this parameter is used, Speed and Duplex must not be specified.
+
+If the link partner is forced to a specific speed and duplex, then this
+parameter should not be used.  Instead, use the Speed and Duplex parameters
+previously mentioned to force the adapter to the same speed and duplex.
+
+Additional Configurations
+=========================
+
+Jumbo Frames
+------------
+
+  Jumbo Frames support is enabled by changing the MTU to a value larger than
+  the default of 1500.  Use the ifconfig command to increase the MTU size.
+  For example::
+
+       ifconfig eth<x> mtu 9000 up
+
+  This setting is not saved across reboots.  It can be made permanent if
+  you add::
+
+       MTU=9000
+
+  to the file /etc/sysconfig/network-scripts/ifcfg-eth<x>.  This example
+  applies to the Red Hat distributions; other distributions may store this
+  setting in a different location.
+
+Notes:
+  Degradation in throughput performance may be observed in some Jumbo frames
+  environments. If this is observed, increasing the application's socket buffer
+  size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
+  See the specific application manual and /usr/src/linux*/Documentation/
+  networking/ip-sysctl.txt for more details.
+
+  - The maximum MTU setting for Jumbo Frames is 16110.  This value coincides
+    with the maximum Jumbo Frames size of 16128.
+
+  - Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
+    poor performance or loss of link.
+
+  - Adapters based on the Intel(R) 82542 and 82573V/E controller do not
+    support Jumbo Frames. These correspond to the following product names::
+
+     Intel(R) PRO/1000 Gigabit Server Adapter
+     Intel(R) PRO/1000 PM Network Connection
+
+ethtool
+-------
+
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information.  The ethtool
+  version 1.6 or later is required for this functionality.
+
+  The latest release of ethtool can be found from
+  https://www.kernel.org/pub/software/network/ethtool/
+
+Enabling Wake on LAN* (WoL)
+---------------------------
+
+  WoL is configured through the ethtool* utility.
+
+  WoL will be enabled on the system during the next shut down or reboot.
+  For this driver version, in order to enable WoL, the e1000 driver must be
+  loaded when shutting down or rebooting the system.
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/e1000e.txt b/Documentation/networking/e1000e.txt
new file mode 100644
index 000000000..12089547b
--- /dev/null
+++ b/Documentation/networking/e1000e.txt
@@ -0,0 +1,312 @@
+Linux* Driver for Intel(R) Ethernet Network Connection
+======================================================
+
+Intel Gigabit Linux driver.
+Copyright(c) 1999 - 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Command Line Parameters
+- Additional Configurations
+- Support
+
+Identifying Your Adapter
+========================
+
+The e1000e driver supports all PCI Express Intel(R) Gigabit Network
+Connections, except those that are 82575, 82576 and 82580-based*.
+
+* NOTE: The Intel(R) PRO/1000 P Dual Port Server Adapter is supported by
+  the e1000 driver, not the e1000e driver due to the 82546 part being used
+  behind a PCI Express bridge.
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/go/network/adapter/idguide.htm
+
+For the latest Intel network drivers for Linux, refer to the following
+website.  In the search field, enter your adapter name or type, or use the
+networking link on the left to search for your adapter:
+
+    http://support.intel.com/support/go/network/adapter/home.htm
+
+Command Line Parameters
+=======================
+
+The default value for each parameter is generally the recommended setting,
+unless otherwise noted.
+
+NOTES:  For more information about the InterruptThrottleRate,
+        RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
+        parameters, see the application note at:
+        http://www.intel.com/design/network/applnots/ap450.htm
+
+InterruptThrottleRate
+---------------------
+Valid Range:   0,1,3,4,100-100000 (0=off, 1=dynamic, 3=dynamic conservative,
+                                   4=simplified balancing)
+Default Value: 3
+
+The driver can limit the amount of interrupts per second that the adapter
+will generate for incoming packets. It does this by writing a value to the
+adapter that is based on the maximum amount of interrupts that the adapter
+will generate per second.
+
+Setting InterruptThrottleRate to a value greater or equal to 100
+will program the adapter to send out a maximum of that many interrupts
+per second, even if more packets have come in. This reduces interrupt
+load on the system and can lower CPU utilization under heavy load,
+but will increase latency as packets are not processed as quickly.
+
+The default behaviour of the driver previously assumed a static
+InterruptThrottleRate value of 8000, providing a good fallback value for
+all traffic types, but lacking in small packet performance and latency.
+The hardware can handle many more small packets per second however, and
+for this reason an adaptive interrupt moderation algorithm was implemented.
+
+The driver has two adaptive modes (setting 1 or 3) in which
+it dynamically adjusts the InterruptThrottleRate value based on the traffic
+that it receives. After determining the type of incoming traffic in the last
+timeframe, it will adjust the InterruptThrottleRate to an appropriate value
+for that traffic.
+
+The algorithm classifies the incoming traffic every interval into
+classes.  Once the class is determined, the InterruptThrottleRate value is
+adjusted to suit that traffic type the best. There are three classes defined:
+"Bulk traffic", for large amounts of packets of normal size; "Low latency",
+for small amounts of traffic and/or a significant percentage of small
+packets; and "Lowest latency", for almost completely small packets or
+minimal traffic.
+
+In dynamic conservative mode, the InterruptThrottleRate value is set to 4000
+for traffic that falls in class "Bulk traffic". If traffic falls in the "Low
+latency" or "Lowest latency" class, the InterruptThrottleRate is increased
+stepwise to 20000. This default mode is suitable for most applications.
+
+For situations where low latency is vital such as cluster or
+grid computing, the algorithm can reduce latency even more when
+InterruptThrottleRate is set to mode 1. In this mode, which operates
+the same as mode 3, the InterruptThrottleRate will be increased stepwise to
+70000 for traffic in class "Lowest latency".
+
+In simplified mode the interrupt rate is based on the ratio of TX and
+RX traffic.  If the bytes per second rate is approximately equal, the
+interrupt rate will drop as low as 2000 interrupts per second.  If the
+traffic is mostly transmit or mostly receive, the interrupt rate could
+be as high as 8000.
+
+Setting InterruptThrottleRate to 0 turns off any interrupt moderation
+and may improve small packet latency, but is generally not suitable
+for bulk throughput traffic.
+
+NOTE:  InterruptThrottleRate takes precedence over the TxAbsIntDelay and
+       RxAbsIntDelay parameters.  In other words, minimizing the receive
+       and/or transmit absolute delays does not force the controller to
+       generate more interrupts than what the Interrupt Throttle Rate
+       allows.
+
+NOTE:  When e1000e is loaded with default settings and multiple adapters
+       are in use simultaneously, the CPU utilization may increase non-
+       linearly.  In order to limit the CPU utilization without impacting
+       the overall throughput, we recommend that you load the driver as
+       follows:
+
+           modprobe e1000e InterruptThrottleRate=3000,3000,3000
+
+       This sets the InterruptThrottleRate to 3000 interrupts/sec for
+       the first, second, and third instances of the driver.  The range
+       of 2000 to 3000 interrupts per second works on a majority of
+       systems and is a good starting point, but the optimal value will
+       be platform-specific.  If CPU utilization is not a concern, use
+       RX_POLLING (NAPI) and default driver settings.
+
+RxIntDelay
+----------
+Valid Range:   0-65535 (0=off)
+Default Value: 0
+
+This value delays the generation of receive interrupts in units of 1.024
+microseconds.  Receive interrupt reduction can improve CPU efficiency if
+properly tuned for specific network traffic.  Increasing this value adds
+extra latency to frame reception and can end up decreasing the throughput
+of TCP traffic.  If the system is reporting dropped receives, this value
+may be set too high, causing the driver to run out of available receive
+descriptors.
+
+CAUTION:  When setting RxIntDelay to a value other than 0, adapters may
+          hang (stop transmitting) under certain network conditions.  If
+          this occurs a NETDEV WATCHDOG message is logged in the system
+          event log.  In addition, the controller is automatically reset,
+          restoring the network connection.  To eliminate the potential
+          for the hang ensure that RxIntDelay is set to 0.
+
+RxAbsIntDelay
+-------------
+Valid Range:   0-65535 (0=off)
+Default Value: 8
+
+This value, in units of 1.024 microseconds, limits the delay in which a
+receive interrupt is generated.  Useful only if RxIntDelay is non-zero,
+this value ensures that an interrupt is generated after the initial
+packet is received within the set amount of time.  Proper tuning,
+along with RxIntDelay, may improve traffic throughput in specific network
+conditions.
+
+TxIntDelay
+----------
+Valid Range:   0-65535 (0=off)
+Default Value: 8
+
+This value delays the generation of transmit interrupts in units of
+1.024 microseconds.  Transmit interrupt reduction can improve CPU
+efficiency if properly tuned for specific network traffic.  If the
+system is reporting dropped transmits, this value may be set too high
+causing the driver to run out of available transmit descriptors.
+
+TxAbsIntDelay
+-------------
+Valid Range:   0-65535 (0=off)
+Default Value: 32
+
+This value, in units of 1.024 microseconds, limits the delay in which a
+transmit interrupt is generated.  Useful only if TxIntDelay is non-zero,
+this value ensures that an interrupt is generated after the initial
+packet is sent on the wire within the set amount of time.  Proper tuning,
+along with TxIntDelay, may improve traffic throughput in specific
+network conditions.
+
+Copybreak
+---------
+Valid Range:   0-xxxxxxx (0=off)
+Default Value: 256
+
+Driver copies all packets below or equaling this size to a fresh RX
+buffer before handing it up the stack.
+
+This parameter is different than other parameters, in that it is a
+single (not 1,1,1 etc.) parameter applied to all driver instances and
+it is also available during runtime at
+/sys/module/e1000e/parameters/copybreak
+
+SmartPowerDownEnable
+--------------------
+Valid Range: 0-1
+Default Value:  0 (disabled)
+
+Allows PHY to turn off in lower power states. The user can set this parameter
+in supported chipsets.
+
+KumeranLockLoss
+---------------
+Valid Range: 0-1
+Default Value: 1 (enabled)
+
+This workaround skips resetting the PHY at shutdown for the initial
+silicon releases of ICH8 systems.
+
+IntMode
+-------
+Valid Range: 0-2 (0=legacy, 1=MSI, 2=MSI-X)
+Default Value: 2
+
+Allows changing the interrupt mode at module load time, without requiring a
+recompile. If the driver load fails to enable a specific interrupt mode, the
+driver will try other interrupt modes, from least to most compatible.  The
+interrupt order is MSI-X, MSI, Legacy.  If specifying MSI (IntMode=1)
+interrupts, only MSI and Legacy will be attempted.
+
+CrcStripping
+------------
+Valid Range: 0-1
+Default Value: 1 (enabled)
+
+Strip the CRC from received packets before sending up the network stack.  If
+you have a machine with a BMC enabled but cannot receive IPMI traffic after
+loading or enabling the driver, try disabling this feature.
+
+WriteProtectNVM
+---------------
+Valid Range: 0,1
+Default Value: 1
+
+If set to 1, configure the hardware to ignore all write/erase cycles to the
+GbE region in the ICHx NVM (in order to prevent accidental corruption of the
+NVM). This feature can be disabled by setting the parameter to 0 during initial
+driver load.
+NOTE: The machine must be power cycled (full off/on) when enabling NVM writes
+via setting the parameter to zero. Once the NVM has been locked (via the
+parameter at 1 when the driver loads) it cannot be unlocked except via power
+cycle.
+
+Additional Configurations
+=========================
+
+  Jumbo Frames
+  ------------
+  Jumbo Frames support is enabled by changing the MTU to a value larger than
+  the default of 1500.  Use the ifconfig command to increase the MTU size.
+  For example:
+
+       ifconfig eth<x> mtu 9000 up
+
+  This setting is not saved across reboots.
+
+  Notes:
+
+  - The maximum MTU setting for Jumbo Frames is 9216.  This value coincides
+    with the maximum Jumbo Frames size of 9234 bytes.
+
+  - Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
+    poor performance or loss of link.
+
+  - Some adapters limit Jumbo Frames sized packets to a maximum of
+    4096 bytes and some adapters do not support Jumbo Frames.
+
+  - Jumbo Frames cannot be configured on an 82579-based Network device, if
+    MACSec is enabled on the system.
+
+  ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information.  We
+  strongly recommend downloading the latest version of ethtool at:
+
+  https://kernel.org/pub/software/network/ethtool/
+
+  NOTE: When validating enable/disable tests on some parts (82578, for example)
+  you need to add a few seconds between tests when working with ethtool.
+
+  Speed and Duplex
+  ----------------
+  Speed and Duplex are configured through the ethtool* utility. For
+  instructions,  refer to the ethtool man page.
+
+  Enabling Wake on LAN* (WoL)
+  ---------------------------
+  WoL is configured through the ethtool* utility. For instructions on
+  enabling WoL with ethtool, refer to the ethtool man page.
+
+  WoL will be enabled on the system during the next shut down or reboot.
+  For this driver version, in order to enable WoL, the e1000e driver must be
+  loaded when shutting down or rebooting the system.
+
+  In most cases Wake On LAN is only supported on port A for multiple port
+  adapters. To verify if a port supports Wake on Lan run ethtool eth<X>.
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    www.intel.com/support/
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/ena.txt b/Documentation/networking/ena.txt
new file mode 100644
index 000000000..2b4b6f57e
--- /dev/null
+++ b/Documentation/networking/ena.txt
@@ -0,0 +1,305 @@
+Linux kernel driver for Elastic Network Adapter (ENA) family:
+=============================================================
+
+Overview:
+=========
+ENA is a networking interface designed to make good use of modern CPU
+features and system architectures.
+
+The ENA device exposes a lightweight management interface with a
+minimal set of memory mapped registers and extendable command set
+through an Admin Queue.
+
+The driver supports a range of ENA devices, is link-speed independent
+(i.e., the same driver is used for 10GbE, 25GbE, 40GbE, etc.), and has
+a negotiated and extendable feature set.
+
+Some ENA devices support SR-IOV. This driver is used for both the
+SR-IOV Physical Function (PF) and Virtual Function (VF) devices.
+
+ENA devices enable high speed and low overhead network traffic
+processing by providing multiple Tx/Rx queue pairs (the maximum number
+is advertised by the device via the Admin Queue), a dedicated MSI-X
+interrupt vector per Tx/Rx queue pair, adaptive interrupt moderation,
+and CPU cacheline optimized data placement.
+
+The ENA driver supports industry standard TCP/IP offload features such
+as checksum offload and TCP transmit segmentation offload (TSO).
+Receive-side scaling (RSS) is supported for multi-core scaling.
+
+The ENA driver and its corresponding devices implement health
+monitoring mechanisms such as watchdog, enabling the device and driver
+to recover in a manner transparent to the application, as well as
+debug logs.
+
+Some of the ENA devices support a working mode called Low-latency
+Queue (LLQ), which saves several more microseconds.
+
+Supported PCI vendor ID/device IDs:
+===================================
+1d0f:0ec2 - ENA PF
+1d0f:1ec2 - ENA PF with LLQ support
+1d0f:ec20 - ENA VF
+1d0f:ec21 - ENA VF with LLQ support
+
+ENA Source Code Directory Structure:
+====================================
+ena_com.[ch]      - Management communication layer. This layer is
+                    responsible for the handling all the management
+                    (admin) communication between the device and the
+                    driver.
+ena_eth_com.[ch]  - Tx/Rx data path.
+ena_admin_defs.h  - Definition of ENA management interface.
+ena_eth_io_defs.h - Definition of ENA data path interface.
+ena_common_defs.h - Common definitions for ena_com layer.
+ena_regs_defs.h   - Definition of ENA PCI memory-mapped (MMIO) registers.
+ena_netdev.[ch]   - Main Linux kernel driver.
+ena_syfsfs.[ch]   - Sysfs files.
+ena_ethtool.c     - ethtool callbacks.
+ena_pci_id_tbl.h  - Supported device IDs.
+
+Management Interface:
+=====================
+ENA management interface is exposed by means of:
+- PCIe Configuration Space
+- Device Registers
+- Admin Queue (AQ) and Admin Completion Queue (ACQ)
+- Asynchronous Event Notification Queue (AENQ)
+
+ENA device MMIO Registers are accessed only during driver
+initialization and are not involved in further normal device
+operation.
+
+AQ is used for submitting management commands, and the
+results/responses are reported asynchronously through ACQ.
+
+ENA introduces a very small set of management commands with room for
+vendor-specific extensions. Most of the management operations are
+framed in a generic Get/Set feature command.
+
+The following admin queue commands are supported:
+- Create I/O submission queue
+- Create I/O completion queue
+- Destroy I/O submission queue
+- Destroy I/O completion queue
+- Get feature
+- Set feature
+- Configure AENQ
+- Get statistics
+
+Refer to ena_admin_defs.h for the list of supported Get/Set Feature
+properties.
+
+The Asynchronous Event Notification Queue (AENQ) is a uni-directional
+queue used by the ENA device to send to the driver events that cannot
+be reported using ACQ. AENQ events are subdivided into groups. Each
+group may have multiple syndromes, as shown below
+
+The events are:
+	Group			Syndrome
+	Link state change	- X -
+	Fatal error		- X -
+	Notification		Suspend traffic
+	Notification		Resume traffic
+	Keep-Alive		- X -
+
+ACQ and AENQ share the same MSI-X vector.
+
+Keep-Alive is a special mechanism that allows monitoring of the
+device's health. The driver maintains a watchdog (WD) handler which,
+if fired, logs the current state and statistics then resets and
+restarts the ENA device and driver. A Keep-Alive event is delivered by
+the device every second. The driver re-arms the WD upon reception of a
+Keep-Alive event. A missed Keep-Alive event causes the WD handler to
+fire.
+
+Data Path Interface:
+====================
+I/O operations are based on Tx and Rx Submission Queues (Tx SQ and Rx
+SQ correspondingly). Each SQ has a completion queue (CQ) associated
+with it.
+
+The SQs and CQs are implemented as descriptor rings in contiguous
+physical memory.
+
+The ENA driver supports two Queue Operation modes for Tx SQs:
+- Regular mode
+  * In this mode the Tx SQs reside in the host's memory. The ENA
+    device fetches the ENA Tx descriptors and packet data from host
+    memory.
+- Low Latency Queue (LLQ) mode or "push-mode".
+  * In this mode the driver pushes the transmit descriptors and the
+    first 128 bytes of the packet directly to the ENA device memory
+    space. The rest of the packet payload is fetched by the
+    device. For this operation mode, the driver uses a dedicated PCI
+    device memory BAR, which is mapped with write-combine capability.
+
+The Rx SQs support only the regular mode.
+
+Note: Not all ENA devices support LLQ, and this feature is negotiated
+      with the device upon initialization. If the ENA device does not
+      support LLQ mode, the driver falls back to the regular mode.
+
+The driver supports multi-queue for both Tx and Rx. This has various
+benefits:
+- Reduced CPU/thread/process contention on a given Ethernet interface.
+- Cache miss rate on completion is reduced, particularly for data
+  cache lines that hold the sk_buff structures.
+- Increased process-level parallelism when handling received packets.
+- Increased data cache hit rate, by steering kernel processing of
+  packets to the CPU, where the application thread consuming the
+  packet is running.
+- In hardware interrupt re-direction.
+
+Interrupt Modes:
+================
+The driver assigns a single MSI-X vector per queue pair (for both Tx
+and Rx directions). The driver assigns an additional dedicated MSI-X vector
+for management (for ACQ and AENQ).
+
+Management interrupt registration is performed when the Linux kernel
+probes the adapter, and it is de-registered when the adapter is
+removed. I/O queue interrupt registration is performed when the Linux
+interface of the adapter is opened, and it is de-registered when the
+interface is closed.
+
+The management interrupt is named:
+   ena-mgmnt@pci:<PCI domain:bus:slot.function>
+and for each queue pair, an interrupt is named:
+   <interface name>-Tx-Rx-<queue index>
+
+The ENA device operates in auto-mask and auto-clear interrupt
+modes. That is, once MSI-X is delivered to the host, its Cause bit is
+automatically cleared and the interrupt is masked. The interrupt is
+unmasked by the driver after NAPI processing is complete.
+
+Interrupt Moderation:
+=====================
+ENA driver and device can operate in conventional or adaptive interrupt
+moderation mode.
+
+In conventional mode the driver instructs device to postpone interrupt
+posting according to static interrupt delay value. The interrupt delay
+value can be configured through ethtool(8). The following ethtool
+parameters are supported by the driver: tx-usecs, rx-usecs
+
+In adaptive interrupt moderation mode the interrupt delay value is
+updated by the driver dynamically and adjusted every NAPI cycle
+according to the traffic nature.
+
+By default ENA driver applies adaptive coalescing on Rx traffic and
+conventional coalescing on Tx traffic.
+
+Adaptive coalescing can be switched on/off through ethtool(8)
+adaptive_rx on|off parameter.
+
+The driver chooses interrupt delay value according to the number of
+bytes and packets received between interrupt unmasking and interrupt
+posting. The driver uses interrupt delay table that subdivides the
+range of received bytes/packets into 5 levels and assigns interrupt
+delay value to each level.
+
+The user can enable/disable adaptive moderation, modify the interrupt
+delay table and restore its default values through sysfs.
+
+The rx_copybreak is initialized by default to ENA_DEFAULT_RX_COPYBREAK
+and can be configured by the ETHTOOL_STUNABLE command of the
+SIOCETHTOOL ioctl.
+
+SKB:
+The driver-allocated SKB for frames received from Rx handling using
+NAPI context. The allocation method depends on the size of the packet.
+If the frame length is larger than rx_copybreak, napi_get_frags()
+is used, otherwise netdev_alloc_skb_ip_align() is used, the buffer
+content is copied (by CPU) to the SKB, and the buffer is recycled.
+
+Statistics:
+===========
+The user can obtain ENA device and driver statistics using ethtool.
+The driver can collect regular or extended statistics (including
+per-queue stats) from the device.
+
+In addition the driver logs the stats to syslog upon device reset.
+
+MTU:
+====
+The driver supports an arbitrarily large MTU with a maximum that is
+negotiated with the device. The driver configures MTU using the
+SetFeature command (ENA_ADMIN_MTU property). The user can change MTU
+via ip(8) and similar legacy tools.
+
+Stateless Offloads:
+===================
+The ENA driver supports:
+- TSO over IPv4/IPv6
+- TSO with ECN
+- IPv4 header checksum offload
+- TCP/UDP over IPv4/IPv6 checksum offloads
+
+RSS:
+====
+- The ENA device supports RSS that allows flexible Rx traffic
+  steering.
+- Toeplitz and CRC32 hash functions are supported.
+- Different combinations of L2/L3/L4 fields can be configured as
+  inputs for hash functions.
+- The driver configures RSS settings using the AQ SetFeature command
+  (ENA_ADMIN_RSS_HASH_FUNCTION, ENA_ADMIN_RSS_HASH_INPUT and
+  ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG properties).
+- If the NETIF_F_RXHASH flag is set, the 32-bit result of the hash
+  function delivered in the Rx CQ descriptor is set in the received
+  SKB.
+- The user can provide a hash key, hash function, and configure the
+  indirection table through ethtool(8).
+
+DATA PATH:
+==========
+Tx:
+---
+end_start_xmit() is called by the stack. This function does the following:
+- Maps data buffers (skb->data and frags).
+- Populates ena_buf for the push buffer (if the driver and device are
+  in push mode.)
+- Prepares ENA bufs for the remaining frags.
+- Allocates a new request ID from the empty req_id ring. The request
+  ID is the index of the packet in the Tx info. This is used for
+  out-of-order TX completions.
+- Adds the packet to the proper place in the Tx ring.
+- Calls ena_com_prepare_tx(), an ENA communication layer that converts
+  the ena_bufs to ENA descriptors (and adds meta ENA descriptors as
+  needed.)
+  * This function also copies the ENA descriptors and the push buffer
+    to the Device memory space (if in push mode.)
+- Writes doorbell to the ENA device.
+- When the ENA device finishes sending the packet, a completion
+  interrupt is raised.
+- The interrupt handler schedules NAPI.
+- The ena_clean_tx_irq() function is called. This function handles the
+  completion descriptors generated by the ENA, with a single
+  completion descriptor per completed packet.
+  * req_id is retrieved from the completion descriptor. The tx_info of
+    the packet is retrieved via the req_id. The data buffers are
+    unmapped and req_id is returned to the empty req_id ring.
+  * The function stops when the completion descriptors are completed or
+    the budget is reached.
+
+Rx:
+---
+- When a packet is received from the ENA device.
+- The interrupt handler schedules NAPI.
+- The ena_clean_rx_irq() function is called. This function calls
+  ena_rx_pkt(), an ENA communication layer function, which returns the
+  number of descriptors used for a new unhandled packet, and zero if
+  no new packet is found.
+- Then it calls the ena_clean_rx_irq() function.
+- ena_eth_rx_skb() checks packet length:
+  * If the packet is small (len < rx_copybreak), the driver allocates
+    a SKB for the new packet, and copies the packet payload into the
+    SKB data buffer.
+    - In this way the original data buffer is not passed to the stack
+      and is reused for future Rx packets.
+  * Otherwise the function unmaps the Rx buffer, then allocates the
+    new SKB structure and hooks the Rx buffer to the SKB frags.
+- The new SKB is updated with the necessary information (protocol,
+  checksum hw verify result, etc.), and then passed to the network
+  stack, using the NAPI interface function napi_gro_receive().
diff --git a/Documentation/networking/eql.txt b/Documentation/networking/eql.txt
new file mode 100644
index 000000000..0f1550150
--- /dev/null
+++ b/Documentation/networking/eql.txt
@@ -0,0 +1,528 @@
+  EQL Driver: Serial IP Load Balancing HOWTO
+  Simon "Guru Aleph-Null" Janes, simon@ncm.com
+  v1.1, February 27, 1995
+
+  This is the manual for the EQL device driver. EQL is a software device
+  that lets you load-balance IP serial links (SLIP or uncompressed PPP)
+  to increase your bandwidth. It will not reduce your latency (i.e. ping
+  times) except in the case where you already have lots of traffic on
+  your link, in which it will help them out. This driver has been tested
+  with the 1.1.75 kernel, and is known to have patched cleanly with
+  1.1.86.  Some testing with 1.1.92 has been done with the v1.1 patch
+  which was only created to patch cleanly in the very latest kernel
+  source trees. (Yes, it worked fine.)
+
+  1.  Introduction
+
+  Which is worse? A huge fee for a 56K leased line or two phone lines?
+  It's probably the former.  If you find yourself craving more bandwidth,
+  and have a ISP that is flexible, it is now possible to bind modems
+  together to work as one point-to-point link to increase your
+  bandwidth.  All without having to have a special black box on either
+  side.
+
+
+  The eql driver has only been tested with the Livingston PortMaster-2e
+  terminal server. I do not know if other terminal servers support load-
+  balancing, but I do know that the PortMaster does it, and does it
+  almost as well as the eql driver seems to do it (-- Unfortunately, in
+  my testing so far, the Livingston PortMaster 2e's load-balancing is a
+  good 1 to 2 KB/s slower than the test machine working with a 28.8 Kbps
+  and 14.4 Kbps connection.  However, I am not sure that it really is
+  the PortMaster, or if it's Linux's TCP drivers. I'm told that Linux's
+  TCP implementation is pretty fast though.--)
+
+
+  I suggest to ISPs out there that it would probably be fair to charge
+  a load-balancing client 75% of the cost of the second line and 50% of
+  the cost of the third line etc...
+
+
+  Hey, we can all dream you know...
+
+
+  2.  Kernel Configuration
+
+  Here I describe the general steps of getting a kernel up and working
+  with the eql driver.	From patching, building, to installing.
+
+
+  2.1.	Patching The Kernel
+
+  If you do not have or cannot get a copy of the kernel with the eql
+  driver folded into it, get your copy of the driver from
+  ftp://slaughter.ncm.com/pub/Linux/LOAD_BALANCING/eql-1.1.tar.gz.
+  Unpack this archive someplace obvious like /usr/local/src/.  It will
+  create the following files:
+
+
+
+       ______________________________________________________________________
+       -rw-r--r-- guru/ncm	198 Jan 19 18:53 1995 eql-1.1/NO-WARRANTY
+       -rw-r--r-- guru/ncm	30620 Feb 27 21:40 1995 eql-1.1/eql-1.1.patch
+       -rwxr-xr-x guru/ncm	16111 Jan 12 22:29 1995 eql-1.1/eql_enslave
+       -rw-r--r-- guru/ncm	2195 Jan 10 21:48 1995 eql-1.1/eql_enslave.c
+       ______________________________________________________________________
+
+  Unpack a recent kernel (something after 1.1.92) someplace convenient
+  like say /usr/src/linux-1.1.92.eql. Use symbolic links to point
+  /usr/src/linux to this development directory.
+
+
+  Apply the patch by running the commands:
+
+
+       ______________________________________________________________________
+       cd /usr/src
+       patch </usr/local/src/eql-1.1/eql-1.1.patch
+       ______________________________________________________________________
+
+
+
+
+
+  2.2.	Building The Kernel
+
+  After patching the kernel, run make config and configure the kernel
+  for your hardware.
+
+
+  After configuration, make and install according to your habit.
+
+
+  3.  Network Configuration
+
+  So far, I have only used the eql device with the DSLIP SLIP connection
+  manager by Matt Dillon (-- "The man who sold his soul to code so much
+  so quickly."--) .  How you configure it for other "connection"
+  managers is up to you.  Most other connection managers that I've seen
+  don't do a very good job when it comes to handling more than one
+  connection.
+
+
+  3.1.	/etc/rc.d/rc.inet1
+
+  In rc.inet1, ifconfig the eql device to the IP address you usually use
+  for your machine, and the MTU you prefer for your SLIP lines.	One
+  could argue that MTU should be roughly half the usual size for two
+  modems, one-third for three, one-fourth for four, etc...  But going
+  too far below 296 is probably overkill. Here is an example ifconfig
+  command that sets up the eql device:
+
+
+
+       ______________________________________________________________________
+       ifconfig eql 198.67.33.239 mtu 1006
+       ______________________________________________________________________
+
+
+
+
+
+  Once the eql device is up and running, add a static default route to
+  it in the routing table using the cool new route syntax that makes
+  life so much easier:
+
+
+
+       ______________________________________________________________________
+       route add default eql
+       ______________________________________________________________________
+
+
+  3.2.	Enslaving Devices By Hand
+
+  Enslaving devices by hand requires two utility programs: eql_enslave
+  and eql_emancipate (-- eql_emancipate hasn't been written because when
+  an enslaved device "dies", it is automatically taken out of the queue.
+  I haven't found a good reason to write it yet... other than for
+  completeness, but that isn't a good motivator is it?--)
+
+
+  The syntax for enslaving a device is "eql_enslave <master-name>
+  <slave-name> <estimated-bps>".  Here are some example enslavings:
+
+
+
+       ______________________________________________________________________
+       eql_enslave eql sl0 28800
+       eql_enslave eql ppp0 14400
+       eql_enslave eql sl1 57600
+       ______________________________________________________________________
+
+
+
+
+
+  When you want to free a device from its life of slavery, you can
+  either down the device with ifconfig (eql will automatically bury the
+  dead slave and remove it from its queue) or use eql_emancipate to free
+  it. (-- Or just ifconfig it down, and the eql driver will take it out
+  for you.--)
+
+
+
+       ______________________________________________________________________
+       eql_emancipate eql sl0
+       eql_emancipate eql ppp0
+       eql_emancipate eql sl1
+       ______________________________________________________________________
+
+
+
+
+
+  3.3.	DSLIP Configuration for the eql Device
+
+  The general idea is to bring up and keep up as many SLIP connections
+  as you need, automatically.
+
+
+  3.3.1.  /etc/slip/runslip.conf
+
+  Here is an example runslip.conf:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  ______________________________________________________________________
+  name		sl-line-1
+  enabled
+  baud		38400
+  mtu		576
+  ducmd		-e /etc/slip/dialout/cua2-288.xp -t 9
+  command	 eql_enslave eql $interface 28800
+  address	 198.67.33.239
+  line		/dev/cua2
+
+  name		sl-line-2
+  enabled
+  baud		38400
+  mtu		576
+  ducmd		-e /etc/slip/dialout/cua3-288.xp -t 9
+  command	 eql_enslave eql $interface 28800
+  address	 198.67.33.239
+  line		/dev/cua3
+  ______________________________________________________________________
+
+
+
+
+
+  3.4.	Using PPP and the eql Device
+
+  I have not yet done any load-balancing testing for PPP devices, mainly
+  because I don't have a PPP-connection manager like SLIP has with
+  DSLIP. I did find a good tip from LinuxNET:Billy for PPP performance:
+  make sure you have asyncmap set to something so that control
+  characters are not escaped.
+
+
+  I tried to fix up a PPP script/system for redialing lost PPP
+  connections for use with the eql driver the weekend of Feb 25-26 '95
+  (Hereafter known as the 8-hour PPP Hate Festival).  Perhaps later this
+  year.
+
+
+  4.  About the Slave Scheduler Algorithm
+
+  The slave scheduler probably could be replaced with a dozen other
+  things and push traffic much faster.	The formula in the current set
+  up of the driver was tuned to handle slaves with wildly different
+  bits-per-second "priorities".
+
+
+  All testing I have done was with two 28.8 V.FC modems, one connecting
+  at 28800 bps or slower, and the other connecting at 14400 bps all the
+  time.
+
+
+  One version of the scheduler was able to push 5.3 K/s through the
+  28800 and 14400 connections, but when the priorities on the links were
+  very wide apart (57600 vs. 14400) the "faster" modem received all
+  traffic and the "slower" modem starved.
+
+
+  5.  Testers' Reports
+
+  Some people have experimented with the eql device with newer
+  kernels (than 1.1.75).  I have since updated the driver to patch
+  cleanly in newer kernels because of the removal of the old "slave-
+  balancing" driver config option.
+
+
+  o  icee from LinuxNET patched 1.1.86 without any rejects and was able
+     to boot the kernel and enslave a couple of ISDN PPP links.
+
+  5.1.	Randolph Bentson's Test Report
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  From bentson@grieg.seaslug.org Wed Feb  8 19:08:09 1995
+  Date: Tue, 7 Feb 95 22:57 PST
+  From: Randolph Bentson <bentson@grieg.seaslug.org>
+  To: guru@ncm.com
+  Subject: EQL driver tests
+
+
+  I have been checking out your eql driver.  (Nice work, that!)
+  Although you may already done this performance testing, here
+  are some data I've discovered.
+
+  Randolph Bentson
+  bentson@grieg.seaslug.org
+
+  ---------------------------------------------------------
+
+
+  A pseudo-device driver, EQL, written by Simon Janes, can be used
+  to bundle multiple SLIP connections into what appears to be a
+  single connection.  This allows one to improve dial-up network
+  connectivity gradually, without having to buy expensive DSU/CSU
+  hardware and services.
+
+  I have done some testing of this software, with two goals in
+  mind: first, to ensure it actually works as described and
+  second, as a method of exercising my device driver.
+
+  The following performance measurements were derived from a set
+  of SLIP connections run between two Linux systems (1.1.84) using
+  a 486DX2/66 with a Cyclom-8Ys and a 486SLC/40 with a Cyclom-16Y.
+  (Ports 0,1,2,3 were used.  A later configuration will distribute
+  port selection across the different Cirrus chips on the boards.)
+  Once a link was established, I timed a binary ftp transfer of
+  289284 bytes of data.	If there were no overhead (packet headers,
+  inter-character and inter-packet delays, etc.) the transfers
+  would take the following times:
+
+      bits/sec	seconds
+      345600	8.3
+      234600	12.3
+      172800	16.7
+      153600	18.8
+      76800	37.6
+      57600	50.2
+      38400	75.3
+      28800	100.4
+      19200	150.6
+      9600	301.3
+
+  A single line running at the lower speeds and with large packets
+  comes to within 2% of this.  Performance is limited for the higher
+  speeds (as predicted by the Cirrus databook) to an aggregate of
+  about 160 kbits/sec.	The next round of testing will distribute
+  the load across two or more Cirrus chips.
+
+  The good news is that one gets nearly the full advantage of the
+  second, third, and fourth line's bandwidth.  (The bad news is
+  that the connection establishment seemed fragile for the higher
+  speeds.  Once established, the connection seemed robust enough.)
+
+  #lines  speed	mtu  seconds	theory  actual  %of
+	 kbit/sec      duration	speed	speed	max
+  3	115200  900	_	345600
+  3	115200  400	18.1	345600  159825  46
+  2	115200  900	_	230400
+  2	115200  600	18.1	230400  159825  69
+  2	115200  400	19.3	230400  149888  65
+  4	57600	900	_	234600
+  4	57600	600	_	234600
+  4	57600	400	_	234600
+  3	57600	600	20.9	172800  138413  80
+  3	57600	900	21.2	172800  136455  78
+  3	115200  600	21.7	345600  133311  38
+  3	57600	400	22.5	172800  128571  74
+  4	38400	900	25.2	153600  114795  74
+  4	38400	600	26.4	153600  109577  71
+  4	38400	400	27.3	153600  105965  68
+  2	57600	900	29.1	115200  99410.3 86
+  1	115200  900	30.7	115200  94229.3 81
+  2	57600	600	30.2	115200  95789.4 83
+  3	38400	900	30.3	115200  95473.3 82
+  3	38400	600	31.2	115200  92719.2 80
+  1	115200  600	31.3	115200  92423	80
+  2	57600	400	32.3	115200  89561.6 77
+  1	115200  400	32.8	115200  88196.3 76
+  3	38400	400	33.5	115200  86353.4 74
+  2	38400	900	43.7	76800	66197.7 86
+  2	38400	600	44	76800	65746.4 85
+  2	38400	400	47.2	76800	61289	79
+  4	19200	900	50.8	76800	56945.7 74
+  4	19200	400	53.2	76800	54376.7 70
+  4	19200	600	53.7	76800	53870.4 70
+  1	57600	900	54.6	57600	52982.4 91
+  1	57600	600	56.2	57600	51474	89
+  3	19200	900	60.5	57600	47815.5 83
+  1	57600	400	60.2	57600	48053.8 83
+  3	19200	600	62	57600	46658.7 81
+  3	19200	400	64.7	57600	44711.6 77
+  1	38400	900	79.4	38400	36433.8 94
+  1	38400	600	82.4	38400	35107.3 91
+  2	19200	900	84.4	38400	34275.4 89
+  1	38400	400	86.8	38400	33327.6 86
+  2	19200	600	87.6	38400	33023.3 85
+  2	19200	400	91.2	38400	31719.7 82
+  4	9600	900	94.7	38400	30547.4 79
+  4	9600	400	106	38400	27290.9 71
+  4	9600	600	110	38400	26298.5 68
+  3	9600	900	118	28800	24515.6 85
+  3	9600	600	120	28800	24107	83
+  3	9600	400	131	28800	22082.7 76
+  1	19200	900	155	19200	18663.5 97
+  1	19200	600	161	19200	17968	93
+  1	19200	400	170	19200	17016.7 88
+  2	9600	600	176	19200	16436.6 85
+  2	9600	900	180	19200	16071.3 83
+  2	9600	400	181	19200	15982.5 83
+  1	9600	900	305	9600	9484.72 98
+  1	9600	600	314	9600	9212.87 95
+  1	9600	400	332	9600	8713.37 90
+
+
+
+
+
+  5.2.	Anthony Healy's Report
+
+
+
+
+
+
+
+  Date: Mon, 13 Feb 1995 16:17:29 +1100 (EST)
+  From: Antony Healey <ahealey@st.nepean.uws.edu.au>
+  To: Simon Janes <guru@ncm.com>
+  Subject: Re: Load Balancing
+
+  Hi Simon,
+	  I've installed your patch and it works great. I have trialed
+	  it over twin SL/IP lines, just over null modems, but I was
+	  able to data at over 48Kb/s [ISDN link -Simon]. I managed a
+	  transfer of up to 7.5 Kbyte/s on one go, but averaged around
+	  6.4 Kbyte/s, which I think is pretty cool.  :)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Documentation/networking/failover.rst b/Documentation/networking/failover.rst
new file mode 100644
index 000000000..f0c8483cd
--- /dev/null
+++ b/Documentation/networking/failover.rst
@@ -0,0 +1,18 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+========
+FAILOVER
+========
+
+Overview
+========
+
+The failover module provides a generic interface for paravirtual drivers
+to register a netdev and a set of ops with a failover instance. The ops
+are used as event handlers that get called to handle netdev register/
+unregister/link change/name change events on slave pci ethernet devices
+with the same mac address as the failover netdev.
+
+This enables paravirtual drivers to use a VF as an accelerated low latency
+datapath. It also allows live migration of VMs with direct attached VFs by
+failing over to the paravirtual datapath when the VF is unplugged.
diff --git a/Documentation/networking/fib_trie.txt b/Documentation/networking/fib_trie.txt
new file mode 100644
index 000000000..fe7193885
--- /dev/null
+++ b/Documentation/networking/fib_trie.txt
@@ -0,0 +1,145 @@
+			LC-trie implementation notes.
+
+Node types
+----------
+leaf 
+	An end node with data. This has a copy of the relevant key, along
+	with 'hlist' with routing table entries sorted by prefix length.
+	See struct leaf and struct leaf_info.
+
+trie node or tnode
+	An internal node, holding an array of child (leaf or tnode) pointers,
+	indexed	through a subset of the key. See Level Compression.
+
+A few concepts explained
+------------------------
+Bits (tnode) 
+	The number of bits in the key segment used for indexing into the
+	child array - the "child index". See Level Compression.
+
+Pos (tnode)
+	The position (in the key) of the key segment used for indexing into
+	the child array. See Path Compression.
+
+Path Compression / skipped bits
+	Any given tnode is linked to from the child array of its parent, using
+	a segment of the key specified by the parent's "pos" and "bits" 
+	In certain cases, this tnode's own "pos" will not be immediately
+	adjacent to the parent (pos+bits), but there will be some bits
+	in the key skipped over because they represent a single path with no
+	deviations. These "skipped bits" constitute Path Compression.
+	Note that the search algorithm will simply skip over these bits when
+	searching, making it necessary to save the keys in the leaves to
+	verify that they actually do match the key we are searching for.
+
+Level Compression / child arrays
+	the trie is kept level balanced moving, under certain conditions, the
+	children of a full child (see "full_children") up one level, so that
+	instead of a pure binary tree, each internal node ("tnode") may
+	contain an arbitrarily large array of links to several children.
+	Conversely, a tnode with a mostly empty	child array (see empty_children)
+	may be "halved", having some of its children moved downwards one level,
+	in order to avoid ever-increasing child arrays.
+
+empty_children
+	the number of positions in the child array of a given tnode that are
+	NULL.
+
+full_children
+	the number of children of a given tnode that aren't path compressed.
+	(in other words, they aren't NULL or leaves and their "pos" is equal
+	to this	tnode's "pos"+"bits").
+
+	(The word "full" here is used more in the sense of "complete" than
+	as the opposite of "empty", which might be a tad confusing.)
+
+Comments
+---------
+
+We have tried to keep the structure of the code as close to fib_hash as 
+possible to allow verification and help up reviewing. 
+
+fib_find_node()
+	A good start for understanding this code. This function implements a
+	straightforward trie lookup.
+
+fib_insert_node()
+	Inserts a new leaf node in the trie. This is bit more complicated than
+	fib_find_node(). Inserting a new node means we might have to run the
+	level compression algorithm on part of the trie.
+
+trie_leaf_remove()
+	Looks up a key, deletes it and runs the level compression algorithm.
+
+trie_rebalance()
+	The key function for the dynamic trie after any change in the trie
+	it is run to optimize and reorganize. It will walk the trie upwards
+	towards the root from a given tnode, doing a resize() at each step
+	to implement level compression.
+
+resize()
+	Analyzes a tnode and optimizes the child array size by either inflating
+	or shrinking it repeatedly until it fulfills the criteria for optimal
+	level compression. This part follows the original paper pretty closely
+	and there may be some room for experimentation here.
+
+inflate()
+	Doubles the size of the child array within a tnode. Used by resize().
+
+halve()
+	Halves the size of the child array within a tnode - the inverse of
+	inflate(). Used by resize();
+
+fn_trie_insert(), fn_trie_delete(), fn_trie_select_default()
+	The route manipulation functions. Should conform pretty closely to the
+	corresponding functions in fib_hash.
+
+fn_trie_flush()
+	This walks the full trie (using nextleaf()) and searches for empty
+	leaves which have to be removed.
+
+fn_trie_dump()
+	Dumps the routing table ordered by prefix length. This is somewhat
+	slower than the corresponding fib_hash function, as we have to walk the
+	entire trie for each prefix length. In comparison, fib_hash is organized
+	as one "zone"/hash per prefix length.
+
+Locking
+-------
+
+fib_lock is used for an RW-lock in the same way that this is done in fib_hash.
+However, the functions are somewhat separated for other possible locking
+scenarios. It might conceivably be possible to run trie_rebalance via RCU
+to avoid read_lock in the fn_trie_lookup() function.
+
+Main lookup mechanism
+---------------------
+fn_trie_lookup() is the main lookup function.
+
+The lookup is in its simplest form just like fib_find_node(). We descend the
+trie, key segment by key segment, until we find a leaf. check_leaf() does
+the fib_semantic_match in the leaf's sorted prefix hlist.
+
+If we find a match, we are done.
+
+If we don't find a match, we enter prefix matching mode. The prefix length,
+starting out at the same as the key length, is reduced one step at a time,
+and we backtrack upwards through the trie trying to find a longest matching
+prefix. The goal is always to reach a leaf and get a positive result from the
+fib_semantic_match mechanism.
+
+Inside each tnode, the search for longest matching prefix consists of searching
+through the child array, chopping off (zeroing) the least significant "1" of
+the child index until we find a match or the child index consists of nothing but
+zeros.
+
+At this point we backtrack (t->stats.backtrack++) up the trie, continuing to
+chop off part of the key in order to find the longest matching prefix.
+
+At this point we will repeatedly descend subtries to look for a match, and there
+are some optimizations available that can provide us with "shortcuts" to avoid
+descending into dead ends. Look for "HL_OPTIMIZE" sections in the code.
+
+To alleviate any doubts about the correctness of the route selection process,
+a new netlink operation has been added. Look for NETLINK_FIB_LOOKUP, which
+gives userland access to fib_lookup().
diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
new file mode 100644
index 000000000..e6b4ebb2b
--- /dev/null
+++ b/Documentation/networking/filter.txt
@@ -0,0 +1,1476 @@
+Linux Socket Filtering aka Berkeley Packet Filter (BPF)
+=======================================================
+
+Introduction
+------------
+
+Linux Socket Filtering (LSF) is derived from the Berkeley Packet Filter.
+Though there are some distinct differences between the BSD and Linux
+Kernel filtering, but when we speak of BPF or LSF in Linux context, we
+mean the very same mechanism of filtering in the Linux kernel.
+
+BPF allows a user-space program to attach a filter onto any socket and
+allow or disallow certain types of data to come through the socket. LSF
+follows exactly the same filter code structure as BSD's BPF, so referring
+to the BSD bpf.4 manpage is very helpful in creating filters.
+
+On Linux, BPF is much simpler than on BSD. One does not have to worry
+about devices or anything like that. You simply create your filter code,
+send it to the kernel via the SO_ATTACH_FILTER option and if your filter
+code passes the kernel check on it, you then immediately begin filtering
+data on that socket.
+
+You can also detach filters from your socket via the SO_DETACH_FILTER
+option. This will probably not be used much since when you close a socket
+that has a filter on it the filter is automagically removed. The other
+less common case may be adding a different filter on the same socket where
+you had another filter that is still running: the kernel takes care of
+removing the old one and placing your new one in its place, assuming your
+filter has passed the checks, otherwise if it fails the old filter will
+remain on that socket.
+
+SO_LOCK_FILTER option allows to lock the filter attached to a socket. Once
+set, a filter cannot be removed or changed. This allows one process to
+setup a socket, attach a filter, lock it then drop privileges and be
+assured that the filter will be kept until the socket is closed.
+
+The biggest user of this construct might be libpcap. Issuing a high-level
+filter command like `tcpdump -i em1 port 22` passes through the libpcap
+internal compiler that generates a structure that can eventually be loaded
+via SO_ATTACH_FILTER to the kernel. `tcpdump -i em1 port 22 -ddd`
+displays what is being placed into this structure.
+
+Although we were only speaking about sockets here, BPF in Linux is used
+in many more places. There's xt_bpf for netfilter, cls_bpf in the kernel
+qdisc layer, SECCOMP-BPF (SECure COMPuting [1]), and lots of other places
+such as team driver, PTP code, etc where BPF is being used.
+
+ [1] Documentation/userspace-api/seccomp_filter.rst
+
+Original BPF paper:
+
+Steven McCanne and Van Jacobson. 1993. The BSD packet filter: a new
+architecture for user-level packet capture. In Proceedings of the
+USENIX Winter 1993 Conference Proceedings on USENIX Winter 1993
+Conference Proceedings (USENIX'93). USENIX Association, Berkeley,
+CA, USA, 2-2. [http://www.tcpdump.org/papers/bpf-usenix93.pdf]
+
+Structure
+---------
+
+User space applications include <linux/filter.h> which contains the
+following relevant structures:
+
+struct sock_filter {	/* Filter block */
+	__u16	code;   /* Actual filter code */
+	__u8	jt;	/* Jump true */
+	__u8	jf;	/* Jump false */
+	__u32	k;      /* Generic multiuse field */
+};
+
+Such a structure is assembled as an array of 4-tuples, that contains
+a code, jt, jf and k value. jt and jf are jump offsets and k a generic
+value to be used for a provided code.
+
+struct sock_fprog {			/* Required for SO_ATTACH_FILTER. */
+	unsigned short		   len;	/* Number of filter blocks */
+	struct sock_filter __user *filter;
+};
+
+For socket filtering, a pointer to this structure (as shown in
+follow-up example) is being passed to the kernel through setsockopt(2).
+
+Example
+-------
+
+#include <sys/socket.h>
+#include <sys/types.h>
+#include <arpa/inet.h>
+#include <linux/if_ether.h>
+/* ... */
+
+/* From the example above: tcpdump -i em1 port 22 -dd */
+struct sock_filter code[] = {
+	{ 0x28,  0,  0, 0x0000000c },
+	{ 0x15,  0,  8, 0x000086dd },
+	{ 0x30,  0,  0, 0x00000014 },
+	{ 0x15,  2,  0, 0x00000084 },
+	{ 0x15,  1,  0, 0x00000006 },
+	{ 0x15,  0, 17, 0x00000011 },
+	{ 0x28,  0,  0, 0x00000036 },
+	{ 0x15, 14,  0, 0x00000016 },
+	{ 0x28,  0,  0, 0x00000038 },
+	{ 0x15, 12, 13, 0x00000016 },
+	{ 0x15,  0, 12, 0x00000800 },
+	{ 0x30,  0,  0, 0x00000017 },
+	{ 0x15,  2,  0, 0x00000084 },
+	{ 0x15,  1,  0, 0x00000006 },
+	{ 0x15,  0,  8, 0x00000011 },
+	{ 0x28,  0,  0, 0x00000014 },
+	{ 0x45,  6,  0, 0x00001fff },
+	{ 0xb1,  0,  0, 0x0000000e },
+	{ 0x48,  0,  0, 0x0000000e },
+	{ 0x15,  2,  0, 0x00000016 },
+	{ 0x48,  0,  0, 0x00000010 },
+	{ 0x15,  0,  1, 0x00000016 },
+	{ 0x06,  0,  0, 0x0000ffff },
+	{ 0x06,  0,  0, 0x00000000 },
+};
+
+struct sock_fprog bpf = {
+	.len = ARRAY_SIZE(code),
+	.filter = code,
+};
+
+sock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+if (sock < 0)
+	/* ... bail out ... */
+
+ret = setsockopt(sock, SOL_SOCKET, SO_ATTACH_FILTER, &bpf, sizeof(bpf));
+if (ret < 0)
+	/* ... bail out ... */
+
+/* ... */
+close(sock);
+
+The above example code attaches a socket filter for a PF_PACKET socket
+in order to let all IPv4/IPv6 packets with port 22 pass. The rest will
+be dropped for this socket.
+
+The setsockopt(2) call to SO_DETACH_FILTER doesn't need any arguments
+and SO_LOCK_FILTER for preventing the filter to be detached, takes an
+integer value with 0 or 1.
+
+Note that socket filters are not restricted to PF_PACKET sockets only,
+but can also be used on other socket families.
+
+Summary of system calls:
+
+ * setsockopt(sockfd, SOL_SOCKET, SO_ATTACH_FILTER, &val, sizeof(val));
+ * setsockopt(sockfd, SOL_SOCKET, SO_DETACH_FILTER, &val, sizeof(val));
+ * setsockopt(sockfd, SOL_SOCKET, SO_LOCK_FILTER,   &val, sizeof(val));
+
+Normally, most use cases for socket filtering on packet sockets will be
+covered by libpcap in high-level syntax, so as an application developer
+you should stick to that. libpcap wraps its own layer around all that.
+
+Unless i) using/linking to libpcap is not an option, ii) the required BPF
+filters use Linux extensions that are not supported by libpcap's compiler,
+iii) a filter might be more complex and not cleanly implementable with
+libpcap's compiler, or iv) particular filter codes should be optimized
+differently than libpcap's internal compiler does; then in such cases
+writing such a filter "by hand" can be of an alternative. For example,
+xt_bpf and cls_bpf users might have requirements that could result in
+more complex filter code, or one that cannot be expressed with libpcap
+(e.g. different return codes for various code paths). Moreover, BPF JIT
+implementors may wish to manually write test cases and thus need low-level
+access to BPF code as well.
+
+BPF engine and instruction set
+------------------------------
+
+Under tools/bpf/ there's a small helper tool called bpf_asm which can
+be used to write low-level filters for example scenarios mentioned in the
+previous section. Asm-like syntax mentioned here has been implemented in
+bpf_asm and will be used for further explanations (instead of dealing with
+less readable opcodes directly, principles are the same). The syntax is
+closely modelled after Steven McCanne's and Van Jacobson's BPF paper.
+
+The BPF architecture consists of the following basic elements:
+
+  Element          Description
+
+  A                32 bit wide accumulator
+  X                32 bit wide X register
+  M[]              16 x 32 bit wide misc registers aka "scratch memory
+                   store", addressable from 0 to 15
+
+A program, that is translated by bpf_asm into "opcodes" is an array that
+consists of the following elements (as already mentioned):
+
+  op:16, jt:8, jf:8, k:32
+
+The element op is a 16 bit wide opcode that has a particular instruction
+encoded. jt and jf are two 8 bit wide jump targets, one for condition
+"jump if true", the other one "jump if false". Eventually, element k
+contains a miscellaneous argument that can be interpreted in different
+ways depending on the given instruction in op.
+
+The instruction set consists of load, store, branch, alu, miscellaneous
+and return instructions that are also represented in bpf_asm syntax. This
+table lists all bpf_asm instructions available resp. what their underlying
+opcodes as defined in linux/filter.h stand for:
+
+  Instruction      Addressing mode      Description
+
+  ld               1, 2, 3, 4, 10       Load word into A
+  ldi              4                    Load word into A
+  ldh              1, 2                 Load half-word into A
+  ldb              1, 2                 Load byte into A
+  ldx              3, 4, 5, 10          Load word into X
+  ldxi             4                    Load word into X
+  ldxb             5                    Load byte into X
+
+  st               3                    Store A into M[]
+  stx              3                    Store X into M[]
+
+  jmp              6                    Jump to label
+  ja               6                    Jump to label
+  jeq              7, 8                 Jump on A == k
+  jneq             8                    Jump on A != k
+  jne              8                    Jump on A != k
+  jlt              8                    Jump on A <  k
+  jle              8                    Jump on A <= k
+  jgt              7, 8                 Jump on A >  k
+  jge              7, 8                 Jump on A >= k
+  jset             7, 8                 Jump on A &  k
+
+  add              0, 4                 A + <x>
+  sub              0, 4                 A - <x>
+  mul              0, 4                 A * <x>
+  div              0, 4                 A / <x>
+  mod              0, 4                 A % <x>
+  neg                                   !A
+  and              0, 4                 A & <x>
+  or               0, 4                 A | <x>
+  xor              0, 4                 A ^ <x>
+  lsh              0, 4                 A << <x>
+  rsh              0, 4                 A >> <x>
+
+  tax                                   Copy A into X
+  txa                                   Copy X into A
+
+  ret              4, 9                 Return
+
+The next table shows addressing formats from the 2nd column:
+
+  Addressing mode  Syntax               Description
+
+   0               x/%x                 Register X
+   1               [k]                  BHW at byte offset k in the packet
+   2               [x + k]              BHW at the offset X + k in the packet
+   3               M[k]                 Word at offset k in M[]
+   4               #k                   Literal value stored in k
+   5               4*([k]&0xf)          Lower nibble * 4 at byte offset k in the packet
+   6               L                    Jump label L
+   7               #k,Lt,Lf             Jump to Lt if true, otherwise jump to Lf
+   8               #k,Lt                Jump to Lt if predicate is true
+   9               a/%a                 Accumulator A
+  10               extension            BPF extension
+
+The Linux kernel also has a couple of BPF extensions that are used along
+with the class of load instructions by "overloading" the k argument with
+a negative offset + a particular extension offset. The result of such BPF
+extensions are loaded into A.
+
+Possible BPF extensions are shown in the following table:
+
+  Extension                             Description
+
+  len                                   skb->len
+  proto                                 skb->protocol
+  type                                  skb->pkt_type
+  poff                                  Payload start offset
+  ifidx                                 skb->dev->ifindex
+  nla                                   Netlink attribute of type X with offset A
+  nlan                                  Nested Netlink attribute of type X with offset A
+  mark                                  skb->mark
+  queue                                 skb->queue_mapping
+  hatype                                skb->dev->type
+  rxhash                                skb->hash
+  cpu                                   raw_smp_processor_id()
+  vlan_tci                              skb_vlan_tag_get(skb)
+  vlan_avail                            skb_vlan_tag_present(skb)
+  vlan_tpid                             skb->vlan_proto
+  rand                                  prandom_u32()
+
+These extensions can also be prefixed with '#'.
+Examples for low-level BPF:
+
+** ARP packets:
+
+  ldh [12]
+  jne #0x806, drop
+  ret #-1
+  drop: ret #0
+
+** IPv4 TCP packets:
+
+  ldh [12]
+  jne #0x800, drop
+  ldb [23]
+  jneq #6, drop
+  ret #-1
+  drop: ret #0
+
+** (Accelerated) VLAN w/ id 10:
+
+  ld vlan_tci
+  jneq #10, drop
+  ret #-1
+  drop: ret #0
+
+** icmp random packet sampling, 1 in 4
+  ldh [12]
+  jne #0x800, drop
+  ldb [23]
+  jneq #1, drop
+  # get a random uint32 number
+  ld rand
+  mod #4
+  jneq #1, drop
+  ret #-1
+  drop: ret #0
+
+** SECCOMP filter example:
+
+  ld [4]                  /* offsetof(struct seccomp_data, arch) */
+  jne #0xc000003e, bad    /* AUDIT_ARCH_X86_64 */
+  ld [0]                  /* offsetof(struct seccomp_data, nr) */
+  jeq #15, good           /* __NR_rt_sigreturn */
+  jeq #231, good          /* __NR_exit_group */
+  jeq #60, good           /* __NR_exit */
+  jeq #0, good            /* __NR_read */
+  jeq #1, good            /* __NR_write */
+  jeq #5, good            /* __NR_fstat */
+  jeq #9, good            /* __NR_mmap */
+  jeq #14, good           /* __NR_rt_sigprocmask */
+  jeq #13, good           /* __NR_rt_sigaction */
+  jeq #35, good           /* __NR_nanosleep */
+  bad: ret #0             /* SECCOMP_RET_KILL_THREAD */
+  good: ret #0x7fff0000   /* SECCOMP_RET_ALLOW */
+
+The above example code can be placed into a file (here called "foo"), and
+then be passed to the bpf_asm tool for generating opcodes, output that xt_bpf
+and cls_bpf understands and can directly be loaded with. Example with above
+ARP code:
+
+$ ./bpf_asm foo
+4,40 0 0 12,21 0 1 2054,6 0 0 4294967295,6 0 0 0,
+
+In copy and paste C-like output:
+
+$ ./bpf_asm -c foo
+{ 0x28,  0,  0, 0x0000000c },
+{ 0x15,  0,  1, 0x00000806 },
+{ 0x06,  0,  0, 0xffffffff },
+{ 0x06,  0,  0, 0000000000 },
+
+In particular, as usage with xt_bpf or cls_bpf can result in more complex BPF
+filters that might not be obvious at first, it's good to test filters before
+attaching to a live system. For that purpose, there's a small tool called
+bpf_dbg under tools/bpf/ in the kernel source directory. This debugger allows
+for testing BPF filters against given pcap files, single stepping through the
+BPF code on the pcap's packets and to do BPF machine register dumps.
+
+Starting bpf_dbg is trivial and just requires issuing:
+
+# ./bpf_dbg
+
+In case input and output do not equal stdin/stdout, bpf_dbg takes an
+alternative stdin source as a first argument, and an alternative stdout
+sink as a second one, e.g. `./bpf_dbg test_in.txt test_out.txt`.
+
+Other than that, a particular libreadline configuration can be set via
+file "~/.bpf_dbg_init" and the command history is stored in the file
+"~/.bpf_dbg_history".
+
+Interaction in bpf_dbg happens through a shell that also has auto-completion
+support (follow-up example commands starting with '>' denote bpf_dbg shell).
+The usual workflow would be to ...
+
+> load bpf 6,40 0 0 12,21 0 3 2048,48 0 0 23,21 0 1 1,6 0 0 65535,6 0 0 0
+  Loads a BPF filter from standard output of bpf_asm, or transformed via
+  e.g. `tcpdump -iem1 -ddd port 22 | tr '\n' ','`. Note that for JIT
+  debugging (next section), this command creates a temporary socket and
+  loads the BPF code into the kernel. Thus, this will also be useful for
+  JIT developers.
+
+> load pcap foo.pcap
+  Loads standard tcpdump pcap file.
+
+> run [<n>]
+bpf passes:1 fails:9
+  Runs through all packets from a pcap to account how many passes and fails
+  the filter will generate. A limit of packets to traverse can be given.
+
+> disassemble
+l0:	ldh [12]
+l1:	jeq #0x800, l2, l5
+l2:	ldb [23]
+l3:	jeq #0x1, l4, l5
+l4:	ret #0xffff
+l5:	ret #0
+  Prints out BPF code disassembly.
+
+> dump
+/* { op, jt, jf, k }, */
+{ 0x28,  0,  0, 0x0000000c },
+{ 0x15,  0,  3, 0x00000800 },
+{ 0x30,  0,  0, 0x00000017 },
+{ 0x15,  0,  1, 0x00000001 },
+{ 0x06,  0,  0, 0x0000ffff },
+{ 0x06,  0,  0, 0000000000 },
+  Prints out C-style BPF code dump.
+
+> breakpoint 0
+breakpoint at: l0:	ldh [12]
+> breakpoint 1
+breakpoint at: l1:	jeq #0x800, l2, l5
+  ...
+  Sets breakpoints at particular BPF instructions. Issuing a `run` command
+  will walk through the pcap file continuing from the current packet and
+  break when a breakpoint is being hit (another `run` will continue from
+  the currently active breakpoint executing next instructions):
+
+  > run
+  -- register dump --
+  pc:       [0]                       <-- program counter
+  code:     [40] jt[0] jf[0] k[12]    <-- plain BPF code of current instruction
+  curr:     l0:	ldh [12]              <-- disassembly of current instruction
+  A:        [00000000][0]             <-- content of A (hex, decimal)
+  X:        [00000000][0]             <-- content of X (hex, decimal)
+  M[0,15]:  [00000000][0]             <-- folded content of M (hex, decimal)
+  -- packet dump --                   <-- Current packet from pcap (hex)
+  len: 42
+    0: 00 19 cb 55 55 a4 00 14 a4 43 78 69 08 06 00 01
+   16: 08 00 06 04 00 01 00 14 a4 43 78 69 0a 3b 01 26
+   32: 00 00 00 00 00 00 0a 3b 01 01
+  (breakpoint)
+  >
+
+> breakpoint
+breakpoints: 0 1
+  Prints currently set breakpoints.
+
+> step [-<n>, +<n>]
+  Performs single stepping through the BPF program from the current pc
+  offset. Thus, on each step invocation, above register dump is issued.
+  This can go forwards and backwards in time, a plain `step` will break
+  on the next BPF instruction, thus +1. (No `run` needs to be issued here.)
+
+> select <n>
+  Selects a given packet from the pcap file to continue from. Thus, on
+  the next `run` or `step`, the BPF program is being evaluated against
+  the user pre-selected packet. Numbering starts just as in Wireshark
+  with index 1.
+
+> quit
+#
+  Exits bpf_dbg.
+
+JIT compiler
+------------
+
+The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC, PowerPC,
+ARM, ARM64, MIPS and s390 and can be enabled through CONFIG_BPF_JIT. The JIT
+compiler is transparently invoked for each attached filter from user space
+or for internal kernel users if it has been previously enabled by root:
+
+  echo 1 > /proc/sys/net/core/bpf_jit_enable
+
+For JIT developers, doing audits etc, each compile run can output the generated
+opcode image into the kernel log via:
+
+  echo 2 > /proc/sys/net/core/bpf_jit_enable
+
+Example output from dmesg:
+
+[ 3389.935842] flen=6 proglen=70 pass=3 image=ffffffffa0069c8f
+[ 3389.935847] JIT code: 00000000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 68
+[ 3389.935849] JIT code: 00000010: 44 2b 4f 6c 4c 8b 87 d8 00 00 00 be 0c 00 00 00
+[ 3389.935850] JIT code: 00000020: e8 1d 94 ff e0 3d 00 08 00 00 75 16 be 17 00 00
+[ 3389.935851] JIT code: 00000030: 00 e8 28 94 ff e0 83 f8 01 75 07 b8 ff ff 00 00
+[ 3389.935852] JIT code: 00000040: eb 02 31 c0 c9 c3
+
+When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set to 1 and
+setting any other value than that will return in failure. This is even the case for
+setting bpf_jit_enable to 2, since dumping the final JIT image into the kernel log
+is discouraged and introspection through bpftool (under tools/bpf/bpftool/) is the
+generally recommended approach instead.
+
+In the kernel source tree under tools/bpf/, there's bpf_jit_disasm for
+generating disassembly out of the kernel log's hexdump:
+
+# ./bpf_jit_disasm
+70 bytes emitted from JIT compiler (pass:3, flen:6)
+ffffffffa0069c8f + <x>:
+   0:	push   %rbp
+   1:	mov    %rsp,%rbp
+   4:	sub    $0x60,%rsp
+   8:	mov    %rbx,-0x8(%rbp)
+   c:	mov    0x68(%rdi),%r9d
+  10:	sub    0x6c(%rdi),%r9d
+  14:	mov    0xd8(%rdi),%r8
+  1b:	mov    $0xc,%esi
+  20:	callq  0xffffffffe0ff9442
+  25:	cmp    $0x800,%eax
+  2a:	jne    0x0000000000000042
+  2c:	mov    $0x17,%esi
+  31:	callq  0xffffffffe0ff945e
+  36:	cmp    $0x1,%eax
+  39:	jne    0x0000000000000042
+  3b:	mov    $0xffff,%eax
+  40:	jmp    0x0000000000000044
+  42:	xor    %eax,%eax
+  44:	leaveq
+  45:	retq
+
+Issuing option `-o` will "annotate" opcodes to resulting assembler
+instructions, which can be very useful for JIT developers:
+
+# ./bpf_jit_disasm -o
+70 bytes emitted from JIT compiler (pass:3, flen:6)
+ffffffffa0069c8f + <x>:
+   0:	push   %rbp
+	55
+   1:	mov    %rsp,%rbp
+	48 89 e5
+   4:	sub    $0x60,%rsp
+	48 83 ec 60
+   8:	mov    %rbx,-0x8(%rbp)
+	48 89 5d f8
+   c:	mov    0x68(%rdi),%r9d
+	44 8b 4f 68
+  10:	sub    0x6c(%rdi),%r9d
+	44 2b 4f 6c
+  14:	mov    0xd8(%rdi),%r8
+	4c 8b 87 d8 00 00 00
+  1b:	mov    $0xc,%esi
+	be 0c 00 00 00
+  20:	callq  0xffffffffe0ff9442
+	e8 1d 94 ff e0
+  25:	cmp    $0x800,%eax
+	3d 00 08 00 00
+  2a:	jne    0x0000000000000042
+	75 16
+  2c:	mov    $0x17,%esi
+	be 17 00 00 00
+  31:	callq  0xffffffffe0ff945e
+	e8 28 94 ff e0
+  36:	cmp    $0x1,%eax
+	83 f8 01
+  39:	jne    0x0000000000000042
+	75 07
+  3b:	mov    $0xffff,%eax
+	b8 ff ff 00 00
+  40:	jmp    0x0000000000000044
+	eb 02
+  42:	xor    %eax,%eax
+	31 c0
+  44:	leaveq
+	c9
+  45:	retq
+	c3
+
+For BPF JIT developers, bpf_jit_disasm, bpf_asm and bpf_dbg provides a useful
+toolchain for developing and testing the kernel's JIT compiler.
+
+BPF kernel internals
+--------------------
+Internally, for the kernel interpreter, a different instruction set
+format with similar underlying principles from BPF described in previous
+paragraphs is being used. However, the instruction set format is modelled
+closer to the underlying architecture to mimic native instruction sets, so
+that a better performance can be achieved (more details later). This new
+ISA is called 'eBPF' or 'internal BPF' interchangeably. (Note: eBPF which
+originates from [e]xtended BPF is not the same as BPF extensions! While
+eBPF is an ISA, BPF extensions date back to classic BPF's 'overloading'
+of BPF_LD | BPF_{B,H,W} | BPF_ABS instruction.)
+
+It is designed to be JITed with one to one mapping, which can also open up
+the possibility for GCC/LLVM compilers to generate optimized eBPF code through
+an eBPF backend that performs almost as fast as natively compiled code.
+
+The new instruction set was originally designed with the possible goal in
+mind to write programs in "restricted C" and compile into eBPF with a optional
+GCC/LLVM backend, so that it can just-in-time map to modern 64-bit CPUs with
+minimal performance overhead over two steps, that is, C -> eBPF -> native code.
+
+Currently, the new format is being used for running user BPF programs, which
+includes seccomp BPF, classic socket filters, cls_bpf traffic classifier,
+team driver's classifier for its load-balancing mode, netfilter's xt_bpf
+extension, PTP dissector/classifier, and much more. They are all internally
+converted by the kernel into the new instruction set representation and run
+in the eBPF interpreter. For in-kernel handlers, this all works transparently
+by using bpf_prog_create() for setting up the filter, resp.
+bpf_prog_destroy() for destroying it. The macro
+BPF_PROG_RUN(filter, ctx) transparently invokes eBPF interpreter or JITed
+code to run the filter. 'filter' is a pointer to struct bpf_prog that we
+got from bpf_prog_create(), and 'ctx' the given context (e.g.
+skb pointer). All constraints and restrictions from bpf_check_classic() apply
+before a conversion to the new layout is being done behind the scenes!
+
+Currently, the classic BPF format is being used for JITing on most 32-bit
+architectures, whereas x86-64, aarch64, s390x, powerpc64, sparc64, arm32 perform
+JIT compilation from eBPF instruction set.
+
+Some core changes of the new internal format:
+
+- Number of registers increase from 2 to 10:
+
+  The old format had two registers A and X, and a hidden frame pointer. The
+  new layout extends this to be 10 internal registers and a read-only frame
+  pointer. Since 64-bit CPUs are passing arguments to functions via registers
+  the number of args from eBPF program to in-kernel function is restricted
+  to 5 and one register is used to accept return value from an in-kernel
+  function. Natively, x86_64 passes first 6 arguments in registers, aarch64/
+  sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved
+  registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers.
+
+  Therefore, eBPF calling convention is defined as:
+
+    * R0	- return value from in-kernel function, and exit value for eBPF program
+    * R1 - R5	- arguments from eBPF program to in-kernel function
+    * R6 - R9	- callee saved registers that in-kernel function will preserve
+    * R10	- read-only frame pointer to access stack
+
+  Thus, all eBPF registers map one to one to HW registers on x86_64, aarch64,
+  etc, and eBPF calling convention maps directly to ABIs used by the kernel on
+  64-bit architectures.
+
+  On 32-bit architectures JIT may map programs that use only 32-bit arithmetic
+  and may let more complex programs to be interpreted.
+
+  R0 - R5 are scratch registers and eBPF program needs spill/fill them if
+  necessary across calls. Note that there is only one eBPF program (== one
+  eBPF main routine) and it cannot call other eBPF functions, it can only
+  call predefined in-kernel functions, though.
+
+- Register width increases from 32-bit to 64-bit:
+
+  Still, the semantics of the original 32-bit ALU operations are preserved
+  via 32-bit subregisters. All eBPF registers are 64-bit with 32-bit lower
+  subregisters that zero-extend into 64-bit if they are being written to.
+  That behavior maps directly to x86_64 and arm64 subregister definition, but
+  makes other JITs more difficult.
+
+  32-bit architectures run 64-bit internal BPF programs via interpreter.
+  Their JITs may convert BPF programs that only use 32-bit subregisters into
+  native instruction set and let the rest being interpreted.
+
+  Operation is 64-bit, because on 64-bit architectures, pointers are also
+  64-bit wide, and we want to pass 64-bit values in/out of kernel functions,
+  so 32-bit eBPF registers would otherwise require to define register-pair
+  ABI, thus, there won't be able to use a direct eBPF register to HW register
+  mapping and JIT would need to do combine/split/move operations for every
+  register in and out of the function, which is complex, bug prone and slow.
+  Another reason is the use of atomic 64-bit counters.
+
+- Conditional jt/jf targets replaced with jt/fall-through:
+
+  While the original design has constructs such as "if (cond) jump_true;
+  else jump_false;", they are being replaced into alternative constructs like
+  "if (cond) jump_true; /* else fall-through */".
+
+- Introduces bpf_call insn and register passing convention for zero overhead
+  calls from/to other kernel functions:
+
+  Before an in-kernel function call, the internal BPF program needs to
+  place function arguments into R1 to R5 registers to satisfy calling
+  convention, then the interpreter will take them from registers and pass
+  to in-kernel function. If R1 - R5 registers are mapped to CPU registers
+  that are used for argument passing on given architecture, the JIT compiler
+  doesn't need to emit extra moves. Function arguments will be in the correct
+  registers and BPF_CALL instruction will be JITed as single 'call' HW
+  instruction. This calling convention was picked to cover common call
+  situations without performance penalty.
+
+  After an in-kernel function call, R1 - R5 are reset to unreadable and R0 has
+  a return value of the function. Since R6 - R9 are callee saved, their state
+  is preserved across the call.
+
+  For example, consider three C functions:
+
+  u64 f1() { return (*_f2)(1); }
+  u64 f2(u64 a) { return f3(a + 1, a); }
+  u64 f3(u64 a, u64 b) { return a - b; }
+
+  GCC can compile f1, f3 into x86_64:
+
+  f1:
+    movl $1, %edi
+    movq _f2(%rip), %rax
+    jmp  *%rax
+  f3:
+    movq %rdi, %rax
+    subq %rsi, %rax
+    ret
+
+  Function f2 in eBPF may look like:
+
+  f2:
+    bpf_mov R2, R1
+    bpf_add R1, 1
+    bpf_call f3
+    bpf_exit
+
+  If f2 is JITed and the pointer stored to '_f2'. The calls f1 -> f2 -> f3 and
+  returns will be seamless. Without JIT, __bpf_prog_run() interpreter needs to
+  be used to call into f2.
+
+  For practical reasons all eBPF programs have only one argument 'ctx' which is
+  already placed into R1 (e.g. on __bpf_prog_run() startup) and the programs
+  can call kernel functions with up to 5 arguments. Calls with 6 or more arguments
+  are currently not supported, but these restrictions can be lifted if necessary
+  in the future.
+
+  On 64-bit architectures all register map to HW registers one to one. For
+  example, x86_64 JIT compiler can map them as ...
+
+    R0 - rax
+    R1 - rdi
+    R2 - rsi
+    R3 - rdx
+    R4 - rcx
+    R5 - r8
+    R6 - rbx
+    R7 - r13
+    R8 - r14
+    R9 - r15
+    R10 - rbp
+
+  ... since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing
+  and rbx, r12 - r15 are callee saved.
+
+  Then the following internal BPF pseudo-program:
+
+    bpf_mov R6, R1 /* save ctx */
+    bpf_mov R2, 2
+    bpf_mov R3, 3
+    bpf_mov R4, 4
+    bpf_mov R5, 5
+    bpf_call foo
+    bpf_mov R7, R0 /* save foo() return value */
+    bpf_mov R1, R6 /* restore ctx for next call */
+    bpf_mov R2, 6
+    bpf_mov R3, 7
+    bpf_mov R4, 8
+    bpf_mov R5, 9
+    bpf_call bar
+    bpf_add R0, R7
+    bpf_exit
+
+  After JIT to x86_64 may look like:
+
+    push %rbp
+    mov %rsp,%rbp
+    sub $0x228,%rsp
+    mov %rbx,-0x228(%rbp)
+    mov %r13,-0x220(%rbp)
+    mov %rdi,%rbx
+    mov $0x2,%esi
+    mov $0x3,%edx
+    mov $0x4,%ecx
+    mov $0x5,%r8d
+    callq foo
+    mov %rax,%r13
+    mov %rbx,%rdi
+    mov $0x2,%esi
+    mov $0x3,%edx
+    mov $0x4,%ecx
+    mov $0x5,%r8d
+    callq bar
+    add %r13,%rax
+    mov -0x228(%rbp),%rbx
+    mov -0x220(%rbp),%r13
+    leaveq
+    retq
+
+  Which is in this example equivalent in C to:
+
+    u64 bpf_filter(u64 ctx)
+    {
+        return foo(ctx, 2, 3, 4, 5) + bar(ctx, 6, 7, 8, 9);
+    }
+
+  In-kernel functions foo() and bar() with prototype: u64 (*)(u64 arg1, u64
+  arg2, u64 arg3, u64 arg4, u64 arg5); will receive arguments in proper
+  registers and place their return value into '%rax' which is R0 in eBPF.
+  Prologue and epilogue are emitted by JIT and are implicit in the
+  interpreter. R0-R5 are scratch registers, so eBPF program needs to preserve
+  them across the calls as defined by calling convention.
+
+  For example the following program is invalid:
+
+    bpf_mov R1, 1
+    bpf_call foo
+    bpf_mov R0, R1
+    bpf_exit
+
+  After the call the registers R1-R5 contain junk values and cannot be read.
+  An in-kernel eBPF verifier is used to validate internal BPF programs.
+
+Also in the new design, eBPF is limited to 4096 insns, which means that any
+program will terminate quickly and will only call a fixed number of kernel
+functions. Original BPF and the new format are two operand instructions,
+which helps to do one-to-one mapping between eBPF insn and x86 insn during JIT.
+
+The input context pointer for invoking the interpreter function is generic,
+its content is defined by a specific use case. For seccomp register R1 points
+to seccomp_data, for converted BPF filters R1 points to a skb.
+
+A program, that is translated internally consists of the following elements:
+
+  op:16, jt:8, jf:8, k:32    ==>    op:8, dst_reg:4, src_reg:4, off:16, imm:32
+
+So far 87 internal BPF instructions were implemented. 8-bit 'op' opcode field
+has room for new instructions. Some of them may use 16/24/32 byte encoding. New
+instructions must be multiple of 8 bytes to preserve backward compatibility.
+
+Internal BPF is a general purpose RISC instruction set. Not every register and
+every instruction are used during translation from original BPF to new format.
+For example, socket filters are not using 'exclusive add' instruction, but
+tracing filters may do to maintain counters of events, for example. Register R9
+is not used by socket filters either, but more complex filters may be running
+out of registers and would have to resort to spill/fill to stack.
+
+Internal BPF can used as generic assembler for last step performance
+optimizations, socket filters and seccomp are using it as assembler. Tracing
+filters may use it as assembler to generate code from kernel. In kernel usage
+may not be bounded by security considerations, since generated internal BPF code
+may be optimizing internal code path and not being exposed to the user space.
+Safety of internal BPF can come from a verifier (TBD). In such use cases as
+described, it may be used as safe instruction set.
+
+Just like the original BPF, the new format runs within a controlled environment,
+is deterministic and the kernel can easily prove that. The safety of the program
+can be determined in two steps: first step does depth-first-search to disallow
+loops and other CFG validation; second step starts from the first insn and
+descends all possible paths. It simulates execution of every insn and observes
+the state change of registers and stack.
+
+eBPF opcode encoding
+--------------------
+
+eBPF is reusing most of the opcode encoding from classic to simplify conversion
+of classic BPF to eBPF. For arithmetic and jump instructions the 8-bit 'code'
+field is divided into three parts:
+
+  +----------------+--------+--------------------+
+  |   4 bits       |  1 bit |   3 bits           |
+  | operation code | source | instruction class  |
+  +----------------+--------+--------------------+
+  (MSB)                                      (LSB)
+
+Three LSB bits store instruction class which is one of:
+
+  Classic BPF classes:    eBPF classes:
+
+  BPF_LD    0x00          BPF_LD    0x00
+  BPF_LDX   0x01          BPF_LDX   0x01
+  BPF_ST    0x02          BPF_ST    0x02
+  BPF_STX   0x03          BPF_STX   0x03
+  BPF_ALU   0x04          BPF_ALU   0x04
+  BPF_JMP   0x05          BPF_JMP   0x05
+  BPF_RET   0x06          [ class 6 unused, for future if needed ]
+  BPF_MISC  0x07          BPF_ALU64 0x07
+
+When BPF_CLASS(code) == BPF_ALU or BPF_JMP, 4th bit encodes source operand ...
+
+  BPF_K     0x00
+  BPF_X     0x08
+
+ * in classic BPF, this means:
+
+  BPF_SRC(code) == BPF_X - use register X as source operand
+  BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand
+
+ * in eBPF, this means:
+
+  BPF_SRC(code) == BPF_X - use 'src_reg' register as source operand
+  BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand
+
+... and four MSB bits store operation code.
+
+If BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of:
+
+  BPF_ADD   0x00
+  BPF_SUB   0x10
+  BPF_MUL   0x20
+  BPF_DIV   0x30
+  BPF_OR    0x40
+  BPF_AND   0x50
+  BPF_LSH   0x60
+  BPF_RSH   0x70
+  BPF_NEG   0x80
+  BPF_MOD   0x90
+  BPF_XOR   0xa0
+  BPF_MOV   0xb0  /* eBPF only: mov reg to reg */
+  BPF_ARSH  0xc0  /* eBPF only: sign extending shift right */
+  BPF_END   0xd0  /* eBPF only: endianness conversion */
+
+If BPF_CLASS(code) == BPF_JMP, BPF_OP(code) is one of:
+
+  BPF_JA    0x00
+  BPF_JEQ   0x10
+  BPF_JGT   0x20
+  BPF_JGE   0x30
+  BPF_JSET  0x40
+  BPF_JNE   0x50  /* eBPF only: jump != */
+  BPF_JSGT  0x60  /* eBPF only: signed '>' */
+  BPF_JSGE  0x70  /* eBPF only: signed '>=' */
+  BPF_CALL  0x80  /* eBPF only: function call */
+  BPF_EXIT  0x90  /* eBPF only: function return */
+  BPF_JLT   0xa0  /* eBPF only: unsigned '<' */
+  BPF_JLE   0xb0  /* eBPF only: unsigned '<=' */
+  BPF_JSLT  0xc0  /* eBPF only: signed '<' */
+  BPF_JSLE  0xd0  /* eBPF only: signed '<=' */
+
+So BPF_ADD | BPF_X | BPF_ALU means 32-bit addition in both classic BPF
+and eBPF. There are only two registers in classic BPF, so it means A += X.
+In eBPF it means dst_reg = (u32) dst_reg + (u32) src_reg; similarly,
+BPF_XOR | BPF_K | BPF_ALU means A ^= imm32 in classic BPF and analogous
+src_reg = (u32) src_reg ^ (u32) imm32 in eBPF.
+
+Classic BPF is using BPF_MISC class to represent A = X and X = A moves.
+eBPF is using BPF_MOV | BPF_X | BPF_ALU code instead. Since there are no
+BPF_MISC operations in eBPF, the class 7 is used as BPF_ALU64 to mean
+exactly the same operations as BPF_ALU, but with 64-bit wide operands
+instead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.:
+dst_reg = dst_reg + src_reg
+
+Classic BPF wastes the whole BPF_RET class to represent a single 'ret'
+operation. Classic BPF_RET | BPF_K means copy imm32 into return register
+and perform function exit. eBPF is modeled to match CPU, so BPF_JMP | BPF_EXIT
+in eBPF means function exit only. The eBPF program needs to store return
+value into register R0 before doing a BPF_EXIT. Class 6 in eBPF is currently
+unused and reserved for future use.
+
+For load and store instructions the 8-bit 'code' field is divided as:
+
+  +--------+--------+-------------------+
+  | 3 bits | 2 bits |   3 bits          |
+  |  mode  |  size  | instruction class |
+  +--------+--------+-------------------+
+  (MSB)                             (LSB)
+
+Size modifier is one of ...
+
+  BPF_W   0x00    /* word */
+  BPF_H   0x08    /* half word */
+  BPF_B   0x10    /* byte */
+  BPF_DW  0x18    /* eBPF only, double word */
+
+... which encodes size of load/store operation:
+
+ B  - 1 byte
+ H  - 2 byte
+ W  - 4 byte
+ DW - 8 byte (eBPF only)
+
+Mode modifier is one of:
+
+  BPF_IMM  0x00  /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
+  BPF_ABS  0x20
+  BPF_IND  0x40
+  BPF_MEM  0x60
+  BPF_LEN  0x80  /* classic BPF only, reserved in eBPF */
+  BPF_MSH  0xa0  /* classic BPF only, reserved in eBPF */
+  BPF_XADD 0xc0  /* eBPF only, exclusive add */
+
+eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
+(BPF_IND | <size> | BPF_LD) which are used to access packet data.
+
+They had to be carried over from classic to have strong performance of
+socket filters running in eBPF interpreter. These instructions can only
+be used when interpreter context is a pointer to 'struct sk_buff' and
+have seven implicit operands. Register R6 is an implicit input that must
+contain pointer to sk_buff. Register R0 is an implicit output which contains
+the data fetched from the packet. Registers R1-R5 are scratch registers
+and must not be used to store the data across BPF_ABS | BPF_LD or
+BPF_IND | BPF_LD instructions.
+
+These instructions have implicit program exit condition as well. When
+eBPF program is trying to access the data beyond the packet boundary,
+the interpreter will abort the execution of the program. JIT compilers
+therefore must preserve this property. src_reg and imm32 fields are
+explicit inputs to these instructions.
+
+For example:
+
+  BPF_IND | BPF_W | BPF_LD means:
+
+    R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
+    and R1 - R5 were scratched.
+
+Unlike classic BPF instruction set, eBPF has generic load/store operations:
+
+BPF_MEM | <size> | BPF_STX:  *(size *) (dst_reg + off) = src_reg
+BPF_MEM | <size> | BPF_ST:   *(size *) (dst_reg + off) = imm32
+BPF_MEM | <size> | BPF_LDX:  dst_reg = *(size *) (src_reg + off)
+BPF_XADD | BPF_W  | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
+BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+
+Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
+2 byte atomic increments are not supported.
+
+eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
+of two consecutive 'struct bpf_insn' 8-byte blocks and interpreted as single
+instruction that loads 64-bit immediate value into a dst_reg.
+Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
+32-bit immediate value into a register.
+
+eBPF verifier
+-------------
+The safety of the eBPF program is determined in two steps.
+
+First step does DAG check to disallow loops and other CFG validation.
+In particular it will detect programs that have unreachable instructions.
+(though classic BPF checker allows them)
+
+Second step starts from the first insn and descends all possible paths.
+It simulates execution of every insn and observes the state change of
+registers and stack.
+
+At the start of the program the register R1 contains a pointer to context
+and has type PTR_TO_CTX.
+If verifier sees an insn that does R2=R1, then R2 has now type
+PTR_TO_CTX as well and can be used on the right hand side of expression.
+If R1=PTR_TO_CTX and insn is R2=R1+R1, then R2=SCALAR_VALUE,
+since addition of two valid pointers makes invalid pointer.
+(In 'secure' mode verifier will reject any type of pointer arithmetic to make
+sure that kernel addresses don't leak to unprivileged users)
+
+If register was never written to, it's not readable:
+  bpf_mov R0 = R2
+  bpf_exit
+will be rejected, since R2 is unreadable at the start of the program.
+
+After kernel function call, R1-R5 are reset to unreadable and
+R0 has a return type of the function.
+
+Since R6-R9 are callee saved, their state is preserved across the call.
+  bpf_mov R6 = 1
+  bpf_call foo
+  bpf_mov R0 = R6
+  bpf_exit
+is a correct program. If there was R1 instead of R6, it would have
+been rejected.
+
+load/store instructions are allowed only with registers of valid types, which
+are PTR_TO_CTX, PTR_TO_MAP, PTR_TO_STACK. They are bounds and alignment checked.
+For example:
+ bpf_mov R1 = 1
+ bpf_mov R2 = 2
+ bpf_xadd *(u32 *)(R1 + 3) += R2
+ bpf_exit
+will be rejected, since R1 doesn't have a valid pointer type at the time of
+execution of instruction bpf_xadd.
+
+At the start R1 type is PTR_TO_CTX (a pointer to generic 'struct bpf_context')
+A callback is used to customize verifier to restrict eBPF program access to only
+certain fields within ctx structure with specified size and alignment.
+
+For example, the following insn:
+  bpf_ld R0 = *(u32 *)(R6 + 8)
+intends to load a word from address R6 + 8 and store it into R0
+If R6=PTR_TO_CTX, via is_valid_access() callback the verifier will know
+that offset 8 of size 4 bytes can be accessed for reading, otherwise
+the verifier will reject the program.
+If R6=PTR_TO_STACK, then access should be aligned and be within
+stack bounds, which are [-MAX_BPF_STACK, 0). In this example offset is 8,
+so it will fail verification, since it's out of bounds.
+
+The verifier will allow eBPF program to read data from stack only after
+it wrote into it.
+Classic BPF verifier does similar check with M[0-15] memory slots.
+For example:
+  bpf_ld R0 = *(u32 *)(R10 - 4)
+  bpf_exit
+is invalid program.
+Though R10 is correct read-only register and has type PTR_TO_STACK
+and R10 - 4 is within stack bounds, there were no stores into that location.
+
+Pointer register spill/fill is tracked as well, since four (R6-R9)
+callee saved registers may not be enough for some programs.
+
+Allowed function calls are customized with bpf_verifier_ops->get_func_proto()
+The eBPF verifier will check that registers match argument constraints.
+After the call register R0 will be set to return type of the function.
+
+Function calls is a main mechanism to extend functionality of eBPF programs.
+Socket filters may let programs to call one set of functions, whereas tracing
+filters may allow completely different set.
+
+If a function made accessible to eBPF program, it needs to be thought through
+from safety point of view. The verifier will guarantee that the function is
+called with valid arguments.
+
+seccomp vs socket filters have different security restrictions for classic BPF.
+Seccomp solves this by two stage verifier: classic BPF verifier is followed
+by seccomp verifier. In case of eBPF one configurable verifier is shared for
+all use cases.
+
+See details of eBPF verifier in kernel/bpf/verifier.c
+
+Register value tracking
+-----------------------
+In order to determine the safety of an eBPF program, the verifier must track
+the range of possible values in each register and also in each stack slot.
+This is done with 'struct bpf_reg_state', defined in include/linux/
+bpf_verifier.h, which unifies tracking of scalar and pointer values.  Each
+register state has a type, which is either NOT_INIT (the register has not been
+written to), SCALAR_VALUE (some value which is not usable as a pointer), or a
+pointer type.  The types of pointers describe their base, as follows:
+    PTR_TO_CTX          Pointer to bpf_context.
+    CONST_PTR_TO_MAP    Pointer to struct bpf_map.  "Const" because arithmetic
+                        on these pointers is forbidden.
+    PTR_TO_MAP_VALUE    Pointer to the value stored in a map element.
+    PTR_TO_MAP_VALUE_OR_NULL
+                        Either a pointer to a map value, or NULL; map accesses
+                        (see section 'eBPF maps', below) return this type,
+                        which becomes a PTR_TO_MAP_VALUE when checked != NULL.
+                        Arithmetic on these pointers is forbidden.
+    PTR_TO_STACK        Frame pointer.
+    PTR_TO_PACKET       skb->data.
+    PTR_TO_PACKET_END   skb->data + headlen; arithmetic forbidden.
+However, a pointer may be offset from this base (as a result of pointer
+arithmetic), and this is tracked in two parts: the 'fixed offset' and 'variable
+offset'.  The former is used when an exactly-known value (e.g. an immediate
+operand) is added to a pointer, while the latter is used for values which are
+not exactly known.  The variable offset is also used in SCALAR_VALUEs, to track
+the range of possible values in the register.
+The verifier's knowledge about the variable offset consists of:
+* minimum and maximum values as unsigned
+* minimum and maximum values as signed
+* knowledge of the values of individual bits, in the form of a 'tnum': a u64
+'mask' and a u64 'value'.  1s in the mask represent bits whose value is unknown;
+1s in the value represent bits known to be 1.  Bits known to be 0 have 0 in both
+mask and value; no bit should ever be 1 in both.  For example, if a byte is read
+into a register from memory, the register's top 56 bits are known zero, while
+the low 8 are unknown - which is represented as the tnum (0x0; 0xff).  If we
+then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
+0x1ff), because of potential carries.
+
+Besides arithmetic, the register state can also be updated by conditional
+branches.  For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch
+it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false'
+branch it will have a umax_value of 8.  A signed compare (with BPF_JSGT or
+BPF_JSGE) would instead update the signed minimum/maximum values.  Information
+from the signed and unsigned bounds can be combined; for instance if a value is
+first tested < 8 and then tested s> 4, the verifier will conclude that the value
+is also > 4 and s< 8, since the bounds prevent crossing the sign boundary.
+
+PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all
+pointers sharing that same variable offset.  This is important for packet range
+checks: after adding a variable to a packet pointer register A, if you then copy
+it to another register B and then add a constant 4 to A, both registers will
+share the same 'id' but the A will have a fixed offset of +4.  Then if A is
+bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is
+now known to have a safe range of at least 4 bytes.  See 'Direct packet access',
+below, for more on PTR_TO_PACKET ranges.
+
+The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies of
+the pointer returned from a map lookup.  This means that when one copy is
+checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs.
+As well as range-checking, the tracked information is also used for enforcing
+alignment of pointer accesses.  For instance, on most systems the packet pointer
+is 2 bytes after a 4-byte alignment.  If a program adds 14 bytes to that to jump
+over the Ethernet header, then reads IHL and addes (IHL * 4), the resulting
+pointer will have a variable offset known to be 4n+2 for some n, so adding the 2
+bytes (NET_IP_ALIGN) gives a 4-byte alignment and so word-sized accesses through
+that pointer are safe.
+
+Direct packet access
+--------------------
+In cls_bpf and act_bpf programs the verifier allows direct access to the packet
+data via skb->data and skb->data_end pointers.
+Ex:
+1:  r4 = *(u32 *)(r1 +80)  /* load skb->data_end */
+2:  r3 = *(u32 *)(r1 +76)  /* load skb->data */
+3:  r5 = r3
+4:  r5 += 14
+5:  if r5 > r4 goto pc+16
+R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp
+6:  r0 = *(u16 *)(r3 +12) /* access 12 and 13 bytes of the packet */
+
+this 2byte load from the packet is safe to do, since the program author
+did check 'if (skb->data + 14 > skb->data_end) goto err' at insn #5 which
+means that in the fall-through case the register R3 (which points to skb->data)
+has at least 14 directly accessible bytes. The verifier marks it
+as R3=pkt(id=0,off=0,r=14).
+id=0 means that no additional variables were added to the register.
+off=0 means that no additional constants were added.
+r=14 is the range of safe access which means that bytes [R3, R3 + 14) are ok.
+Note that R5 is marked as R5=pkt(id=0,off=14,r=14). It also points
+to the packet data, but constant 14 was added to the register, so
+it now points to 'skb->data + 14' and accessible range is [R5, R5 + 14 - 14)
+which is zero bytes.
+
+More complex packet access may look like:
+ R0=inv1 R1=ctx R3=pkt(id=0,off=0,r=14) R4=pkt_end R5=pkt(id=0,off=14,r=14) R10=fp
+ 6:  r0 = *(u8 *)(r3 +7) /* load 7th byte from the packet */
+ 7:  r4 = *(u8 *)(r3 +12)
+ 8:  r4 *= 14
+ 9:  r3 = *(u32 *)(r1 +76) /* load skb->data */
+10:  r3 += r4
+11:  r2 = r1
+12:  r2 <<= 48
+13:  r2 >>= 48
+14:  r3 += r2
+15:  r2 = r3
+16:  r2 += 8
+17:  r1 = *(u32 *)(r1 +80) /* load skb->data_end */
+18:  if r2 > r1 goto pc+2
+ R0=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R1=pkt_end R2=pkt(id=2,off=8,r=8) R3=pkt(id=2,off=0,r=8) R4=inv(id=0,umax_value=3570,var_off=(0x0; 0xfffe)) R5=pkt(id=0,off=14,r=14) R10=fp
+19:  r1 = *(u8 *)(r3 +4)
+The state of the register R3 is R3=pkt(id=2,off=0,r=8)
+id=2 means that two 'r3 += rX' instructions were seen, so r3 points to some
+offset within a packet and since the program author did
+'if (r3 + 8 > r1) goto err' at insn #18, the safe range is [R3, R3 + 8).
+The verifier only allows 'add'/'sub' operations on packet registers. Any other
+operation will set the register state to 'SCALAR_VALUE' and it won't be
+available for direct packet access.
+Operation 'r3 += rX' may overflow and become less than original skb->data,
+therefore the verifier has to prevent that.  So when it sees 'r3 += rX'
+instruction and rX is more than 16-bit value, any subsequent bounds-check of r3
+against skb->data_end will not give us 'range' information, so attempts to read
+through the pointer will give "invalid access to packet" error.
+Ex. after insn 'r4 = *(u8 *)(r3 +12)' (insn #7 above) the state of r4 is
+R4=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) which means that upper 56 bits
+of the register are guaranteed to be zero, and nothing is known about the lower
+8 bits. After insn 'r4 *= 14' the state becomes
+R4=inv(id=0,umax_value=3570,var_off=(0x0; 0xfffe)), since multiplying an 8-bit
+value by constant 14 will keep upper 52 bits as zero, also the least significant
+bit will be zero as 14 is even.  Similarly 'r2 >>= 48' will make
+R2=inv(id=0,umax_value=65535,var_off=(0x0; 0xffff)), since the shift is not sign
+extending.  This logic is implemented in adjust_reg_min_max_vals() function,
+which calls adjust_ptr_min_max_vals() for adding pointer to scalar (or vice
+versa) and adjust_scalar_min_max_vals() for operations on two scalars.
+
+The end result is that bpf program author can access packet directly
+using normal C code as:
+  void *data = (void *)(long)skb->data;
+  void *data_end = (void *)(long)skb->data_end;
+  struct eth_hdr *eth = data;
+  struct iphdr *iph = data + sizeof(*eth);
+  struct udphdr *udp = data + sizeof(*eth) + sizeof(*iph);
+
+  if (data + sizeof(*eth) + sizeof(*iph) + sizeof(*udp) > data_end)
+          return 0;
+  if (eth->h_proto != htons(ETH_P_IP))
+          return 0;
+  if (iph->protocol != IPPROTO_UDP || iph->ihl != 5)
+          return 0;
+  if (udp->dest == 53 || udp->source == 9)
+          ...;
+which makes such programs easier to write comparing to LD_ABS insn
+and significantly faster.
+
+eBPF maps
+---------
+'maps' is a generic storage of different types for sharing data between kernel
+and userspace.
+
+The maps are accessed from user space via BPF syscall, which has commands:
+- create a map with given type and attributes
+  map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)
+  using attr->map_type, attr->key_size, attr->value_size, attr->max_entries
+  returns process-local file descriptor or negative error
+
+- lookup key in a given map
+  err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)
+  using attr->map_fd, attr->key, attr->value
+  returns zero and stores found elem into value or negative error
+
+- create or update key/value pair in a given map
+  err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)
+  using attr->map_fd, attr->key, attr->value
+  returns zero or negative error
+
+- find and delete element by key in a given map
+  err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)
+  using attr->map_fd, attr->key
+
+- to delete map: close(fd)
+  Exiting process will delete maps automatically
+
+userspace programs use this syscall to create/access maps that eBPF programs
+are concurrently updating.
+
+maps can have different types: hash, array, bloom filter, radix-tree, etc.
+
+The map is defined by:
+  . type
+  . max number of elements
+  . key size in bytes
+  . value size in bytes
+
+Pruning
+-------
+The verifier does not actually walk all possible paths through the program.  For
+each new branch to analyse, the verifier looks at all the states it's previously
+been in when at this instruction.  If any of them contain the current state as a
+subset, the branch is 'pruned' - that is, the fact that the previous state was
+accepted implies the current state would be as well.  For instance, if in the
+previous state, r1 held a packet-pointer, and in the current state, r1 holds a
+packet-pointer with a range as long or longer and at least as strict an
+alignment, then r1 is safe.  Similarly, if r2 was NOT_INIT before then it can't
+have been used by any path from that point, so any value in r2 (including
+another NOT_INIT) is safe.  The implementation is in the function regsafe().
+Pruning considers not only the registers but also the stack (and any spilled
+registers it may hold).  They must all be safe for the branch to be pruned.
+This is implemented in states_equal().
+
+Understanding eBPF verifier messages
+------------------------------------
+
+The following are few examples of invalid eBPF programs and verifier error
+messages as seen in the log:
+
+Program with unreachable instructions:
+static struct bpf_insn prog[] = {
+  BPF_EXIT_INSN(),
+  BPF_EXIT_INSN(),
+};
+Error:
+  unreachable insn 1
+
+Program that reads uninitialized register:
+  BPF_MOV64_REG(BPF_REG_0, BPF_REG_2),
+  BPF_EXIT_INSN(),
+Error:
+  0: (bf) r0 = r2
+  R2 !read_ok
+
+Program that doesn't initialize R0 before exiting:
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_1),
+  BPF_EXIT_INSN(),
+Error:
+  0: (bf) r2 = r1
+  1: (95) exit
+  R0 !read_ok
+
+Program that accesses stack out of bounds:
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, 8, 0),
+  BPF_EXIT_INSN(),
+Error:
+  0: (7a) *(u64 *)(r10 +8) = 0
+  invalid stack off=8 size=8
+
+Program that doesn't initialize stack before passing its address into function:
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_EXIT_INSN(),
+Error:
+  0: (bf) r2 = r10
+  1: (07) r2 += -8
+  2: (b7) r1 = 0x0
+  3: (85) call 1
+  invalid indirect read from stack off -8+0 size 8
+
+Program that uses invalid map_fd=0 while calling to map_lookup_elem() function:
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_EXIT_INSN(),
+Error:
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 0x0
+  4: (85) call 1
+  fd 0 is not pointing to valid bpf_map
+
+Program that doesn't check return value of map_lookup_elem() before accessing
+map element:
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+  BPF_EXIT_INSN(),
+Error:
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 0x0
+  4: (85) call 1
+  5: (7a) *(u64 *)(r0 +0) = 0
+  R0 invalid mem access 'map_value_or_null'
+
+Program that correctly checks map_lookup_elem() returned value for NULL, but
+accesses the memory with incorrect alignment:
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 4, 0),
+  BPF_EXIT_INSN(),
+Error:
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 1
+  4: (85) call 1
+  5: (15) if r0 == 0x0 goto pc+1
+   R0=map_ptr R10=fp
+  6: (7a) *(u64 *)(r0 +4) = 0
+  misaligned access off 4 size 8
+
+Program that correctly checks map_lookup_elem() returned value for NULL and
+accesses memory with correct alignment in one side of 'if' branch, but fails
+to do so in the other side of 'if' branch:
+  BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_LD_MAP_FD(BPF_REG_1, 0),
+  BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+  BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 0),
+  BPF_EXIT_INSN(),
+  BPF_ST_MEM(BPF_DW, BPF_REG_0, 0, 1),
+  BPF_EXIT_INSN(),
+Error:
+  0: (7a) *(u64 *)(r10 -8) = 0
+  1: (bf) r2 = r10
+  2: (07) r2 += -8
+  3: (b7) r1 = 1
+  4: (85) call 1
+  5: (15) if r0 == 0x0 goto pc+2
+   R0=map_ptr R10=fp
+  6: (7a) *(u64 *)(r0 +0) = 0
+  7: (95) exit
+
+  from 5 to 8: R0=imm0 R10=fp
+  8: (7a) *(u64 *)(r0 +0) = 1
+  R0 invalid mem access 'imm'
+
+Testing
+-------
+
+Next to the BPF toolchain, the kernel also ships a test module that contains
+various test cases for classic and internal BPF that can be executed against
+the BPF interpreter and JIT compiler. It can be found in lib/test_bpf.c and
+enabled via Kconfig:
+
+  CONFIG_TEST_BPF=m
+
+After the module has been built and installed, the test suite can be executed
+via insmod or modprobe against 'test_bpf' module. Results of the test cases
+including timings in nsec can be found in the kernel log (dmesg).
+
+Misc
+----
+
+Also trinity, the Linux syscall fuzzer, has built-in support for BPF and
+SECCOMP-BPF kernel fuzzing.
+
+Written by
+----------
+
+The document was written in the hope that it is found useful and in order
+to give potential BPF hackers or security auditors a better overview of
+the underlying architecture.
+
+Jay Schulist <jschlst@samba.org>
+Daniel Borkmann <daniel@iogearbox.net>
+Alexei Starovoitov <ast@kernel.org>
diff --git a/Documentation/networking/fore200e.txt b/Documentation/networking/fore200e.txt
new file mode 100644
index 000000000..1f98f62b4
--- /dev/null
+++ b/Documentation/networking/fore200e.txt
@@ -0,0 +1,64 @@
+
+FORE Systems PCA-200E/SBA-200E ATM NIC driver
+---------------------------------------------
+
+This driver adds support for the FORE Systems 200E-series ATM adapters
+to the Linux operating system. It is based on the earlier PCA-200E driver
+written by Uwe Dannowski.
+
+The driver simultaneously supports PCA-200E and SBA-200E adapters on
+i386, alpha (untested), powerpc, sparc and sparc64 archs.
+
+The intent is to enable the use of different models of FORE adapters at the
+same time, by hosts that have several bus interfaces (such as PCI+SBUS,
+or PCI+EISA).
+
+Only PCI and SBUS devices are currently supported by the driver, but support
+for other bus interfaces such as EISA should not be too hard to add.
+
+
+Firmware Copyright Notice
+-------------------------
+
+Please read the fore200e_firmware_copyright file present
+in the linux/drivers/atm directory for details and restrictions.
+
+
+Firmware Updates
+----------------
+
+The FORE Systems 200E-series driver is shipped with firmware data being 
+uploaded to the ATM adapters at system boot time or at module loading time. 
+The supplied firmware images should work with all adapters.
+
+However, if you encounter problems (the firmware doesn't start or the driver
+is unable to read the PROM data), you may consider trying another firmware
+version. Alternative binary firmware images can be found somewhere on the
+ForeThought CD-ROM supplied with your adapter by FORE Systems.
+
+You can also get the latest firmware images from FORE Systems at
+https://en.wikipedia.org/wiki/FORE_Systems. Register TACTics Online and go to
+the 'software updates' pages. The firmware binaries are part of
+the various ForeThought software distributions.
+
+Notice that different versions of the PCA-200E firmware exist, depending
+on the endianness of the host architecture. The driver is shipped with
+both little and big endian PCA firmware images.
+
+Name and location of the new firmware images can be set at kernel
+configuration time:
+
+1. Copy the new firmware binary files (with .bin, .bin1 or .bin2 suffix)
+   to some directory, such as linux/drivers/atm.
+
+2. Reconfigure your kernel to set the new firmware name and location.
+   Expected pathnames are absolute or relative to the drivers/atm directory.
+
+3. Rebuild and re-install your kernel or your module.
+
+
+Feedback
+--------
+
+Feedback is welcome. Please send success stories/bug reports/
+patches/improvement/comments/flames to <lizzi@cnam.fr>.
diff --git a/Documentation/networking/framerelay.txt b/Documentation/networking/framerelay.txt
new file mode 100644
index 000000000..1a0b72044
--- /dev/null
+++ b/Documentation/networking/framerelay.txt
@@ -0,0 +1,39 @@
+Frame Relay (FR) support for linux is built into a two tiered system of device 
+drivers.  The upper layer implements RFC1490 FR specification, and uses the
+Data Link Connection Identifier (DLCI) as its hardware address.  Usually these
+are assigned by your network supplier, they give you the number/numbers of
+the Virtual Connections (VC) assigned to you.
+
+Each DLCI is a point-to-point link between your machine and a remote one.
+As such, a separate device is needed to accommodate the routing.  Within the
+net-tools archives is 'dlcicfg'.  This program will communicate with the
+base "DLCI" device, and create new net devices named 'dlci00', 'dlci01'... 
+The configuration script will ask you how many DLCIs you need, as well as
+how many DLCIs you want to assign to each Frame Relay Access Device (FRAD).
+
+The DLCI uses a number of function calls to communicate with the FRAD, all
+of which are stored in the FRAD's private data area.  assoc/deassoc, 
+activate/deactivate and dlci_config.  The DLCI supplies a receive function
+to the FRAD to accept incoming packets.
+
+With this initial offering, only 1 FRAD driver is available.  With many thanks
+to Sangoma Technologies, David Mandelstam & Gene Kozin, the S502A, S502E & 
+S508 are supported.  This driver is currently set up for only FR, but as 
+Sangoma makes more firmware modules available, it can be updated to provide
+them as well.
+
+Configuration of the FRAD makes use of another net-tools program, 'fradcfg'.
+This program makes use of a configuration file (which dlcicfg can also read)
+to specify the types of boards to be configured as FRADs, as well as perform
+any board specific configuration.  The Sangoma module of fradcfg loads the
+FR firmware into the card, sets the irq/port/memory information, and provides
+an initial configuration.
+
+Additional FRAD device drivers can be added as hardware is available.
+
+At this time, the dlcicfg and fradcfg programs have not been incorporated into
+the net-tools distribution.  They can be found at ftp.invlogic.com, in 
+/pub/linux.  Note that with OS/2 FTPD, you end up in /pub by default, so just
+use 'cd linux'.  v0.10 is for use on pre-2.0.3 and earlier, v0.15 is for 
+pre-2.0.4 and later.
+
diff --git a/Documentation/networking/gen_stats.txt b/Documentation/networking/gen_stats.txt
new file mode 100644
index 000000000..179b18ce4
--- /dev/null
+++ b/Documentation/networking/gen_stats.txt
@@ -0,0 +1,119 @@
+Generic networking statistics for netlink users
+======================================================================
+
+Statistic counters are grouped into structs:
+
+Struct               TLV type              Description
+----------------------------------------------------------------------
+gnet_stats_basic     TCA_STATS_BASIC       Basic statistics
+gnet_stats_rate_est  TCA_STATS_RATE_EST    Rate estimator
+gnet_stats_queue     TCA_STATS_QUEUE       Queue statistics
+none                 TCA_STATS_APP         Application specific
+
+
+Collecting:
+-----------
+
+Declare the statistic structs you need:
+struct mystruct {
+	struct gnet_stats_basic	bstats;
+	struct gnet_stats_queue	qstats;
+	...
+};
+
+Update statistics, in dequeue() methods only, (while owning qdisc->running)
+mystruct->tstats.packet++;
+mystruct->qstats.backlog += skb->pkt_len;
+
+
+Export to userspace (Dump):
+---------------------------
+
+my_dumping_routine(struct sk_buff *skb, ...)
+{
+	struct gnet_dump dump;
+
+	if (gnet_stats_start_copy(skb, TCA_STATS2, &mystruct->lock, &dump,
+				  TCA_PAD) < 0)
+		goto rtattr_failure;
+
+	if (gnet_stats_copy_basic(&dump, &mystruct->bstats) < 0 ||
+	    gnet_stats_copy_queue(&dump, &mystruct->qstats) < 0 ||
+		gnet_stats_copy_app(&dump, &xstats, sizeof(xstats)) < 0)
+		goto rtattr_failure;
+
+	if (gnet_stats_finish_copy(&dump) < 0)
+		goto rtattr_failure;
+	...
+}
+
+TCA_STATS/TCA_XSTATS backward compatibility:
+--------------------------------------------
+
+Prior users of struct tc_stats and xstats can maintain backward
+compatibility by calling the compat wrappers to keep providing the
+existing TLV types.
+
+my_dumping_routine(struct sk_buff *skb, ...)
+{
+    if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS,
+				     TCA_XSTATS, &mystruct->lock, &dump,
+				     TCA_PAD) < 0)
+		goto rtattr_failure;
+	...
+}
+
+A struct tc_stats will be filled out during gnet_stats_copy_* calls
+and appended to the skb. TCA_XSTATS is provided if gnet_stats_copy_app
+was called.
+
+
+Locking:
+--------
+
+Locks are taken before writing and released once all statistics have
+been written. Locks are always released in case of an error. You
+are responsible for making sure that the lock is initialized.
+
+
+Rate Estimator:
+--------------
+
+0) Prepare an estimator attribute. Most likely this would be in user
+   space. The value of this TLV should contain a tc_estimator structure.
+   As usual, such a TLV needs to be 32 bit aligned and therefore the
+   length needs to be appropriately set, etc. The estimator interval
+   and ewma log need to be converted to the appropriate values.
+   tc_estimator.c::tc_setup_estimator() is advisable to be used as the
+   conversion routine. It does a few clever things. It takes a time
+   interval in microsecs, a time constant also in microsecs and a struct
+   tc_estimator to  be populated. The returned tc_estimator can be
+   transported to the kernel.  Transfer such a structure in a TLV of type
+   TCA_RATE to your code in the kernel.
+
+In the kernel when setting up:
+1) make sure you have basic stats and rate stats setup first.
+2) make sure you have initialized stats lock that is used to setup such
+   stats.
+3) Now initialize a new estimator:
+
+   int ret = gen_new_estimator(my_basicstats,my_rate_est_stats,
+       mystats_lock, attr_with_tcestimator_struct);
+
+   if ret == 0
+       success
+   else
+       failed
+
+From now on, every time you dump my_rate_est_stats it will contain
+up-to-date info.
+
+Once you are done, call gen_kill_estimator(my_basicstats,
+my_rate_est_stats) Make sure that my_basicstats and my_rate_est_stats
+are still valid (i.e still exist) at the time of making this call.
+
+
+Authors:
+--------
+Thomas Graf <tgraf@suug.ch>
+Jamal Hadi Salim <hadi@cyberus.ca>
diff --git a/Documentation/networking/generic-hdlc.txt b/Documentation/networking/generic-hdlc.txt
new file mode 100644
index 000000000..4eb3cc40b
--- /dev/null
+++ b/Documentation/networking/generic-hdlc.txt
@@ -0,0 +1,132 @@
+Generic HDLC layer
+Krzysztof Halasa <khc@pm.waw.pl>
+
+
+Generic HDLC layer currently supports:
+1. Frame Relay (ANSI, CCITT, Cisco and no LMI)
+   - Normal (routed) and Ethernet-bridged (Ethernet device emulation)
+     interfaces can share a single PVC.
+   - ARP support (no InARP support in the kernel - there is an
+     experimental InARP user-space daemon available on:
+     http://www.kernel.org/pub/linux/utils/net/hdlc/).
+2. raw HDLC - either IP (IPv4) interface or Ethernet device emulation
+3. Cisco HDLC
+4. PPP
+5. X.25 (uses X.25 routines).
+
+Generic HDLC is a protocol driver only - it needs a low-level driver
+for your particular hardware.
+
+Ethernet device emulation (using HDLC or Frame-Relay PVC) is compatible
+with IEEE 802.1Q (VLANs) and 802.1D (Ethernet bridging).
+
+
+Make sure the hdlc.o and the hardware driver are loaded. It should
+create a number of "hdlc" (hdlc0 etc) network devices, one for each
+WAN port. You'll need the "sethdlc" utility, get it from:
+	http://www.kernel.org/pub/linux/utils/net/hdlc/
+
+Compile sethdlc.c utility:
+	gcc -O2 -Wall -o sethdlc sethdlc.c
+Make sure you're using a correct version of sethdlc for your kernel.
+
+Use sethdlc to set physical interface, clock rate, HDLC mode used,
+and add any required PVCs if using Frame Relay.
+Usually you want something like:
+
+	sethdlc hdlc0 clock int rate 128000
+	sethdlc hdlc0 cisco interval 10 timeout 25
+or
+	sethdlc hdlc0 rs232 clock ext
+	sethdlc hdlc0 fr lmi ansi
+	sethdlc hdlc0 create 99
+	ifconfig hdlc0 up
+	ifconfig pvc0 localIP pointopoint remoteIP
+
+In Frame Relay mode, ifconfig master hdlc device up (without assigning
+any IP address to it) before using pvc devices.
+
+
+Setting interface:
+
+* v35 | rs232 | x21 | t1 | e1 - sets physical interface for a given port
+                                if the card has software-selectable interfaces
+  loopback - activate hardware loopback (for testing only)
+* clock ext - both RX clock and TX clock external
+* clock int - both RX clock and TX clock internal
+* clock txint - RX clock external, TX clock internal
+* clock txfromrx - RX clock external, TX clock derived from RX clock
+* rate - sets clock rate in bps (for "int" or "txint" clock only)
+
+
+Setting protocol:
+
+* hdlc - sets raw HDLC (IP-only) mode
+  nrz / nrzi / fm-mark / fm-space / manchester - sets transmission code
+  no-parity / crc16 / crc16-pr0 (CRC16 with preset zeros) / crc32-itu
+  crc16-itu (CRC16 with ITU-T polynomial) / crc16-itu-pr0 - sets parity
+
+* hdlc-eth - Ethernet device emulation using HDLC. Parity and encoding
+  as above.
+
+* cisco - sets Cisco HDLC mode (IP, IPv6 and IPX supported)
+  interval - time in seconds between keepalive packets
+  timeout - time in seconds after last received keepalive packet before
+            we assume the link is down
+
+* ppp - sets synchronous PPP mode
+
+* x25 - sets X.25 mode
+
+* fr - Frame Relay mode
+  lmi ansi / ccitt / cisco / none - LMI (link management) type
+  dce - Frame Relay DCE (network) side LMI instead of default DTE (user).
+  It has nothing to do with clocks!
+  t391 - link integrity verification polling timer (in seconds) - user
+  t392 - polling verification timer (in seconds) - network
+  n391 - full status polling counter - user
+  n392 - error threshold - both user and network
+  n393 - monitored events count - both user and network
+
+Frame-Relay only:
+* create n | delete n - adds / deletes PVC interface with DLCI #n.
+  Newly created interface will be named pvc0, pvc1 etc.
+
+* create ether n | delete ether n - adds a device for Ethernet-bridged
+  frames. The device will be named pvceth0, pvceth1 etc.
+
+
+
+
+Board-specific issues
+---------------------
+
+n2.o and c101.o need parameters to work:
+
+	insmod n2 hw=io,irq,ram,ports[:io,irq,...]
+example:
+	insmod n2 hw=0x300,10,0xD0000,01
+
+or
+	insmod c101 hw=irq,ram[:irq,...]
+example:
+	insmod c101 hw=9,0xdc000
+
+If built into the kernel, these drivers need kernel (command line) parameters:
+	n2.hw=io,irq,ram,ports:...
+or
+	c101.hw=irq,ram:...
+
+
+
+If you have a problem with N2, C101 or PLX200SYN card, you can issue the
+"private" command to see port's packet descriptor rings (in kernel logs):
+
+	sethdlc hdlc0 private
+
+The hardware driver has to be build with #define DEBUG_RINGS.
+Attaching this info to bug reports would be helpful. Anyway, let me know
+if you have problems using this.
+
+For patches and other info look at:
+<http://www.kernel.org/pub/linux/utils/net/hdlc/>.
diff --git a/Documentation/networking/generic_netlink.txt b/Documentation/networking/generic_netlink.txt
new file mode 100644
index 000000000..3e071115c
--- /dev/null
+++ b/Documentation/networking/generic_netlink.txt
@@ -0,0 +1,3 @@
+A wiki document on how to use Generic Netlink can be found here:
+
+ * http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto
diff --git a/Documentation/networking/gianfar.txt b/Documentation/networking/gianfar.txt
new file mode 100644
index 000000000..ba1daea7f
--- /dev/null
+++ b/Documentation/networking/gianfar.txt
@@ -0,0 +1,42 @@
+The Gianfar Ethernet Driver
+
+Author: Andy Fleming <afleming@freescale.com>
+Updated: 2005-07-28
+
+
+CHECKSUM OFFLOADING
+
+The eTSEC controller (first included in parts from late 2005 like
+the 8548) has the ability to perform TCP, UDP, and IP checksums
+in hardware.  The Linux kernel only offloads the TCP and UDP
+checksums (and always performs the pseudo header checksums), so
+the driver only supports checksumming for TCP/IP and UDP/IP
+packets.  Use ethtool to enable or disable this feature for RX
+and TX.
+
+VLAN
+
+In order to use VLAN, please consult Linux documentation on
+configuring VLANs.  The gianfar driver supports hardware insertion and
+extraction of VLAN headers, but not filtering.  Filtering will be
+done by the kernel.
+
+MULTICASTING
+
+The gianfar driver supports using the group hash table on the
+TSEC (and the extended hash table on the eTSEC) for multicast
+filtering.  On the eTSEC, the exact-match MAC registers are used
+before the hash tables.  See Linux documentation on how to join
+multicast groups.
+
+PADDING
+
+The gianfar driver supports padding received frames with 2 bytes
+to align the IP header to a 16-byte boundary, when supported by
+hardware.
+
+ETHTOOL
+
+The gianfar driver supports the use of ethtool for many
+configuration options.  You must run ethtool only on currently
+open interfaces.  See ethtool documentation for details.
diff --git a/Documentation/networking/gtp.txt b/Documentation/networking/gtp.txt
new file mode 100644
index 000000000..6966bbec1
--- /dev/null
+++ b/Documentation/networking/gtp.txt
@@ -0,0 +1,230 @@
+The Linux kernel GTP tunneling module
+======================================================================
+Documentation by Harald Welte <laforge@gnumonks.org> and
+                 Andreas Schultz <aschultz@tpip.net>
+
+In 'drivers/net/gtp.c' you are finding a kernel-level implementation
+of a GTP tunnel endpoint.
+
+== What is GTP ==
+
+GTP is the Generic Tunnel Protocol, which is a 3GPP protocol used for
+tunneling User-IP payload between a mobile station (phone, modem)
+and the interconnection between an external packet data network (such
+as the internet).
+
+So when you start a 'data connection' from your mobile phone, the
+phone will use the control plane to signal for the establishment of
+such a tunnel between that external data network and the phone.  The
+tunnel endpoints thus reside on the phone and in the gateway.  All
+intermediate nodes just transport the encapsulated packet.
+
+The phone itself does not implement GTP but uses some other
+technology-dependent protocol stack for transmitting the user IP
+payload, such as LLC/SNDCP/RLC/MAC.
+
+At some network element inside the cellular operator infrastructure
+(SGSN in case of GPRS/EGPRS or classic UMTS, hNodeB in case of a 3G
+femtocell, eNodeB in case of 4G/LTE), the cellular protocol stacking
+is translated into GTP *without breaking the end-to-end tunnel*.  So
+intermediate nodes just perform some specific relay function.
+
+At some point the GTP packet ends up on the so-called GGSN (GSM/UMTS)
+or P-GW (LTE), which terminates the tunnel, decapsulates the packet
+and forwards it onto an external packet data network.  This can be
+public internet, but can also be any private IP network (or even
+theoretically some non-IP network like X.25).
+
+You can find the protocol specification in 3GPP TS 29.060, available
+publicly via the 3GPP website at http://www.3gpp.org/DynaReport/29060.htm
+
+A direct PDF link to v13.6.0 is provided for convenience below:
+http://www.etsi.org/deliver/etsi_ts/129000_129099/129060/13.06.00_60/ts_129060v130600p.pdf
+
+== The Linux GTP tunnelling module ==
+
+The module implements the function of a tunnel endpoint, i.e. it is
+able to decapsulate tunneled IP packets in the uplink originated by
+the phone, and encapsulate raw IP packets received from the external
+packet network in downlink towards the phone.
+
+It *only* implements the so-called 'user plane', carrying the User-IP
+payload, called GTP-U.  It does not implement the 'control plane',
+which is a signaling protocol used for establishment and teardown of
+GTP tunnels (GTP-C).
+
+So in order to have a working GGSN/P-GW setup, you will need a
+userspace program that implements the GTP-C protocol and which then
+uses the netlink interface provided by the GTP-U module in the kernel
+to configure the kernel module.
+
+This split architecture follows the tunneling modules of other
+protocols, e.g. PPPoE or L2TP, where you also run a userspace daemon
+to handle the tunnel establishment, authentication etc. and only the
+data plane is accelerated inside the kernel.
+
+Don't be confused by terminology:  The GTP User Plane goes through
+kernel accelerated path, while the GTP Control Plane goes to
+Userspace :)
+
+The official homepage of the module is at
+https://osmocom.org/projects/linux-kernel-gtp-u/wiki
+
+== Userspace Programs with Linux Kernel GTP-U support ==
+
+At the time of this writing, there are at least two Free Software
+implementations that implement GTP-C and can use the netlink interface
+to make use of the Linux kernel GTP-U support:
+
+* OpenGGSN (classic 2G/3G GGSN in C):
+  https://osmocom.org/projects/openggsn/wiki/OpenGGSN
+
+* ergw (GGSN + P-GW in Erlang):
+  https://github.com/travelping/ergw
+
+== Userspace Library / Command Line Utilities ==
+
+There is a userspace library called 'libgtpnl' which is based on
+libmnl and which implements a C-language API towards the netlink
+interface provided by the Kernel GTP module:
+
+http://git.osmocom.org/libgtpnl/
+
+== Protocol Versions ==
+
+There are two different versions of GTP-U: v0 [GSM TS 09.60] and v1
+[3GPP TS 29.281].  Both are implemented in the Kernel GTP module.
+Version 0 is a legacy version, and deprecated from recent 3GPP
+specifications.
+
+GTP-U uses UDP for transporting PDUs.  The receiving UDP port is 2151
+for GTPv1-U and 3386 for GTPv0-U.
+
+There are three versions of GTP-C: v0, v1, and v2.  As the kernel
+doesn't implement GTP-C, we don't have to worry about this.  It's the
+responsibility of the control plane implementation in userspace to
+implement that.
+
+== IPv6 ==
+
+The 3GPP specifications indicate either IPv4 or IPv6 can be used both
+on the inner (user) IP layer, or on the outer (transport) layer.
+
+Unfortunately, the Kernel module currently supports IPv6 neither for
+the User IP payload, nor for the outer IP layer.  Patches or other
+Contributions to fix this are most welcome!
+
+== Mailing List ==
+
+If yo have questions regarding how to use the Kernel GTP module from
+your own software, or want to contribute to the code, please use the
+osmocom-net-grps mailing list for related discussion. The list can be
+reached at osmocom-net-gprs@lists.osmocom.org and the mailman
+interface for managing your subscription is at
+https://lists.osmocom.org/mailman/listinfo/osmocom-net-gprs
+
+== Issue Tracker ==
+
+The Osmocom project maintains an issue tracker for the Kernel GTP-U
+module at
+https://osmocom.org/projects/linux-kernel-gtp-u/issues
+
+== History / Acknowledgements ==
+
+The Module was originally created in 2012 by Harald Welte, but never
+completed.  Pablo came in to finish the mess Harald left behind.  But
+doe to a lack of user interest, it never got merged.
+
+In 2015, Andreas Schultz came to the rescue and fixed lots more bugs,
+extended it with new features and finally pushed all of us to get it
+mainline, where it was merged in 4.7.0.
+
+== Architectural Details ==
+
+=== Local GTP-U entity and tunnel identification ===
+
+GTP-U uses UDP for transporting PDU's. The receiving UDP port is 2152
+for GTPv1-U and 3386 for GTPv0-U.
+
+There is only one GTP-U entity (and therefor SGSN/GGSN/S-GW/PDN-GW
+instance) per IP address. Tunnel Endpoint Identifier (TEID) are unique
+per GTP-U entity.
+
+A specific tunnel is only defined by the destination entity. Since the
+destination port is constant, only the destination IP and TEID define
+a tunnel. The source IP and Port have no meaning for the tunnel.
+
+Therefore:
+
+  * when sending, the remote entity is defined by the remote IP and
+    the tunnel endpoint id. The source IP and port have no meaning and
+    can be changed at any time.
+
+  * when receiving the local entity is defined by the local
+    destination IP and the tunnel endpoint id. The source IP and port
+    have no meaning and can change at any time.
+
+[3GPP TS 29.281] Section 4.3.0 defines this so:
+
+> The TEID in the GTP-U header is used to de-multiplex traffic
+> incoming from remote tunnel endpoints so that it is delivered to the
+> User plane entities in a way that allows multiplexing of different
+> users, different packet protocols and different QoS levels.
+> Therefore no two remote GTP-U endpoints shall send traffic to a
+> GTP-U protocol entity using the same TEID value except
+> for data forwarding as part of mobility procedures.
+
+The definition above only defines that two remote GTP-U endpoints
+*should not* send to the same TEID, it *does not* forbid or exclude
+such a scenario. In fact, the mentioned mobility procedures make it
+necessary that the GTP-U entity accepts traffic for TEIDs from
+multiple or unknown peers.
+
+Therefore, the receiving side identifies tunnels exclusively based on
+TEIDs, not based on the source IP!
+
+== APN vs. Network Device ==
+
+The GTP-U driver creates a Linux network device for each Gi/SGi
+interface.
+
+[3GPP TS 29.281] calls the Gi/SGi reference point an interface. This
+may lead to the impression that the GGSN/P-GW can have only one such
+interface.
+
+Correct is that the Gi/SGi reference point defines the interworking
+between +the 3GPP packet domain (PDN) based on GTP-U tunnel and IP
+based networks.
+
+There is no provision in any of the 3GPP documents that limits the
+number of Gi/SGi interfaces implemented by a GGSN/P-GW.
+
+[3GPP TS 29.061] Section 11.3 makes it clear that the selection of a
+specific Gi/SGi interfaces is made through the Access Point Name
+(APN):
+
+> 2. each private network manages its own addressing. In general this
+>    will result in different private networks having overlapping
+>    address ranges. A logically separate connection (e.g. an IP in IP
+>    tunnel or layer 2 virtual circuit) is used between the GGSN/P-GW
+>    and each private network.
+>
+>    In this case the IP address alone is not necessarily unique.  The
+>    pair of values, Access Point Name (APN) and IPv4 address and/or
+>    IPv6 prefixes, is unique.
+
+In order to support the overlapping address range use case, each APN
+is mapped to a separate Gi/SGi interface (network device).
+
+NOTE: The Access Point Name is purely a control plane (GTP-C) concept.
+At the GTP-U level, only Tunnel Endpoint Identifiers are present in
+GTP-U packets and network devices are known
+
+Therefore for a given UE the mapping in IP to PDN network is:
+  * network device + MS IP -> Peer IP + Peer TEID,
+
+and from PDN to IP network:
+  * local GTP-U IP + TEID  -> network device
+
+Furthermore, before a received T-PDU is injected into the network
+device the MS IP is checked against the IP recorded in PDP context.
diff --git a/Documentation/networking/hinic.txt b/Documentation/networking/hinic.txt
new file mode 100644
index 000000000..989366a40
--- /dev/null
+++ b/Documentation/networking/hinic.txt
@@ -0,0 +1,125 @@
+Linux Kernel Driver for Huawei Intelligent NIC(HiNIC) family
+============================================================
+
+Overview:
+=========
+HiNIC is a network interface card for the Data Center Area.
+
+The driver supports a range of link-speed devices (10GbE, 25GbE, 40GbE, etc.).
+The driver supports also a negotiated and extendable feature set.
+
+Some HiNIC devices support SR-IOV. This driver is used for Physical Function
+(PF).
+
+HiNIC devices support MSI-X interrupt vector for each Tx/Rx queue and
+adaptive interrupt moderation.
+
+HiNIC devices support also various offload features such as checksum offload,
+TCP Transmit Segmentation Offload(TSO), Receive-Side Scaling(RSS) and
+LRO(Large Receive Offload).
+
+
+Supported PCI vendor ID/device IDs:
+===================================
+
+19e5:1822 - HiNIC PF
+
+
+Driver Architecture and Source Code:
+====================================
+
+hinic_dev - Implement a Logical Network device that is independent from
+specific HW details about HW data structure formats.
+
+hinic_hwdev - Implement the HW details of the device and include the components
+for accessing the PCI NIC.
+
+hinic_hwdev contains the following components:
+===============================================
+
+HW Interface:
+=============
+
+The interface for accessing the pci device (DMA memory and PCI BARs).
+(hinic_hw_if.c, hinic_hw_if.h)
+
+Configuration Status Registers Area that describes the HW Registers on the
+configuration and status BAR0. (hinic_hw_csr.h)
+
+MGMT components:
+================
+
+Asynchronous Event Queues(AEQs) - The event queues for receiving messages from
+the MGMT modules on the cards. (hinic_hw_eqs.c, hinic_hw_eqs.h)
+
+Application Programmable Interface commands(API CMD) - Interface for sending
+MGMT commands to the card. (hinic_hw_api_cmd.c, hinic_hw_api_cmd.h)
+
+Management (MGMT) - the PF to MGMT channel that uses API CMD for sending MGMT
+commands to the card and receives notifications from the MGMT modules on the
+card by AEQs. Also set the addresses of the IO CMDQs in HW.
+(hinic_hw_mgmt.c, hinic_hw_mgmt.h)
+
+IO components:
+==============
+
+Completion Event Queues(CEQs) - The completion Event Queues that describe IO
+tasks that are finished. (hinic_hw_eqs.c, hinic_hw_eqs.h)
+
+Work Queues(WQ) - Contain the memory and operations for use by CMD queues and
+the Queue Pairs. The WQ is a Memory Block in a Page. The Block contains
+pointers to Memory Areas that are the Memory for the Work Queue Elements(WQEs).
+(hinic_hw_wq.c, hinic_hw_wq.h)
+
+Command Queues(CMDQ) - The queues for sending commands for IO management and is
+used to set the QPs addresses in HW. The commands completion events are
+accumulated on the CEQ that is configured to receive the CMDQ completion events.
+(hinic_hw_cmdq.c, hinic_hw_cmdq.h)
+
+Queue Pairs(QPs) - The HW Receive and Send queues for Receiving and Transmitting
+Data. (hinic_hw_qp.c, hinic_hw_qp.h, hinic_hw_qp_ctxt.h)
+
+IO - de/constructs all the IO components. (hinic_hw_io.c, hinic_hw_io.h)
+
+HW device:
+==========
+
+HW device - de/constructs the HW Interface, the MGMT components on the
+initialization of the driver and the IO components on the case of Interface
+UP/DOWN Events. (hinic_hw_dev.c, hinic_hw_dev.h)
+
+
+hinic_dev contains the following components:
+===============================================
+
+PCI ID table - Contains the supported PCI Vendor/Device IDs.
+(hinic_pci_tbl.h)
+
+Port Commands - Send commands to the HW device for port management
+(MAC, Vlan, MTU, ...). (hinic_port.c, hinic_port.h)
+
+Tx Queues - Logical Tx Queues that use the HW Send Queues for transmit.
+The Logical Tx queue is not dependent on the format of the HW Send Queue.
+(hinic_tx.c, hinic_tx.h)
+
+Rx Queues - Logical Rx Queues that use the HW Receive Queues for receive.
+The Logical Rx queue is not dependent on the format of the HW Receive Queue.
+(hinic_rx.c, hinic_rx.h)
+
+hinic_dev - de/constructs the Logical Tx and Rx Queues.
+(hinic_main.c, hinic_dev.h)
+
+
+Miscellaneous:
+=============
+
+Common functions that are used by HW and Logical Device.
+(hinic_common.c, hinic_common.h)
+
+
+Support
+=======
+
+If an issue is identified with the released source code on the supported kernel
+with a supported adapter, email the specific information related to the issue to
+aviad.krawczyk@huawei.com.
diff --git a/Documentation/networking/i40e.txt b/Documentation/networking/i40e.txt
new file mode 100644
index 000000000..c2d6e1824
--- /dev/null
+++ b/Documentation/networking/i40e.txt
@@ -0,0 +1,190 @@
+Linux Base Driver for the Intel(R) Ethernet Controller XL710 Family
+===================================================================
+
+Intel i40e Linux driver.
+Copyright(c) 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Additional Configurations
+- Performance Tuning
+- Known Issues
+- Support
+
+
+Identifying Your Adapter
+========================
+
+The driver in this release is compatible with the Intel Ethernet
+Controller XL710 Family.
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/network/sb/CS-012904.htm
+
+
+Enabling the driver
+===================
+
+The driver is enabled via the standard kernel configuration system,
+using the make command:
+
+     make config/oldconfig/menuconfig/etc.
+
+The driver is located in the menu structure at:
+
+	-> Device Drivers
+	  -> Network device support (NETDEVICES [=y])
+	    -> Ethernet driver support
+	      -> Intel devices
+	        -> Intel(R) Ethernet Controller XL710 Family
+
+Additional Configurations
+=========================
+
+  Generic Receive Offload (GRO)
+  -----------------------------
+  The driver supports the in-kernel software implementation of GRO.  GRO has
+  shown that by coalescing Rx traffic into larger chunks of data, CPU
+  utilization can be significantly reduced when under large Rx load.  GRO is
+  an evolution of the previously-used LRO interface.  GRO is able to coalesce
+  other protocols besides TCP.  It's also safe to use with configurations that
+  are problematic for LRO, namely bridging and iSCSI.
+
+  Ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information. The latest
+  ethtool version is required for this functionality.
+
+  The latest release of ethtool can be found from
+  https://www.kernel.org/pub/software/network/ethtool
+
+
+  Flow Director n-ntuple traffic filters (FDir)
+  ---------------------------------------------
+  The driver utilizes the ethtool interface for configuring ntuple filters,
+  via "ethtool -N <device> <filter>".
+
+  The sctp4, ip4, udp4, and tcp4 flow types are supported with the standard
+  fields including src-ip, dst-ip, src-port and dst-port. The driver only
+  supports fully enabling or fully masking the fields, so use of the mask
+  fields for partial matches is not supported.
+
+  Additionally, the driver supports using the action to specify filters for a
+  Virtual Function. You can specify the action as a 64bit value, where the
+  lower 32 bits represents the queue number, while the next 8 bits represent
+  which VF. Note that 0 is the PF, so the VF identifier is offset by 1. For
+  example:
+
+    ... action 0x800000002 ...
+
+  Would indicate to direct traffic for Virtual Function 7 (8 minus 1) on queue
+  2 of that VF.
+
+  The driver also supports using the user-defined field to specify 2 bytes of
+  arbitrary data to match within the packet payload in addition to the regular
+  fields. The data is specified in the lower 32bits of the user-def field in
+  the following way:
+
+  +----------------------------+---------------------------+
+  | 31    28    24    20    16 | 15    12     8     4     0|
+  +----------------------------+---------------------------+
+  | offset into packet payload |  2 bytes of flexible data |
+  +----------------------------+---------------------------+
+
+  As an example,
+
+    ... user-def 0x4FFFF ....
+
+  means to match the value 0xFFFF 4 bytes into the packet payload. Note that
+  the offset is based on the beginning of the payload, and not the beginning
+  of the packet. Thus
+
+    flow-type tcp4 ... user-def 0x8BEAF ....
+
+  would match TCP/IPv4 packets which have the value 0xBEAF 8bytes into the
+  TCP/IPv4 payload.
+
+  For ICMP, the hardware parses the ICMP header as 4 bytes of header and 4
+  bytes of payload, so if you want to match an ICMP frames payload you may need
+  to add 4 to the offset in order to match the data.
+
+  Furthermore, the offset can only be up to a value of 64, as the hardware
+  will only read up to 64 bytes of data from the payload. It must also be even
+  as the flexible data is 2 bytes long and must be aligned to byte 0 of the
+  packet payload.
+
+  When programming filters, the hardware is limited to using a single input
+  set for each flow type. This means that it is an error to program two
+  different filters with the same type that don't match on the same fields.
+  Thus the second of the following two commands will fail:
+
+    ethtool -N <device> flow-type tcp4 src-ip 192.168.0.7 action 5
+    ethtool -N <device> flow-type tcp4 dst-ip 192.168.15.18 action 1
+
+  This is because the first filter will be accepted and reprogram the input
+  set for TCPv4 filters, but the second filter will be unable to reprogram the
+  input set until all the conflicting TCPv4 filters are first removed.
+
+  Note that the user-defined flexible offset is also considered part of the
+  input set and cannot be programmed separately for multiple filters of the
+  same type. However, the flexible data is not part of the input set and
+  multiple filters may use the same offset but match against different data.
+
+  Data Center Bridging (DCB)
+  --------------------------
+  DCB configuration is not currently supported.
+
+  FCoE
+  ----
+  The driver supports Fiber Channel over Ethernet (FCoE) and Data Center
+  Bridging (DCB) functionality. Configuring DCB and FCoE is outside the scope
+  of this driver doc. Refer to http://www.open-fcoe.org/ for FCoE project
+  information and http://www.open-lldp.org/ or email list
+  e1000-eedc@lists.sourceforge.net for DCB information.
+
+  MAC and VLAN anti-spoofing feature
+  ----------------------------------
+  When a malicious driver attempts to send a spoofed packet, it is dropped by
+  the hardware and not transmitted.  An interrupt is sent to the PF driver
+  notifying it of the spoof attempt.
+
+  When a spoofed packet is detected the PF driver will send the following
+  message to the system log (displayed by  the "dmesg" command):
+
+  Spoof event(s) detected on VF (n)
+
+  Where n=the VF that attempted to do the spoofing.
+
+
+Performance Tuning
+==================
+
+An excellent article on performance tuning can be found at:
+
+http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf
+
+
+Known Issues
+============
+
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://e1000.sourceforge.net
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sourceforge.net and copy
+netdev@vger.kernel.org.
diff --git a/Documentation/networking/i40evf.txt b/Documentation/networking/i40evf.txt
new file mode 100644
index 000000000..e9b3035b9
--- /dev/null
+++ b/Documentation/networking/i40evf.txt
@@ -0,0 +1,54 @@
+Linux* Base Driver for Intel(R) Network Connection
+==================================================
+
+Intel Ethernet Adaptive Virtual Function Linux driver.
+Copyright(c) 2013-2017 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Known Issues/Troubleshooting
+- Support
+
+This file describes the i40evf Linux* Base Driver.
+
+The i40evf driver supports the below mentioned virtual function
+devices and can only be activated on kernels running the i40e or
+newer Physical Function (PF) driver compiled with CONFIG_PCI_IOV.
+The i40evf driver requires CONFIG_PCI_MSI to be enabled.
+
+The guest OS loading the i40evf driver must support MSI-X interrupts.
+
+Supported Hardware
+==================
+Intel XL710 X710 Virtual Function
+Intel Ethernet Adaptive Virtual Function
+Intel X722 Virtual Function
+
+Identifying Your Adapter
+========================
+
+For more information on how to identify your adapter, go to the
+Adapter & Driver ID Guide at:
+
+    http://support.intel.com/support/go/network/adapter/idguide.htm
+
+Known Issues/Troubleshooting
+============================
+
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/ice.txt b/Documentation/networking/ice.txt
new file mode 100644
index 000000000..6261c4637
--- /dev/null
+++ b/Documentation/networking/ice.txt
@@ -0,0 +1,39 @@
+Intel(R) Ethernet Connection E800 Series Linux Driver
+===================================================================
+
+Intel ice Linux driver.
+Copyright(c) 2018 Intel Corporation.
+
+Contents
+========
+- Enabling the driver
+- Support
+
+The driver in this release supports Intel's E800 Series of products. For
+more information, visit Intel's support page at http://support.intel.com.
+
+Enabling the driver
+===================
+
+The driver is enabled via the standard kernel configuration system,
+using the make command:
+
+     Make oldconfig/silentoldconfig/menuconfig/etc.
+
+The driver is located in the menu structure at:
+
+	-> Device Drivers
+	  -> Network device support (NETDEVICES [=y])
+	    -> Ethernet driver support
+	      -> Intel devices
+	        -> Intel(R) Ethernet Connection E800 Series Support
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+If an issue is identified with the released source code, please email
+the maintainer listed in the MAINTAINERS file.
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
new file mode 100644
index 000000000..e74d8e1da
--- /dev/null
+++ b/Documentation/networking/ieee802154.txt
@@ -0,0 +1,177 @@
+
+		Linux IEEE 802.15.4 implementation
+
+
+Introduction
+============
+The IEEE 802.15.4 working group focuses on standardization of the bottom
+two layers: Medium Access Control (MAC) and Physical access (PHY). And there
+are mainly two options available for upper layers:
+ - ZigBee - proprietary protocol from the ZigBee Alliance
+ - 6LoWPAN - IPv6 networking over low rate personal area networks
+
+The goal of the Linux-wpan is to provide a complete implementation
+of the IEEE 802.15.4 and 6LoWPAN protocols. IEEE 802.15.4 is a stack
+of protocols for organizing Low-Rate Wireless Personal Area Networks.
+
+The stack is composed of three main parts:
+ - IEEE 802.15.4 layer;  We have chosen to use plain Berkeley socket API,
+   the generic Linux networking stack to transfer IEEE 802.15.4 data
+   messages and a special protocol over netlink for configuration/management
+ - MAC - provides access to shared channel and reliable data delivery
+ - PHY - represents device drivers
+
+
+Socket API
+==========
+
+int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
+.....
+
+The address family, socket addresses etc. are defined in the
+include/net/af_ieee802154.h header or in the special header
+in the userspace package (see either http://wpan.cakelab.org/ or the
+git tree at https://github.com/linux-wpan/wpan-tools).
+
+
+Kernel side
+=============
+
+Like with WiFi, there are several types of devices implementing IEEE 802.15.4.
+1) 'HardMAC'. The MAC layer is implemented in the device itself, the device
+   exports a management (e.g. MLME) and data API.
+2) 'SoftMAC' or just radio. These types of devices are just radio transceivers
+   possibly with some kinds of acceleration like automatic CRC computation and
+   comparation, automagic ACK handling, address matching, etc.
+
+Those types of devices require different approach to be hooked into Linux kernel.
+
+
+HardMAC
+=======
+
+See the header include/net/ieee802154_netdev.h. You have to implement Linux
+net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
+code via plain sk_buffs. On skb reception skb->cb must contain additional
+info as described in the struct ieee802154_mac_cb. During packet transmission
+the skb->cb is used to provide additional data to device's header_ops->create
+function. Be aware that this data can be overridden later (when socket code
+submits skb to qdisc), so if you need something from that cb later, you should
+store info in the skb->data on your own.
+
+To hook the MLME interface you have to populate the ml_priv field of your
+net_device with a pointer to struct ieee802154_mlme_ops instance. The fields
+assoc_req, assoc_resp, disassoc_req, start_req, and scan_req are optional.
+All other fields are required.
+
+
+SoftMAC
+=======
+
+The MAC is the middle layer in the IEEE 802.15.4 Linux stack. This moment it
+provides interface for drivers registration and management of slave interfaces.
+
+NOTE: Currently the only monitor device type is supported - it's IEEE 802.15.4
+stack interface for network sniffers (e.g. WireShark).
+
+This layer is going to be extended soon.
+
+See header include/net/mac802154.h and several drivers in
+drivers/net/ieee802154/.
+
+
+Device drivers API
+==================
+
+The include/net/mac802154.h defines following functions:
+ - struct ieee802154_hw *
+   ieee802154_alloc_hw(size_t priv_data_len, const struct ieee802154_ops *ops):
+   allocation of IEEE 802.15.4 compatible hardware device
+
+ - void ieee802154_free_hw(struct ieee802154_hw *hw):
+   freeing allocated hardware device
+
+ - int ieee802154_register_hw(struct ieee802154_hw *hw):
+   register PHY which is the allocated hardware device, in the system
+
+ - void ieee802154_unregister_hw(struct ieee802154_hw *hw):
+   freeing registered PHY
+
+ - void ieee802154_rx_irqsafe(struct ieee802154_hw *hw, struct sk_buff *skb,
+                              u8 lqi):
+   telling 802.15.4 module there is a new received frame in the skb with
+   the RF Link Quality Indicator (LQI) from the hardware device
+
+ - void ieee802154_xmit_complete(struct ieee802154_hw *hw, struct sk_buff *skb,
+                                 bool ifs_handling):
+   telling 802.15.4 module the frame in the skb is or going to be
+   transmitted through the hardware device
+
+The device driver must implement the following callbacks in the IEEE 802.15.4
+operations structure at least:
+struct ieee802154_ops {
+	...
+	int	(*start)(struct ieee802154_hw *hw);
+	void	(*stop)(struct ieee802154_hw *hw);
+	...
+	int	(*xmit_async)(struct ieee802154_hw *hw, struct sk_buff *skb);
+	int	(*ed)(struct ieee802154_hw *hw, u8 *level);
+	int	(*set_channel)(struct ieee802154_hw *hw, u8 page, u8 channel);
+	...
+};
+
+ - int start(struct ieee802154_hw *hw):
+   handler that 802.15.4 module calls for the hardware device initialization.
+
+ - void stop(struct ieee802154_hw *hw):
+   handler that 802.15.4 module calls for the hardware device cleanup.
+
+ - int xmit_async(struct ieee802154_hw *hw, struct sk_buff *skb):
+   handler that 802.15.4 module calls for each frame in the skb going to be
+   transmitted through the hardware device.
+
+ - int ed(struct ieee802154_hw *hw, u8 *level):
+   handler that 802.15.4 module calls for Energy Detection from the hardware
+   device.
+
+ - int set_channel(struct ieee802154_hw *hw, u8 page, u8 channel):
+   set radio for listening on specific channel of the hardware device.
+
+Moreover IEEE 802.15.4 device operations structure should be filled.
+
+Fake drivers
+============
+
+In addition there is a driver available which simulates a real device with
+SoftMAC (fakelb - IEEE 802.15.4 loopback driver) interface. This option
+provides a possibility to test and debug the stack without usage of real hardware.
+
+See sources in drivers/net/ieee802154 folder for more details.
+
+
+6LoWPAN Linux implementation
+============================
+
+The IEEE 802.15.4 standard specifies an MTU of 127 bytes, yielding about 80
+octets of actual MAC payload once security is turned on, on a wireless link
+with a link throughput of 250 kbps or less.  The 6LoWPAN adaptation format
+[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
+taking into account limited bandwidth, memory, or energy resources that are
+expected in applications such as wireless Sensor Networks.  [RFC4944] defines
+a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
+to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
+compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
+relatively large IPv6 and UDP headers down to (in the best case) several bytes.
+
+In September 2011 the standard update was published - [RFC6282].
+It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
+used in this Linux implementation.
+
+All the code related to 6lowpan you may find in files: net/6lowpan/*
+and net/ieee802154/6lowpan/*
+
+To setup a 6LoWPAN interface you need:
+1. Add IEEE802.15.4 interface and set channel and PAN ID;
+2. Add 6lowpan interface by command like:
+   # ip link add link wpan0 name lowpan0 type lowpan
+3. Bring up 'lowpan0' interface
diff --git a/Documentation/networking/igb.txt b/Documentation/networking/igb.txt
new file mode 100644
index 000000000..f90643ef3
--- /dev/null
+++ b/Documentation/networking/igb.txt
@@ -0,0 +1,129 @@
+Linux* Base Driver for Intel(R) Ethernet Network Connection
+===========================================================
+
+Intel Gigabit Linux driver.
+Copyright(c) 1999 - 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Additional Configurations
+- Support
+
+Identifying Your Adapter
+========================
+
+This driver supports all 82575, 82576 and 82580-based Intel (R) gigabit network
+connections.
+
+For specific information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/go/network/adapter/idguide.htm
+
+Command Line Parameters
+=======================
+
+The default value for each parameter is generally the recommended setting,
+unless otherwise noted.
+
+max_vfs
+-------
+Valid Range:   0-7
+Default Value: 0
+
+This parameter adds support for SR-IOV.  It causes the driver to spawn up to
+max_vfs worth of virtual function.
+
+Additional Configurations
+=========================
+
+  Jumbo Frames
+  ------------
+  Jumbo Frames support is enabled by changing the MTU to a value larger than
+  the default of 1500.  Use the ip command to increase the MTU size.
+  For example:
+
+       ip link set dev eth<x> mtu 9000
+
+  This setting is not saved across reboots.
+
+  Notes:
+
+  - The maximum MTU setting for Jumbo Frames is 9216.  This value coincides
+    with the maximum Jumbo Frames size of 9234 bytes.
+
+  - Using Jumbo frames at 10 or 100 Mbps is not supported and may result in
+    poor performance or loss of link.
+
+  ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information. The latest
+  version of ethtool can be found at:
+
+  https://www.kernel.org/pub/software/network/ethtool/
+
+  Enabling Wake on LAN* (WoL)
+  ---------------------------
+  WoL is configured through the ethtool* utility.
+
+  For instructions on enabling WoL with ethtool, refer to the ethtool man page.
+
+  WoL will be enabled on the system during the next shut down or reboot.
+  For this driver version, in order to enable WoL, the igb driver must be
+  loaded when shutting down or rebooting the system.
+
+  Wake On LAN is only supported on port A of multi-port adapters.
+
+  Wake On LAN is not supported for the Intel(R) Gigabit VT Quad Port Server
+  Adapter.
+
+  Multiqueue
+  ----------
+  In this mode, a separate MSI-X vector is allocated for each queue and one
+  for "other" interrupts such as link status change and errors.  All
+  interrupts are throttled via interrupt moderation.  Interrupt moderation
+  must be used to avoid interrupt storms while the driver is processing one
+  interrupt.  The moderation value should be at least as large as the expected
+  time for the driver to process an interrupt. Multiqueue is off by default.
+
+  REQUIREMENTS: MSI-X support is required for Multiqueue. If MSI-X is not
+  found, the system will fallback to MSI or to Legacy interrupts.
+
+  MAC and VLAN anti-spoofing feature
+  ----------------------------------
+  When a malicious driver attempts to send a spoofed packet, it is dropped by
+  the hardware and not transmitted.  An interrupt is sent to the PF driver
+  notifying it of the spoof attempt.
+
+  When a spoofed packet is detected the PF driver will send the following
+  message to the system log (displayed by  the "dmesg" command):
+
+  Spoof event(s) detected on VF(n)
+
+  Where n=the VF that attempted to do the spoofing.
+
+  Setting MAC Address, VLAN and Rate Limit Using IProute2 Tool
+  ------------------------------------------------------------
+  You can set a MAC address of a Virtual Function (VF), a default VLAN and the
+  rate limit using the IProute2 tool. Download the latest version of the
+  iproute2 tool from Sourceforge if your version does not have all the
+  features you require.
+
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    www.intel.com/support/
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/igbvf.txt b/Documentation/networking/igbvf.txt
new file mode 100644
index 000000000..bd404735f
--- /dev/null
+++ b/Documentation/networking/igbvf.txt
@@ -0,0 +1,80 @@
+Linux* Base Driver for Intel(R) Ethernet Network Connection
+===========================================================
+
+Intel Gigabit Linux driver.
+Copyright(c) 1999 - 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Additional Configurations
+- Support
+
+This file describes the igbvf Linux* Base Driver for Intel Network Connection.
+
+The igbvf driver supports 82576-based virtual function devices that can only
+be activated on kernels that support SR-IOV. SR-IOV requires the correct
+platform and OS support.
+
+The igbvf driver requires the igb driver, version 2.0 or later. The igbvf
+driver supports virtual functions generated by the igb driver with a max_vfs
+value of 1 or greater. For more information on the max_vfs parameter refer
+to the README included with the igb driver.
+
+The guest OS loading the igbvf driver must support MSI-X interrupts.
+
+This driver is only supported as a loadable module at this time.  Intel is
+not supplying patches against the kernel source to allow for static linking
+of the driver.  For questions related to hardware requirements, refer to the
+documentation supplied with your Intel Gigabit adapter.  All hardware
+requirements listed apply to use with Linux.
+
+Instructions on updating ethtool can be found in the section "Additional
+Configurations" later in this document.
+
+VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
+
+Identifying Your Adapter
+========================
+
+The igbvf driver supports 82576-based virtual function devices that can only
+be activated on kernels that support SR-IOV.
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/go/network/adapter/idguide.htm
+
+For the latest Intel network drivers for Linux, refer to the following
+website.  In the search field, enter your adapter name or type, or use the
+networking link on the left to search for your adapter:
+
+    http://downloadcenter.intel.com/scripts-df-external/Support_Intel.aspx
+
+Additional Configurations
+=========================
+
+  ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information.  The ethtool
+  version 3.0 or later is required for this functionality, although we
+  strongly recommend downloading the latest version at:
+
+  https://www.kernel.org/pub/software/network/ethtool/
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/ila.txt b/Documentation/networking/ila.txt
new file mode 100644
index 000000000..a17dac9dc
--- /dev/null
+++ b/Documentation/networking/ila.txt
@@ -0,0 +1,285 @@
+Identifier Locator Addressing (ILA)
+
+
+Introduction
+============
+
+Identifier-locator addressing (ILA) is a technique used with IPv6 that
+differentiates between location and identity of a network node. Part of an
+address expresses the immutable identity of the node, and another part
+indicates the location of the node which can be dynamic. Identifier-locator
+addressing can be used to efficiently implement overlay networks for
+network virtualization as well as solutions for use cases in mobility.
+
+ILA can be thought of as means to implement an overlay network without
+encapsulation. This is accomplished by performing network address
+translation on destination addresses as a packet traverses a network. To
+the network, an ILA translated packet appears to be no different than any
+other IPv6 packet. For instance, if the transport protocol is TCP then an
+ILA translated packet looks like just another TCP/IPv6 packet. The
+advantage of this is that ILA is transparent to the network so that
+optimizations in the network, such as ECMP, RSS, GRO, GSO, etc., just work.
+
+The ILA protocol is described in Internet-Draft draft-herbert-intarea-ila.
+
+
+ILA terminology
+===============
+
+  - Identifier	A number that identifies an addressable node in the network
+		independent of its location. ILA identifiers are sixty-four
+		bit values.
+
+  - Locator	A network prefix that routes to a physical host. Locators
+		provide the topological location of an addressed node. ILA
+		locators are sixty-four bit prefixes.
+
+  - ILA mapping
+		A mapping of an ILA identifier to a locator (or to a
+		locator and meta data). An ILA domain maintains a database
+		that contains mappings for all destinations in the domain.
+
+  - SIR address
+		An IPv6 address composed of a SIR prefix (upper sixty-
+		four bits) and an identifier (lower sixty-four bits).
+		SIR addresses are visible to applications and provide a
+		means for them to address nodes independent of their
+		location.
+
+  - ILA address
+		An IPv6 address composed of a locator (upper sixty-four
+		bits) and an identifier (low order sixty-four bits). ILA
+		addresses are never visible to an application.
+
+  - ILA host	An end host that is capable of performing ILA translations
+		on transmit or receive.
+
+  - ILA router	A network node that performs ILA translation and forwarding
+		of translated packets.
+
+  - ILA forwarding cache
+		A type of ILA router that only maintains a working set
+		cache of mappings.
+
+  - ILA node	A network node capable of performing ILA translations. This
+		can be an ILA router, ILA forwarding cache, or ILA host.
+
+
+Operation
+=========
+
+There are two fundamental operations with ILA:
+
+  - Translate a SIR address to an ILA address. This is performed on ingress
+    to an ILA overlay.
+
+  - Translate an ILA address to a SIR address. This is performed on egress
+    from the ILA overlay.
+
+ILA can be deployed either on end hosts or intermediate devices in the
+network; these are provided by "ILA hosts" and "ILA routers" respectively.
+Configuration and datapath for these two points of deployment is somewhat
+different.
+
+The diagram below illustrates the flow of packets through ILA as well
+as showing ILA hosts and routers.
+
+    +--------+                                                +--------+
+    | Host A +-+                                         +--->| Host B |
+    |        | |              (2) ILA                   (')   |        |
+    +--------+ |            ...addressed....           (   )  +--------+
+               V  +---+--+  .  packet      .  +---+--+  (_)
+   (1) SIR     |  | ILA  |----->-------->---->| ILA  |   |   (3) SIR
+    addressed  +->|router|  .              .  |router|->-+    addressed
+    packet        +---+--+  .     IPv6     .  +---+--+        packet
+                   /        .    Network   .
+                  /         .              .   +--+-++--------+
+    +--------+   /          .              .   |ILA ||  Host  |
+    |  Host  +--+           .              .- -|host||        |
+    |        |              .              .   +--+-++--------+
+    +--------+              ................
+
+
+Transport checksum handling
+===========================
+
+When an address is translated by ILA, an encapsulated transport checksum
+that includes the translated address in a pseudo header may be rendered
+incorrect on the wire. This is a problem for intermediate devices,
+including checksum offload in NICs, that process the checksum. There are
+three options to deal with this:
+
+- no action	Allow the checksum to be incorrect on the wire. Before
+		a receiver verifies a checksum the ILA to SIR address
+		translation must be done.
+
+- adjust transport checksum
+		When ILA translation is performed the packet is parsed
+		and if a transport layer checksum is found then it is
+		adjusted to reflect the correct checksum per the
+		translated address.
+
+- checksum neutral mapping
+		When an address is translated the difference can be offset
+		elsewhere in a part of the packet that is covered by
+		the checksum. The low order sixteen bits of the identifier
+		are used. This method is preferred since it doesn't require
+		parsing a packet beyond the IP header and in most cases the
+		adjustment can be precomputed and saved with the mapping.
+
+Note that the checksum neutral adjustment affects the low order sixteen
+bits of the identifier. When ILA to SIR address translation is done on
+egress the low order bits are restored to the original value which
+restores the identifier as it was originally sent.
+
+
+Identifier types
+================
+
+ILA defines different types of identifiers for different use cases.
+
+The defined types are:
+
+      0: interface identifier
+
+      1: locally unique identifier
+
+      2: virtual networking identifier for IPv4 address
+
+      3: virtual networking identifier for IPv6 unicast address
+
+      4: virtual networking identifier for IPv6 multicast address
+
+      5: non-local address identifier
+
+In the current implementation of kernel ILA only locally unique identifiers
+(LUID) are supported. LUID allows for a generic, unformatted 64 bit
+identifier.
+
+
+Identifier formats
+==================
+
+Kernel ILA supports two optional fields in an identifier for formatting:
+"C-bit" and "identifier type". The presence of these fields is determined
+by configuration as demonstrated below.
+
+If the identifier type is present it occupies the three highest order
+bits of an identifier. The possible values are given in the above list.
+
+If the C-bit is present,  this is used as an indication that checksum
+neutral mapping has been done. The C-bit can only be set in an
+ILA address, never a SIR address.
+
+In the simplest format the identifier types, C-bit, and checksum
+adjustment value are not present so an identifier is considered an
+unstructured sixty-four bit value.
+
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |                            Identifier                         |
+     +                                                               +
+     |                                                               |
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The checksum neutral adjustment may be configured to always be
+present using neutral-map-auto. In this case there is no C-bit, but the
+checksum adjustment is in the low order 16 bits. The identifier is
+still sixty-four bits.
+
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |                            Identifier                         |
+     |                               +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |                               |  Checksum-neutral adjustment  |
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The C-bit may used to explicitly indicate that checksum neutral
+mapping has been applied to an ILA address. The format is:
+
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |     |C|                    Identifier                         |
+     |     +-+                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |                               |  Checksum-neutral adjustment  |
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The identifier type field may be present to indicate the identifier
+type. If it is not present then the type is inferred based on mapping
+configuration. The checksum neutral adjustment may automatically
+used with the identifier type as illustrated below.
+
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     | Type|                      Identifier                         |
+     +-+-+-+                         +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |                               |  Checksum-neutral adjustment  |
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+If the identifier type and the C-bit can be present simultaneously so
+the identifier format would be:
+
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     | Type|C|                    Identifier                         |
+     +-+-+-+-+                       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+     |                               |  Checksum-neutral adjustment  |
+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+Configuration
+=============
+
+There are two methods to configure ILA mappings. One is by using LWT routes
+and the other is ila_xlat (called from NFHOOK PREROUTING hook). ila_xlat
+is intended to be used in the receive path for ILA hosts .
+
+An ILA router has also been implemented in XDP. Description of that is
+outside the scope of this document.
+
+The usage of for ILA LWT routes is:
+
+ip route add DEST/128 encap ila LOC csum-mode MODE ident-type TYPE via ADDR
+
+Destination (DEST) can either be a SIR address (for an ILA host or ingress
+ILA router) or an ILA address (egress ILA router). LOC is the sixty-four
+bit locator (with format W:X:Y:Z) that overwrites the upper sixty-four
+bits of the destination address.  Checksum MODE is one of "no-action",
+"adj-transport", "neutral-map", and "neutral-map-auto". If neutral-map is
+set then the C-bit will be present. Identifier TYPE one of "luid" or
+"use-format." In the case of use-format, the identifier type field is
+present and the effective type is taken from that.
+
+The usage of ila_xlat is:
+
+ip ila add loc_match MATCH loc LOC csum-mode MODE ident-type TYPE
+
+MATCH indicates the incoming locator that must be matched to apply
+a the translaiton. LOC is the locator that overwrites the upper
+sixty-four bits of the destination address. MODE and TYPE have the
+same meanings as described above.
+
+
+Some examples
+=============
+
+# Configure an ILA route that uses checksum neutral mapping as well
+# as type field. Note that the type field is set in the SIR address
+# (the 2000 implies type is 1 which is LUID).
+ip route add 3333:0:0:1:2000:0:1:87/128 encap ila 2001:0:87:0 \
+     csum-mode neutral-map ident-type use-format
+
+# Configure an ILA LWT route that uses auto checksum neutral mapping
+# (no C-bit) and configure identifier type to be LUID so that the
+# identifier type field will not be present.
+ip route add 3333:0:0:1:2000:0:2:87/128 encap ila 2001:0:87:1 \
+     csum-mode neutral-map-auto ident-type luid
+
+ila_xlat configuration
+
+# Configure an ILA to SIR mapping that matches a locator and overwrites
+# it with a SIR address (3333:0:0:1 in this example). The C-bit and
+# identifier field are used.
+ip ila add loc_match 2001:0:119:0 loc 3333:0:0:1 \
+    csum-mode neutral-map-auto ident-type use-format
+
+# Configure an ILA to SIR mapping where checksum neutral is automatically
+# set without the C-bit and the identifier type is configured to be LUID
+# so that the identifier type field is not present.
+ip ila add loc_match 2001:0:119:0 loc 3333:0:0:1 \
+    csum-mode neutral-map-auto ident-type use-format
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
new file mode 100644
index 000000000..fcd710f2c
--- /dev/null
+++ b/Documentation/networking/index.rst
@@ -0,0 +1,30 @@
+Linux Networking Documentation
+==============================
+
+Contents:
+
+.. toctree::
+   :maxdepth: 2
+
+   netdev-FAQ
+   af_xdp
+   batman-adv
+   can
+   can_ucan_protocol
+   dpaa2/index
+   e100
+   e1000
+   kapi
+   z8530book
+   msg_zerocopy
+   failover
+   net_failover
+   alias
+   bridge
+
+.. only::  subproject
+
+   Indices
+   =======
+
+   * :ref:`genindex`
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
new file mode 100644
index 000000000..3c617d620
--- /dev/null
+++ b/Documentation/networking/ip-sysctl.txt
@@ -0,0 +1,2195 @@
+/proc/sys/net/ipv4/* Variables:
+
+ip_forward - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	Forward Packets between interfaces.
+
+	This variable is special, its change resets all configuration
+	parameters to their default state (RFC1122 for hosts, RFC1812
+	for routers)
+
+ip_default_ttl - INTEGER
+	Default value of TTL field (Time To Live) for outgoing (but not
+	forwarded) IP packets. Should be between 1 and 255 inclusive.
+	Default: 64 (as recommended by RFC1700)
+
+ip_no_pmtu_disc - INTEGER
+	Disable Path MTU Discovery. If enabled in mode 1 and a
+	fragmentation-required ICMP is received, the PMTU to this
+	destination will be set to min_pmtu (see below). You will need
+	to raise min_pmtu to the smallest interface MTU on your system
+	manually if you want to avoid locally generated fragments.
+
+	In mode 2 incoming Path MTU Discovery messages will be
+	discarded. Outgoing frames are handled the same as in mode 1,
+	implicitly setting IP_PMTUDISC_DONT on every created socket.
+
+	Mode 3 is a hardened pmtu discover mode. The kernel will only
+	accept fragmentation-needed errors if the underlying protocol
+	can verify them besides a plain socket lookup. Current
+	protocols for which pmtu events will be honored are TCP, SCTP
+	and DCCP as they verify e.g. the sequence number or the
+	association. This mode should not be enabled globally but is
+	only intended to secure e.g. name servers in namespaces where
+	TCP path mtu must still work but path MTU information of other
+	protocols should be discarded. If enabled globally this mode
+	could break other protocols.
+
+	Possible values: 0-3
+	Default: FALSE
+
+min_pmtu - INTEGER
+	default 552 - minimum discovered Path MTU
+
+ip_forward_use_pmtu - BOOLEAN
+	By default we don't trust protocol path MTUs while forwarding
+	because they could be easily forged and can lead to unwanted
+	fragmentation by the router.
+	You only need to enable this if you have user-space software
+	which tries to discover path mtus by itself and depends on the
+	kernel honoring this information. This is normally not the
+	case.
+	Default: 0 (disabled)
+	Possible values:
+	0 - disabled
+	1 - enabled
+
+fwmark_reflect - BOOLEAN
+	Controls the fwmark of kernel-generated IPv4 reply packets that are not
+	associated with a socket for example, TCP RSTs or ICMP echo replies).
+	If unset, these packets have a fwmark of zero. If set, they have the
+	fwmark of the packet they are replying to.
+	Default: 0
+
+fib_multipath_use_neigh - BOOLEAN
+	Use status of existing neighbor entry when determining nexthop for
+	multipath routes. If disabled, neighbor information is not used and
+	packets could be directed to a failed nexthop. Only valid for kernels
+	built with CONFIG_IP_ROUTE_MULTIPATH enabled.
+	Default: 0 (disabled)
+	Possible values:
+	0 - disabled
+	1 - enabled
+
+fib_multipath_hash_policy - INTEGER
+	Controls which hash policy to use for multipath routes. Only valid
+	for kernels built with CONFIG_IP_ROUTE_MULTIPATH enabled.
+	Default: 0 (Layer 3)
+	Possible values:
+	0 - Layer 3
+	1 - Layer 4
+
+ip_forward_update_priority - INTEGER
+	Whether to update SKB priority from "TOS" field in IPv4 header after it
+	is forwarded. The new SKB priority is mapped from TOS field value
+	according to an rt_tos2priority table (see e.g. man tc-prio).
+	Default: 1 (Update priority.)
+	Possible values:
+	0 - Do not update priority.
+	1 - Update priority.
+
+route/max_size - INTEGER
+	Maximum number of routes allowed in the kernel.  Increase
+	this when using large numbers of interfaces and/or routes.
+	From linux kernel 3.6 onwards, this is deprecated for ipv4
+	as route cache is no longer used.
+
+neigh/default/gc_thresh1 - INTEGER
+	Minimum number of entries to keep.  Garbage collector will not
+	purge entries if there are fewer than this number.
+	Default: 128
+
+neigh/default/gc_thresh2 - INTEGER
+	Threshold when garbage collector becomes more aggressive about
+	purging entries. Entries older than 5 seconds will be cleared
+	when over this number.
+	Default: 512
+
+neigh/default/gc_thresh3 - INTEGER
+	Maximum number of neighbor entries allowed.  Increase this
+	when using large numbers of interfaces and when communicating
+	with large numbers of directly-connected peers.
+	Default: 1024
+
+neigh/default/unres_qlen_bytes - INTEGER
+	The maximum number of bytes which may be used by packets
+	queued for each	unresolved address by other network layers.
+	(added in linux 3.3)
+	Setting negative value is meaningless and will return error.
+	Default: SK_WMEM_MAX, (same as net.core.wmem_default).
+		Exact value depends on architecture and kernel options,
+		but should be enough to allow queuing 256 packets
+		of medium size.
+
+neigh/default/unres_qlen - INTEGER
+	The maximum number of packets which may be queued for each
+	unresolved address by other network layers.
+	(deprecated in linux 3.3) : use unres_qlen_bytes instead.
+	Prior to linux 3.3, the default value is 3 which may cause
+	unexpected packet loss. The current default value is calculated
+	according to default value of unres_qlen_bytes and true size of
+	packet.
+	Default: 101
+
+mtu_expires - INTEGER
+	Time, in seconds, that cached PMTU information is kept.
+
+min_adv_mss - INTEGER
+	The advertised MSS depends on the first hop route MTU, but will
+	never be lower than this setting.
+
+IP Fragmentation:
+
+ipfrag_high_thresh - LONG INTEGER
+	Maximum memory used to reassemble IP fragments.
+
+ipfrag_low_thresh - LONG INTEGER
+	(Obsolete since linux-4.17)
+	Maximum memory used to reassemble IP fragments before the kernel
+	begins to remove incomplete fragment queues to free up resources.
+	The kernel still accepts new fragments for defragmentation.
+
+ipfrag_time - INTEGER
+	Time in seconds to keep an IP fragment in memory.
+
+ipfrag_max_dist - INTEGER
+	ipfrag_max_dist is a non-negative integer value which defines the
+	maximum "disorder" which is allowed among fragments which share a
+	common IP source address. Note that reordering of packets is
+	not unusual, but if a large number of fragments arrive from a source
+	IP address while a particular fragment queue remains incomplete, it
+	probably indicates that one or more fragments belonging to that queue
+	have been lost. When ipfrag_max_dist is positive, an additional check
+	is done on fragments before they are added to a reassembly queue - if
+	ipfrag_max_dist (or more) fragments have arrived from a particular IP
+	address between additions to any IP fragment queue using that source
+	address, it's presumed that one or more fragments in the queue are
+	lost. The existing fragment queue will be dropped, and a new one
+	started. An ipfrag_max_dist value of zero disables this check.
+
+	Using a very small value, e.g. 1 or 2, for ipfrag_max_dist can
+	result in unnecessarily dropping fragment queues when normal
+	reordering of packets occurs, which could lead to poor application
+	performance. Using a very large value, e.g. 50000, increases the
+	likelihood of incorrectly reassembling IP fragments that originate
+	from different IP datagrams, which could result in data corruption.
+	Default: 64
+
+INET peer storage:
+
+inet_peer_threshold - INTEGER
+	The approximate size of the storage.  Starting from this threshold
+	entries will be thrown aggressively.  This threshold also determines
+	entries' time-to-live and time intervals between garbage collection
+	passes.  More entries, less time-to-live, less GC interval.
+
+inet_peer_minttl - INTEGER
+	Minimum time-to-live of entries.  Should be enough to cover fragment
+	time-to-live on the reassembling side.  This minimum time-to-live  is
+	guaranteed if the pool size is less than inet_peer_threshold.
+	Measured in seconds.
+
+inet_peer_maxttl - INTEGER
+	Maximum time-to-live of entries.  Unused entries will expire after
+	this period of time if there is no memory pressure on the pool (i.e.
+	when the number of entries in the pool is very small).
+	Measured in seconds.
+
+TCP variables:
+
+somaxconn - INTEGER
+	Limit of socket listen() backlog, known in userspace as SOMAXCONN.
+	Defaults to 128.  See also tcp_max_syn_backlog for additional tuning
+	for TCP sockets.
+
+tcp_abort_on_overflow - BOOLEAN
+	If listening service is too slow to accept new connections,
+	reset them. Default state is FALSE. It means that if overflow
+	occurred due to a burst, connection will recover. Enable this
+	option _only_ if you are really sure that listening daemon
+	cannot be tuned to accept connections faster. Enabling this
+	option can harm clients of your server.
+
+tcp_adv_win_scale - INTEGER
+	Count buffering overhead as bytes/2^tcp_adv_win_scale
+	(if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale),
+	if it is <= 0.
+	Possible values are [-31, 31], inclusive.
+	Default: 1
+
+tcp_allowed_congestion_control - STRING
+	Show/set the congestion control choices available to non-privileged
+	processes. The list is a subset of those listed in
+	tcp_available_congestion_control.
+	Default is "reno" and the default setting (tcp_congestion_control).
+
+tcp_app_win - INTEGER
+	Reserve max(window/2^tcp_app_win, mss) of window for application
+	buffer. Value 0 is special, it means that nothing is reserved.
+	Default: 31
+
+tcp_autocorking - BOOLEAN
+	Enable TCP auto corking :
+	When applications do consecutive small write()/sendmsg() system calls,
+	we try to coalesce these small writes as much as possible, to lower
+	total amount of sent packets. This is done if at least one prior
+	packet for the flow is waiting in Qdisc queues or device transmit
+	queue. Applications can still use TCP_CORK for optimal behavior
+	when they know how/when to uncork their sockets.
+	Default : 1
+
+tcp_available_congestion_control - STRING
+	Shows the available congestion control choices that are registered.
+	More congestion control algorithms may be available as modules,
+	but not loaded.
+
+tcp_base_mss - INTEGER
+	The initial value of search_low to be used by the packetization layer
+	Path MTU discovery (MTU probing).  If MTU probing is enabled,
+	this is the initial MSS used by the connection.
+
+tcp_min_snd_mss - INTEGER
+	TCP SYN and SYNACK messages usually advertise an ADVMSS option,
+	as described in RFC 1122 and RFC 6691.
+	If this ADVMSS option is smaller than tcp_min_snd_mss,
+	it is silently capped to tcp_min_snd_mss.
+
+	Default : 48 (at least 8 bytes of payload per segment)
+
+tcp_congestion_control - STRING
+	Set the congestion control algorithm to be used for new
+	connections. The algorithm "reno" is always available, but
+	additional choices may be available based on kernel configuration.
+	Default is set as part of kernel configuration.
+	For passive connections, the listener congestion control choice
+	is inherited.
+	[see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
+
+tcp_dsack - BOOLEAN
+	Allows TCP to send "duplicate" SACKs.
+
+tcp_early_retrans - INTEGER
+	Tail loss probe (TLP) converts RTOs occurring due to tail
+	losses into fast recovery (draft-ietf-tcpm-rack). Note that
+	TLP requires RACK to function properly (see tcp_recovery below)
+	Possible values:
+		0 disables TLP
+		3 or 4 enables TLP
+	Default: 3
+
+tcp_ecn - INTEGER
+	Control use of Explicit Congestion Notification (ECN) by TCP.
+	ECN is used only when both ends of the TCP connection indicate
+	support for it.  This feature is useful in avoiding losses due
+	to congestion by allowing supporting routers to signal
+	congestion before having to drop packets.
+	Possible values are:
+		0 Disable ECN.  Neither initiate nor accept ECN.
+		1 Enable ECN when requested by incoming connections and
+		  also request ECN on outgoing connection attempts.
+		2 Enable ECN when requested by incoming connections
+		  but do not request ECN on outgoing connections.
+	Default: 2
+
+tcp_ecn_fallback - BOOLEAN
+	If the kernel detects that ECN connection misbehaves, enable fall
+	back to non-ECN. Currently, this knob implements the fallback
+	from RFC3168, section 6.1.1.1., but we reserve that in future,
+	additional detection mechanisms could be implemented under this
+	knob. The value	is not used, if tcp_ecn or per route (or congestion
+	control) ECN settings are disabled.
+	Default: 1 (fallback enabled)
+
+tcp_fack - BOOLEAN
+	This is a legacy option, it has no effect anymore.
+
+tcp_fin_timeout - INTEGER
+	The length of time an orphaned (no longer referenced by any
+	application) connection will remain in the FIN_WAIT_2 state
+	before it is aborted at the local end.  While a perfectly
+	valid "receive only" state for an un-orphaned connection, an
+	orphaned connection in FIN_WAIT_2 state could otherwise wait
+	forever for the remote to close its end of the connection.
+	Cf. tcp_max_orphans
+	Default: 60 seconds
+
+tcp_frto - INTEGER
+	Enables Forward RTO-Recovery (F-RTO) defined in RFC5682.
+	F-RTO is an enhanced recovery algorithm for TCP retransmission
+	timeouts.  It is particularly beneficial in networks where the
+	RTT fluctuates (e.g., wireless). F-RTO is sender-side only
+	modification. It does not require any support from the peer.
+
+	By default it's enabled with a non-zero value. 0 disables F-RTO.
+
+tcp_invalid_ratelimit - INTEGER
+	Limit the maximal rate for sending duplicate acknowledgments
+	in response to incoming TCP packets that are for an existing
+	connection but that are invalid due to any of these reasons:
+
+	  (a) out-of-window sequence number,
+	  (b) out-of-window acknowledgment number, or
+	  (c) PAWS (Protection Against Wrapped Sequence numbers) check failure
+
+	This can help mitigate simple "ack loop" DoS attacks, wherein
+	a buggy or malicious middlebox or man-in-the-middle can
+	rewrite TCP header fields in manner that causes each endpoint
+	to think that the other is sending invalid TCP segments, thus
+	causing each side to send an unterminating stream of duplicate
+	acknowledgments for invalid segments.
+
+	Using 0 disables rate-limiting of dupacks in response to
+	invalid segments; otherwise this value specifies the minimal
+	space between sending such dupacks, in milliseconds.
+
+	Default: 500 (milliseconds).
+
+tcp_keepalive_time - INTEGER
+	How often TCP sends out keepalive messages when keepalive is enabled.
+	Default: 2hours.
+
+tcp_keepalive_probes - INTEGER
+	How many keepalive probes TCP sends out, until it decides that the
+	connection is broken. Default value: 9.
+
+tcp_keepalive_intvl - INTEGER
+	How frequently the probes are send out. Multiplied by
+	tcp_keepalive_probes it is time to kill not responding connection,
+	after probes started. Default value: 75sec i.e. connection
+	will be aborted after ~11 minutes of retries.
+
+tcp_l3mdev_accept - BOOLEAN
+	Enables child sockets to inherit the L3 master device index.
+	Enabling this option allows a "global" listen socket to work
+	across L3 master domains (e.g., VRFs) with connected sockets
+	derived from the listen socket to be bound to the L3 domain in
+	which the packets originated. Only valid when the kernel was
+	compiled with CONFIG_NET_L3_MASTER_DEV.
+
+tcp_low_latency - BOOLEAN
+	This is a legacy option, it has no effect anymore.
+
+tcp_max_orphans - INTEGER
+	Maximal number of TCP sockets not attached to any user file handle,
+	held by system.	If this number is exceeded orphaned connections are
+	reset immediately and warning is printed. This limit exists
+	only to prevent simple DoS attacks, you _must_ not rely on this
+	or lower the limit artificially, but rather increase it
+	(probably, after increasing installed memory),
+	if network conditions require more than default value,
+	and tune network services to linger and kill such states
+	more aggressively. Let me to remind again: each orphan eats
+	up to ~64K of unswappable memory.
+
+tcp_max_syn_backlog - INTEGER
+	Maximal number of remembered connection requests, which have not
+	received an acknowledgment from connecting client.
+	The minimal value is 128 for low memory machines, and it will
+	increase in proportion to the memory of machine.
+	If server suffers from overload, try increasing this number.
+
+tcp_max_tw_buckets - INTEGER
+	Maximal number of timewait sockets held by system simultaneously.
+	If this number is exceeded time-wait socket is immediately destroyed
+	and warning is printed. This limit exists only to prevent
+	simple DoS attacks, you _must_ not lower the limit artificially,
+	but rather increase it (probably, after increasing installed memory),
+	if network conditions require more than default value.
+
+tcp_mem - vector of 3 INTEGERs: min, pressure, max
+	min: below this number of pages TCP is not bothered about its
+	memory appetite.
+
+	pressure: when amount of memory allocated by TCP exceeds this number
+	of pages, TCP moderates its memory consumption and enters memory
+	pressure mode, which is exited when memory consumption falls
+	under "min".
+
+	max: number of pages allowed for queueing by all TCP sockets.
+
+	Defaults are calculated at boot time from amount of available
+	memory.
+
+tcp_min_rtt_wlen - INTEGER
+	The window length of the windowed min filter to track the minimum RTT.
+	A shorter window lets a flow more quickly pick up new (higher)
+	minimum RTT when it is moved to a longer path (e.g., due to traffic
+	engineering). A longer window makes the filter more resistant to RTT
+	inflations such as transient congestion. The unit is seconds.
+	Possible values: 0 - 86400 (1 day)
+	Default: 300
+
+tcp_moderate_rcvbuf - BOOLEAN
+	If set, TCP performs receive buffer auto-tuning, attempting to
+	automatically size the buffer (no greater than tcp_rmem[2]) to
+	match the size required by the path for full throughput.  Enabled by
+	default.
+
+tcp_mtu_probing - INTEGER
+	Controls TCP Packetization-Layer Path MTU Discovery.  Takes three
+	values:
+	  0 - Disabled
+	  1 - Disabled by default, enabled when an ICMP black hole detected
+	  2 - Always enabled, use initial MSS of tcp_base_mss.
+
+tcp_probe_interval - UNSIGNED INTEGER
+	Controls how often to start TCP Packetization-Layer Path MTU
+	Discovery reprobe. The default is reprobing every 10 minutes as
+	per RFC4821.
+
+tcp_probe_threshold - INTEGER
+	Controls when TCP Packetization-Layer Path MTU Discovery probing
+	will stop in respect to the width of search range in bytes. Default
+	is 8 bytes.
+
+tcp_no_metrics_save - BOOLEAN
+	By default, TCP saves various connection metrics in the route cache
+	when the connection closes, so that connections established in the
+	near future can use these to set initial conditions.  Usually, this
+	increases overall performance, but may sometimes cause performance
+	degradation.  If set, TCP will not cache metrics on closing
+	connections.
+
+tcp_orphan_retries - INTEGER
+	This value influences the timeout of a locally closed TCP connection,
+	when RTO retransmissions remain unacknowledged.
+	See tcp_retries2 for more details.
+
+	The default value is 8.
+	If your machine is a loaded WEB server,
+	you should think about lowering this value, such sockets
+	may consume significant resources. Cf. tcp_max_orphans.
+
+tcp_recovery - INTEGER
+	This value is a bitmap to enable various experimental loss recovery
+	features.
+
+	RACK: 0x1 enables the RACK loss detection for fast detection of lost
+	      retransmissions and tail drops. It also subsumes and disables
+	      RFC6675 recovery for SACK connections.
+	RACK: 0x2 makes RACK's reordering window static (min_rtt/4).
+	RACK: 0x4 disables RACK's DUPACK threshold heuristic
+
+	Default: 0x1
+
+tcp_reordering - INTEGER
+	Initial reordering level of packets in a TCP stream.
+	TCP stack can then dynamically adjust flow reordering level
+	between this initial value and tcp_max_reordering
+	Default: 3
+
+tcp_max_reordering - INTEGER
+	Maximal reordering level of packets in a TCP stream.
+	300 is a fairly conservative value, but you might increase it
+	if paths are using per packet load balancing (like bonding rr mode)
+	Default: 300
+
+tcp_retrans_collapse - BOOLEAN
+	Bug-to-bug compatibility with some broken printers.
+	On retransmit try to send bigger packets to work around bugs in
+	certain TCP stacks.
+
+tcp_retries1 - INTEGER
+	This value influences the time, after which TCP decides, that
+	something is wrong due to unacknowledged RTO retransmissions,
+	and reports this suspicion to the network layer.
+	See tcp_retries2 for more details.
+
+	RFC 1122 recommends at least 3 retransmissions, which is the
+	default.
+
+tcp_retries2 - INTEGER
+	This value influences the timeout of an alive TCP connection,
+	when RTO retransmissions remain unacknowledged.
+	Given a value of N, a hypothetical TCP connection following
+	exponential backoff with an initial RTO of TCP_RTO_MIN would
+	retransmit N times before killing the connection at the (N+1)th RTO.
+
+	The default value of 15 yields a hypothetical timeout of 924.6
+	seconds and is a lower bound for the effective timeout.
+	TCP will effectively time out at the first RTO which exceeds the
+	hypothetical timeout.
+
+	RFC 1122 recommends at least 100 seconds for the timeout,
+	which corresponds to a value of at least 8.
+
+tcp_rfc1337 - BOOLEAN
+	If set, the TCP stack behaves conforming to RFC1337. If unset,
+	we are not conforming to RFC, but prevent TCP TIME_WAIT
+	assassination.
+	Default: 0
+
+tcp_rmem - vector of 3 INTEGERs: min, default, max
+	min: Minimal size of receive buffer used by TCP sockets.
+	It is guaranteed to each TCP socket, even under moderate memory
+	pressure.
+	Default: 4K
+
+	default: initial size of receive buffer used by TCP sockets.
+	This value overrides net.core.rmem_default used by other protocols.
+	Default: 87380 bytes. This value results in window of 65535 with
+	default setting of tcp_adv_win_scale and tcp_app_win:0 and a bit
+	less for default tcp_app_win. See below about these variables.
+
+	max: maximal size of receive buffer allowed for automatically
+	selected receiver buffers for TCP socket. This value does not override
+	net.core.rmem_max.  Calling setsockopt() with SO_RCVBUF disables
+	automatic tuning of that socket's receive buffer size, in which
+	case this value is ignored.
+	Default: between 87380B and 6MB, depending on RAM size.
+
+tcp_sack - BOOLEAN
+	Enable select acknowledgments (SACKS).
+
+tcp_comp_sack_delay_ns - LONG INTEGER
+	TCP tries to reduce number of SACK sent, using a timer
+	based on 5% of SRTT, capped by this sysctl, in nano seconds.
+	The default is 1ms, based on TSO autosizing period.
+
+	Default : 1,000,000 ns (1 ms)
+
+tcp_comp_sack_nr - INTEGER
+	Max numer of SACK that can be compressed.
+	Using 0 disables SACK compression.
+
+	Detault : 44
+
+tcp_slow_start_after_idle - BOOLEAN
+	If set, provide RFC2861 behavior and time out the congestion
+	window after an idle period.  An idle period is defined at
+	the current RTO.  If unset, the congestion window will not
+	be timed out after an idle period.
+	Default: 1
+
+tcp_stdurg - BOOLEAN
+	Use the Host requirements interpretation of the TCP urgent pointer field.
+	Most hosts use the older BSD interpretation, so if you turn this on
+	Linux might not communicate correctly with them.
+	Default: FALSE
+
+tcp_synack_retries - INTEGER
+	Number of times SYNACKs for a passive TCP connection attempt will
+	be retransmitted. Should not be higher than 255. Default value
+	is 5, which corresponds to 31seconds till the last retransmission
+	with the current initial RTO of 1second. With this the final timeout
+	for a passive TCP connection will happen after 63seconds.
+
+tcp_syncookies - BOOLEAN
+	Only valid when the kernel was compiled with CONFIG_SYN_COOKIES
+	Send out syncookies when the syn backlog queue of a socket
+	overflows. This is to prevent against the common 'SYN flood attack'
+	Default: 1
+
+	Note, that syncookies is fallback facility.
+	It MUST NOT be used to help highly loaded servers to stand
+	against legal connection rate. If you see SYN flood warnings
+	in your logs, but investigation	shows that they occur
+	because of overload with legal connections, you should tune
+	another parameters until this warning disappear.
+	See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.
+
+	syncookies seriously violate TCP protocol, do not allow
+	to use TCP extensions, can result in serious degradation
+	of some services (f.e. SMTP relaying), visible not by you,
+	but your clients and relays, contacting you. While you see
+	SYN flood warnings in logs not being really flooded, your server
+	is seriously misconfigured.
+
+	If you want to test which effects syncookies have to your
+	network connections you can set this knob to 2 to enable
+	unconditionally generation of syncookies.
+
+tcp_fastopen - INTEGER
+	Enable TCP Fast Open (RFC7413) to send and accept data in the opening
+	SYN packet.
+
+	The client support is enabled by flag 0x1 (on by default). The client
+	then must use sendmsg() or sendto() with the MSG_FASTOPEN flag,
+	rather than connect() to send data in SYN.
+
+	The server support is enabled by flag 0x2 (off by default). Then
+	either enable for all listeners with another flag (0x400) or
+	enable individual listeners via TCP_FASTOPEN socket option with
+	the option value being the length of the syn-data backlog.
+
+	The values (bitmap) are
+	  0x1: (client) enables sending data in the opening SYN on the client.
+	  0x2: (server) enables the server support, i.e., allowing data in
+			a SYN packet to be accepted and passed to the
+			application before 3-way handshake finishes.
+	  0x4: (client) send data in the opening SYN regardless of cookie
+			availability and without a cookie option.
+	0x200: (server) accept data-in-SYN w/o any cookie option present.
+	0x400: (server) enable all listeners to support Fast Open by
+			default without explicit TCP_FASTOPEN socket option.
+
+	Default: 0x1
+
+	Note that that additional client or server features are only
+	effective if the basic support (0x1 and 0x2) are enabled respectively.
+
+tcp_fastopen_blackhole_timeout_sec - INTEGER
+	Initial time period in second to disable Fastopen on active TCP sockets
+	when a TFO firewall blackhole issue happens.
+	This time period will grow exponentially when more blackhole issues
+	get detected right after Fastopen is re-enabled and will reset to
+	initial value when the blackhole issue goes away.
+	0 to disable the blackhole detection.
+	By default, it is set to 1hr.
+
+tcp_syn_retries - INTEGER
+	Number of times initial SYNs for an active TCP connection attempt
+	will be retransmitted. Should not be higher than 127. Default value
+	is 6, which corresponds to 63seconds till the last retransmission
+	with the current initial RTO of 1second. With this the final timeout
+	for an active TCP connection attempt will happen after 127seconds.
+
+tcp_timestamps - INTEGER
+Enable timestamps as defined in RFC1323.
+	0: Disabled.
+	1: Enable timestamps as defined in RFC1323 and use random offset for
+	each connection rather than only using the current time.
+	2: Like 1, but without random offsets.
+	Default: 1
+
+tcp_min_tso_segs - INTEGER
+	Minimal number of segments per TSO frame.
+	Since linux-3.12, TCP does an automatic sizing of TSO frames,
+	depending on flow rate, instead of filling 64Kbytes packets.
+	For specific usages, it's possible to force TCP to build big
+	TSO frames. Note that TCP stack might split too big TSO packets
+	if available window is too small.
+	Default: 2
+
+tcp_pacing_ss_ratio - INTEGER
+	sk->sk_pacing_rate is set by TCP stack using a ratio applied
+	to current rate. (current_rate = cwnd * mss / srtt)
+	If TCP is in slow start, tcp_pacing_ss_ratio is applied
+	to let TCP probe for bigger speeds, assuming cwnd can be
+	doubled every other RTT.
+	Default: 200
+
+tcp_pacing_ca_ratio - INTEGER
+	sk->sk_pacing_rate is set by TCP stack using a ratio applied
+	to current rate. (current_rate = cwnd * mss / srtt)
+	If TCP is in congestion avoidance phase, tcp_pacing_ca_ratio
+	is applied to conservatively probe for bigger throughput.
+	Default: 120
+
+tcp_tso_win_divisor - INTEGER
+	This allows control over what percentage of the congestion window
+	can be consumed by a single TSO frame.
+	The setting of this parameter is a choice between burstiness and
+	building larger TSO frames.
+	Default: 3
+
+tcp_tw_reuse - INTEGER
+	Enable reuse of TIME-WAIT sockets for new connections when it is
+	safe from protocol viewpoint.
+	0 - disable
+	1 - global enable
+	2 - enable for loopback traffic only
+	It should not be changed without advice/request of technical
+	experts.
+	Default: 2
+
+tcp_window_scaling - BOOLEAN
+	Enable window scaling as defined in RFC1323.
+
+tcp_wmem - vector of 3 INTEGERs: min, default, max
+	min: Amount of memory reserved for send buffers for TCP sockets.
+	Each TCP socket has rights to use it due to fact of its birth.
+	Default: 4K
+
+	default: initial size of send buffer used by TCP sockets.  This
+	value overrides net.core.wmem_default used by other protocols.
+	It is usually lower than net.core.wmem_default.
+	Default: 16K
+
+	max: Maximal amount of memory allowed for automatically tuned
+	send buffers for TCP sockets. This value does not override
+	net.core.wmem_max.  Calling setsockopt() with SO_SNDBUF disables
+	automatic tuning of that socket's send buffer size, in which case
+	this value is ignored.
+	Default: between 64K and 4MB, depending on RAM size.
+
+tcp_notsent_lowat - UNSIGNED INTEGER
+	A TCP socket can control the amount of unsent bytes in its write queue,
+	thanks to TCP_NOTSENT_LOWAT socket option. poll()/select()/epoll()
+	reports POLLOUT events if the amount of unsent bytes is below a per
+	socket value, and if the write queue is not full. sendmsg() will
+	also not add new buffers if the limit is hit.
+
+	This global variable controls the amount of unsent data for
+	sockets not using TCP_NOTSENT_LOWAT. For these sockets, a change
+	to the global variable has immediate effect.
+
+	Default: UINT_MAX (0xFFFFFFFF)
+
+tcp_workaround_signed_windows - BOOLEAN
+	If set, assume no receipt of a window scaling option means the
+	remote TCP is broken and treats the window as a signed quantity.
+	If unset, assume the remote TCP is not broken even if we do
+	not receive a window scaling option from them.
+	Default: 0
+
+tcp_thin_linear_timeouts - BOOLEAN
+	Enable dynamic triggering of linear timeouts for thin streams.
+	If set, a check is performed upon retransmission by timeout to
+	determine if the stream is thin (less than 4 packets in flight).
+	As long as the stream is found to be thin, up to 6 linear
+	timeouts may be performed before exponential backoff mode is
+	initiated. This improves retransmission latency for
+	non-aggressive thin streams, often found to be time-dependent.
+	For more information on thin streams, see
+	Documentation/networking/tcp-thin.txt
+	Default: 0
+
+tcp_limit_output_bytes - INTEGER
+	Controls TCP Small Queue limit per tcp socket.
+	TCP bulk sender tends to increase packets in flight until it
+	gets losses notifications. With SNDBUF autotuning, this can
+	result in a large amount of packets queued on the local machine
+	(e.g.: qdiscs, CPU backlog, or device) hurting latency of other
+	flows, for typical pfifo_fast qdiscs.  tcp_limit_output_bytes
+	limits the number of bytes on qdisc or device to reduce artificial
+	RTT/cwnd and reduce bufferbloat.
+	Default: 262144
+
+tcp_challenge_ack_limit - INTEGER
+	Limits number of Challenge ACK sent per second, as recommended
+	in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks)
+	Default: 100
+
+UDP variables:
+
+udp_l3mdev_accept - BOOLEAN
+	Enabling this option allows a "global" bound socket to work
+	across L3 master domains (e.g., VRFs) with packets capable of
+	being received regardless of the L3 domain in which they
+	originated. Only valid when the kernel was compiled with
+	CONFIG_NET_L3_MASTER_DEV.
+
+udp_mem - vector of 3 INTEGERs: min, pressure, max
+	Number of pages allowed for queueing by all UDP sockets.
+
+	min: Below this number of pages UDP is not bothered about its
+	memory appetite. When amount of memory allocated by UDP exceeds
+	this number, UDP starts to moderate memory usage.
+
+	pressure: This value was introduced to follow format of tcp_mem.
+
+	max: Number of pages allowed for queueing by all UDP sockets.
+
+	Default is calculated at boot time from amount of available memory.
+
+udp_rmem_min - INTEGER
+	Minimal size of receive buffer used by UDP sockets in moderation.
+	Each UDP socket is able to use the size for receiving data, even if
+	total pages of UDP sockets exceed udp_mem pressure. The unit is byte.
+	Default: 4K
+
+udp_wmem_min - INTEGER
+	Minimal size of send buffer used by UDP sockets in moderation.
+	Each UDP socket is able to use the size for sending data, even if
+	total pages of UDP sockets exceed udp_mem pressure. The unit is byte.
+	Default: 4K
+
+CIPSOv4 Variables:
+
+cipso_cache_enable - BOOLEAN
+	If set, enable additions to and lookups from the CIPSO label mapping
+	cache.  If unset, additions are ignored and lookups always result in a
+	miss.  However, regardless of the setting the cache is still
+	invalidated when required when means you can safely toggle this on and
+	off and the cache will always be "safe".
+	Default: 1
+
+cipso_cache_bucket_size - INTEGER
+	The CIPSO label cache consists of a fixed size hash table with each
+	hash bucket containing a number of cache entries.  This variable limits
+	the number of entries in each hash bucket; the larger the value the
+	more CIPSO label mappings that can be cached.  When the number of
+	entries in a given hash bucket reaches this limit adding new entries
+	causes the oldest entry in the bucket to be removed to make room.
+	Default: 10
+
+cipso_rbm_optfmt - BOOLEAN
+	Enable the "Optimized Tag 1 Format" as defined in section 3.4.2.6 of
+	the CIPSO draft specification (see Documentation/netlabel for details).
+	This means that when set the CIPSO tag will be padded with empty
+	categories in order to make the packet data 32-bit aligned.
+	Default: 0
+
+cipso_rbm_structvalid - BOOLEAN
+	If set, do a very strict check of the CIPSO option when
+	ip_options_compile() is called.  If unset, relax the checks done during
+	ip_options_compile().  Either way is "safe" as errors are caught else
+	where in the CIPSO processing code but setting this to 0 (False) should
+	result in less work (i.e. it should be faster) but could cause problems
+	with other implementations that require strict checking.
+	Default: 0
+
+IP Variables:
+
+ip_local_port_range - 2 INTEGERS
+	Defines the local port range that is used by TCP and UDP to
+	choose the local port. The first number is the first, the
+	second the last local port number.
+	If possible, it is better these numbers have different parity.
+	(one even and one odd values)
+	The default values are 32768 and 60999 respectively.
+
+ip_local_reserved_ports - list of comma separated ranges
+	Specify the ports which are reserved for known third-party
+	applications. These ports will not be used by automatic port
+	assignments (e.g. when calling connect() or bind() with port
+	number 0). Explicit port allocation behavior is unchanged.
+
+	The format used for both input and output is a comma separated
+	list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and
+	10). Writing to the file will clear all previously reserved
+	ports and update the current list with the one given in the
+	input.
+
+	Note that ip_local_port_range and ip_local_reserved_ports
+	settings are independent and both are considered by the kernel
+	when determining which ports are available for automatic port
+	assignments.
+
+	You can reserve ports which are not in the current
+	ip_local_port_range, e.g.:
+
+	$ cat /proc/sys/net/ipv4/ip_local_port_range
+	32000	60999
+	$ cat /proc/sys/net/ipv4/ip_local_reserved_ports
+	8080,9148
+
+	although this is redundant. However such a setting is useful
+	if later the port range is changed to a value that will
+	include the reserved ports.
+
+	Default: Empty
+
+ip_unprivileged_port_start - INTEGER
+	This is a per-namespace sysctl.  It defines the first
+	unprivileged port in the network namespace.  Privileged ports
+	require root or CAP_NET_BIND_SERVICE in order to bind to them.
+	To disable all privileged ports, set this to 0.  It may not
+	overlap with the ip_local_reserved_ports range.
+
+	Default: 1024
+
+ip_nonlocal_bind - BOOLEAN
+	If set, allows processes to bind() to non-local IP addresses,
+	which can be quite useful - but may break some applications.
+	Default: 0
+
+ip_dynaddr - BOOLEAN
+	If set non-zero, enables support for dynamic addresses.
+	If set to a non-zero value larger than 1, a kernel log
+	message will be printed when dynamic address rewriting
+	occurs.
+	Default: 0
+
+ip_early_demux - BOOLEAN
+	Optimize input packet processing down to one demux for
+	certain kinds of local sockets.  Currently we only do this
+	for established TCP and connected UDP sockets.
+
+	It may add an additional cost for pure routing workloads that
+	reduces overall throughput, in such case you should disable it.
+	Default: 1
+
+tcp_early_demux - BOOLEAN
+	Enable early demux for established TCP sockets.
+	Default: 1
+
+udp_early_demux - BOOLEAN
+	Enable early demux for connected UDP sockets. Disable this if
+	your system could experience more unconnected load.
+	Default: 1
+
+icmp_echo_ignore_all - BOOLEAN
+	If set non-zero, then the kernel will ignore all ICMP ECHO
+	requests sent to it.
+	Default: 0
+
+icmp_echo_ignore_broadcasts - BOOLEAN
+	If set non-zero, then the kernel will ignore all ICMP ECHO and
+	TIMESTAMP requests sent to it via broadcast/multicast.
+	Default: 1
+
+icmp_ratelimit - INTEGER
+	Limit the maximal rates for sending ICMP packets whose type matches
+	icmp_ratemask (see below) to specific targets.
+	0 to disable any limiting,
+	otherwise the minimal space between responses in milliseconds.
+	Note that another sysctl, icmp_msgs_per_sec limits the number
+	of ICMP packets	sent on all targets.
+	Default: 1000
+
+icmp_msgs_per_sec - INTEGER
+	Limit maximal number of ICMP packets sent per second from this host.
+	Only messages whose type matches icmp_ratemask (see below) are
+	controlled by this limit. For security reasons, the precise count
+	of messages per second is randomized.
+	Default: 1000
+
+icmp_msgs_burst - INTEGER
+	icmp_msgs_per_sec controls number of ICMP packets sent per second,
+	while icmp_msgs_burst controls the burst size of these packets.
+	For security reasons, the precise burst size is randomized.
+	Default: 50
+
+icmp_ratemask - INTEGER
+	Mask made of ICMP types for which rates are being limited.
+	Significant bits: IHGFEDCBA9876543210
+	Default mask:     0000001100000011000 (6168)
+
+	Bit definitions (see include/linux/icmp.h):
+		0 Echo Reply
+		3 Destination Unreachable *
+		4 Source Quench *
+		5 Redirect
+		8 Echo Request
+		B Time Exceeded *
+		C Parameter Problem *
+		D Timestamp Request
+		E Timestamp Reply
+		F Info Request
+		G Info Reply
+		H Address Mask Request
+		I Address Mask Reply
+
+	* These are rate limited by default (see default mask above)
+
+icmp_ignore_bogus_error_responses - BOOLEAN
+	Some routers violate RFC1122 by sending bogus responses to broadcast
+	frames.  Such violations are normally logged via a kernel warning.
+	If this is set to TRUE, the kernel will not give such warnings, which
+	will avoid log file clutter.
+	Default: 1
+
+icmp_errors_use_inbound_ifaddr - BOOLEAN
+
+	If zero, icmp error messages are sent with the primary address of
+	the exiting interface.
+
+	If non-zero, the message will be sent with the primary address of
+	the interface that received the packet that caused the icmp error.
+	This is the behaviour network many administrators will expect from
+	a router. And it can make debugging complicated network layouts
+	much easier.
+
+	Note that if no primary address exists for the interface selected,
+	then the primary address of the first non-loopback interface that
+	has one will be used regardless of this setting.
+
+	Default: 0
+
+igmp_max_memberships - INTEGER
+	Change the maximum number of multicast groups we can subscribe to.
+	Default: 20
+
+	Theoretical maximum value is bounded by having to send a membership
+	report in a single datagram (i.e. the report can't span multiple
+	datagrams, or risk confusing the switch and leaving groups you don't
+	intend to).
+
+	The number of supported groups 'M' is bounded by the number of group
+	report entries you can fit into a single datagram of 65535 bytes.
+
+	M = 65536-sizeof (ip header)/(sizeof(Group record))
+
+	Group records are variable length, with a minimum of 12 bytes.
+	So net.ipv4.igmp_max_memberships should not be set higher than:
+
+	(65536-24) / 12 = 5459
+
+	The value 5459 assumes no IP header options, so in practice
+	this number may be lower.
+
+igmp_max_msf - INTEGER
+	Maximum number of addresses allowed in the source filter list for a
+	multicast group.
+	Default: 10
+
+igmp_qrv - INTEGER
+	Controls the IGMP query robustness variable (see RFC2236 8.1).
+	Default: 2 (as specified by RFC2236 8.1)
+	Minimum: 1 (as specified by RFC6636 4.5)
+
+force_igmp_version - INTEGER
+	0 - (default) No enforcement of a IGMP version, IGMPv1/v2 fallback
+	    allowed. Will back to IGMPv3 mode again if all IGMPv1/v2 Querier
+	    Present timer expires.
+	1 - Enforce to use IGMP version 1. Will also reply IGMPv1 report if
+	    receive IGMPv2/v3 query.
+	2 - Enforce to use IGMP version 2. Will fallback to IGMPv1 if receive
+	    IGMPv1 query message. Will reply report if receive IGMPv3 query.
+	3 - Enforce to use IGMP version 3. The same react with default 0.
+
+	Note: this is not the same with force_mld_version because IGMPv3 RFC3376
+	Security Considerations does not have clear description that we could
+	ignore other version messages completely as MLDv2 RFC3810. So make
+	this value as default 0 is recommended.
+
+conf/interface/*  changes special settings per interface (where
+"interface" is the name of your network interface)
+
+conf/all/*	  is special, changes the settings for all interfaces
+
+log_martians - BOOLEAN
+	Log packets with impossible addresses to kernel log.
+	log_martians for the interface will be enabled if at least one of
+	conf/{all,interface}/log_martians is set to TRUE,
+	it will be disabled otherwise
+
+accept_redirects - BOOLEAN
+	Accept ICMP redirect messages.
+	accept_redirects for the interface will be enabled if:
+	- both conf/{all,interface}/accept_redirects are TRUE in the case
+	  forwarding for the interface is enabled
+	or
+	- at least one of conf/{all,interface}/accept_redirects is TRUE in the
+	  case forwarding for the interface is disabled
+	accept_redirects for the interface will be disabled otherwise
+	default TRUE (host)
+		FALSE (router)
+
+forwarding - BOOLEAN
+	Enable IP forwarding on this interface.  This controls whether packets
+	received _on_ this interface can be forwarded.
+
+mc_forwarding - BOOLEAN
+	Do multicast routing. The kernel needs to be compiled with CONFIG_MROUTE
+	and a multicast routing daemon is required.
+	conf/all/mc_forwarding must also be set to TRUE to enable multicast
+	routing	for the interface
+
+medium_id - INTEGER
+	Integer value used to differentiate the devices by the medium they
+	are attached to. Two devices can have different id values when
+	the broadcast packets are received only on one of them.
+	The default value 0 means that the device is the only interface
+	to its medium, value of -1 means that medium is not known.
+
+	Currently, it is used to change the proxy_arp behavior:
+	the proxy_arp feature is enabled for packets forwarded between
+	two devices attached to different media.
+
+proxy_arp - BOOLEAN
+	Do proxy arp.
+	proxy_arp for the interface will be enabled if at least one of
+	conf/{all,interface}/proxy_arp is set to TRUE,
+	it will be disabled otherwise
+
+proxy_arp_pvlan - BOOLEAN
+	Private VLAN proxy arp.
+	Basically allow proxy arp replies back to the same interface
+	(from which the ARP request/solicitation was received).
+
+	This is done to support (ethernet) switch features, like RFC
+	3069, where the individual ports are NOT allowed to
+	communicate with each other, but they are allowed to talk to
+	the upstream router.  As described in RFC 3069, it is possible
+	to allow these hosts to communicate through the upstream
+	router by proxy_arp'ing. Don't need to be used together with
+	proxy_arp.
+
+	This technology is known by different names:
+	  In RFC 3069 it is called VLAN Aggregation.
+	  Cisco and Allied Telesyn call it Private VLAN.
+	  Hewlett-Packard call it Source-Port filtering or port-isolation.
+	  Ericsson call it MAC-Forced Forwarding (RFC Draft).
+
+shared_media - BOOLEAN
+	Send(router) or accept(host) RFC1620 shared media redirects.
+	Overrides secure_redirects.
+	shared_media for the interface will be enabled if at least one of
+	conf/{all,interface}/shared_media is set to TRUE,
+	it will be disabled otherwise
+	default TRUE
+
+secure_redirects - BOOLEAN
+	Accept ICMP redirect messages only to gateways listed in the
+	interface's current gateway list. Even if disabled, RFC1122 redirect
+	rules still apply.
+	Overridden by shared_media.
+	secure_redirects for the interface will be enabled if at least one of
+	conf/{all,interface}/secure_redirects is set to TRUE,
+	it will be disabled otherwise
+	default TRUE
+
+send_redirects - BOOLEAN
+	Send redirects, if router.
+	send_redirects for the interface will be enabled if at least one of
+	conf/{all,interface}/send_redirects is set to TRUE,
+	it will be disabled otherwise
+	Default: TRUE
+
+bootp_relay - BOOLEAN
+	Accept packets with source address 0.b.c.d destined
+	not to this host as local ones. It is supposed, that
+	BOOTP relay daemon will catch and forward such packets.
+	conf/all/bootp_relay must also be set to TRUE to enable BOOTP relay
+	for the interface
+	default FALSE
+	Not Implemented Yet.
+
+accept_source_route - BOOLEAN
+	Accept packets with SRR option.
+	conf/all/accept_source_route must also be set to TRUE to accept packets
+	with SRR option on the interface
+	default TRUE (router)
+		FALSE (host)
+
+accept_local - BOOLEAN
+	Accept packets with local source addresses. In combination with
+	suitable routing, this can be used to direct packets between two
+	local interfaces over the wire and have them accepted properly.
+	default FALSE
+
+route_localnet - BOOLEAN
+	Do not consider loopback addresses as martian source or destination
+	while routing. This enables the use of 127/8 for local routing purposes.
+	default FALSE
+
+rp_filter - INTEGER
+	0 - No source validation.
+	1 - Strict mode as defined in RFC3704 Strict Reverse Path
+	    Each incoming packet is tested against the FIB and if the interface
+	    is not the best reverse path the packet check will fail.
+	    By default failed packets are discarded.
+	2 - Loose mode as defined in RFC3704 Loose Reverse Path
+	    Each incoming packet's source address is also tested against the FIB
+	    and if the source address is not reachable via any interface
+	    the packet check will fail.
+
+	Current recommended practice in RFC3704 is to enable strict mode
+	to prevent IP spoofing from DDos attacks. If using asymmetric routing
+	or other complicated routing, then loose mode is recommended.
+
+	The max value from conf/{all,interface}/rp_filter is used
+	when doing source validation on the {interface}.
+
+	Default value is 0. Note that some distributions enable it
+	in startup scripts.
+
+arp_filter - BOOLEAN
+	1 - Allows you to have multiple network interfaces on the same
+	subnet, and have the ARPs for each interface be answered
+	based on whether or not the kernel would route a packet from
+	the ARP'd IP out that interface (therefore you must use source
+	based routing for this to work). In other words it allows control
+	of which cards (usually 1) will respond to an arp request.
+
+	0 - (default) The kernel can respond to arp requests with addresses
+	from other interfaces. This may seem wrong but it usually makes
+	sense, because it increases the chance of successful communication.
+	IP addresses are owned by the complete host on Linux, not by
+	particular interfaces. Only for more complex setups like load-
+	balancing, does this behaviour cause problems.
+
+	arp_filter for the interface will be enabled if at least one of
+	conf/{all,interface}/arp_filter is set to TRUE,
+	it will be disabled otherwise
+
+arp_announce - INTEGER
+	Define different restriction levels for announcing the local
+	source IP address from IP packets in ARP requests sent on
+	interface:
+	0 - (default) Use any local address, configured on any interface
+	1 - Try to avoid local addresses that are not in the target's
+	subnet for this interface. This mode is useful when target
+	hosts reachable via this interface require the source IP
+	address in ARP requests to be part of their logical network
+	configured on the receiving interface. When we generate the
+	request we will check all our subnets that include the
+	target IP and will preserve the source address if it is from
+	such subnet. If there is no such subnet we select source
+	address according to the rules for level 2.
+	2 - Always use the best local address for this target.
+	In this mode we ignore the source address in the IP packet
+	and try to select local address that we prefer for talks with
+	the target host. Such local address is selected by looking
+	for primary IP addresses on all our subnets on the outgoing
+	interface that include the target IP address. If no suitable
+	local address is found we select the first local address
+	we have on the outgoing interface or on all other interfaces,
+	with the hope we will receive reply for our request and
+	even sometimes no matter the source IP address we announce.
+
+	The max value from conf/{all,interface}/arp_announce is used.
+
+	Increasing the restriction level gives more chance for
+	receiving answer from the resolved target while decreasing
+	the level announces more valid sender's information.
+
+arp_ignore - INTEGER
+	Define different modes for sending replies in response to
+	received ARP requests that resolve local target IP addresses:
+	0 - (default): reply for any local target IP address, configured
+	on any interface
+	1 - reply only if the target IP address is local address
+	configured on the incoming interface
+	2 - reply only if the target IP address is local address
+	configured on the incoming interface and both with the
+	sender's IP address are part from same subnet on this interface
+	3 - do not reply for local addresses configured with scope host,
+	only resolutions for global and link addresses are replied
+	4-7 - reserved
+	8 - do not reply for all local addresses
+
+	The max value from conf/{all,interface}/arp_ignore is used
+	when ARP request is received on the {interface}
+
+arp_notify - BOOLEAN
+	Define mode for notification of address and device changes.
+	0 - (default): do nothing
+	1 - Generate gratuitous arp requests when device is brought up
+	    or hardware address changes.
+
+arp_accept - BOOLEAN
+	Define behavior for gratuitous ARP frames who's IP is not
+	already present in the ARP table:
+	0 - don't create new entries in the ARP table
+	1 - create new entries in the ARP table
+
+	Both replies and requests type gratuitous arp will trigger the
+	ARP table to be updated, if this setting is on.
+
+	If the ARP table already contains the IP address of the
+	gratuitous arp frame, the arp table will be updated regardless
+	if this setting is on or off.
+
+mcast_solicit - INTEGER
+	The maximum number of multicast probes in INCOMPLETE state,
+	when the associated hardware address is unknown.  Defaults
+	to 3.
+
+ucast_solicit - INTEGER
+	The maximum number of unicast probes in PROBE state, when
+	the hardware address is being reconfirmed.  Defaults to 3.
+
+app_solicit - INTEGER
+	The maximum number of probes to send to the user space ARP daemon
+	via netlink before dropping back to multicast probes (see
+	mcast_resolicit).  Defaults to 0.
+
+mcast_resolicit - INTEGER
+	The maximum number of multicast probes after unicast and
+	app probes in PROBE state.  Defaults to 0.
+
+disable_policy - BOOLEAN
+	Disable IPSEC policy (SPD) for this interface
+
+disable_xfrm - BOOLEAN
+	Disable IPSEC encryption on this interface, whatever the policy
+
+igmpv2_unsolicited_report_interval - INTEGER
+	The interval in milliseconds in which the next unsolicited
+	IGMPv1 or IGMPv2 report retransmit will take place.
+	Default: 10000 (10 seconds)
+
+igmpv3_unsolicited_report_interval - INTEGER
+	The interval in milliseconds in which the next unsolicited
+	IGMPv3 report retransmit will take place.
+	Default: 1000 (1 seconds)
+
+promote_secondaries - BOOLEAN
+	When a primary IP address is removed from this interface
+	promote a corresponding secondary IP address instead of
+	removing all the corresponding secondary IP addresses.
+
+drop_unicast_in_l2_multicast - BOOLEAN
+	Drop any unicast IP packets that are received in link-layer
+	multicast (or broadcast) frames.
+	This behavior (for multicast) is actually a SHOULD in RFC
+	1122, but is disabled by default for compatibility reasons.
+	Default: off (0)
+
+drop_gratuitous_arp - BOOLEAN
+	Drop all gratuitous ARP frames, for example if there's a known
+	good ARP proxy on the network and such frames need not be used
+	(or in the case of 802.11, must not be used to prevent attacks.)
+	Default: off (0)
+
+
+tag - INTEGER
+	Allows you to write a number, which can be used as required.
+	Default value is 0.
+
+xfrm4_gc_thresh - INTEGER
+	The threshold at which we will start garbage collecting for IPv4
+	destination cache entries.  At twice this value the system will
+	refuse new allocations.
+
+igmp_link_local_mcast_reports - BOOLEAN
+	Enable IGMP reports for link local multicast groups in the
+	224.0.0.X range.
+	Default TRUE
+
+Alexey Kuznetsov.
+kuznet@ms2.inr.ac.ru
+
+Updated by:
+Andi Kleen
+ak@muc.de
+Nicolas Delon
+delon.nicolas@wanadoo.fr
+
+
+
+
+/proc/sys/net/ipv6/* Variables:
+
+IPv6 has no global variables such as tcp_*.  tcp_* settings under ipv4/ also
+apply to IPv6 [XXX?].
+
+bindv6only - BOOLEAN
+	Default value for IPV6_V6ONLY socket option,
+	which restricts use of the IPv6 socket to IPv6 communication
+	only.
+		TRUE: disable IPv4-mapped address feature
+		FALSE: enable IPv4-mapped address feature
+
+	Default: FALSE (as specified in RFC3493)
+
+flowlabel_consistency - BOOLEAN
+	Protect the consistency (and unicity) of flow label.
+	You have to disable it to use IPV6_FL_F_REFLECT flag on the
+	flow label manager.
+	TRUE: enabled
+	FALSE: disabled
+	Default: TRUE
+
+auto_flowlabels - INTEGER
+	Automatically generate flow labels based on a flow hash of the
+	packet. This allows intermediate devices, such as routers, to
+	identify packet flows for mechanisms like Equal Cost Multipath
+	Routing (see RFC 6438).
+	0: automatic flow labels are completely disabled
+	1: automatic flow labels are enabled by default, they can be
+	   disabled on a per socket basis using the IPV6_AUTOFLOWLABEL
+	   socket option
+	2: automatic flow labels are allowed, they may be enabled on a
+	   per socket basis using the IPV6_AUTOFLOWLABEL socket option
+	3: automatic flow labels are enabled and enforced, they cannot
+	   be disabled by the socket option
+	Default: 1
+
+flowlabel_state_ranges - BOOLEAN
+	Split the flow label number space into two ranges. 0-0x7FFFF is
+	reserved for the IPv6 flow manager facility, 0x80000-0xFFFFF
+	is reserved for stateless flow labels as described in RFC6437.
+	TRUE: enabled
+	FALSE: disabled
+	Default: true
+
+flowlabel_reflect - BOOLEAN
+	Automatically reflect the flow label. Needed for Path MTU
+	Discovery to work with Equal Cost Multipath Routing in anycast
+	environments. See RFC 7690 and:
+	https://tools.ietf.org/html/draft-wang-6man-flow-label-reflection-01
+	TRUE: enabled
+	FALSE: disabled
+	Default: FALSE
+
+fib_multipath_hash_policy - INTEGER
+	Controls which hash policy to use for multipath routes.
+	Default: 0 (Layer 3)
+	Possible values:
+	0 - Layer 3 (source and destination addresses plus flow label)
+	1 - Layer 4 (standard 5-tuple)
+
+anycast_src_echo_reply - BOOLEAN
+	Controls the use of anycast addresses as source addresses for ICMPv6
+	echo reply
+	TRUE:  enabled
+	FALSE: disabled
+	Default: FALSE
+
+idgen_delay - INTEGER
+	Controls the delay in seconds after which time to retry
+	privacy stable address generation if a DAD conflict is
+	detected.
+	Default: 1 (as specified in RFC7217)
+
+idgen_retries - INTEGER
+	Controls the number of retries to generate a stable privacy
+	address if a DAD conflict is detected.
+	Default: 3 (as specified in RFC7217)
+
+mld_qrv - INTEGER
+	Controls the MLD query robustness variable (see RFC3810 9.1).
+	Default: 2 (as specified by RFC3810 9.1)
+	Minimum: 1 (as specified by RFC6636 4.5)
+
+max_dst_opts_number - INTEGER
+	Maximum number of non-padding TLVs allowed in a Destination
+	options extension header. If this value is less than zero
+	then unknown options are disallowed and the number of known
+	TLVs allowed is the absolute value of this number.
+	Default: 8
+
+max_hbh_opts_number - INTEGER
+	Maximum number of non-padding TLVs allowed in a Hop-by-Hop
+	options extension header. If this value is less than zero
+	then unknown options are disallowed and the number of known
+	TLVs allowed is the absolute value of this number.
+	Default: 8
+
+max_dst_opts_length - INTEGER
+	Maximum length allowed for a Destination options extension
+	header.
+	Default: INT_MAX (unlimited)
+
+max_hbh_length - INTEGER
+	Maximum length allowed for a Hop-by-Hop options extension
+	header.
+	Default: INT_MAX (unlimited)
+
+IPv6 Fragmentation:
+
+ip6frag_high_thresh - INTEGER
+	Maximum memory used to reassemble IPv6 fragments. When
+	ip6frag_high_thresh bytes of memory is allocated for this purpose,
+	the fragment handler will toss packets until ip6frag_low_thresh
+	is reached.
+
+ip6frag_low_thresh - INTEGER
+	See ip6frag_high_thresh
+
+ip6frag_time - INTEGER
+	Time in seconds to keep an IPv6 fragment in memory.
+
+IPv6 Segment Routing:
+
+seg6_flowlabel - INTEGER
+	Controls the behaviour of computing the flowlabel of outer
+	IPv6 header in case of SR T.encaps
+
+	-1 set flowlabel to zero.
+	0 copy flowlabel from Inner packet in case of Inner IPv6
+		(Set flowlabel to 0 in case IPv4/L2)
+	1 Compute the flowlabel using seg6_make_flowlabel()
+
+	Default is 0.
+
+conf/default/*:
+	Change the interface-specific default settings.
+
+
+conf/all/*:
+	Change all the interface-specific settings.
+
+	[XXX:  Other special features than forwarding?]
+
+conf/all/forwarding - BOOLEAN
+	Enable global IPv6 forwarding between all interfaces.
+
+	IPv4 and IPv6 work differently here; e.g. netfilter must be used
+	to control which interfaces may forward packets and which not.
+
+	This also sets all interfaces' Host/Router setting
+	'forwarding' to the specified value.  See below for details.
+
+	This referred to as global forwarding.
+
+proxy_ndp - BOOLEAN
+	Do proxy ndp.
+
+fwmark_reflect - BOOLEAN
+	Controls the fwmark of kernel-generated IPv6 reply packets that are not
+	associated with a socket for example, TCP RSTs or ICMPv6 echo replies).
+	If unset, these packets have a fwmark of zero. If set, they have the
+	fwmark of the packet they are replying to.
+	Default: 0
+
+conf/interface/*:
+	Change special settings per interface.
+
+	The functional behaviour for certain settings is different
+	depending on whether local forwarding is enabled or not.
+
+accept_ra - INTEGER
+	Accept Router Advertisements; autoconfigure using them.
+
+	It also determines whether or not to transmit Router
+	Solicitations. If and only if the functional setting is to
+	accept Router Advertisements, Router Solicitations will be
+	transmitted.
+
+	Possible values are:
+		0 Do not accept Router Advertisements.
+		1 Accept Router Advertisements if forwarding is disabled.
+		2 Overrule forwarding behaviour. Accept Router Advertisements
+		  even if forwarding is enabled.
+
+	Functional default: enabled if local forwarding is disabled.
+			    disabled if local forwarding is enabled.
+
+accept_ra_defrtr - BOOLEAN
+	Learn default router in Router Advertisement.
+
+	Functional default: enabled if accept_ra is enabled.
+			    disabled if accept_ra is disabled.
+
+accept_ra_from_local - BOOLEAN
+	Accept RA with source-address that is found on local machine
+        if the RA is otherwise proper and able to be accepted.
+        Default is to NOT accept these as it may be an un-intended
+        network loop.
+
+	Functional default:
+           enabled if accept_ra_from_local is enabled
+               on a specific interface.
+	   disabled if accept_ra_from_local is disabled
+               on a specific interface.
+
+accept_ra_min_hop_limit - INTEGER
+	Minimum hop limit Information in Router Advertisement.
+
+	Hop limit Information in Router Advertisement less than this
+	variable shall be ignored.
+
+	Default: 1
+
+accept_ra_pinfo - BOOLEAN
+	Learn Prefix Information in Router Advertisement.
+
+	Functional default: enabled if accept_ra is enabled.
+			    disabled if accept_ra is disabled.
+
+accept_ra_rt_info_min_plen - INTEGER
+	Minimum prefix length of Route Information in RA.
+
+	Route Information w/ prefix smaller than this variable shall
+	be ignored.
+
+	Functional default: 0 if accept_ra_rtr_pref is enabled.
+			    -1 if accept_ra_rtr_pref is disabled.
+
+accept_ra_rt_info_max_plen - INTEGER
+	Maximum prefix length of Route Information in RA.
+
+	Route Information w/ prefix larger than this variable shall
+	be ignored.
+
+	Functional default: 0 if accept_ra_rtr_pref is enabled.
+			    -1 if accept_ra_rtr_pref is disabled.
+
+accept_ra_rtr_pref - BOOLEAN
+	Accept Router Preference in RA.
+
+	Functional default: enabled if accept_ra is enabled.
+			    disabled if accept_ra is disabled.
+
+accept_ra_mtu - BOOLEAN
+	Apply the MTU value specified in RA option 5 (RFC4861). If
+	disabled, the MTU specified in the RA will be ignored.
+
+	Functional default: enabled if accept_ra is enabled.
+			    disabled if accept_ra is disabled.
+
+accept_redirects - BOOLEAN
+	Accept Redirects.
+
+	Functional default: enabled if local forwarding is disabled.
+			    disabled if local forwarding is enabled.
+
+accept_source_route - INTEGER
+	Accept source routing (routing extension header).
+
+	>= 0: Accept only routing header type 2.
+	< 0: Do not accept routing header.
+
+	Default: 0
+
+autoconf - BOOLEAN
+	Autoconfigure addresses using Prefix Information in Router
+	Advertisements.
+
+	Functional default: enabled if accept_ra_pinfo is enabled.
+			    disabled if accept_ra_pinfo is disabled.
+
+dad_transmits - INTEGER
+	The amount of Duplicate Address Detection probes to send.
+	Default: 1
+
+forwarding - INTEGER
+	Configure interface-specific Host/Router behaviour.
+
+	Note: It is recommended to have the same setting on all
+	interfaces; mixed router/host scenarios are rather uncommon.
+
+	Possible values are:
+		0 Forwarding disabled
+		1 Forwarding enabled
+
+	FALSE (0):
+
+	By default, Host behaviour is assumed.  This means:
+
+	1. IsRouter flag is not set in Neighbour Advertisements.
+	2. If accept_ra is TRUE (default), transmit Router
+	   Solicitations.
+	3. If accept_ra is TRUE (default), accept Router
+	   Advertisements (and do autoconfiguration).
+	4. If accept_redirects is TRUE (default), accept Redirects.
+
+	TRUE (1):
+
+	If local forwarding is enabled, Router behaviour is assumed.
+	This means exactly the reverse from the above:
+
+	1. IsRouter flag is set in Neighbour Advertisements.
+	2. Router Solicitations are not sent unless accept_ra is 2.
+	3. Router Advertisements are ignored unless accept_ra is 2.
+	4. Redirects are ignored.
+
+	Default: 0 (disabled) if global forwarding is disabled (default),
+		 otherwise 1 (enabled).
+
+hop_limit - INTEGER
+	Default Hop Limit to set.
+	Default: 64
+
+mtu - INTEGER
+	Default Maximum Transfer Unit
+	Default: 1280 (IPv6 required minimum)
+
+ip_nonlocal_bind - BOOLEAN
+	If set, allows processes to bind() to non-local IPv6 addresses,
+	which can be quite useful - but may break some applications.
+	Default: 0
+
+router_probe_interval - INTEGER
+	Minimum interval (in seconds) between Router Probing described
+	in RFC4191.
+
+	Default: 60
+
+router_solicitation_delay - INTEGER
+	Number of seconds to wait after interface is brought up
+	before sending Router Solicitations.
+	Default: 1
+
+router_solicitation_interval - INTEGER
+	Number of seconds to wait between Router Solicitations.
+	Default: 4
+
+router_solicitations - INTEGER
+	Number of Router Solicitations to send until assuming no
+	routers are present.
+	Default: 3
+
+use_oif_addrs_only - BOOLEAN
+	When enabled, the candidate source addresses for destinations
+	routed via this interface are restricted to the set of addresses
+	configured on this interface (vis. RFC 6724, section 4).
+
+	Default: false
+
+use_tempaddr - INTEGER
+	Preference for Privacy Extensions (RFC3041).
+	  <= 0 : disable Privacy Extensions
+	  == 1 : enable Privacy Extensions, but prefer public
+	         addresses over temporary addresses.
+	  >  1 : enable Privacy Extensions and prefer temporary
+	         addresses over public addresses.
+	Default:  0 (for most devices)
+		 -1 (for point-to-point devices and loopback devices)
+
+temp_valid_lft - INTEGER
+	valid lifetime (in seconds) for temporary addresses.
+	Default: 604800 (7 days)
+
+temp_prefered_lft - INTEGER
+	Preferred lifetime (in seconds) for temporary addresses.
+	Default: 86400 (1 day)
+
+keep_addr_on_down - INTEGER
+	Keep all IPv6 addresses on an interface down event. If set static
+	global addresses with no expiration time are not flushed.
+	  >0 : enabled
+	   0 : system default
+	  <0 : disabled
+
+	Default: 0 (addresses are removed)
+
+max_desync_factor - INTEGER
+	Maximum value for DESYNC_FACTOR, which is a random value
+	that ensures that clients don't synchronize with each
+	other and generate new addresses at exactly the same time.
+	value is in seconds.
+	Default: 600
+
+regen_max_retry - INTEGER
+	Number of attempts before give up attempting to generate
+	valid temporary addresses.
+	Default: 5
+
+max_addresses - INTEGER
+	Maximum number of autoconfigured addresses per interface.  Setting
+	to zero disables the limitation.  It is not recommended to set this
+	value too large (or to zero) because it would be an easy way to
+	crash the kernel by allowing too many addresses to be created.
+	Default: 16
+
+disable_ipv6 - BOOLEAN
+	Disable IPv6 operation.  If accept_dad is set to 2, this value
+	will be dynamically set to TRUE if DAD fails for the link-local
+	address.
+	Default: FALSE (enable IPv6 operation)
+
+	When this value is changed from 1 to 0 (IPv6 is being enabled),
+	it will dynamically create a link-local address on the given
+	interface and start Duplicate Address Detection, if necessary.
+
+	When this value is changed from 0 to 1 (IPv6 is being disabled),
+	it will dynamically delete all addresses and routes on the given
+	interface. From now on it will not possible to add addresses/routes
+	to the selected interface.
+
+accept_dad - INTEGER
+	Whether to accept DAD (Duplicate Address Detection).
+	0: Disable DAD
+	1: Enable DAD (default)
+	2: Enable DAD, and disable IPv6 operation if MAC-based duplicate
+	   link-local address has been found.
+
+	DAD operation and mode on a given interface will be selected according
+	to the maximum value of conf/{all,interface}/accept_dad.
+
+force_tllao - BOOLEAN
+	Enable sending the target link-layer address option even when
+	responding to a unicast neighbor solicitation.
+	Default: FALSE
+
+	Quoting from RFC 2461, section 4.4, Target link-layer address:
+
+	"The option MUST be included for multicast solicitations in order to
+	avoid infinite Neighbor Solicitation "recursion" when the peer node
+	does not have a cache entry to return a Neighbor Advertisements
+	message.  When responding to unicast solicitations, the option can be
+	omitted since the sender of the solicitation has the correct link-
+	layer address; otherwise it would not have be able to send the unicast
+	solicitation in the first place. However, including the link-layer
+	address in this case adds little overhead and eliminates a potential
+	race condition where the sender deletes the cached link-layer address
+	prior to receiving a response to a previous solicitation."
+
+ndisc_notify - BOOLEAN
+	Define mode for notification of address and device changes.
+	0 - (default): do nothing
+	1 - Generate unsolicited neighbour advertisements when device is brought
+	    up or hardware address changes.
+
+ndisc_tclass - INTEGER
+	The IPv6 Traffic Class to use by default when sending IPv6 Neighbor
+	Discovery (Router Solicitation, Router Advertisement, Neighbor
+	Solicitation, Neighbor Advertisement, Redirect) messages.
+	These 8 bits can be interpreted as 6 high order bits holding the DSCP
+	value and 2 low order bits representing ECN (which you probably want
+	to leave cleared).
+	0 - (default)
+
+mldv1_unsolicited_report_interval - INTEGER
+	The interval in milliseconds in which the next unsolicited
+	MLDv1 report retransmit will take place.
+	Default: 10000 (10 seconds)
+
+mldv2_unsolicited_report_interval - INTEGER
+	The interval in milliseconds in which the next unsolicited
+	MLDv2 report retransmit will take place.
+	Default: 1000 (1 second)
+
+force_mld_version - INTEGER
+	0 - (default) No enforcement of a MLD version, MLDv1 fallback allowed
+	1 - Enforce to use MLD version 1
+	2 - Enforce to use MLD version 2
+
+suppress_frag_ndisc - INTEGER
+	Control RFC 6980 (Security Implications of IPv6 Fragmentation
+	with IPv6 Neighbor Discovery) behavior:
+	1 - (default) discard fragmented neighbor discovery packets
+	0 - allow fragmented neighbor discovery packets
+
+optimistic_dad - BOOLEAN
+	Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
+	0: disabled (default)
+	1: enabled
+
+	Optimistic Duplicate Address Detection for the interface will be enabled
+	if at least one of conf/{all,interface}/optimistic_dad is set to 1,
+	it will be disabled otherwise.
+
+use_optimistic - BOOLEAN
+	If enabled, do not classify optimistic addresses as deprecated during
+	source address selection.  Preferred addresses will still be chosen
+	before optimistic addresses, subject to other ranking in the source
+	address selection algorithm.
+	0: disabled (default)
+	1: enabled
+
+	This will be enabled if at least one of
+	conf/{all,interface}/use_optimistic is set to 1, disabled otherwise.
+
+stable_secret - IPv6 address
+	This IPv6 address will be used as a secret to generate IPv6
+	addresses for link-local addresses and autoconfigured
+	ones. All addresses generated after setting this secret will
+	be stable privacy ones by default. This can be changed via the
+	addrgenmode ip-link. conf/default/stable_secret is used as the
+	secret for the namespace, the interface specific ones can
+	overwrite that. Writes to conf/all/stable_secret are refused.
+
+	It is recommended to generate this secret during installation
+	of a system and keep it stable after that.
+
+	By default the stable secret is unset.
+
+addr_gen_mode - INTEGER
+	Defines how link-local and autoconf addresses are generated.
+
+	0: generate address based on EUI64 (default)
+	1: do no generate a link-local address, use EUI64 for addresses generated
+	   from autoconf
+	2: generate stable privacy addresses, using the secret from
+	   stable_secret (RFC7217)
+	3: generate stable privacy addresses, using a random secret if unset
+
+drop_unicast_in_l2_multicast - BOOLEAN
+	Drop any unicast IPv6 packets that are received in link-layer
+	multicast (or broadcast) frames.
+
+	By default this is turned off.
+
+drop_unsolicited_na - BOOLEAN
+	Drop all unsolicited neighbor advertisements, for example if there's
+	a known good NA proxy on the network and such frames need not be used
+	(or in the case of 802.11, must not be used to prevent attacks.)
+
+	By default this is turned off.
+
+enhanced_dad - BOOLEAN
+	Include a nonce option in the IPv6 neighbor solicitation messages used for
+	duplicate address detection per RFC7527. A received DAD NS will only signal
+	a duplicate address if the nonce is different. This avoids any false
+	detection of duplicates due to loopback of the NS messages that we send.
+	The nonce option will be sent on an interface unless both of
+	conf/{all,interface}/enhanced_dad are set to FALSE.
+	Default: TRUE
+
+icmp/*:
+ratelimit - INTEGER
+	Limit the maximal rates for sending ICMPv6 packets.
+	0 to disable any limiting,
+	otherwise the minimal space between responses in milliseconds.
+	Default: 1000
+
+echo_ignore_all - BOOLEAN
+	If set non-zero, then the kernel will ignore all ICMP ECHO
+	requests sent to it over the IPv6 protocol.
+	Default: 0
+
+xfrm6_gc_thresh - INTEGER
+	The threshold at which we will start garbage collecting for IPv6
+	destination cache entries.  At twice this value the system will
+	refuse new allocations.
+
+
+IPv6 Update by:
+Pekka Savola <pekkas@netcore.fi>
+YOSHIFUJI Hideaki / USAGI Project <yoshfuji@linux-ipv6.org>
+
+
+/proc/sys/net/bridge/* Variables:
+
+bridge-nf-call-arptables - BOOLEAN
+	1 : pass bridged ARP traffic to arptables' FORWARD chain.
+	0 : disable this.
+	Default: 1
+
+bridge-nf-call-iptables - BOOLEAN
+	1 : pass bridged IPv4 traffic to iptables' chains.
+	0 : disable this.
+	Default: 1
+
+bridge-nf-call-ip6tables - BOOLEAN
+	1 : pass bridged IPv6 traffic to ip6tables' chains.
+	0 : disable this.
+	Default: 1
+
+bridge-nf-filter-vlan-tagged - BOOLEAN
+	1 : pass bridged vlan-tagged ARP/IP/IPv6 traffic to {arp,ip,ip6}tables.
+	0 : disable this.
+	Default: 0
+
+bridge-nf-filter-pppoe-tagged - BOOLEAN
+	1 : pass bridged pppoe-tagged IP/IPv6 traffic to {ip,ip6}tables.
+	0 : disable this.
+	Default: 0
+
+bridge-nf-pass-vlan-input-dev - BOOLEAN
+	1: if bridge-nf-filter-vlan-tagged is enabled, try to find a vlan
+	interface on the bridge and set the netfilter input device to the vlan.
+	This allows use of e.g. "iptables -i br0.1" and makes the REDIRECT
+	target work with vlan-on-top-of-bridge interfaces.  When no matching
+	vlan interface is found, or this switch is off, the input device is
+	set to the bridge interface.
+	0: disable bridge netfilter vlan interface lookup.
+	Default: 0
+
+proc/sys/net/sctp/* Variables:
+
+addip_enable - BOOLEAN
+	Enable or disable extension of  Dynamic Address Reconfiguration
+	(ADD-IP) functionality specified in RFC5061.  This extension provides
+	the ability to dynamically add and remove new addresses for the SCTP
+	associations.
+
+	1: Enable extension.
+
+	0: Disable extension.
+
+	Default: 0
+
+pf_enable - INTEGER
+	Enable or disable pf (pf is short for potentially failed) state. A value
+	of pf_retrans > path_max_retrans also disables pf state. That is, one of
+	both pf_enable and pf_retrans > path_max_retrans can disable pf state.
+	Since pf_retrans and path_max_retrans can be changed by userspace
+	application, sometimes user expects to disable pf state by the value of
+	pf_retrans > path_max_retrans, but occasionally the value of pf_retrans
+	or path_max_retrans is changed by the user application, this pf state is
+	enabled. As such, it is necessary to add this to dynamically enable
+	and disable pf state. See:
+	https://datatracker.ietf.org/doc/draft-ietf-tsvwg-sctp-failover for
+	details.
+
+	1: Enable pf.
+
+	0: Disable pf.
+
+	Default: 1
+
+addip_noauth_enable - BOOLEAN
+	Dynamic Address Reconfiguration (ADD-IP) requires the use of
+	authentication to protect the operations of adding or removing new
+	addresses.  This requirement is mandated so that unauthorized hosts
+	would not be able to hijack associations.  However, older
+	implementations may not have implemented this requirement while
+	allowing the ADD-IP extension.  For reasons of interoperability,
+	we provide this variable to control the enforcement of the
+	authentication requirement.
+
+	1: Allow ADD-IP extension to be used without authentication.  This
+	   should only be set in a closed environment for interoperability
+	   with older implementations.
+
+	0: Enforce the authentication requirement
+
+	Default: 0
+
+auth_enable - BOOLEAN
+	Enable or disable Authenticated Chunks extension.  This extension
+	provides the ability to send and receive authenticated chunks and is
+	required for secure operation of Dynamic Address Reconfiguration
+	(ADD-IP) extension.
+
+	1: Enable this extension.
+	0: Disable this extension.
+
+	Default: 0
+
+prsctp_enable - BOOLEAN
+	Enable or disable the Partial Reliability extension (RFC3758) which
+	is used to notify peers that a given DATA should no longer be expected.
+
+	1: Enable extension
+	0: Disable
+
+	Default: 1
+
+max_burst - INTEGER
+	The limit of the number of new packets that can be initially sent.  It
+	controls how bursty the generated traffic can be.
+
+	Default: 4
+
+association_max_retrans - INTEGER
+	Set the maximum number for retransmissions that an association can
+	attempt deciding that the remote end is unreachable.  If this value
+	is exceeded, the association is terminated.
+
+	Default: 10
+
+max_init_retransmits - INTEGER
+	The maximum number of retransmissions of INIT and COOKIE-ECHO chunks
+	that an association will attempt before declaring the destination
+	unreachable and terminating.
+
+	Default: 8
+
+path_max_retrans - INTEGER
+	The maximum number of retransmissions that will be attempted on a given
+	path.  Once this threshold is exceeded, the path is considered
+	unreachable, and new traffic will use a different path when the
+	association is multihomed.
+
+	Default: 5
+
+pf_retrans - INTEGER
+	The number of retransmissions that will be attempted on a given path
+	before traffic is redirected to an alternate transport (should one
+	exist).  Note this is distinct from path_max_retrans, as a path that
+	passes the pf_retrans threshold can still be used.  Its only
+	deprioritized when a transmission path is selected by the stack.  This
+	setting is primarily used to enable fast failover mechanisms without
+	having to reduce path_max_retrans to a very low value.  See:
+	http://www.ietf.org/id/draft-nishida-tsvwg-sctp-failover-05.txt
+	for details.  Note also that a value of pf_retrans > path_max_retrans
+	disables this feature. Since both pf_retrans and path_max_retrans can
+	be changed by userspace application, a variable pf_enable is used to
+	disable pf state.
+
+	Default: 0
+
+rto_initial - INTEGER
+	The initial round trip timeout value in milliseconds that will be used
+	in calculating round trip times.  This is the initial time interval
+	for retransmissions.
+
+	Default: 3000
+
+rto_max - INTEGER
+	The maximum value (in milliseconds) of the round trip timeout.  This
+	is the largest time interval that can elapse between retransmissions.
+
+	Default: 60000
+
+rto_min - INTEGER
+	The minimum value (in milliseconds) of the round trip timeout.  This
+	is the smallest time interval the can elapse between retransmissions.
+
+	Default: 1000
+
+hb_interval - INTEGER
+	The interval (in milliseconds) between HEARTBEAT chunks.  These chunks
+	are sent at the specified interval on idle paths to probe the state of
+	a given path between 2 associations.
+
+	Default: 30000
+
+sack_timeout - INTEGER
+	The amount of time (in milliseconds) that the implementation will wait
+	to send a SACK.
+
+	Default: 200
+
+valid_cookie_life - INTEGER
+	The default lifetime of the SCTP cookie (in milliseconds).  The cookie
+	is used during association establishment.
+
+	Default: 60000
+
+cookie_preserve_enable - BOOLEAN
+	Enable or disable the ability to extend the lifetime of the SCTP cookie
+	that is used during the establishment phase of SCTP association
+
+	1: Enable cookie lifetime extension.
+	0: Disable
+
+	Default: 1
+
+cookie_hmac_alg - STRING
+	Select the hmac algorithm used when generating the cookie value sent by
+	a listening sctp socket to a connecting client in the INIT-ACK chunk.
+	Valid values are:
+	* md5
+	* sha1
+	* none
+	Ability to assign md5 or sha1 as the selected alg is predicated on the
+	configuration of those algorithms at build time (CONFIG_CRYPTO_MD5 and
+	CONFIG_CRYPTO_SHA1).
+
+	Default: Dependent on configuration.  MD5 if available, else SHA1 if
+	available, else none.
+
+rcvbuf_policy - INTEGER
+	Determines if the receive buffer is attributed to the socket or to
+	association.   SCTP supports the capability to create multiple
+	associations on a single socket.  When using this capability, it is
+	possible that a single stalled association that's buffering a lot
+	of data may block other associations from delivering their data by
+	consuming all of the receive buffer space.  To work around this,
+	the rcvbuf_policy could be set to attribute the receiver buffer space
+	to each association instead of the socket.  This prevents the described
+	blocking.
+
+	1: rcvbuf space is per association
+	0: rcvbuf space is per socket
+
+	Default: 0
+
+sndbuf_policy - INTEGER
+	Similar to rcvbuf_policy above, this applies to send buffer space.
+
+	1: Send buffer is tracked per association
+	0: Send buffer is tracked per socket.
+
+	Default: 0
+
+sctp_mem - vector of 3 INTEGERs: min, pressure, max
+	Number of pages allowed for queueing by all SCTP sockets.
+
+	min: Below this number of pages SCTP is not bothered about its
+	memory appetite. When amount of memory allocated by SCTP exceeds
+	this number, SCTP starts to moderate memory usage.
+
+	pressure: This value was introduced to follow format of tcp_mem.
+
+	max: Number of pages allowed for queueing by all SCTP sockets.
+
+	Default is calculated at boot time from amount of available memory.
+
+sctp_rmem - vector of 3 INTEGERs: min, default, max
+	Only the first value ("min") is used, "default" and "max" are
+	ignored.
+
+	min: Minimal size of receive buffer used by SCTP socket.
+	It is guaranteed to each SCTP socket (but not association) even
+	under moderate memory pressure.
+
+	Default: 4K
+
+sctp_wmem  - vector of 3 INTEGERs: min, default, max
+	Currently this tunable has no effect.
+
+addr_scope_policy - INTEGER
+	Control IPv4 address scoping - draft-stewart-tsvwg-sctp-ipv4-00
+
+	0   - Disable IPv4 address scoping
+	1   - Enable IPv4 address scoping
+	2   - Follow draft but allow IPv4 private addresses
+	3   - Follow draft but allow IPv4 link local addresses
+
+	Default: 1
+
+
+/proc/sys/net/core/*
+	Please see: Documentation/sysctl/net.txt for descriptions of these entries.
+
+
+/proc/sys/net/unix/*
+max_dgram_qlen - INTEGER
+	The maximum length of dgram socket receive queue
+
+	Default: 10
+
diff --git a/Documentation/networking/ip_dynaddr.txt b/Documentation/networking/ip_dynaddr.txt
new file mode 100644
index 000000000..45f3c1268
--- /dev/null
+++ b/Documentation/networking/ip_dynaddr.txt
@@ -0,0 +1,29 @@
+IP dynamic address hack-port v0.03
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This stuff allows diald ONESHOT connections to get established by
+dynamically changing packet source address (and socket's if local procs).
+It is implemented for TCP diald-box connections(1) and IP_MASQuerading(2).
+
+If enabled[*] and forwarding interface has changed:
+  1)  Socket (and packet) source address is rewritten ON RETRANSMISSIONS
+      while in SYN_SENT state (diald-box processes).
+  2)  Out-bounded MASQueraded source address changes ON OUTPUT (when
+      internal host does retransmission) until a packet from outside is
+      received by the tunnel.
+
+This is specially helpful for auto dialup links (diald), where the
+``actual'' outgoing address is unknown at the moment the link is
+going up. So, the *same* (local AND masqueraded) connections requests that
+bring the link up will be able to get established.
+
+[*] At boot, by default no address rewriting is attempted. 
+  To enable:
+     # echo 1 > /proc/sys/net/ipv4/ip_dynaddr
+  To enable verbose mode:
+     # echo 2 > /proc/sys/net/ipv4/ip_dynaddr
+  To disable (default)
+     # echo 0 > /proc/sys/net/ipv4/ip_dynaddr
+
+Enjoy!
+
+-- Juanjo  <jjciarla@raiz.uncu.edu.ar>
diff --git a/Documentation/networking/ipddp.txt b/Documentation/networking/ipddp.txt
new file mode 100644
index 000000000..ba5c217ff
--- /dev/null
+++ b/Documentation/networking/ipddp.txt
@@ -0,0 +1,73 @@
+Text file for ipddp.c:
+	AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation
+
+This text file is written by Jay Schulist <jschlst@samba.org>
+
+Introduction
+------------
+
+AppleTalk-IP (IPDDP) is the method computers connected to AppleTalk
+networks can use to communicate via IP. AppleTalk-IP is simply IP datagrams
+inside AppleTalk packets.
+
+Through this driver you can either allow your Linux box to communicate
+IP over an AppleTalk network or you can provide IP gatewaying functions
+for your AppleTalk users.
+
+You can currently encapsulate or decapsulate AppleTalk-IP on LocalTalk,
+EtherTalk and PPPTalk. The only limit on the protocol is that of what
+kernel AppleTalk layer and drivers are available.
+
+Each mode requires its own user space software.
+
+Compiling AppleTalk-IP Decapsulation/Encapsulation
+=================================================
+
+AppleTalk-IP decapsulation needs to be compiled into your kernel. You
+will need to turn on AppleTalk-IP driver support. Then you will need to
+select ONE of the two options; IP to AppleTalk-IP encapsulation support or
+AppleTalk-IP to IP decapsulation support. If you compile the driver
+statically you will only be able to use the driver for the function you have
+enabled in the kernel. If you compile the driver as a module you can
+select what mode you want it to run in via a module loading param.
+ipddp_mode=1 for AppleTalk-IP encapsulation and ipddp_mode=2 for
+AppleTalk-IP to IP decapsulation.
+
+Basic instructions for user space tools
+=======================================
+
+I will briefly describe the operation of the tools, but you will
+need to consult the supporting documentation for each set of tools.
+
+Decapsulation - You will need to download a software package called
+MacGate. In this distribution there will be a tool called MacRoute
+which enables you to add routes to the kernel for your Macs by hand.
+Also the tool MacRegGateWay is included to register the
+proper IP Gateway and IP addresses for your machine. Included in this
+distribution is a patch to netatalk-1.4b2+asun2.0a17.2 (available from
+ftp.u.washington.edu/pub/user-supported/asun/) this patch is optional
+but it allows automatic adding and deleting of routes for Macs. (Handy
+for locations with large Mac installations)
+
+Encapsulation - You will need to download a software daemon called ipddpd.
+This software expects there to be an AppleTalk-IP gateway on the network.
+You will also need to add the proper routes to route your Linux box's IP
+traffic out the ipddp interface.
+
+Common Uses of ipddp.c
+----------------------
+Of course AppleTalk-IP decapsulation and encapsulation, but specifically
+decapsulation is being used most for connecting LocalTalk networks to
+IP networks. Although it has been used on EtherTalk networks to allow
+Macs that are only able to tunnel IP over EtherTalk.
+
+Encapsulation has been used to allow a Linux box stuck on a LocalTalk
+network to use IP. It should work equally well if you are stuck on an
+EtherTalk only network.
+
+Further Assistance
+-------------------
+You can contact me (Jay Schulist <jschlst@samba.org>) with any
+questions regarding decapsulation or encapsulation. Bradford W. Johnson
+<johns393@maroon.tc.umn.edu> originally wrote the ipddp.c driver for IP
+encapsulation in AppleTalk.
diff --git a/Documentation/networking/iphase.txt b/Documentation/networking/iphase.txt
new file mode 100644
index 000000000..670b72f16
--- /dev/null
+++ b/Documentation/networking/iphase.txt
@@ -0,0 +1,158 @@
+
+                              READ ME FISRT
+		  ATM (i)Chip IA Linux Driver Source
+--------------------------------------------------------------------------------
+                     Read This Before You Begin!
+--------------------------------------------------------------------------------
+
+Description
+-----------
+
+This is the README file for the Interphase PCI ATM (i)Chip IA Linux driver 
+source release.
+
+The features and limitations of this driver are as follows:
+    - A single VPI (VPI value of 0) is supported.
+    - Supports 4K VCs for the server board (with 512K control memory) and 1K 
+      VCs for the client board (with 128K control memory).
+    - UBR, ABR and CBR service categories are supported.
+    - Only AAL5 is supported. 
+    - Supports setting of PCR on the VCs. 
+    - Multiple adapters in a system are supported.
+    - All variants of Interphase ATM PCI (i)Chip adapter cards are supported, 
+      including x575 (OC3, control memory 128K , 512K and packet memory 128K, 
+      512K and 1M), x525 (UTP25) and x531 (DS3 and E3). See 
+      http://www.iphase.com/
+      for details.
+    - Only x86 platforms are supported.
+    - SMP is supported.
+
+
+Before You Start
+---------------- 
+
+
+Installation
+------------
+
+1. Installing the adapters in the system
+   To install the ATM adapters in the system, follow the steps below.
+       a. Login as root.
+       b. Shut down the system and power off the system.
+       c. Install one or more ATM adapters in the system.
+       d. Connect each adapter to a port on an ATM switch. The green 'Link' 
+          LED on the front panel of the adapter will be on if the adapter is 
+          connected to the switch properly when the system is powered up.
+       e. Power on and boot the system.
+
+2. [ Removed ]
+
+3. Rebuild kernel with ABR support
+   [ a. and b. removed ]
+    c. Reconfigure the kernel, choose the Interphase ia driver through "make 
+       menuconfig" or "make xconfig".
+    d. Rebuild the kernel, loadable modules and the atm tools. 
+    e. Install the new built kernel and modules and reboot.
+
+4. Load the adapter hardware driver (ia driver) if it is built as a module
+       a. Login as root.
+       b. Change directory to /lib/modules/<kernel-version>/atm.
+       c. Run "insmod suni.o;insmod iphase.o"
+	  The yellow 'status' LED on the front panel of the adapter will blink 
+          while the driver is loaded in the system.
+       d. To verify that the 'ia' driver is loaded successfully, run the 
+          following command:
+
+              cat /proc/atm/devices
+
+          If the driver is loaded successfully, the output of the command will 
+          be similar to the following lines:
+
+              Itf Type    ESI/"MAC"addr AAL(TX,err,RX,err,drop) ...
+              0   ia      xxxxxxxxx  0 ( 0 0 0 0 0 )  5 ( 0 0 0 0 0 )
+
+          You can also check the system log file /var/log/messages for messages
+          related to the ATM driver.
+
+5. Ia Driver Configuration 
+
+5.1 Configuration of adapter buffers
+    The (i)Chip boards have 3 different packet RAM size variants: 128K, 512K and
+    1M. The RAM size decides the number of buffers and buffer size. The default 
+    size and number of buffers are set as following: 
+
+          Total    Rx RAM   Tx RAM   Rx Buf   Tx Buf   Rx buf   Tx buf
+         RAM size   size     size     size     size      cnt      cnt
+         --------  ------   ------   ------   ------   ------   ------
+           128K      64K      64K      10K      10K       6        6
+           512K     256K     256K      10K      10K      25       25
+             1M     512K     512K      10K      10K      51       51
+
+       These setting should work well in most environments, but can be
+       changed by typing the following command: 
+ 
+           insmod <IA_DIR>/ia.o IA_RX_BUF=<RX_CNT> IA_RX_BUF_SZ=<RX_SIZE> \
+                   IA_TX_BUF=<TX_CNT> IA_TX_BUF_SZ=<TX_SIZE> 
+       Where:
+            RX_CNT = number of receive buffers in the range (1-128)
+            RX_SIZE = size of receive buffers in the range (48-64K)
+            TX_CNT = number of transmit buffers in the range (1-128)
+            TX_SIZE = size of transmit buffers in the range (48-64K)
+
+            1. Transmit and receive buffer size must be a multiple of 4.
+            2. Care should be taken so that the memory required for the
+               transmit and receive buffers is less than or equal to the
+               total adapter packet memory.   
+
+5.2 Turn on ia debug trace
+
+    When the ia driver is built with the CONFIG_ATM_IA_DEBUG flag, the driver 
+    can provide more debug trace if needed. There is a bit mask variable, 
+    IADebugFlag, which controls the output of the traces. You can find the bit 
+    map of the IADebugFlag in iphase.h. 
+    The debug trace can be turn on through the insmod command line option, for 
+    example, "insmod iphase.o IADebugFlag=0xffffffff" can turn on all the debug 
+    traces together with loading the driver.
+
+6. Ia Driver Test Using ttcp_atm and PVC
+
+   For the PVC setup, the test machines can either be connected back-to-back or 
+   through a switch. If connected through the switch, the switch must be 
+   configured for the PVC(s).
+
+   a. For UBR test:
+      At the test machine intended to receive data, type:
+         ttcp_atm -r -a -s 0.100 
+      At the other test machine, type:
+         ttcp_atm -t -a -s 0.100 -n 10000
+      Run "ttcp_atm -h" to display more options of the ttcp_atm tool.
+   b. For ABR test:
+      It is the same as the UBR testing, but with an extra command option:
+         -Pabr:max_pcr=<xxx>
+         where:
+             xxx = the maximum peak cell rate, from 170 - 353207.
+         This option must be set on both the machines.
+   c. For CBR test:
+      It is the same as the UBR testing, but with an extra command option:
+         -Pcbr:max_pcr=<xxx>
+         where:
+             xxx = the maximum peak cell rate, from 170 - 353207.
+         This option may only be set on the transmit machine.
+
+
+OUTSTANDING ISSUES
+------------------
+
+
+
+Contact Information
+-------------------
+
+     Customer Support:
+         United States:	Telephone:	(214) 654-5555
+     			Fax:		(214) 654-5500
+			E-Mail:		intouch@iphase.com
+	 Europe:	Telephone:	33 (0)1 41 15 44 00
+			Fax:		33 (0)1 41 15 12 13
+     World Wide Web:	http://www.iphase.com
+     Anonymous FTP:	ftp.iphase.com
diff --git a/Documentation/networking/ipsec.txt b/Documentation/networking/ipsec.txt
new file mode 100644
index 000000000..ba794b7e5
--- /dev/null
+++ b/Documentation/networking/ipsec.txt
@@ -0,0 +1,38 @@
+
+Here documents known IPsec corner cases which need to be keep in mind when
+deploy various IPsec configuration in real world production environment.
+
+1. IPcomp: Small IP packet won't get compressed at sender, and failed on
+	   policy check on receiver.
+
+Quote from RFC3173:
+2.2. Non-Expansion Policy
+
+   If the total size of a compressed payload and the IPComp header, as
+   defined in section 3, is not smaller than the size of the original
+   payload, the IP datagram MUST be sent in the original non-compressed
+   form.  To clarify: If an IP datagram is sent non-compressed, no
+
+   IPComp header is added to the datagram.  This policy ensures saving
+   the decompression processing cycles and avoiding incurring IP
+   datagram fragmentation when the expanded datagram is larger than the
+   MTU.
+
+   Small IP datagrams are likely to expand as a result of compression.
+   Therefore, a numeric threshold should be applied before compression,
+   where IP datagrams of size smaller than the threshold are sent in the
+   original form without attempting compression.  The numeric threshold
+   is implementation dependent.
+
+Current IPComp implementation is indeed by the book, while as in practice
+when sending non-compressed packet to the peer (whether or not packet len
+is smaller than the threshold or the compressed len is larger than original
+packet len), the packet is dropped when checking the policy as this packet
+matches the selector but not coming from any XFRM layer, i.e., with no
+security path. Such naked packet will not eventually make it to upper layer.
+The result is much more wired to the user when ping peer with different
+payload length.
+
+One workaround is try to set "level use" for each policy if user observed
+above scenario. The consequence of doing so is small packet(uncompressed)
+will skip policy checking on receiver side.
diff --git a/Documentation/networking/ipv6.txt b/Documentation/networking/ipv6.txt
new file mode 100644
index 000000000..6cd74fa55
--- /dev/null
+++ b/Documentation/networking/ipv6.txt
@@ -0,0 +1,72 @@
+
+Options for the ipv6 module are supplied as parameters at load time.
+
+Module options may be given as command line arguments to the insmod
+or modprobe command, but are usually specified in either
+/etc/modules.d/*.conf configuration files, or in a distro-specific
+configuration file.
+
+The available ipv6 module parameters are listed below.  If a parameter
+is not specified the default value is used.
+
+The parameters are as follows:
+
+disable
+
+	Specifies whether to load the IPv6 module, but disable all
+	its functionality.  This might be used when another module
+	has a dependency on the IPv6 module being loaded, but no
+	IPv6 addresses or operations are desired.
+
+	The possible values and their effects are:
+
+	0
+		IPv6 is enabled.
+
+		This is the default value.
+
+	1
+		IPv6 is disabled.
+
+		No IPv6 addresses will be added to interfaces, and
+		it will not be possible to open an IPv6 socket.
+
+		A reboot is required to enable IPv6.
+
+autoconf
+
+	Specifies whether to enable IPv6 address autoconfiguration
+	on all interfaces.  This might be used when one does not wish
+	for addresses to be automatically generated from prefixes
+	received in Router Advertisements.
+
+	The possible values and their effects are:
+
+	0
+		IPv6 address autoconfiguration is disabled on all interfaces.
+
+		Only the IPv6 loopback address (::1) and link-local addresses
+		will be added to interfaces.
+
+	1
+		IPv6 address autoconfiguration is enabled on all interfaces.
+
+		This is the default value.
+
+disable_ipv6
+
+	Specifies whether to disable IPv6 on all interfaces.
+	This might be used when no IPv6 addresses are desired.
+
+	The possible values and their effects are:
+
+	0
+		IPv6 is enabled on all interfaces.
+
+		This is the default value.
+
+	1
+		IPv6 is disabled on all interfaces.
+
+		No IPv6 addresses will be added to interfaces.
+
diff --git a/Documentation/networking/ipvlan.txt b/Documentation/networking/ipvlan.txt
new file mode 100644
index 000000000..27a38e50c
--- /dev/null
+++ b/Documentation/networking/ipvlan.txt
@@ -0,0 +1,146 @@
+
+                            IPVLAN Driver HOWTO
+
+Initial Release:
+	Mahesh Bandewar <maheshb AT google.com>
+
+1. Introduction:
+	This is conceptually very similar to the macvlan driver with one major
+exception of using L3 for mux-ing /demux-ing among slaves. This property makes
+the master device share the L2 with it's slave devices. I have developed this
+driver in conjunction with network namespaces and not sure if there is use case
+outside of it.
+
+
+2. Building and Installation:
+	In order to build the driver, please select the config item CONFIG_IPVLAN.
+The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module
+(CONFIG_IPVLAN=m).
+
+
+3. Configuration:
+	There are no module parameters for this driver and it can be configured
+using IProute2/ip utility.
+
+    ip link add link <master> name <slave> type ipvlan [ mode MODE ] [ FLAGS ]
+       where
+         MODE: l3 (default) | l3s | l2
+         FLAGS: bridge (default) | private | vepa
+
+    e.g.
+    (a) Following will create IPvlan link with eth0 as master in
+        L3 bridge mode
+          bash# ip link add link eth0 name ipvl0 type ipvlan
+    (b) This command will create IPvlan link in L2 bridge mode.
+          bash# ip link add link eth0 name ipvl0 type ipvlan mode l2 bridge
+    (c) This command will create an IPvlan device in L2 private mode.
+          bash# ip link add link eth0 name ipvlan type ipvlan mode l2 private
+    (d) This command will create an IPvlan device in L2 vepa mode.
+          bash# ip link add link eth0 name ipvlan type ipvlan mode l2 vepa
+
+
+4. Operating modes:
+	IPvlan has two modes of operation - L2 and L3. For a given master device,
+you can select one of these two modes and all slaves on that master will
+operate in the same (selected) mode. The RX mode is almost identical except
+that in L3 mode the slaves wont receive any multicast / broadcast traffic.
+L3 mode is more restrictive since routing is controlled from the other (mostly)
+default namespace.
+
+4.1 L2 mode:
+	In this mode TX processing happens on the stack instance attached to the
+slave device and packets are switched and queued to the master device to send
+out. In this mode the slaves will RX/TX multicast and broadcast (if applicable)
+as well.
+
+4.2 L3 mode:
+	In this mode TX processing up to L3 happens on the stack instance attached
+to the slave device and packets are switched to the stack instance of the
+master device for the L2 processing and routing from that instance will be
+used before packets are queued on the outbound device. In this mode the slaves
+will not receive nor can send multicast / broadcast traffic.
+
+4.3 L3S mode:
+	This is very similar to the L3 mode except that iptables (conn-tracking)
+works in this mode and hence it is L3-symmetric (L3s). This will have slightly less
+performance but that shouldn't matter since you are choosing this mode over plain-L3
+mode to make conn-tracking work.
+
+5. Mode flags:
+	At this time following mode flags are available
+
+5.1 bridge:
+	This is the default option. To configure the IPvlan port in this mode,
+user can choose to either add this option on the command-line or don't specify
+anything. This is the traditional mode where slaves can cross-talk among
+themselves apart from talking through the master device.
+
+5.2 private:
+	If this option is added to the command-line, the port is set in private
+mode. i.e. port won't allow cross communication between slaves.
+
+5.3 vepa:
+	If this is added to the command-line, the port is set in VEPA mode.
+i.e. port will offload switching functionality to the external entity as
+described in 802.1Qbg
+Note: VEPA mode in IPvlan has limitations. IPvlan uses the mac-address of the
+master-device, so the packets which are emitted in this mode for the adjacent
+neighbor will have source and destination mac same. This will make the switch /
+router send the redirect message.
+
+6. What to choose (macvlan vs. ipvlan)?
+	These two devices are very similar in many regards and the specific use
+case could very well define which device to choose. if one of the following
+situations defines your use case then you can choose to use ipvlan -
+	(a) The Linux host that is connected to the external switch / router has
+policy configured that allows only one mac per port.
+	(b) No of virtual devices created on a master exceed the mac capacity and
+puts the NIC in promiscuous mode and degraded performance is a concern.
+	(c) If the slave device is to be put into the hostile / untrusted network
+namespace where L2 on the slave could be changed / misused.
+
+
+6. Example configuration:
+
+  +=============================================================+
+  |  Host: host1                                                |
+  |                                                             |
+  |   +----------------------+      +----------------------+    |
+  |   |   NS:ns0             |      |  NS:ns1              |    |
+  |   |                      |      |                      |    |
+  |   |                      |      |                      |    |
+  |   |        ipvl0         |      |         ipvl1        |    |
+  |   +----------#-----------+      +-----------#----------+    |
+  |              #                              #               |
+  |              ################################               |
+  |                              # eth0                         |
+  +==============================#==============================+
+
+
+	(a) Create two network namespaces - ns0, ns1
+		ip netns add ns0
+		ip netns add ns1
+
+	(b) Create two ipvlan slaves on eth0 (master device)
+		ip link add link eth0 ipvl0 type ipvlan mode l2
+		ip link add link eth0 ipvl1 type ipvlan mode l2
+
+	(c) Assign slaves to the respective network namespaces
+		ip link set dev ipvl0 netns ns0
+		ip link set dev ipvl1 netns ns1
+
+	(d) Now switch to the namespace (ns0 or ns1) to configure the slave devices
+		- For ns0
+			(1) ip netns exec ns0 bash
+			(2) ip link set dev ipvl0 up
+			(3) ip link set dev lo up
+			(4) ip -4 addr add 127.0.0.1 dev lo
+			(5) ip -4 addr add $IPADDR dev ipvl0
+			(6) ip -4 route add default via $ROUTER dev ipvl0
+		- For ns1
+			(1) ip netns exec ns1 bash
+			(2) ip link set dev ipvl1 up
+			(3) ip link set dev lo up
+			(4) ip -4 addr add 127.0.0.1 dev lo
+			(5) ip -4 addr add $IPADDR dev ipvl1
+			(6) ip -4 route add default via $ROUTER dev ipvl1
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
new file mode 100644
index 000000000..fc531c29a
--- /dev/null
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -0,0 +1,293 @@
+/proc/sys/net/ipv4/vs/* Variables:
+
+am_droprate - INTEGER
+        default 10
+
+        It sets the always mode drop rate, which is used in the mode 3
+        of the drop_rate defense.
+
+amemthresh - INTEGER
+        default 1024
+
+        It sets the available memory threshold (in pages), which is
+        used in the automatic modes of defense. When there is no
+        enough available memory, the respective strategy will be
+        enabled and the variable is automatically set to 2, otherwise
+        the strategy is disabled and the variable is  set  to 1.
+
+backup_only - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	If set, disable the director function while the server is
+	in backup mode to avoid packet loops for DR/TUN methods.
+
+conn_reuse_mode - INTEGER
+	1 - default
+
+	Controls how ipvs will deal with connections that are detected
+	port reuse. It is a bitmap, with the values being:
+
+	0: disable any special handling on port reuse. The new
+	connection will be delivered to the same real server that was
+	servicing the previous connection.
+
+	bit 1: enable rescheduling of new connections when it is safe.
+	That is, whenever expire_nodest_conn and for TCP sockets, when
+	the connection is in TIME_WAIT state (which is only possible if
+	you use NAT mode).
+
+	bit 2: it is bit 1 plus, for TCP connections, when connections
+	are in FIN_WAIT state, as this is the last state seen by load
+	balancer in Direct Routing mode. This bit helps on adding new
+	real servers to a very busy cluster.
+
+conntrack - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	If set, maintain connection tracking entries for
+	connections handled by IPVS.
+
+	This should be enabled if connections handled by IPVS are to be
+	also handled by stateful firewall rules. That is, iptables rules
+	that make use of connection tracking.  It is a performance
+	optimisation to disable this setting otherwise.
+
+	Connections handled by the IPVS FTP application module
+	will have connection tracking entries regardless of this setting.
+
+	Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
+
+cache_bypass - BOOLEAN
+        0 - disabled (default)
+        not 0 - enabled
+
+        If it is enabled, forward packets to the original destination
+        directly when no cache server is available and destination
+        address is not local (iph->daddr is RTN_UNICAST). It is mostly
+        used in transparent web cache cluster.
+
+debug_level - INTEGER
+	0          - transmission error messages (default)
+	1          - non-fatal error messages
+	2          - configuration
+	3          - destination trash
+	4          - drop entry
+	5          - service lookup
+	6          - scheduling
+	7          - connection new/expire, lookup and synchronization
+	8          - state transition
+	9          - binding destination, template checks and applications
+	10         - IPVS packet transmission
+	11         - IPVS packet handling (ip_vs_in/ip_vs_out)
+	12 or more - packet traversal
+
+	Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
+
+	Higher debugging levels include the messages for lower debugging
+	levels, so setting debug level 2, includes level 0, 1 and 2
+	messages. Thus, logging becomes more and more verbose the higher
+	the level.
+
+drop_entry - INTEGER
+        0  - disabled (default)
+
+        The drop_entry defense is to randomly drop entries in the
+        connection hash table, just in order to collect back some
+        memory for new connections. In the current code, the
+        drop_entry procedure can be activated every second, then it
+        randomly scans 1/32 of the whole and drops entries that are in
+        the SYN-RECV/SYNACK state, which should be effective against
+        syn-flooding attack.
+
+        The valid values of drop_entry are from 0 to 3, where 0 means
+        that this strategy is always disabled, 1 and 2 mean automatic
+        modes (when there is no enough available memory, the strategy
+        is enabled and the variable is automatically set to 2,
+        otherwise the strategy is disabled and the variable is set to
+        1), and 3 means that that the strategy is always enabled.
+
+drop_packet - INTEGER
+        0  - disabled (default)
+
+        The drop_packet defense is designed to drop 1/rate packets
+        before forwarding them to real servers. If the rate is 1, then
+        drop all the incoming packets.
+
+        The value definition is the same as that of the drop_entry. In
+        the automatic mode, the rate is determined by the follow
+        formula: rate = amemthresh / (amemthresh - available_memory)
+        when available memory is less than the available memory
+        threshold. When the mode 3 is set, the always mode drop rate
+        is controlled by the /proc/sys/net/ipv4/vs/am_droprate.
+
+expire_nodest_conn - BOOLEAN
+        0 - disabled (default)
+        not 0 - enabled
+
+        The default value is 0, the load balancer will silently drop
+        packets when its destination server is not available. It may
+        be useful, when user-space monitoring program deletes the
+        destination server (because of server overload or wrong
+        detection) and add back the server later, and the connections
+        to the server can continue.
+
+        If this feature is enabled, the load balancer will expire the
+        connection immediately when a packet arrives and its
+        destination server is not available, then the client program
+        will be notified that the connection is closed. This is
+        equivalent to the feature some people requires to flush
+        connections when its destination is not available.
+
+expire_quiescent_template - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	When set to a non-zero value, the load balancer will expire
+	persistent templates when the destination server is quiescent.
+	This may be useful, when a user makes a destination server
+	quiescent by setting its weight to 0 and it is desired that
+	subsequent otherwise persistent connections are sent to a
+	different destination server.  By default new persistent
+	connections are allowed to quiescent destination servers.
+
+	If this feature is enabled, the load balancer will expire the
+	persistence template if it is to be used to schedule a new
+	connection and the destination server is quiescent.
+
+ignore_tunneled - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	If set, ipvs will set the ipvs_property on all packets which are of
+	unrecognized protocols.  This prevents us from routing tunneled
+	protocols like ipip, which is useful to prevent rescheduling
+	packets that have been tunneled to the ipvs host (i.e. to prevent
+	ipvs routing loops when ipvs is also acting as a real server).
+
+nat_icmp_send - BOOLEAN
+        0 - disabled (default)
+        not 0 - enabled
+
+        It controls sending icmp error messages (ICMP_DEST_UNREACH)
+        for VS/NAT when the load balancer receives packets from real
+        servers but the connection entries don't exist.
+
+pmtu_disc - BOOLEAN
+	0 - disabled
+	not 0 - enabled (default)
+
+	By default, reject with FRAG_NEEDED all DF packets that exceed
+	the PMTU, irrespective of the forwarding method. For TUN method
+	the flag can be disabled to fragment such packets.
+
+secure_tcp - INTEGER
+        0  - disabled (default)
+
+	The secure_tcp defense is to use a more complicated TCP state
+	transition table. For VS/NAT, it also delays entering the
+	TCP ESTABLISHED state until the three way handshake is completed.
+
+        The value definition is the same as that of drop_entry and
+        drop_packet.
+
+sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period
+	default 3 50
+
+	It sets synchronization threshold, which is the minimum number
+	of incoming packets that a connection needs to receive before
+	the connection will be synchronized. A connection will be
+	synchronized, every time the number of its incoming packets
+	modulus sync_period equals the threshold. The range of the
+	threshold is from 0 to sync_period.
+
+	When sync_period and sync_refresh_period are 0, send sync only
+	for state changes or only once when pkts matches sync_threshold
+
+sync_refresh_period - UNSIGNED INTEGER
+	default 0
+
+	In seconds, difference in reported connection timer that triggers
+	new sync message. It can be used to avoid sync messages for the
+	specified period (or half of the connection timeout if it is lower)
+	if connection state is not changed since last sync.
+
+	This is useful for normal connections with high traffic to reduce
+	sync rate. Additionally, retry sync_retries times with period of
+	sync_refresh_period/8.
+
+sync_retries - INTEGER
+	default 0
+
+	Defines sync retries with period of sync_refresh_period/8. Useful
+	to protect against loss of sync messages. The range of the
+	sync_retries is from 0 to 3.
+
+sync_qlen_max - UNSIGNED LONG
+
+	Hard limit for queued sync messages that are not sent yet. It
+	defaults to 1/32 of the memory pages but actually represents
+	number of messages. It will protect us from allocating large
+	parts of memory when the sending rate is lower than the queuing
+	rate.
+
+sync_sock_size - INTEGER
+	default 0
+
+	Configuration of SNDBUF (master) or RCVBUF (slave) socket limit.
+	Default value is 0 (preserve system defaults).
+
+sync_ports - INTEGER
+	default 1
+
+	The number of threads that master and backup servers can use for
+	sync traffic. Every thread will use single UDP port, thread 0 will
+	use the default port 8848 while last thread will use port
+	8848+sync_ports-1.
+
+snat_reroute - BOOLEAN
+	0 - disabled
+	not 0 - enabled (default)
+
+	If enabled, recalculate the route of SNATed packets from
+	realservers so that they are routed as if they originate from the
+	director. Otherwise they are routed as if they are forwarded by the
+	director.
+
+	If policy routing is in effect then it is possible that the route
+	of a packet originating from a director is routed differently to a
+	packet being forwarded by the director.
+
+	If policy routing is not in effect then the recalculated route will
+	always be the same as the original route so it is an optimisation
+	to disable snat_reroute and avoid the recalculation.
+
+sync_persist_mode - INTEGER
+	default 0
+
+	Controls the synchronisation of connections when using persistence
+
+	0: All types of connections are synchronised
+	1: Attempt to reduce the synchronisation traffic depending on
+	the connection type. For persistent services avoid synchronisation
+	for normal connections, do it only for persistence templates.
+	In such case, for TCP and SCTP it may need enabling sloppy_tcp and
+	sloppy_sctp flags on backup servers. For non-persistent services
+	such optimization is not applied, mode 0 is assumed.
+
+sync_version - INTEGER
+	default 1
+
+	The version of the synchronisation protocol used when sending
+	synchronisation messages.
+
+	0 selects the original synchronisation protocol (version 0). This
+	should be used when sending synchronisation messages to a legacy
+	system that only understands the original synchronisation protocol.
+
+	1 selects the current synchronisation protocol (version 1). This
+	should be used where possible.
+
+	Kernels with this sync_version entry are able to receive messages
+	of both version 1 and version 2 of the synchronisation protocol.
diff --git a/Documentation/networking/ixgb.txt b/Documentation/networking/ixgb.txt
new file mode 100644
index 000000000..09f71d719
--- /dev/null
+++ b/Documentation/networking/ixgb.txt
@@ -0,0 +1,433 @@
+Linux Base Driver for 10 Gigabit Intel(R) Ethernet Network Connection
+=====================================================================
+
+March 14, 2011
+
+
+Contents
+========
+
+- In This Release
+- Identifying Your Adapter
+- Building and Installation
+- Command Line Parameters
+- Improving Performance
+- Additional Configurations
+- Known Issues/Troubleshooting
+- Support
+
+
+
+In This Release
+===============
+
+This file describes the ixgb Linux Base Driver for the 10 Gigabit Intel(R)
+Network Connection.  This driver includes support for Itanium(R)2-based
+systems.
+
+For questions related to hardware requirements, refer to the documentation
+supplied with your 10 Gigabit adapter.  All hardware requirements listed apply
+to use with Linux.
+
+The following features are available in this kernel:
+ - Native VLANs
+ - Channel Bonding (teaming)
+ - SNMP
+
+Channel Bonding documentation can be found in the Linux kernel source:
+/Documentation/networking/bonding.txt
+
+The driver information previously displayed in the /proc filesystem is not
+supported in this release.  Alternatively, you can use ethtool (version 1.6
+or later), lspci, and iproute2 to obtain the same information.
+
+Instructions on updating ethtool can be found in the section "Additional
+Configurations" later in this document.
+
+
+Identifying Your Adapter
+========================
+
+The following Intel network adapters are compatible with the drivers in this
+release:
+
+Controller  Adapter Name                 Physical Layer
+----------  ------------                 --------------
+82597EX     Intel(R) PRO/10GbE LR/SR/CX4 10G Base-LR (1310 nm optical fiber)
+            Server Adapters              10G Base-SR (850 nm optical fiber)
+                                         10G Base-CX4(twin-axial copper cabling)
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/network/sb/CS-012904.htm
+
+
+Building and Installation
+=========================
+
+select m for "Intel(R) PRO/10GbE support" located at:
+      Location:
+        -> Device Drivers
+          -> Network device support (NETDEVICES [=y])
+            -> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
+1. make modules && make modules_install
+
+2. Load the module:
+
+    modprobe ixgb <parameter>=<value>
+
+   The insmod command can be used if the full
+   path to the driver module is specified.  For example:
+
+     insmod /lib/modules/<KERNEL VERSION>/kernel/drivers/net/ixgb/ixgb.ko
+
+   With 2.6 based kernels also make sure that older ixgb drivers are
+   removed from the kernel, before loading the new module:
+
+     rmmod ixgb; modprobe ixgb
+
+3. Assign an IP address to the interface by entering the following, where
+   x is the interface number:
+
+     ip addr add ethx <IP_address>
+
+4. Verify that the interface works. Enter the following, where <IP_address>
+   is the IP address for another machine on the same subnet as the interface
+   that is being tested:
+
+     ping  <IP_address>
+
+
+Command Line Parameters
+=======================
+
+If the driver is built as a module, the  following optional parameters are
+used by entering them on the command line with the modprobe command using
+this syntax:
+
+     modprobe ixgb [<option>=<VAL1>,<VAL2>,...]
+
+For example, with two 10GbE PCI adapters, entering:
+
+     modprobe ixgb TxDescriptors=80,128
+
+loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
+resources for the second adapter.
+
+The default value for each parameter is generally the recommended setting,
+unless otherwise noted.
+
+FlowControl
+Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
+Default: Read from the EEPROM
+         If EEPROM is not detected, default is 1
+    This parameter controls the automatic generation(Tx) and response(Rx) to
+    Ethernet PAUSE frames.  There are hardware bugs associated with enabling
+    Tx flow control so beware.
+
+RxDescriptors
+Valid Range: 64-512
+Default Value: 512
+    This value is the number of receive descriptors allocated by the driver.
+    Increasing this value allows the driver to buffer more incoming packets.
+    Each descriptor is 16 bytes.  A receive buffer is also allocated for
+    each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
+    depending on the MTU setting.  When the MTU size is 1500 or less, the
+    receive buffer size is 2048 bytes. When the MTU is greater than 1500 the
+    receive buffer size will be either 4056, 8192, or 16384 bytes.  The
+    maximum MTU size is 16114.
+
+RxIntDelay
+Valid Range: 0-65535 (0=off)
+Default Value: 72
+    This value delays the generation of receive interrupts in units of
+    0.8192 microseconds.  Receive interrupt reduction can improve CPU
+    efficiency if properly tuned for specific network traffic.  Increasing
+    this value adds extra latency to frame reception and can end up
+    decreasing the throughput of TCP traffic.  If the system is reporting
+    dropped receives, this value may be set too high, causing the driver to
+    run out of available receive descriptors.
+
+TxDescriptors
+Valid Range: 64-4096
+Default Value: 256
+    This value is the number of transmit descriptors allocated by the driver.
+    Increasing this value allows the driver to queue more transmits.  Each
+    descriptor is 16 bytes.
+
+XsumRX
+Valid Range: 0-1
+Default Value: 1
+    A value of '1' indicates that the driver should enable IP checksum
+    offload for received packets (both UDP and TCP) to the adapter hardware.
+
+
+Improving Performance
+=====================
+
+With the 10 Gigabit server adapters, the default Linux configuration will
+very likely limit the total available throughput artificially.  There is a set
+of configuration changes that, when applied together, will increase the ability
+of Linux to transmit and receive data.  The following enhancements were
+originally acquired from settings published at http://www.spec.org/web99/ for
+various submitted results using Linux.
+
+NOTE: These changes are only suggestions, and serve as a starting point for
+      tuning your network performance.
+
+The changes are made in three major ways, listed in order of greatest effect:
+- Use ip link to modify the mtu (maximum transmission unit) and the txqueuelen
+  parameter.
+- Use sysctl to modify /proc parameters (essentially kernel tuning)
+- Use setpci to modify the MMRBC field in PCI-X configuration space to increase
+  transmit burst lengths on the bus.
+
+NOTE: setpci modifies the adapter's configuration registers to allow it to read
+up to 4k bytes at a time (for transmits).  However, for some systems the
+behavior after modifying this register may be undefined (possibly errors of
+some kind).  A power-cycle, hard reset or explicitly setting the e6 register
+back to 22 (setpci -d 8086:1a48 e6.b=22) may be required to get back to a
+stable configuration.
+
+- COPY these lines and paste them into ixgb_perf.sh:
+#!/bin/bash
+echo "configuring network performance , edit this file to change the interface
+or device ID of 10GbE card"
+# set mmrbc to 4k reads, modify only Intel 10GbE device IDs
+# replace 1a48 with appropriate 10GbE device's ID installed on the system,
+# if needed.
+setpci -d 8086:1a48 e6.b=2e
+# set the MTU (max transmission unit) - it requires your switch and clients
+# to change as well.
+# set the txqueuelen
+# your ixgb adapter should be loaded as eth1 for this to work, change if needed
+ip li set dev eth1 mtu 9000 txqueuelen 1000 up
+# call the sysctl utility to modify /proc/sys entries
+sysctl -p ./sysctl_ixgb.conf
+- END ixgb_perf.sh
+
+- COPY these lines and paste them into sysctl_ixgb.conf:
+# some of the defaults may be different for your kernel
+# call this file with sysctl -p <this file>
+# these are just suggested values that worked well to increase throughput in
+# several network benchmark tests, your mileage may vary
+
+### IPV4 specific settings
+# turn TCP timestamp support off, default 1, reduces CPU use
+net.ipv4.tcp_timestamps = 0
+# turn SACK support off, default on
+# on systems with a VERY fast bus -> memory interface this is the big gainer
+net.ipv4.tcp_sack = 0
+# set min/default/max TCP read buffer, default 4096 87380 174760
+net.ipv4.tcp_rmem = 10000000 10000000 10000000
+# set min/pressure/max TCP write buffer, default 4096 16384 131072
+net.ipv4.tcp_wmem = 10000000 10000000 10000000
+# set min/pressure/max TCP buffer space, default 31744 32256 32768
+net.ipv4.tcp_mem = 10000000 10000000 10000000
+
+### CORE settings (mostly for socket and UDP effect)
+# set maximum receive socket buffer size, default 131071
+net.core.rmem_max = 524287
+# set maximum send socket buffer size, default 131071
+net.core.wmem_max = 524287
+# set default receive socket buffer size, default 65535
+net.core.rmem_default = 524287
+# set default send socket buffer size, default 65535
+net.core.wmem_default = 524287
+# set maximum amount of option memory buffers, default 10240
+net.core.optmem_max = 524287
+# set number of unprocessed input packets before kernel starts dropping them; default 300
+net.core.netdev_max_backlog = 300000
+- END sysctl_ixgb.conf
+
+Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
+your ixgb driver is using and/or replace '1a48' with appropriate 10GbE device's
+ID installed on the system.
+
+NOTE: Unless these scripts are added to the boot process, these changes will
+      only last only until the next system reboot.
+
+
+Resolving Slow UDP Traffic
+--------------------------
+If your server does not seem to be able to receive UDP traffic as fast as it
+can receive TCP traffic, it could be because Linux, by default, does not set
+the network stack buffers as large as they need to be to support high UDP
+transfer rates.  One way to alleviate this problem is to allow more memory to
+be used by the IP stack to store incoming data.
+
+For instance, use the commands:
+    sysctl -w net.core.rmem_max=262143
+and
+    sysctl -w net.core.rmem_default=262143
+to increase the read buffer memory max and default to 262143 (256k - 1) from
+defaults of max=131071 (128k - 1) and default=65535 (64k - 1).  These variables
+will increase the amount of memory used by the network stack for receives, and
+can be increased significantly more if necessary for your application.
+
+
+Additional Configurations
+=========================
+
+  Configuring the Driver on Different Distributions
+  -------------------------------------------------
+  Configuring a network driver to load properly when the system is started is
+  distribution dependent. Typically, the configuration process involves adding
+  an alias line to /etc/modprobe.conf as well as editing other system startup
+  scripts and/or configuration files.  Many popular Linux distributions ship
+  with tools to make these changes for you.  To learn the proper way to
+  configure a network device for your system, refer to your distribution
+  documentation.  If during this process you are asked for the driver or module
+  name, the name for the Linux Base Driver for the Intel 10GbE Family of
+  Adapters is ixgb.
+
+  Viewing Link Messages
+  ---------------------
+  Link messages will not be displayed to the console if the distribution is
+  restricting system messages. In order to see network driver link messages on
+  your console, set dmesg to eight by entering the following:
+
+       dmesg -n 8
+
+  NOTE: This setting is not saved across reboots.
+
+
+  Jumbo Frames
+  ------------
+  The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
+  enabled by changing the MTU to a value larger than the default of 1500.
+  The maximum value for the MTU is 16114.  Use the ip command to
+  increase the MTU size.  For example:
+
+        ip li set dev ethx mtu 9000
+
+  The maximum MTU setting for Jumbo Frames is 16114.  This value coincides
+  with the maximum Jumbo Frames size of 16128.
+
+
+  ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information.  The ethtool
+  version 1.6 or later is required for this functionality.
+
+  The latest release of ethtool can be found from
+  https://www.kernel.org/pub/software/network/ethtool/
+
+  NOTE: The ethtool version 1.6 only supports a limited set of ethtool options.
+        Support for a more complete ethtool feature set can be enabled by
+        upgrading to the latest version.
+
+
+  NAPI
+  ----
+
+  NAPI (Rx polling mode) is supported in the ixgb driver.  NAPI is enabled
+  or disabled based on the configuration of the kernel.  see CONFIG_IXGB_NAPI
+
+  See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
+
+
+Known Issues/Troubleshooting
+============================
+
+  NOTE: After installing the driver, if your Intel Network Connection is not
+  working, verify in the "In This Release" section of the readme that you have
+  installed the correct driver.
+
+  Intel(R) PRO/10GbE CX4 Server Adapter Cable Interoperability Issue with
+  Fujitsu XENPAK Module in SmartBits Chassis
+  ---------------------------------------------------------------------
+  Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4
+  Server adapter is connected to a Fujitsu XENPAK CX4 module in a SmartBits
+  chassis using 15 m/24AWG cable assemblies manufactured by Fujitsu or Leoni.
+  The CRC errors may be received either by the Intel(R) PRO/10GbE CX4
+  Server adapter or the SmartBits. If this situation occurs using a different
+  cable assembly may resolve the issue.
+
+  CX4 Server Adapter Cable Interoperability Issues with HP Procurve 3400cl
+  Switch Port
+  ------------------------------------------------------------------------
+  Excessive CRC errors may be observed if the Intel(R) PRO/10GbE CX4 Server
+  adapter is connected to an HP Procurve 3400cl switch port using short cables
+  (1 m or shorter). If this situation occurs, using a longer cable may resolve
+  the issue.
+
+  Excessive CRC errors may be observed using Fujitsu 24AWG cable assemblies that
+  Are 10 m or longer or where using a Leoni 15 m/24AWG cable assembly. The CRC
+  errors may be received either by the CX4 Server adapter or at the switch. If
+  this situation occurs, using a different cable assembly may resolve the issue.
+
+
+  Jumbo Frames System Requirement
+  -------------------------------
+  Memory allocation failures have been observed on Linux systems with 64 MB
+  of RAM or less that are running Jumbo Frames.  If you are using Jumbo
+  Frames, your system may require more than the advertised minimum
+  requirement of 64 MB of system memory.
+
+
+  Performance Degradation with Jumbo Frames
+  -----------------------------------------
+  Degradation in throughput performance may be observed in some Jumbo frames
+  environments.  If this is observed, increasing the application's socket buffer
+  size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values may help.
+  See the specific application manual and /usr/src/linux*/Documentation/
+  networking/ip-sysctl.txt for more details.
+
+
+  Allocating Rx Buffers when Using Jumbo Frames
+  ---------------------------------------------
+  Allocating Rx buffers when using Jumbo Frames on 2.6.x kernels may fail if
+  the available memory is heavily fragmented. This issue may be seen with PCI-X
+  adapters or with packet split disabled. This can be reduced or eliminated
+  by changing the amount of available memory for receive buffer allocation, by
+  increasing /proc/sys/vm/min_free_kbytes.
+
+
+  Multiple Interfaces on Same Ethernet Broadcast Network
+  ------------------------------------------------------
+  Due to the default ARP behavior on Linux, it is not possible to have
+  one system on two IP networks in the same Ethernet broadcast domain
+  (non-partitioned switch) behave as expected.  All Ethernet interfaces
+  will respond to IP traffic for any IP address assigned to the system.
+  This results in unbalanced receive traffic.
+
+  If you have multiple interfaces in a server, do either of the following:
+
+  - Turn on ARP filtering by entering:
+      echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
+
+  - Install the interfaces in separate broadcast domains - either in
+    different switches or in a switch partitioned to VLANs.
+
+
+  UDP Stress Test Dropped Packet Issue
+  --------------------------------------
+  Under small packets UDP stress test with 10GbE driver, the Linux system
+  may drop UDP packets due to the fullness of socket buffers. You may want
+  to change the driver's Flow Control variables to the minimum value for
+  controlling packet reception.
+
+
+  Tx Hangs Possible Under Stress
+  ------------------------------
+  Under stress conditions, if TX hangs occur, turning off TSO
+  "ethtool -K eth0 tso off" may resolve the problem.
+
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/ixgbe.txt b/Documentation/networking/ixgbe.txt
new file mode 100644
index 000000000..687835415
--- /dev/null
+++ b/Documentation/networking/ixgbe.txt
@@ -0,0 +1,349 @@
+Linux* Base Driver for the Intel(R) Ethernet 10 Gigabit PCI Express Family of
+Adapters
+=============================================================================
+
+Intel 10 Gigabit Linux driver.
+Copyright(c) 1999 - 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Additional Configurations
+- Performance Tuning
+- Known Issues
+- Support
+
+Identifying Your Adapter
+========================
+
+The driver in this release is compatible with 82598, 82599 and X540-based
+Intel Network Connections.
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/network/sb/CS-012904.htm
+
+SFP+ Devices with Pluggable Optics
+----------------------------------
+
+82599-BASED ADAPTERS
+
+NOTES: If your 82599-based Intel(R) Network Adapter came with Intel optics, or
+is an Intel(R) Ethernet Server Adapter X520-2, then it only supports Intel
+optics and/or the direct attach cables listed below.
+
+When 82599-based SFP+ devices are connected back to back, they should be set to
+the same Speed setting via ethtool. Results may vary if you mix speed settings.
+82598-based adapters support all passive direct attach cables that comply
+with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach
+cables are not supported.
+
+Supplier    Type                                             Part Numbers
+
+SR Modules
+Intel       DUAL RATE 1G/10G SFP+ SR (bailed)                FTLX8571D3BCV-IT
+Intel       DUAL RATE 1G/10G SFP+ SR (bailed)                AFBR-703SDDZ-IN1
+Intel       DUAL RATE 1G/10G SFP+ SR (bailed)                AFBR-703SDZ-IN2
+LR Modules
+Intel       DUAL RATE 1G/10G SFP+ LR (bailed)                FTLX1471D3BCV-IT
+Intel       DUAL RATE 1G/10G SFP+ LR (bailed)                AFCT-701SDDZ-IN1
+Intel       DUAL RATE 1G/10G SFP+ LR (bailed)                AFCT-701SDZ-IN2
+
+The following is a list of 3rd party SFP+ modules and direct attach cables that
+have received some testing. Not all modules are applicable to all devices.
+
+Supplier   Type                                              Part Numbers
+
+Finisar    SFP+ SR bailed, 10g single rate                   FTLX8571D3BCL
+Avago      SFP+ SR bailed, 10g single rate                   AFBR-700SDZ
+Finisar    SFP+ LR bailed, 10g single rate                   FTLX1471D3BCL
+
+Finisar    DUAL RATE 1G/10G SFP+ SR (No Bail)                FTLX8571D3QCV-IT
+Avago      DUAL RATE 1G/10G SFP+ SR (No Bail)                AFBR-703SDZ-IN1
+Finisar    DUAL RATE 1G/10G SFP+ LR (No Bail)                FTLX1471D3QCV-IT
+Avago      DUAL RATE 1G/10G SFP+ LR (No Bail)                AFCT-701SDZ-IN1
+Finistar   1000BASE-T SFP                                    FCLF8522P2BTL
+Avago      1000BASE-T SFP                                    ABCU-5710RZ
+
+82599-based adapters support all passive and active limiting direct attach
+cables that comply with SFF-8431 v4.1 and SFF-8472 v10.4 specifications.
+
+Laser turns off for SFP+ when device is down
+-------------------------------------------
+"ip link set down" turns off the laser for 82599-based SFP+ fiber adapters.
+"ip link set up" turns on the laser.
+
+
+82598-BASED ADAPTERS
+
+NOTES for 82598-Based Adapters:
+- Intel(R) Network Adapters that support removable optical modules only support
+  their original module type (i.e., the Intel(R) 10 Gigabit SR Dual Port
+  Express Module only supports SR optical modules). If you plug in a different
+  type of module, the driver will not load.
+- Hot Swapping/hot plugging optical modules is not supported.
+- Only single speed, 10 gigabit modules are supported.
+- LAN on Motherboard (LOMs) may support DA, SR, or LR modules. Other module
+  types are not supported. Please see your system documentation for details.
+
+The following is a list of 3rd party SFP+ modules and direct attach cables that
+have received some testing. Not all modules are applicable to all devices.
+
+Supplier   Type                                              Part Numbers
+
+Finisar    SFP+ SR bailed, 10g single rate                   FTLX8571D3BCL
+Avago      SFP+ SR bailed, 10g single rate                   AFBR-700SDZ
+Finisar    SFP+ LR bailed, 10g single rate                   FTLX1471D3BCL
+
+82598-based adapters support all passive direct attach cables that comply
+with SFF-8431 v4.1 and SFF-8472 v10.4 specifications. Active direct attach
+cables are not supported.
+
+
+Flow Control
+------------
+Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable
+receiving and transmitting pause frames for ixgbe. When TX is enabled, PAUSE
+frames are generated when the receive packet buffer crosses a predefined
+threshold.  When rx is enabled, the transmit unit will halt for the time delay
+specified when a PAUSE frame is received.
+
+Flow Control is enabled by default. If you want to disable a flow control
+capable link partner, use ethtool:
+
+     ethtool -A eth? autoneg off RX off TX off
+
+NOTE: For 82598 backplane cards entering 1 gig mode, flow control default
+behavior is changed to off.  Flow control in 1 gig mode on these devices can
+lead to Tx hangs.
+
+Intel(R) Ethernet Flow Director
+-------------------------------
+Supports advanced filters that direct receive packets by their flows to
+different queues. Enables tight control on routing a flow in the platform.
+Matches flows and CPU cores for flow affinity. Supports multiple parameters
+for flexible flow classification and load balancing.
+
+Flow director is enabled only if the kernel is multiple TX queue capable.
+
+An included script (set_irq_affinity.sh) automates setting the IRQ to CPU
+affinity.
+
+You can verify that the driver is using Flow Director by looking at the counter
+in ethtool: fdir_miss and fdir_match.
+
+Other ethtool Commands:
+To enable Flow Director
+	ethtool -K ethX ntuple on
+To add a filter
+	Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 10.0.128.23
+        action 1
+To see the list of filters currently present:
+	ethtool -u ethX
+
+Perfect Filter: Perfect filter is an interface to load the filter table that
+funnels all flow into queue_0 unless an alternative queue is specified using
+"action". In that case, any flow that matches the filter criteria will be
+directed to the appropriate queue.
+
+If the queue is defined as -1, filter will drop matching packets.
+
+To account for filter matches and misses, there are two stats in ethtool:
+fdir_match and fdir_miss. In addition, rx_queue_N_packets shows the number of
+packets processed by the Nth queue.
+
+NOTE: Receive Packet Steering (RPS) and Receive Flow Steering (RFS) are not
+compatible with Flow Director. IF Flow Director is enabled, these will be
+disabled.
+
+The following three parameters impact Flow Director.
+
+FdirMode
+--------
+Valid Range: 0-2 (0=off, 1=ATR, 2=Perfect filter mode)
+Default Value: 1
+
+  Flow Director filtering modes.
+
+FdirPballoc
+-----------
+Valid Range: 0-2 (0=64k, 1=128k, 2=256k)
+Default Value: 0
+
+  Flow Director allocated packet buffer size.
+
+AtrSampleRate
+--------------
+Valid Range: 1-100
+Default Value: 20
+
+  Software ATR Tx packet sample rate. For example, when set to 20, every 20th
+  packet, looks to see if the packet will create a new flow.
+
+Node
+----
+Valid Range:   0-n
+Default Value: 1 (off)
+
+  0 - n: where n is the number of NUMA nodes (i.e. 0 - 3) currently online in
+  your system
+  1: turns this option off
+
+  The Node parameter will allow you to pick which NUMA node you want to have
+  the adapter allocate memory on.
+
+max_vfs
+-------
+Valid Range:   1-63
+Default Value: 0
+
+  If the value is greater than 0 it will also force the VMDq parameter to be 1
+  or more.
+
+  This parameter adds support for SR-IOV.  It causes the driver to spawn up to
+  max_vfs worth of virtual function.
+
+
+Additional Configurations
+=========================
+
+  Jumbo Frames
+  ------------
+  The driver supports Jumbo Frames for all adapters. Jumbo Frames support is
+  enabled by changing the MTU to a value larger than the default of 1500.
+  The maximum value for the MTU is 16110.  Use the ip command to
+  increase the MTU size.  For example:
+
+        ip link set dev ethx mtu 9000
+
+  The maximum MTU setting for Jumbo Frames is 9710.  This value coincides
+  with the maximum Jumbo Frames size of 9728.
+
+  Generic Receive Offload, aka GRO
+  --------------------------------
+  The driver supports the in-kernel software implementation of GRO.  GRO has
+  shown that by coalescing Rx traffic into larger chunks of data, CPU
+  utilization can be significantly reduced when under large Rx load.  GRO is an
+  evolution of the previously-used LRO interface.  GRO is able to coalesce
+  other protocols besides TCP.  It's also safe to use with configurations that
+  are problematic for LRO, namely bridging and iSCSI.
+
+  Data Center Bridging, aka DCB
+  -----------------------------
+  DCB is a configuration Quality of Service implementation in hardware.
+  It uses the VLAN priority tag (802.1p) to filter traffic.  That means
+  that there are 8 different priorities that traffic can be filtered into.
+  It also enables priority flow control which can limit or eliminate the
+  number of dropped packets during network stress.  Bandwidth can be
+  allocated to each of these priorities, which is enforced at the hardware
+  level.
+
+  To enable DCB support in ixgbe, you must enable the DCB netlink layer to
+  allow the userspace tools (see below) to communicate with the driver.
+  This can be found in the kernel configuration here:
+
+        -> Networking support
+          -> Networking options
+            -> Data Center Bridging support
+
+  Once this is selected, DCB support must be selected for ixgbe.  This can
+  be found here:
+
+        -> Device Drivers
+          -> Network device support (NETDEVICES [=y])
+            -> Ethernet (10000 Mbit) (NETDEV_10000 [=y])
+              -> Intel(R) 10GbE PCI Express adapters support
+                -> Data Center Bridging (DCB) Support
+
+  After these options are selected, you must rebuild your kernel and your
+  modules.
+
+  In order to use DCB, userspace tools must be downloaded and installed.
+  The dcbd tools can be found at:
+
+        http://e1000.sf.net
+
+  Ethtool
+  -------
+  The driver utilizes the ethtool interface for driver configuration and
+  diagnostics, as well as displaying statistical information. The latest
+  ethtool version is required for this functionality.
+
+  The latest release of ethtool can be found from
+  https://www.kernel.org/pub/software/network/ethtool/
+
+  FCoE
+  ----
+  This release of the ixgbe driver contains new code to enable users to use
+  Fiber Channel over Ethernet (FCoE) and Data Center Bridging (DCB)
+  functionality that is supported by the 82598-based hardware.  This code has
+  no default effect on the regular driver operation, and configuring DCB and
+  FCoE is outside the scope of this driver README. Refer to
+  http://www.open-fcoe.org/ for FCoE project information and contact
+  e1000-eedc@lists.sourceforge.net for DCB information.
+
+  MAC and VLAN anti-spoofing feature
+  ----------------------------------
+  When a malicious driver attempts to send a spoofed packet, it is dropped by
+  the hardware and not transmitted.  An interrupt is sent to the PF driver
+  notifying it of the spoof attempt.
+
+  When a spoofed packet is detected the PF driver will send the following
+  message to the system log (displayed by  the "dmesg" command):
+
+  Spoof event(s) detected on VF (n)
+
+  Where n=the VF that attempted to do the spoofing.
+
+
+Performance Tuning
+==================
+
+An excellent article on performance tuning can be found at:
+
+http://www.redhat.com/promo/summit/2008/downloads/pdf/Thursday/Mark_Wagner.pdf
+
+
+Known Issues
+============
+
+  Enabling SR-IOV in a 32-bit or 64-bit Microsoft* Windows* Server 2008/R2
+  Guest OS using Intel (R) 82576-based GbE or Intel (R) 82599-based 10GbE
+  controller under KVM
+  ------------------------------------------------------------------------
+  KVM Hypervisor/VMM supports direct assignment of a PCIe device to a VM.  This
+  includes traditional PCIe devices, as well as SR-IOV-capable devices using
+  Intel 82576-based and 82599-based controllers.
+
+  While direct assignment of a PCIe device or an SR-IOV Virtual Function (VF)
+  to a Linux-based VM running 2.6.32 or later kernel works fine, there is a
+  known issue with Microsoft Windows Server 2008 VM that results in a "yellow
+  bang" error. This problem is within the KVM VMM itself, not the Intel driver,
+  or the SR-IOV logic of the VMM, but rather that KVM emulates an older CPU
+  model for the guests, and this older CPU model does not support MSI-X
+  interrupts, which is a requirement for Intel SR-IOV.
+
+  If you wish to use the Intel 82576 or 82599-based controllers in SR-IOV mode
+  with KVM and a Microsoft Windows Server 2008 guest try the following
+  workaround. The workaround is to tell KVM to emulate a different model of CPU
+  when using qemu to create the KVM guest:
+
+       "-cpu qemu64,model=13"
+
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://e1000.sourceforge.net
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/ixgbevf.txt b/Documentation/networking/ixgbevf.txt
new file mode 100644
index 000000000..53d8d2a5a
--- /dev/null
+++ b/Documentation/networking/ixgbevf.txt
@@ -0,0 +1,52 @@
+Linux* Base Driver for Intel(R) Ethernet Network Connection
+===========================================================
+
+Intel Gigabit Linux driver.
+Copyright(c) 1999 - 2013 Intel Corporation.
+
+Contents
+========
+
+- Identifying Your Adapter
+- Known Issues/Troubleshooting
+- Support
+
+This file describes the ixgbevf Linux* Base Driver for Intel Network
+Connection.
+
+The ixgbevf driver supports 82599-based virtual function devices that can only
+be activated on kernels with CONFIG_PCI_IOV enabled.
+
+The ixgbevf driver supports virtual functions generated by the ixgbe driver
+with a max_vfs value of 1 or greater.
+
+The guest OS loading the ixgbevf driver must support MSI-X interrupts.
+
+VLANs: There is a limit of a total of 32 shared VLANs to 1 or more VFs.
+
+Identifying Your Adapter
+========================
+
+For more information on how to identify your adapter, go to the Adapter &
+Driver ID Guide at:
+
+    http://support.intel.com/support/go/network/adapter/idguide.htm
+
+Known Issues/Troubleshooting
+============================
+
+
+Support
+=======
+
+For general information, go to the Intel support website at:
+
+    http://support.intel.com
+
+or the Intel Wired Networking project hosted by Sourceforge at:
+
+    http://sourceforge.net/projects/e1000
+
+If an issue is identified with the released source code on the supported
+kernel with a supported adapter, email the specific information related
+to the issue to e1000-devel@lists.sf.net
diff --git a/Documentation/networking/kapi.rst b/Documentation/networking/kapi.rst
new file mode 100644
index 000000000..f03ae64be
--- /dev/null
+++ b/Documentation/networking/kapi.rst
@@ -0,0 +1,171 @@
+=========================================
+Linux Networking and Network Devices APIs
+=========================================
+
+Linux Networking
+================
+
+Networking Base Types
+---------------------
+
+.. kernel-doc:: include/linux/net.h
+   :internal:
+
+Socket Buffer Functions
+-----------------------
+
+.. kernel-doc:: include/linux/skbuff.h
+   :internal:
+
+.. kernel-doc:: include/net/sock.h
+   :internal:
+
+.. kernel-doc:: net/socket.c
+   :export:
+
+.. kernel-doc:: net/core/skbuff.c
+   :export:
+
+.. kernel-doc:: net/core/sock.c
+   :export:
+
+.. kernel-doc:: net/core/datagram.c
+   :export:
+
+.. kernel-doc:: net/core/stream.c
+   :export:
+
+Socket Filter
+-------------
+
+.. kernel-doc:: net/core/filter.c
+   :export:
+
+Generic Network Statistics
+--------------------------
+
+.. kernel-doc:: include/uapi/linux/gen_stats.h
+   :internal:
+
+.. kernel-doc:: net/core/gen_stats.c
+   :export:
+
+.. kernel-doc:: net/core/gen_estimator.c
+   :export:
+
+SUN RPC subsystem
+-----------------
+
+.. kernel-doc:: net/sunrpc/xdr.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/svc_xprt.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/xprt.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/sched.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/socklib.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/stats.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/rpc_pipe.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/rpcb_clnt.c
+   :export:
+
+.. kernel-doc:: net/sunrpc/clnt.c
+   :export:
+
+WiMAX
+-----
+
+.. kernel-doc:: net/wimax/op-msg.c
+   :export:
+
+.. kernel-doc:: net/wimax/op-reset.c
+   :export:
+
+.. kernel-doc:: net/wimax/op-rfkill.c
+   :export:
+
+.. kernel-doc:: net/wimax/stack.c
+   :export:
+
+.. kernel-doc:: include/net/wimax.h
+   :internal:
+
+.. kernel-doc:: include/uapi/linux/wimax.h
+   :internal:
+
+Network device support
+======================
+
+Driver Support
+--------------
+
+.. kernel-doc:: net/core/dev.c
+   :export:
+
+.. kernel-doc:: net/ethernet/eth.c
+   :export:
+
+.. kernel-doc:: net/sched/sch_generic.c
+   :export:
+
+.. kernel-doc:: include/linux/etherdevice.h
+   :internal:
+
+.. kernel-doc:: include/linux/netdevice.h
+   :internal:
+
+PHY Support
+-----------
+
+.. kernel-doc:: drivers/net/phy/phy.c
+   :export:
+
+.. kernel-doc:: drivers/net/phy/phy.c
+   :internal:
+
+.. kernel-doc:: drivers/net/phy/phy_device.c
+   :export:
+
+.. kernel-doc:: drivers/net/phy/phy_device.c
+   :internal:
+
+.. kernel-doc:: drivers/net/phy/mdio_bus.c
+   :export:
+
+.. kernel-doc:: drivers/net/phy/mdio_bus.c
+   :internal:
+
+PHYLINK
+-------
+
+  PHYLINK interfaces traditional network drivers with PHYLIB, fixed-links,
+  and SFF modules (eg, hot-pluggable SFP) that may contain PHYs.  PHYLINK
+  provides management of the link state and link modes.
+
+.. kernel-doc:: include/linux/phylink.h
+   :internal:
+
+.. kernel-doc:: drivers/net/phy/phylink.c
+
+SFP support
+-----------
+
+.. kernel-doc:: drivers/net/phy/sfp-bus.c
+   :internal:
+
+.. kernel-doc:: include/linux/sfp.h
+   :internal:
+
+.. kernel-doc:: drivers/net/phy/sfp-bus.c
+   :export:
diff --git a/Documentation/networking/kcm.txt b/Documentation/networking/kcm.txt
new file mode 100644
index 000000000..b773a5278
--- /dev/null
+++ b/Documentation/networking/kcm.txt
@@ -0,0 +1,285 @@
+Kernel Connection Multiplexor
+-----------------------------
+
+Kernel Connection Multiplexor (KCM) is a mechanism that provides a message based
+interface over TCP for generic application protocols. With KCM an application
+can efficiently send and receive application protocol messages over TCP using
+datagram sockets.
+
+KCM implements an NxM multiplexor in the kernel as diagrammed below:
+
++------------+   +------------+   +------------+   +------------+
+| KCM socket |   | KCM socket |   | KCM socket |   | KCM socket |
++------------+   +------------+   +------------+   +------------+
+      |                 |               |                |
+      +-----------+     |               |     +----------+
+                  |     |               |     |
+               +----------------------------------+
+               |           Multiplexor            |
+               +----------------------------------+
+                 |   |           |           |  |
+       +---------+   |           |           |  ------------+
+       |             |           |           |              |
++----------+  +----------+  +----------+  +----------+ +----------+
+|  Psock   |  |  Psock   |  |  Psock   |  |  Psock   | |  Psock   |
++----------+  +----------+  +----------+  +----------+ +----------+
+      |              |           |            |             |
++----------+  +----------+  +----------+  +----------+ +----------+
+| TCP sock |  | TCP sock |  | TCP sock |  | TCP sock | | TCP sock |
++----------+  +----------+  +----------+  +----------+ +----------+
+
+KCM sockets
+-----------
+
+The KCM sockets provide the user interface to the multiplexor. All the KCM sockets
+bound to a multiplexor are considered to have equivalent function, and I/O
+operations in different sockets may be done in parallel without the need for
+synchronization between threads in userspace.
+
+Multiplexor
+-----------
+
+The multiplexor provides the message steering. In the transmit path, messages
+written on a KCM socket are sent atomically on an appropriate TCP socket.
+Similarly, in the receive path, messages are constructed on each TCP socket
+(Psock) and complete messages are steered to a KCM socket.
+
+TCP sockets & Psocks
+--------------------
+
+TCP sockets may be bound to a KCM multiplexor. A Psock structure is allocated
+for each bound TCP socket, this structure holds the state for constructing
+messages on receive as well as other connection specific information for KCM.
+
+Connected mode semantics
+------------------------
+
+Each multiplexor assumes that all attached TCP connections are to the same
+destination and can use the different connections for load balancing when
+transmitting. The normal send and recv calls (include sendmmsg and recvmmsg)
+can be used to send and receive messages from the KCM socket.
+
+Socket types
+------------
+
+KCM supports SOCK_DGRAM and SOCK_SEQPACKET socket types.
+
+Message delineation
+-------------------
+
+Messages are sent over a TCP stream with some application protocol message
+format that typically includes a header which frames the messages. The length
+of a received message can be deduced from the application protocol header
+(often just a simple length field).
+
+A TCP stream must be parsed to determine message boundaries. Berkeley Packet
+Filter (BPF) is used for this. When attaching a TCP socket to a multiplexor a
+BPF program must be specified. The program is called at the start of receiving
+a new message and is given an skbuff that contains the bytes received so far.
+It parses the message header and returns the length of the message. Given this
+information, KCM will construct the message of the stated length and deliver it
+to a KCM socket.
+
+TCP socket management
+---------------------
+
+When a TCP socket is attached to a KCM multiplexor data ready (POLLIN) and
+write space available (POLLOUT) events are handled by the multiplexor. If there
+is a state change (disconnection) or other error on a TCP socket, an error is
+posted on the TCP socket so that a POLLERR event happens and KCM discontinues
+using the socket. When the application gets the error notification for a
+TCP socket, it should unattach the socket from KCM and then handle the error
+condition (the typical response is to close the socket and create a new
+connection if necessary).
+
+KCM limits the maximum receive message size to be the size of the receive
+socket buffer on the attached TCP socket (the socket buffer size can be set by
+SO_RCVBUF). If the length of a new message reported by the BPF program is
+greater than this limit a corresponding error (EMSGSIZE) is posted on the TCP
+socket. The BPF program may also enforce a maximum messages size and report an
+error when it is exceeded.
+
+A timeout may be set for assembling messages on a receive socket. The timeout
+value is taken from the receive timeout of the attached TCP socket (this is set
+by SO_RCVTIMEO). If the timer expires before assembly is complete an error
+(ETIMEDOUT) is posted on the socket.
+
+User interface
+==============
+
+Creating a multiplexor
+----------------------
+
+A new multiplexor and initial KCM socket is created by a socket call:
+
+  socket(AF_KCM, type, protocol)
+
+  - type is either SOCK_DGRAM or SOCK_SEQPACKET
+  - protocol is KCMPROTO_CONNECTED
+
+Cloning KCM sockets
+-------------------
+
+After the first KCM socket is created using the socket call as described
+above, additional sockets for the multiplexor can be created by cloning
+a KCM socket. This is accomplished by an ioctl on a KCM socket:
+
+  /* From linux/kcm.h */
+  struct kcm_clone {
+        int fd;
+  };
+
+  struct kcm_clone info;
+
+  memset(&info, 0, sizeof(info));
+
+  err = ioctl(kcmfd, SIOCKCMCLONE, &info);
+
+  if (!err)
+    newkcmfd = info.fd;
+
+Attach transport sockets
+------------------------
+
+Attaching of transport sockets to a multiplexor is performed by calling an
+ioctl on a KCM socket for the multiplexor. e.g.:
+
+  /* From linux/kcm.h */
+  struct kcm_attach {
+        int fd;
+	int bpf_fd;
+  };
+
+  struct kcm_attach info;
+
+  memset(&info, 0, sizeof(info));
+
+  info.fd = tcpfd;
+  info.bpf_fd = bpf_prog_fd;
+
+  ioctl(kcmfd, SIOCKCMATTACH, &info);
+
+The kcm_attach structure contains:
+  fd: file descriptor for TCP socket being attached
+  bpf_prog_fd: file descriptor for compiled BPF program downloaded
+
+Unattach transport sockets
+--------------------------
+
+Unattaching a transport socket from a multiplexor is straightforward. An
+"unattach" ioctl is done with the kcm_unattach structure as the argument:
+
+  /* From linux/kcm.h */
+  struct kcm_unattach {
+        int fd;
+  };
+
+  struct kcm_unattach info;
+
+  memset(&info, 0, sizeof(info));
+
+  info.fd = cfd;
+
+  ioctl(fd, SIOCKCMUNATTACH, &info);
+
+Disabling receive on KCM socket
+-------------------------------
+
+A setsockopt is used to disable or enable receiving on a KCM socket.
+When receive is disabled, any pending messages in the socket's
+receive buffer are moved to other sockets. This feature is useful
+if an application thread knows that it will be doing a lot of
+work on a request and won't be able to service new messages for a
+while. Example use:
+
+  int val = 1;
+
+  setsockopt(kcmfd, SOL_KCM, KCM_RECV_DISABLE, &val, sizeof(val))
+
+BFP programs for message delineation
+------------------------------------
+
+BPF programs can be compiled using the BPF LLVM backend. For example,
+the BPF program for parsing Thrift is:
+
+  #include "bpf.h" /* for __sk_buff */
+  #include "bpf_helpers.h" /* for load_word intrinsic */
+
+  SEC("socket_kcm")
+  int bpf_prog1(struct __sk_buff *skb)
+  {
+       return load_word(skb, 0) + 4;
+  }
+
+  char _license[] SEC("license") = "GPL";
+
+Use in applications
+===================
+
+KCM accelerates application layer protocols. Specifically, it allows
+applications to use a message based interface for sending and receiving
+messages. The kernel provides necessary assurances that messages are sent
+and received atomically. This relieves much of the burden applications have
+in mapping a message based protocol onto the TCP stream. KCM also make
+application layer messages a unit of work in the kernel for the purposes of
+steering and scheduling, which in turn allows a simpler networking model in
+multithreaded applications.
+
+Configurations
+--------------
+
+In an Nx1 configuration, KCM logically provides multiple socket handles
+to the same TCP connection. This allows parallelism between in I/O
+operations on the TCP socket (for instance copyin and copyout of data is
+parallelized). In an application, a KCM socket can be opened for each
+processing thread and inserted into the epoll (similar to how SO_REUSEPORT
+is used to allow multiple listener sockets on the same port).
+
+In a MxN configuration, multiple connections are established to the
+same destination. These are used for simple load balancing.
+
+Message batching
+----------------
+
+The primary purpose of KCM is load balancing between KCM sockets and hence
+threads in a nominal use case. Perfect load balancing, that is steering
+each received message to a different KCM socket or steering each sent
+message to a different TCP socket, can negatively impact performance
+since this doesn't allow for affinities to be established. Balancing
+based on groups, or batches of messages, can be beneficial for performance.
+
+On transmit, there are three ways an application can batch (pipeline)
+messages on a KCM socket.
+  1) Send multiple messages in a single sendmmsg.
+  2) Send a group of messages each with a sendmsg call, where all messages
+     except the last have MSG_BATCH in the flags of sendmsg call.
+  3) Create "super message" composed of multiple messages and send this
+     with a single sendmsg.
+
+On receive, the KCM module attempts to queue messages received on the
+same KCM socket during each TCP ready callback. The targeted KCM socket
+changes at each receive ready callback on the KCM socket. The application
+does not need to configure this.
+
+Error handling
+--------------
+
+An application should include a thread to monitor errors raised on
+the TCP connection. Normally, this will be done by placing each
+TCP socket attached to a KCM multiplexor in epoll set for POLLERR
+event. If an error occurs on an attached TCP socket, KCM sets an EPIPE
+on the socket thus waking up the application thread. When the application
+sees the error (which may just be a disconnect) it should unattach the
+socket from KCM and then close it. It is assumed that once an error is
+posted on the TCP socket the data stream is unrecoverable (i.e. an error
+may have occurred in the middle of receiving a message).
+
+TCP connection monitoring
+-------------------------
+
+In KCM there is no means to correlate a message to the TCP socket that
+was used to send or receive the message (except in the case there is
+only one attached TCP socket). However, the application does retain
+an open file descriptor to the socket so it will be able to get statistics
+from the socket which can be used in detecting issues (such as high
+retransmissions on the socket).
diff --git a/Documentation/networking/l2tp.txt b/Documentation/networking/l2tp.txt
new file mode 100644
index 000000000..9bc271cdc
--- /dev/null
+++ b/Documentation/networking/l2tp.txt
@@ -0,0 +1,345 @@
+This document describes how to use the kernel's L2TP drivers to
+provide L2TP functionality. L2TP is a protocol that tunnels one or
+more sessions over an IP tunnel. It is commonly used for VPNs
+(L2TP/IPSec) and by ISPs to tunnel subscriber PPP sessions over an IP
+network infrastructure. With L2TPv3, it is also useful as a Layer-2
+tunneling infrastructure.
+
+Features
+========
+
+L2TPv2 (PPP over L2TP (UDP tunnels)).
+L2TPv3 ethernet pseudowires.
+L2TPv3 PPP pseudowires.
+L2TPv3 IP encapsulation.
+Netlink sockets for L2TPv3 configuration management.
+
+History
+=======
+
+The original pppol2tp driver was introduced in 2.6.23 and provided
+L2TPv2 functionality (rfc2661). L2TPv2 is used to tunnel one or more PPP
+sessions over a UDP tunnel.
+
+L2TPv3 (rfc3931) changes the protocol to allow different frame types
+to be passed over an L2TP tunnel by moving the PPP-specific parts of
+the protocol out of the core L2TP packet headers. Each frame type is
+known as a pseudowire type. Ethernet, PPP, HDLC, Frame Relay and ATM
+pseudowires for L2TP are defined in separate RFC standards. Another
+change for L2TPv3 is that it can be carried directly over IP with no
+UDP header (UDP is optional). It is also possible to create static
+unmanaged L2TPv3 tunnels manually without a control protocol
+(userspace daemon) to manage them.
+
+To support L2TPv3, the original pppol2tp driver was split up to
+separate the L2TP and PPP functionality. Existing L2TPv2 userspace
+apps should be unaffected as the original pppol2tp sockets API is
+retained. L2TPv3, however, uses netlink to manage L2TPv3 tunnels and
+sessions.
+
+Design
+======
+
+The L2TP protocol separates control and data frames.  The L2TP kernel
+drivers handle only L2TP data frames; control frames are always
+handled by userspace. L2TP control frames carry messages between L2TP
+clients/servers and are used to setup / teardown tunnels and
+sessions. An L2TP client or server is implemented in userspace.
+
+Each L2TP tunnel is implemented using a UDP or L2TPIP socket; L2TPIP
+provides L2TPv3 IP encapsulation (no UDP) and is implemented using a
+new l2tpip socket family. The tunnel socket is typically created by
+userspace, though for unmanaged L2TPv3 tunnels, the socket can also be
+created by the kernel. Each L2TP session (pseudowire) gets a network
+interface instance. In the case of PPP, these interfaces are created
+indirectly by pppd using a pppol2tp socket. In the case of ethernet,
+the netdevice is created upon a netlink request to create an L2TPv3
+ethernet pseudowire.
+
+For PPP, the PPPoL2TP driver, net/l2tp/l2tp_ppp.c, provides a
+mechanism by which PPP frames carried through an L2TP session are
+passed through the kernel's PPP subsystem. The standard PPP daemon,
+pppd, handles all PPP interaction with the peer. PPP network
+interfaces are created for each local PPP endpoint. The kernel's PPP
+subsystem arranges for PPP control frames to be delivered to pppd,
+while data frames are forwarded as usual.
+
+For ethernet, the L2TPETH driver, net/l2tp/l2tp_eth.c, implements a
+netdevice driver, managing virtual ethernet devices, one per
+pseudowire. These interfaces can be managed using standard Linux tools
+such as "ip" and "ifconfig". If only IP frames are passed over the
+tunnel, the interface can be given an IP addresses of itself and its
+peer. If non-IP frames are to be passed over the tunnel, the interface
+can be added to a bridge using brctl. All L2TP datapath protocol
+functions are handled by the L2TP core driver.
+
+Each tunnel and session within a tunnel is assigned a unique tunnel_id
+and session_id. These ids are carried in the L2TP header of every
+control and data packet. (Actually, in L2TPv3, the tunnel_id isn't
+present in data frames - it is inferred from the IP connection on
+which the packet was received.) The L2TP driver uses the ids to lookup
+internal tunnel and/or session contexts to determine how to handle the
+packet. Zero tunnel / session ids are treated specially - zero ids are
+never assigned to tunnels or sessions in the network. In the driver,
+the tunnel context keeps a reference to the tunnel UDP or L2TPIP
+socket. The session context holds data that lets the driver interface
+to the kernel's network frame type subsystems, i.e. PPP, ethernet.
+
+Userspace Programming
+=====================
+
+For L2TPv2, there are a number of requirements on the userspace L2TP
+daemon in order to use the pppol2tp driver.
+
+1. Use a UDP socket per tunnel.
+
+2. Create a single PPPoL2TP socket per tunnel bound to a special null
+   session id. This is used only for communicating with the driver but
+   must remain open while the tunnel is active. Opening this tunnel
+   management socket causes the driver to mark the tunnel socket as an
+   L2TP UDP encapsulation socket and flags it for use by the
+   referenced tunnel id. This hooks up the UDP receive path via
+   udp_encap_rcv() in net/ipv4/udp.c. PPP data frames are never passed
+   in this special PPPoX socket.
+
+3. Create a PPPoL2TP socket per L2TP session. This is typically done
+   by starting pppd with the pppol2tp plugin and appropriate
+   arguments. A PPPoL2TP tunnel management socket (Step 2) must be
+   created before the first PPPoL2TP session socket is created.
+
+When creating PPPoL2TP sockets, the application provides information
+to the driver about the socket in a socket connect() call. Source and
+destination tunnel and session ids are provided, as well as the file
+descriptor of a UDP socket. See struct pppol2tp_addr in
+include/linux/if_pppol2tp.h. Note that zero tunnel / session ids are
+treated specially. When creating the per-tunnel PPPoL2TP management
+socket in Step 2 above, zero source and destination session ids are
+specified, which tells the driver to prepare the supplied UDP file
+descriptor for use as an L2TP tunnel socket.
+
+Userspace may control behavior of the tunnel or session using
+setsockopt and ioctl on the PPPoX socket. The following socket
+options are supported:-
+
+DEBUG     - bitmask of debug message categories. See below.
+SENDSEQ   - 0 => don't send packets with sequence numbers
+            1 => send packets with sequence numbers
+RECVSEQ   - 0 => receive packet sequence numbers are optional
+            1 => drop receive packets without sequence numbers
+LNSMODE   - 0 => act as LAC.
+            1 => act as LNS.
+REORDERTO - reorder timeout (in millisecs). If 0, don't try to reorder.
+
+Only the DEBUG option is supported by the special tunnel management
+PPPoX socket.
+
+In addition to the standard PPP ioctls, a PPPIOCGL2TPSTATS is provided
+to retrieve tunnel and session statistics from the kernel using the
+PPPoX socket of the appropriate tunnel or session.
+
+For L2TPv3, userspace must use the netlink API defined in
+include/linux/l2tp.h to manage tunnel and session contexts. The
+general procedure to create a new L2TP tunnel with one session is:-
+
+1. Open a GENL socket using L2TP_GENL_NAME for configuring the kernel
+   using netlink.
+
+2. Create a UDP or L2TPIP socket for the tunnel.
+
+3. Create a new L2TP tunnel using a L2TP_CMD_TUNNEL_CREATE
+   request. Set attributes according to desired tunnel parameters,
+   referencing the UDP or L2TPIP socket created in the previous step.
+
+4. Create a new L2TP session in the tunnel using a
+   L2TP_CMD_SESSION_CREATE request.
+
+The tunnel and all of its sessions are closed when the tunnel socket
+is closed. The netlink API may also be used to delete sessions and
+tunnels. Configuration and status info may be set or read using netlink.
+
+The L2TP driver also supports static (unmanaged) L2TPv3 tunnels. These
+are where there is no L2TP control message exchange with the peer to
+setup the tunnel; the tunnel is configured manually at each end of the
+tunnel. There is no need for an L2TP userspace application in this
+case -- the tunnel socket is created by the kernel and configured
+using parameters sent in the L2TP_CMD_TUNNEL_CREATE netlink
+request. The "ip" utility of iproute2 has commands for managing static
+L2TPv3 tunnels; do "ip l2tp help" for more information.
+
+Debugging
+=========
+
+The driver supports a flexible debug scheme where kernel trace
+messages may be optionally enabled per tunnel and per session. Care is
+needed when debugging a live system since the messages are not
+rate-limited and a busy system could be swamped. Userspace uses
+setsockopt on the PPPoX socket to set a debug mask.
+
+The following debug mask bits are available:
+
+L2TP_MSG_DEBUG    verbose debug (if compiled in)
+L2TP_MSG_CONTROL  userspace - kernel interface
+L2TP_MSG_SEQ      sequence numbers handling
+L2TP_MSG_DATA     data packets
+
+If enabled, files under a l2tp debugfs directory can be used to dump
+kernel state about L2TP tunnels and sessions. To access it, the
+debugfs filesystem must first be mounted.
+
+# mount -t debugfs debugfs /debug
+
+Files under the l2tp directory can then be accessed.
+
+# cat /debug/l2tp/tunnels
+
+The debugfs files should not be used by applications to obtain L2TP
+state information because the file format is subject to change. It is
+implemented to provide extra debug information to help diagnose
+problems.) Users should use the netlink API.
+
+/proc/net/pppol2tp is also provided for backwards compatibility with
+the original pppol2tp driver. It lists information about L2TPv2
+tunnels and sessions only. Its use is discouraged.
+
+Unmanaged L2TPv3 Tunnels
+========================
+
+Some commercial L2TP products support unmanaged L2TPv3 ethernet
+tunnels, where there is no L2TP control protocol; tunnels are
+configured at each side manually. New commands are available in
+iproute2's ip utility to support this.
+
+To create an L2TPv3 ethernet pseudowire between local host 192.168.1.1
+and peer 192.168.1.2, using IP addresses 10.5.1.1 and 10.5.1.2 for the
+tunnel endpoints:-
+
+# ip l2tp add tunnel tunnel_id 1 peer_tunnel_id 1 udp_sport 5000 \
+  udp_dport 5000 encap udp local 192.168.1.1 remote 192.168.1.2
+# ip l2tp add session tunnel_id 1 session_id 1 peer_session_id 1
+# ip -s -d show dev l2tpeth0
+# ip addr add 10.5.1.2/32 peer 10.5.1.1/32 dev l2tpeth0
+# ip li set dev l2tpeth0 up
+
+Choose IP addresses to be the address of a local IP interface and that
+of the remote system. The IP addresses of the l2tpeth0 interface can be
+anything suitable.
+
+Repeat the above at the peer, with ports, tunnel/session ids and IP
+addresses reversed.  The tunnel and session IDs can be any non-zero
+32-bit number, but the values must be reversed at the peer.
+
+Host 1                         Host2
+udp_sport=5000                 udp_sport=5001
+udp_dport=5001                 udp_dport=5000
+tunnel_id=42                   tunnel_id=45
+peer_tunnel_id=45              peer_tunnel_id=42
+session_id=128                 session_id=5196755
+peer_session_id=5196755        peer_session_id=128
+
+When done at both ends of the tunnel, it should be possible to send
+data over the network. e.g.
+
+# ping 10.5.1.1
+
+
+Sample Userspace Code
+=====================
+
+1. Create tunnel management PPPoX socket
+
+        kernel_fd = socket(AF_PPPOX, SOCK_DGRAM, PX_PROTO_OL2TP);
+        if (kernel_fd >= 0) {
+                struct sockaddr_pppol2tp sax;
+                struct sockaddr_in const *peer_addr;
+
+                peer_addr = l2tp_tunnel_get_peer_addr(tunnel);
+                memset(&sax, 0, sizeof(sax));
+                sax.sa_family = AF_PPPOX;
+                sax.sa_protocol = PX_PROTO_OL2TP;
+                sax.pppol2tp.fd = udp_fd;       /* fd of tunnel UDP socket */
+                sax.pppol2tp.addr.sin_addr.s_addr = peer_addr->sin_addr.s_addr;
+                sax.pppol2tp.addr.sin_port = peer_addr->sin_port;
+                sax.pppol2tp.addr.sin_family = AF_INET;
+                sax.pppol2tp.s_tunnel = tunnel_id;
+                sax.pppol2tp.s_session = 0;     /* special case: mgmt socket */
+                sax.pppol2tp.d_tunnel = 0;
+                sax.pppol2tp.d_session = 0;     /* special case: mgmt socket */
+
+                if(connect(kernel_fd, (struct sockaddr *)&sax, sizeof(sax) ) < 0 ) {
+                        perror("connect failed");
+                        result = -errno;
+                        goto err;
+                }
+        }
+
+2. Create session PPPoX data socket
+
+        struct sockaddr_pppol2tp sax;
+        int fd;
+
+        /* Note, the target socket must be bound already, else it will not be ready */
+        sax.sa_family = AF_PPPOX;
+        sax.sa_protocol = PX_PROTO_OL2TP;
+        sax.pppol2tp.fd = tunnel_fd;
+        sax.pppol2tp.addr.sin_addr.s_addr = addr->sin_addr.s_addr;
+        sax.pppol2tp.addr.sin_port = addr->sin_port;
+        sax.pppol2tp.addr.sin_family = AF_INET;
+        sax.pppol2tp.s_tunnel  = tunnel_id;
+        sax.pppol2tp.s_session = session_id;
+        sax.pppol2tp.d_tunnel  = peer_tunnel_id;
+        sax.pppol2tp.d_session = peer_session_id;
+
+        /* session_fd is the fd of the session's PPPoL2TP socket.
+         * tunnel_fd is the fd of the tunnel UDP socket.
+         */
+        fd = connect(session_fd, (struct sockaddr *)&sax, sizeof(sax));
+        if (fd < 0 )    {
+                return -errno;
+        }
+        return 0;
+
+Internal Implementation
+=======================
+
+The driver keeps a struct l2tp_tunnel context per L2TP tunnel and a
+struct l2tp_session context for each session. The l2tp_tunnel is
+always associated with a UDP or L2TP/IP socket and keeps a list of
+sessions in the tunnel. The l2tp_session context keeps kernel state
+about the session. It has private data which is used for data specific
+to the session type. With L2TPv2, the session always carried PPP
+traffic. With L2TPv3, the session can also carry ethernet frames
+(ethernet pseudowire) or other data types such as ATM, HDLC or Frame
+Relay.
+
+When a tunnel is first opened, the reference count on the socket is
+increased using sock_hold(). This ensures that the kernel socket
+cannot be removed while L2TP's data structures reference it.
+
+Some L2TP sessions also have a socket (PPP pseudowires) while others
+do not (ethernet pseudowires). We can't use the socket reference count
+as the reference count for session contexts. The L2TP implementation
+therefore has its own internal reference counts on the session
+contexts.
+
+To Do
+=====
+
+Add L2TP tunnel switching support. This would route tunneled traffic
+from one L2TP tunnel into another. Specified in
+http://tools.ietf.org/html/draft-ietf-l2tpext-tunnel-switching-08
+
+Add L2TPv3 VLAN pseudowire support.
+
+Add L2TPv3 IP pseudowire support.
+
+Add L2TPv3 ATM pseudowire support.
+
+Miscellaneous
+=============
+
+The L2TP drivers were developed as part of the OpenL2TP project by
+Katalix Systems Ltd. OpenL2TP is a full-featured L2TP client / server,
+designed from the ground up to have the L2TP datapath in the
+kernel. The project also implemented the pppol2tp plugin for pppd
+which allows pppd to use the kernel driver. Details can be found at
+http://www.openl2tp.org.
diff --git a/Documentation/networking/lapb-module.txt b/Documentation/networking/lapb-module.txt
new file mode 100644
index 000000000..d4fc8f221
--- /dev/null
+++ b/Documentation/networking/lapb-module.txt
@@ -0,0 +1,263 @@
+		The Linux LAPB Module Interface 1.3
+
+		      Jonathan Naylor 29.12.96
+
+Changed (Henner Eisen, 2000-10-29): int return value for data_indication() 
+
+The LAPB module will be a separately compiled module for use by any parts of
+the Linux operating system that require a LAPB service. This document
+defines the interfaces to, and the services provided by this module. The
+term module in this context does not imply that the LAPB module is a
+separately loadable module, although it may be. The term module is used in
+its more standard meaning.
+
+The interface to the LAPB module consists of functions to the module,
+callbacks from the module to indicate important state changes, and
+structures for getting and setting information about the module.
+
+Structures
+----------
+
+Probably the most important structure is the skbuff structure for holding
+received and transmitted data, however it is beyond the scope of this
+document.
+
+The two LAPB specific structures are the LAPB initialisation structure and
+the LAPB parameter structure. These will be defined in a standard header
+file, <linux/lapb.h>. The header file <net/lapb.h> is internal to the LAPB
+module and is not for use.
+
+LAPB Initialisation Structure
+-----------------------------
+
+This structure is used only once, in the call to lapb_register (see below).
+It contains information about the device driver that requires the services
+of the LAPB module.
+
+struct lapb_register_struct {
+	void (*connect_confirmation)(int token, int reason);
+	void (*connect_indication)(int token, int reason);
+	void (*disconnect_confirmation)(int token, int reason);
+	void (*disconnect_indication)(int token, int reason);
+	int  (*data_indication)(int token, struct sk_buff *skb);
+	void (*data_transmit)(int token, struct sk_buff *skb);
+};
+
+Each member of this structure corresponds to a function in the device driver
+that is called when a particular event in the LAPB module occurs. These will
+be described in detail below. If a callback is not required (!!) then a NULL
+may be substituted.
+
+
+LAPB Parameter Structure
+------------------------
+
+This structure is used with the lapb_getparms and lapb_setparms functions
+(see below). They are used to allow the device driver to get and set the
+operational parameters of the LAPB implementation for a given connection.
+
+struct lapb_parms_struct {
+	unsigned int t1;
+	unsigned int t1timer;
+	unsigned int t2;
+	unsigned int t2timer;
+	unsigned int n2;
+	unsigned int n2count;
+	unsigned int window;
+	unsigned int state;
+	unsigned int mode;
+};
+
+T1 and T2 are protocol timing parameters and are given in units of 100ms. N2
+is the maximum number of tries on the link before it is declared a failure.
+The window size is the maximum number of outstanding data packets allowed to
+be unacknowledged by the remote end, the value of the window is between 1
+and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB
+link.
+
+The mode variable is a bit field used for setting (at present) three values.
+The bit fields have the following meanings:
+
+Bit	Meaning
+0	LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED).
+1	[SM]LP operation (0=LAPB_SLP 1=LAPB=MLP).
+2	DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE)
+3-31	Reserved, must be 0.
+
+Extended LAPB operation indicates the use of extended sequence numbers and
+consequently larger window sizes, the default is standard LAPB operation.
+MLP operation is the same as SLP operation except that the addresses used by
+LAPB are different to indicate the mode of operation, the default is Single
+Link Procedure. The difference between DCE and DTE operation is (i) the
+addresses used for commands and responses, and (ii) when the DCE is not
+connected, it sends DM without polls set, every T1. The upper case constant
+names will be defined in the public LAPB header file.
+
+
+Functions
+---------
+
+The LAPB module provides a number of function entry points.
+
+
+int lapb_register(void *token, struct lapb_register_struct);
+
+This must be called before the LAPB module may be used. If the call is
+successful then LAPB_OK is returned. The token must be a unique identifier
+generated by the device driver to allow for the unique identification of the
+instance of the LAPB link. It is returned by the LAPB module in all of the
+callbacks, and is used by the device driver in all calls to the LAPB module.
+For multiple LAPB links in a single device driver, multiple calls to
+lapb_register must be made. The format of the lapb_register_struct is given
+above. The return values are:
+
+LAPB_OK			LAPB registered successfully.
+LAPB_BADTOKEN		Token is already registered.
+LAPB_NOMEM		Out of memory
+
+
+int lapb_unregister(void *token);
+
+This releases all the resources associated with a LAPB link. Any current
+LAPB link will be abandoned without further messages being passed. After
+this call, the value of token is no longer valid for any calls to the LAPB
+function. The valid return values are:
+
+LAPB_OK			LAPB unregistered successfully.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+
+
+int lapb_getparms(void *token, struct lapb_parms_struct *parms);
+
+This allows the device driver to get the values of the current LAPB
+variables, the lapb_parms_struct is described above. The valid return values
+are:
+
+LAPB_OK			LAPB getparms was successful.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+
+
+int lapb_setparms(void *token, struct lapb_parms_struct *parms);
+
+This allows the device driver to set the values of the current LAPB
+variables, the lapb_parms_struct is described above. The values of t1timer,
+t2timer and n2count are ignored, likewise changing the mode bits when
+connected will be ignored. An error implies that none of the values have
+been changed. The valid return values are:
+
+LAPB_OK			LAPB getparms was successful.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_INVALUE		One of the values was out of its allowable range.
+
+
+int lapb_connect_request(void *token);
+
+Initiate a connect using the current parameter settings. The valid return
+values are:
+
+LAPB_OK			LAPB is starting to connect.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_CONNECTED		LAPB module is already connected.
+
+
+int lapb_disconnect_request(void *token);
+
+Initiate a disconnect. The valid return values are:
+
+LAPB_OK			LAPB is starting to disconnect.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_NOTCONNECTED	LAPB module is not connected.
+
+
+int lapb_data_request(void *token, struct sk_buff *skb);
+
+Queue data with the LAPB module for transmitting over the link. If the call
+is successful then the skbuff is owned by the LAPB module and may not be
+used by the device driver again. The valid return values are:
+
+LAPB_OK			LAPB has accepted the data.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+LAPB_NOTCONNECTED	LAPB module is not connected.
+
+
+int lapb_data_received(void *token, struct sk_buff *skb);
+
+Queue data with the LAPB module which has been received from the device. It
+is expected that the data passed to the LAPB module has skb->data pointing
+to the beginning of the LAPB data. If the call is successful then the skbuff
+is owned by the LAPB module and may not be used by the device driver again.
+The valid return values are:
+
+LAPB_OK			LAPB has accepted the data.
+LAPB_BADTOKEN		Invalid/unknown LAPB token.
+
+
+Callbacks
+---------
+
+These callbacks are functions provided by the device driver for the LAPB
+module to call when an event occurs. They are registered with the LAPB
+module with lapb_register (see above) in the structure lapb_register_struct
+(see above).
+
+
+void (*connect_confirmation)(void *token, int reason);
+
+This is called by the LAPB module when a connection is established after
+being requested by a call to lapb_connect_request (see above). The reason is
+always LAPB_OK.
+
+
+void (*connect_indication)(void *token, int reason);
+
+This is called by the LAPB module when the link is established by the remote
+system. The value of reason is always LAPB_OK.
+
+
+void (*disconnect_confirmation)(void *token, int reason);
+
+This is called by the LAPB module when an event occurs after the device
+driver has called lapb_disconnect_request (see above). The reason indicates
+what has happened. In all cases the LAPB link can be regarded as being
+terminated. The values for reason are:
+
+LAPB_OK			The LAPB link was terminated normally.
+LAPB_NOTCONNECTED	The remote system was not connected.
+LAPB_TIMEDOUT		No response was received in N2 tries from the remote
+			system.
+
+
+void (*disconnect_indication)(void *token, int reason);
+
+This is called by the LAPB module when the link is terminated by the remote
+system or another event has occurred to terminate the link. This may be
+returned in response to a lapb_connect_request (see above) if the remote
+system refused the request. The values for reason are:
+
+LAPB_OK			The LAPB link was terminated normally by the remote
+			system.
+LAPB_REFUSED		The remote system refused the connect request.
+LAPB_NOTCONNECTED	The remote system was not connected.
+LAPB_TIMEDOUT		No response was received in N2 tries from the remote
+			system.
+
+
+int (*data_indication)(void *token, struct sk_buff *skb);
+
+This is called by the LAPB module when data has been received from the
+remote system that should be passed onto the next layer in the protocol
+stack. The skbuff becomes the property of the device driver and the LAPB
+module will not perform any more actions on it. The skb->data pointer will
+be pointing to the first byte of data after the LAPB header.
+
+This method should return NET_RX_DROP (as defined in the header
+file include/linux/netdevice.h) if and only if the frame was dropped
+before it could be delivered to the upper layer.
+
+
+void (*data_transmit)(void *token, struct sk_buff *skb);
+
+This is called by the LAPB module when data is to be transmitted to the
+remote system by the device driver. The skbuff becomes the property of the
+device driver and the LAPB module will not perform any more actions on it.
+The skb->data pointer will be pointing to the first byte of the LAPB header.
diff --git a/Documentation/networking/ltpc.txt b/Documentation/networking/ltpc.txt
new file mode 100644
index 000000000..0bf3220c7
--- /dev/null
+++ b/Documentation/networking/ltpc.txt
@@ -0,0 +1,131 @@
+This is the ALPHA version of the ltpc driver.
+
+In order to use it, you will need at least version 1.3.3 of the
+netatalk package, and the Apple or Farallon LocalTalk PC card.
+There are a number of different LocalTalk cards for the PC; this
+driver applies only to the one with the 65c02 processor chip on it.
+
+To include it in the kernel, select the CONFIG_LTPC switch in the
+configuration dialog.  You can also compile it as a module.
+
+While the driver will attempt to autoprobe the I/O port address, IRQ
+line, and DMA channel of the card, this does not always work.  For
+this reason, you should be prepared to supply these parameters
+yourself.  (see "Card Configuration" below for how to determine or
+change the settings on your card)
+
+When the driver is compiled into the kernel, you can add a line such
+as the following to your /etc/lilo.conf:
+
+ append="ltpc=0x240,9,1"
+
+where the parameters (in order) are the port address, IRQ, and DMA
+channel.  The second and third values can be omitted, in which case
+the driver will try to determine them itself.
+
+If you load the driver as a module, you can pass the parameters "io=",
+"irq=", and "dma=" on the command line with insmod or modprobe, or add
+them as options in a configuration file in /etc/modprobe.d/ directory:
+
+ alias lt0 ltpc # autoload the module when the interface is configured
+ options ltpc io=0x240 irq=9 dma=1
+
+Before starting up the netatalk demons (perhaps in rc.local), you
+need to add a line such as:
+
+ /sbin/ifconfig lt0 127.0.0.42
+
+The address is unimportant - however, the card needs to be configured
+with ifconfig so that Netatalk can find it.
+
+The appropriate netatalk configuration depends on whether you are
+attached to a network that includes AppleTalk routers or not.  If,
+like me, you are simply connecting to your home Macintoshes and
+printers, you need to set up netatalk to "seed".  The way I do this
+is to have the lines
+
+ dummy -seed -phase 2 -net 2000 -addr 2000.26 -zone "1033"
+ lt0 -seed -phase 1 -net 1033 -addr 1033.27 -zone "1033"
+
+in my atalkd.conf.  What is going on here is that I need to fool
+netatalk into thinking that there are two AppleTalk interfaces
+present; otherwise, it refuses to seed.  This is a hack, and a more
+permanent solution would be to alter the netatalk code.  Also, make
+sure you have the correct name for the dummy interface - If it's
+compiled as a module, you will need to refer to it as "dummy0" or some
+such.
+
+If you are attached to an extended AppleTalk network, with routers on
+it, then you don't need to fool around with this -- the appropriate
+line in atalkd.conf is
+
+ lt0 -phase 1
+
+--------------------------------------
+
+Card Configuration:
+
+The interrupts and so forth are configured via the dipswitch on the
+board.  Set the switches so as not to conflict with other hardware.
+
+       Interrupts -- set at most one.  If none are set, the driver uses
+       polled mode.  Because the card was developed in the XT era, the
+       original documentation refers to IRQ2.  Since you'll be running
+       this on an AT (or later) class machine, that really means IRQ9.
+
+       SW1     IRQ 4
+       SW2     IRQ 3
+       SW3     IRQ 9 (2 in original card documentation only applies to XT)
+
+
+       DMA -- choose DMA 1 or 3, and set both corresponding switches.
+
+       SW4     DMA 3
+       SW5     DMA 1
+       SW6     DMA 3
+       SW7     DMA 1
+
+
+       I/O address -- choose one.
+
+       SW8     220 / 240
+
+--------------------------------------
+
+IP:
+
+Yes, it is possible to do IP over LocalTalk.  However, you can't just
+treat the LocalTalk device like an ordinary Ethernet device, even if
+that's what it looks like to Netatalk.
+
+Instead, you follow the same procedure as for doing IP in EtherTalk.
+See Documentation/networking/ipddp.txt for more information about the
+kernel driver and userspace tools needed.
+
+--------------------------------------
+
+BUGS:
+
+IRQ autoprobing often doesn't work on a cold boot.  To get around
+this, either compile the driver as a module, or pass the parameters
+for the card to the kernel as described above.
+
+Also, as usual, autoprobing is not recommended when you use the driver
+as a module. (though it usually works at boot time, at least)
+
+Polled mode is *really* slow sometimes, but this seems to depend on
+the configuration of the network.
+
+It may theoretically be possible to use two LTPC cards in the same
+machine, but this is unsupported, so if you really want to do this,
+you'll probably have to hack the initialization code a bit.
+
+______________________________________
+
+THANKS:
+	Thanks to Alan Cox for helpful discussions early on in this
+work, and to Denis Hainsworth for doing the bleeding-edge testing.
+
+-- Bradford Johnson <bradford@math.umn.edu>
+
+-- Updated 11/09/1998 by David Huggins-Daines <dhd@debian.org>
diff --git a/Documentation/networking/mac80211-auth-assoc-deauth.txt b/Documentation/networking/mac80211-auth-assoc-deauth.txt
new file mode 100644
index 000000000..d7a15fe91
--- /dev/null
+++ b/Documentation/networking/mac80211-auth-assoc-deauth.txt
@@ -0,0 +1,95 @@
+#
+# This outlines the Linux authentication/association and
+# deauthentication/disassociation flows.
+#
+# This can be converted into a diagram using the service
+# at http://www.websequencediagrams.com/
+#
+
+participant userspace
+participant mac80211
+participant driver
+
+alt authentication needed (not FT)
+userspace->mac80211: authenticate
+
+alt authenticated/authenticating already
+mac80211->driver: sta_state(AP, not-exists)
+mac80211->driver: bss_info_changed(clear BSSID)
+else associated
+note over mac80211,driver
+like deauth/disassoc, without sending the
+BA session stop & deauth/disassoc frames
+end note
+end
+
+mac80211->driver: config(channel, channel type)
+mac80211->driver: bss_info_changed(set BSSID, basic rate bitmap)
+mac80211->driver: sta_state(AP, exists)
+
+alt no probe request data known
+mac80211->driver: TX directed probe request
+driver->mac80211: RX probe response
+end
+
+mac80211->driver: TX auth frame
+driver->mac80211: RX auth frame
+
+alt WEP shared key auth
+mac80211->driver: TX auth frame
+driver->mac80211: RX auth frame
+end
+
+mac80211->driver: sta_state(AP, authenticated)
+mac80211->userspace: RX auth frame
+
+end
+
+userspace->mac80211: associate
+alt authenticated or associated
+note over mac80211,driver: cleanup like for authenticate
+end
+
+alt not previously authenticated (FT)
+mac80211->driver: config(channel, channel type)
+mac80211->driver: bss_info_changed(set BSSID, basic rate bitmap)
+mac80211->driver: sta_state(AP, exists)
+mac80211->driver: sta_state(AP, authenticated)
+end
+mac80211->driver: TX assoc
+driver->mac80211: RX assoc response
+note over mac80211: init rate control
+mac80211->driver: sta_state(AP, associated)
+
+alt not using WPA
+mac80211->driver: sta_state(AP, authorized)
+end
+
+mac80211->driver: set up QoS parameters
+
+mac80211->driver: bss_info_changed(QoS, HT, associated with AID)
+mac80211->userspace: associated
+
+note left of userspace: associated now
+
+alt using WPA
+note over userspace
+do 4-way-handshake
+(data frames)
+end note
+userspace->mac80211: authorized
+mac80211->driver: sta_state(AP, authorized)
+end
+
+userspace->mac80211: deauthenticate/disassociate
+mac80211->driver: stop BA sessions
+mac80211->driver: TX deauth/disassoc
+mac80211->driver: flush frames
+mac80211->driver: sta_state(AP,associated)
+mac80211->driver: sta_state(AP,authenticated)
+mac80211->driver: sta_state(AP,exists)
+mac80211->driver: sta_state(AP,not-exists)
+mac80211->driver: turn off powersave
+mac80211->driver: bss_info_changed(clear BSSID, not associated, no QoS, ...)
+mac80211->driver: config(channel type to non-HT)
+mac80211->userspace: disconnected
diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt
new file mode 100644
index 000000000..d58d78df9
--- /dev/null
+++ b/Documentation/networking/mac80211-injection.txt
@@ -0,0 +1,97 @@
+How to use packet injection with mac80211
+=========================================
+
+mac80211 now allows arbitrary packets to be injected down any Monitor Mode
+interface from userland.  The packet you inject needs to be composed in the
+following format:
+
+ [ radiotap header  ]
+ [ ieee80211 header ]
+ [ payload ]
+
+The radiotap format is discussed in
+./Documentation/networking/radiotap-headers.txt.
+
+Despite many radiotap parameters being currently defined, most only make sense
+to appear on received packets.  The following information is parsed from the
+radiotap headers and used to control injection:
+
+ * IEEE80211_RADIOTAP_FLAGS
+
+   IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
+   IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
+   IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
+			      current fragmentation threshold.
+
+ * IEEE80211_RADIOTAP_TX_FLAGS
+
+   IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
+				  an ACK even if it is a unicast frame
+
+ * IEEE80211_RADIOTAP_RATE
+
+   legacy rate for the transmission (only for devices without own rate control)
+
+ * IEEE80211_RADIOTAP_MCS
+
+   HT rate for the transmission (only for devices without own rate control).
+   Also some flags are parsed
+
+   IEEE80211_RADIOTAP_MCS_SGI: use short guard interval
+   IEEE80211_RADIOTAP_MCS_BW_40: send in HT40 mode
+
+ * IEEE80211_RADIOTAP_DATA_RETRIES
+
+   number of retries when either IEEE80211_RADIOTAP_RATE or
+   IEEE80211_RADIOTAP_MCS was used
+
+ * IEEE80211_RADIOTAP_VHT
+
+   VHT mcs and number of streams used in the transmission (only for devices
+   without own rate control). Also other fields are parsed
+
+   flags field
+   IEEE80211_RADIOTAP_VHT_FLAG_SGI: use short guard interval
+
+   bandwidth field
+   1: send using 40MHz channel width
+   4: send using 80MHz channel width
+   11: send using 160MHz channel width
+
+The injection code can also skip all other currently defined radiotap fields
+facilitating replay of captured radiotap headers directly.
+
+Here is an example valid radiotap header defining some parameters
+
+	0x00, 0x00, // <-- radiotap version
+	0x0b, 0x00, // <- radiotap header length
+	0x04, 0x0c, 0x00, 0x00, // <-- bitmap
+	0x6c, // <-- rate
+	0x0c, //<-- tx power
+	0x01 //<-- antenna
+
+The ieee80211 header follows immediately afterwards, looking for example like
+this:
+
+	0x08, 0x01, 0x00, 0x00,
+	0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
+	0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
+	0x13, 0x22, 0x33, 0x44, 0x55, 0x66,
+	0x10, 0x86
+
+Then lastly there is the payload.
+
+After composing the packet contents, it is sent by send()-ing it to a logical
+mac80211 interface that is in Monitor mode.  Libpcap can also be used,
+(which is easier than doing the work to bind the socket to the right
+interface), along the following lines:
+
+	ppcap = pcap_open_live(szInterfaceName, 800, 1, 20, szErrbuf);
+...
+	r = pcap_inject(ppcap, u8aSendBuffer, nLength);
+
+You can also find a link to a complete inject application here:
+
+http://wireless.kernel.org/en/users/Documentation/packetspammer
+
+Andy Green <andy@warmcat.com>
diff --git a/Documentation/networking/mac80211_hwsim/README b/Documentation/networking/mac80211_hwsim/README
new file mode 100644
index 000000000..3566a725d
--- /dev/null
+++ b/Documentation/networking/mac80211_hwsim/README
@@ -0,0 +1,68 @@
+mac80211_hwsim - software simulator of 802.11 radio(s) for mac80211
+Copyright (c) 2008, Jouni Malinen <j@w1.fi>
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License version 2 as
+published by the Free Software Foundation.
+
+
+Introduction
+
+mac80211_hwsim is a Linux kernel module that can be used to simulate
+arbitrary number of IEEE 802.11 radios for mac80211. It can be used to
+test most of the mac80211 functionality and user space tools (e.g.,
+hostapd and wpa_supplicant) in a way that matches very closely with
+the normal case of using real WLAN hardware. From the mac80211 view
+point, mac80211_hwsim is yet another hardware driver, i.e., no changes
+to mac80211 are needed to use this testing tool.
+
+The main goal for mac80211_hwsim is to make it easier for developers
+to test their code and work with new features to mac80211, hostapd,
+and wpa_supplicant. The simulated radios do not have the limitations
+of real hardware, so it is easy to generate an arbitrary test setup
+and always reproduce the same setup for future tests. In addition,
+since all radio operation is simulated, any channel can be used in
+tests regardless of regulatory rules.
+
+mac80211_hwsim kernel module has a parameter 'radios' that can be used
+to select how many radios are simulated (default 2). This allows
+configuration of both very simply setups (e.g., just a single access
+point and a station) or large scale tests (multiple access points with
+hundreds of stations).
+
+mac80211_hwsim works by tracking the current channel of each virtual
+radio and copying all transmitted frames to all other radios that are
+currently enabled and on the same channel as the transmitting
+radio. Software encryption in mac80211 is used so that the frames are
+actually encrypted over the virtual air interface to allow more
+complete testing of encryption.
+
+A global monitoring netdev, hwsim#, is created independent of
+mac80211. This interface can be used to monitor all transmitted frames
+regardless of channel.
+
+
+Simple example
+
+This example shows how to use mac80211_hwsim to simulate two radios:
+one to act as an access point and the other as a station that
+associates with the AP. hostapd and wpa_supplicant are used to take
+care of WPA2-PSK authentication. In addition, hostapd is also
+processing access point side of association.
+
+
+# Build mac80211_hwsim as part of kernel configuration
+
+# Load the module
+modprobe mac80211_hwsim
+
+# Run hostapd (AP) for wlan0
+hostapd hostapd.conf
+
+# Run wpa_supplicant (station) for wlan1
+wpa_supplicant -Dnl80211 -iwlan1 -c wpa_supplicant.conf
+
+
+More test cases are available in hostap.git:
+git://w1.fi/srv/git/hostap.git and mac80211_hwsim/tests subdirectory
+(http://w1.fi/gitweb/gitweb.cgi?p=hostap.git;a=tree;f=mac80211_hwsim/tests)
diff --git a/Documentation/networking/mac80211_hwsim/hostapd.conf b/Documentation/networking/mac80211_hwsim/hostapd.conf
new file mode 100644
index 000000000..08cde7e35
--- /dev/null
+++ b/Documentation/networking/mac80211_hwsim/hostapd.conf
@@ -0,0 +1,11 @@
+interface=wlan0
+driver=nl80211
+
+hw_mode=g
+channel=1
+ssid=mac80211 test
+
+wpa=2
+wpa_key_mgmt=WPA-PSK
+wpa_pairwise=CCMP
+wpa_passphrase=12345678
diff --git a/Documentation/networking/mac80211_hwsim/wpa_supplicant.conf b/Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
new file mode 100644
index 000000000..299128cff
--- /dev/null
+++ b/Documentation/networking/mac80211_hwsim/wpa_supplicant.conf
@@ -0,0 +1,10 @@
+ctrl_interface=/var/run/wpa_supplicant
+
+network={
+	ssid="mac80211 test"
+	psk="12345678"
+	key_mgmt=WPA-PSK
+	proto=WPA2
+	pairwise=CCMP
+	group=CCMP
+}
diff --git a/Documentation/networking/mpls-sysctl.txt b/Documentation/networking/mpls-sysctl.txt
new file mode 100644
index 000000000..2f24a1912
--- /dev/null
+++ b/Documentation/networking/mpls-sysctl.txt
@@ -0,0 +1,48 @@
+/proc/sys/net/mpls/* Variables:
+
+platform_labels - INTEGER
+	Number of entries in the platform label table.  It is not
+	possible to configure forwarding for label values equal to or
+	greater than the number of platform labels.
+
+	A dense utilization of the entries in the platform label table
+	is possible and expected as the platform labels are locally
+	allocated.
+
+	If the number of platform label table entries is set to 0 no
+	label will be recognized by the kernel and mpls forwarding
+	will be disabled.
+
+	Reducing this value will remove all label routing entries that
+	no longer fit in the table.
+
+	Possible values: 0 - 1048575
+	Default: 0
+
+ip_ttl_propagate - BOOL
+	Control whether TTL is propagated from the IPv4/IPv6 header to
+	the MPLS header on imposing labels and propagated from the
+	MPLS header to the IPv4/IPv6 header on popping the last label.
+
+	If disabled, the MPLS transport network will appear as a
+	single hop to transit traffic.
+
+	0 - disabled / RFC 3443 [Short] Pipe Model
+	1 - enabled / RFC 3443 Uniform Model (default)
+
+default_ttl - BOOL
+	Default TTL value to use for MPLS packets where it cannot be
+	propagated from an IP header, either because one isn't present
+	or ip_ttl_propagate has been disabled.
+
+	Possible values: 1 - 255
+	Default: 255
+
+conf/<interface>/input - BOOL
+	Control whether packets can be input on this interface.
+
+	If disabled, packets will be discarded without further
+	processing.
+
+	0 - disabled (default)
+	not 0 - enabled
diff --git a/Documentation/networking/msg_zerocopy.rst b/Documentation/networking/msg_zerocopy.rst
new file mode 100644
index 000000000..fe46d4867
--- /dev/null
+++ b/Documentation/networking/msg_zerocopy.rst
@@ -0,0 +1,256 @@
+
+============
+MSG_ZEROCOPY
+============
+
+Intro
+=====
+
+The MSG_ZEROCOPY flag enables copy avoidance for socket send calls.
+The feature is currently implemented for TCP sockets.
+
+
+Opportunity and Caveats
+-----------------------
+
+Copying large buffers between user process and kernel can be
+expensive. Linux supports various interfaces that eschew copying,
+such as sendpage and splice. The MSG_ZEROCOPY flag extends the
+underlying copy avoidance mechanism to common socket send calls.
+
+Copy avoidance is not a free lunch. As implemented, with page pinning,
+it replaces per byte copy cost with page accounting and completion
+notification overhead. As a result, MSG_ZEROCOPY is generally only
+effective at writes over around 10 KB.
+
+Page pinning also changes system call semantics. It temporarily shares
+the buffer between process and network stack. Unlike with copying, the
+process cannot immediately overwrite the buffer after system call
+return without possibly modifying the data in flight. Kernel integrity
+is not affected, but a buggy program can possibly corrupt its own data
+stream.
+
+The kernel returns a notification when it is safe to modify data.
+Converting an existing application to MSG_ZEROCOPY is not always as
+trivial as just passing the flag, then.
+
+
+More Info
+---------
+
+Much of this document was derived from a longer paper presented at
+netdev 2.1. For more in-depth information see that paper and talk,
+the excellent reporting over at LWN.net or read the original code.
+
+  paper, slides, video
+    https://netdevconf.org/2.1/session.html?debruijn
+
+  LWN article
+    https://lwn.net/Articles/726917/
+
+  patchset
+    [PATCH net-next v4 0/9] socket sendmsg MSG_ZEROCOPY
+    http://lkml.kernel.org/r/20170803202945.70750-1-willemdebruijn.kernel@gmail.com
+
+
+Interface
+=========
+
+Passing the MSG_ZEROCOPY flag is the most obvious step to enable copy
+avoidance, but not the only one.
+
+Socket Setup
+------------
+
+The kernel is permissive when applications pass undefined flags to the
+send system call. By default it simply ignores these. To avoid enabling
+copy avoidance mode for legacy processes that accidentally already pass
+this flag, a process must first signal intent by setting a socket option:
+
+::
+
+	if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &one, sizeof(one)))
+		error(1, errno, "setsockopt zerocopy");
+
+Transmission
+------------
+
+The change to send (or sendto, sendmsg, sendmmsg) itself is trivial.
+Pass the new flag.
+
+::
+
+	ret = send(fd, buf, sizeof(buf), MSG_ZEROCOPY);
+
+A zerocopy failure will return -1 with errno ENOBUFS. This happens if
+the socket option was not set, the socket exceeds its optmem limit or
+the user exceeds its ulimit on locked pages.
+
+
+Mixing copy avoidance and copying
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Many workloads have a mixture of large and small buffers. Because copy
+avoidance is more expensive than copying for small packets, the
+feature is implemented as a flag. It is safe to mix calls with the flag
+with those without.
+
+
+Notifications
+-------------
+
+The kernel has to notify the process when it is safe to reuse a
+previously passed buffer. It queues completion notifications on the
+socket error queue, akin to the transmit timestamping interface.
+
+The notification itself is a simple scalar value. Each socket
+maintains an internal unsigned 32-bit counter. Each send call with
+MSG_ZEROCOPY that successfully sends data increments the counter. The
+counter is not incremented on failure or if called with length zero.
+The counter counts system call invocations, not bytes. It wraps after
+UINT_MAX calls.
+
+
+Notification Reception
+~~~~~~~~~~~~~~~~~~~~~~
+
+The below snippet demonstrates the API. In the simplest case, each
+send syscall is followed by a poll and recvmsg on the error queue.
+
+Reading from the error queue is always a non-blocking operation. The
+poll call is there to block until an error is outstanding. It will set
+POLLERR in its output flags. That flag does not have to be set in the
+events field. Errors are signaled unconditionally.
+
+::
+
+	pfd.fd = fd;
+	pfd.events = 0;
+	if (poll(&pfd, 1, -1) != 1 || pfd.revents & POLLERR == 0)
+		error(1, errno, "poll");
+
+	ret = recvmsg(fd, &msg, MSG_ERRQUEUE);
+	if (ret == -1)
+		error(1, errno, "recvmsg");
+
+	read_notification(msg);
+
+The example is for demonstration purpose only. In practice, it is more
+efficient to not wait for notifications, but read without blocking
+every couple of send calls.
+
+Notifications can be processed out of order with other operations on
+the socket. A socket that has an error queued would normally block
+other operations until the error is read. Zerocopy notifications have
+a zero error code, however, to not block send and recv calls.
+
+
+Notification Batching
+~~~~~~~~~~~~~~~~~~~~~
+
+Multiple outstanding packets can be read at once using the recvmmsg
+call. This is often not needed. In each message the kernel returns not
+a single value, but a range. It coalesces consecutive notifications
+while one is outstanding for reception on the error queue.
+
+When a new notification is about to be queued, it checks whether the
+new value extends the range of the notification at the tail of the
+queue. If so, it drops the new notification packet and instead increases
+the range upper value of the outstanding notification.
+
+For protocols that acknowledge data in-order, like TCP, each
+notification can be squashed into the previous one, so that no more
+than one notification is outstanding at any one point.
+
+Ordered delivery is the common case, but not guaranteed. Notifications
+may arrive out of order on retransmission and socket teardown.
+
+
+Notification Parsing
+~~~~~~~~~~~~~~~~~~~~
+
+The below snippet demonstrates how to parse the control message: the
+read_notification() call in the previous snippet. A notification
+is encoded in the standard error format, sock_extended_err.
+
+The level and type fields in the control data are protocol family
+specific, IP_RECVERR or IPV6_RECVERR.
+
+Error origin is the new type SO_EE_ORIGIN_ZEROCOPY. ee_errno is zero,
+as explained before, to avoid blocking read and write system calls on
+the socket.
+
+The 32-bit notification range is encoded as [ee_info, ee_data]. This
+range is inclusive. Other fields in the struct must be treated as
+undefined, bar for ee_code, as discussed below.
+
+::
+
+	struct sock_extended_err *serr;
+	struct cmsghdr *cm;
+
+	cm = CMSG_FIRSTHDR(msg);
+	if (cm->cmsg_level != SOL_IP &&
+	    cm->cmsg_type != IP_RECVERR)
+		error(1, 0, "cmsg");
+
+	serr = (void *) CMSG_DATA(cm);
+	if (serr->ee_errno != 0 ||
+	    serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY)
+		error(1, 0, "serr");
+
+	printf("completed: %u..%u\n", serr->ee_info, serr->ee_data);
+
+
+Deferred copies
+~~~~~~~~~~~~~~~
+
+Passing flag MSG_ZEROCOPY is a hint to the kernel to apply copy
+avoidance, and a contract that the kernel will queue a completion
+notification. It is not a guarantee that the copy is elided.
+
+Copy avoidance is not always feasible. Devices that do not support
+scatter-gather I/O cannot send packets made up of kernel generated
+protocol headers plus zerocopy user data. A packet may need to be
+converted to a private copy of data deep in the stack, say to compute
+a checksum.
+
+In all these cases, the kernel returns a completion notification when
+it releases its hold on the shared pages. That notification may arrive
+before the (copied) data is fully transmitted. A zerocopy completion
+notification is not a transmit completion notification, therefore.
+
+Deferred copies can be more expensive than a copy immediately in the
+system call, if the data is no longer warm in the cache. The process
+also incurs notification processing cost for no benefit. For this
+reason, the kernel signals if data was completed with a copy, by
+setting flag SO_EE_CODE_ZEROCOPY_COPIED in field ee_code on return.
+A process may use this signal to stop passing flag MSG_ZEROCOPY on
+subsequent requests on the same socket.
+
+
+Implementation
+==============
+
+Loopback
+--------
+
+Data sent to local sockets can be queued indefinitely if the receive
+process does not read its socket. Unbound notification latency is not
+acceptable. For this reason all packets generated with MSG_ZEROCOPY
+that are looped to a local socket will incur a deferred copy. This
+includes looping onto packet sockets (e.g., tcpdump) and tun devices.
+
+
+Testing
+=======
+
+More realistic example code can be found in the kernel source under
+tools/testing/selftests/net/msg_zerocopy.c.
+
+Be cognizant of the loopback constraint. The test can be run between
+a pair of hosts. But if run between a local pair of processes, for
+instance when run with msg_zerocopy.sh between a veth pair across
+namespaces, the test will not show any improvement. For testing, the
+loopback restriction can be temporarily relaxed by making
+skb_orphan_frags_rx identical to skb_orphan_frags.
diff --git a/Documentation/networking/multiqueue.txt b/Documentation/networking/multiqueue.txt
new file mode 100644
index 000000000..4caa0e314
--- /dev/null
+++ b/Documentation/networking/multiqueue.txt
@@ -0,0 +1,79 @@
+
+		HOWTO for multiqueue network device support
+		===========================================
+
+Section 1: Base driver requirements for implementing multiqueue support
+
+Intro: Kernel support for multiqueue devices
+---------------------------------------------------------
+
+Kernel support for multiqueue devices is always present.
+
+Section 1: Base driver requirements for implementing multiqueue support
+-----------------------------------------------------------------------
+
+Base drivers are required to use the new alloc_etherdev_mq() or
+alloc_netdev_mq() functions to allocate the subqueues for the device.  The
+underlying kernel API will take care of the allocation and deallocation of
+the subqueue memory, as well as netdev configuration of where the queues
+exist in memory.
+
+The base driver will also need to manage the queues as it does the global
+netdev->queue_lock today.  Therefore base drivers should use the
+netif_{start|stop|wake}_subqueue() functions to manage each queue while the
+device is still operational.  netdev->queue_lock is still used when the device
+comes online or when it's completely shut down (unregister_netdev(), etc.).
+
+
+Section 2: Qdisc support for multiqueue devices
+
+-----------------------------------------------
+
+Currently two qdiscs are optimized for multiqueue devices.  The first is the
+default pfifo_fast qdisc.  This qdisc supports one qdisc per hardware queue.
+A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The
+qdisc is responsible for classifying the skb's and then directing the skb's to
+bands and queues based on the value in skb->queue_mapping.  Use this field in
+the base driver to determine which queue to send the skb to.
+
+sch_multiq has been added for hardware that wishes to avoid head-of-line
+blocking.  It will cycle though the bands and verify that the hardware queue
+associated with the band is not stopped prior to dequeuing a packet.
+
+On qdisc load, the number of bands is based on the number of queues on the
+hardware.  Once the association is made, any skb with skb->queue_mapping set,
+will be queued to the band associated with the hardware queue.
+
+
+Section 3: Brief howto using MULTIQ for multiqueue devices
+---------------------------------------------------------------
+
+The userspace command 'tc,' part of the iproute2 package, is used to configure
+qdiscs.  To add the MULTIQ qdisc to your network device, assuming the device
+is called eth0, run the following command:
+
+# tc qdisc add dev eth0 root handle 1: multiq
+
+The qdisc will allocate the number of bands to equal the number of queues that
+the device reports, and bring the qdisc online.  Assuming eth0 has 4 Tx
+queues, the band mapping would look like:
+
+band 0 => queue 0
+band 1 => queue 1
+band 2 => queue 2
+band 3 => queue 3
+
+Traffic will begin flowing through each queue based on either the simple_tx_hash
+function or based on netdev->select_queue() if you have it defined.
+
+The behavior of tc filters remains the same.  However a new tc action,
+skbedit, has been added.  Assuming you wanted to route all traffic to a
+specific host, for example 192.168.0.3, through a specific queue you could use
+this action and establish a filter such as:
+
+tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
+	match ip dst 192.168.0.3 \
+	action skbedit queue_mapping 3
+
+Author: Alexander Duyck <alexander.h.duyck@intel.com>
+Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>
diff --git a/Documentation/networking/net_dim.txt b/Documentation/networking/net_dim.txt
new file mode 100644
index 000000000..9cb31c5e2
--- /dev/null
+++ b/Documentation/networking/net_dim.txt
@@ -0,0 +1,174 @@
+Net DIM - Generic Network Dynamic Interrupt Moderation
+======================================================
+
+Author:
+	Tal Gilboa <talgi@mellanox.com>
+
+
+Contents
+=========
+
+- Assumptions
+- Introduction
+- The Net DIM Algorithm
+- Registering a Network Device to DIM
+- Example
+
+Part 0: Assumptions
+======================
+
+This document assumes the reader has basic knowledge in network drivers
+and in general interrupt moderation.
+
+
+Part I: Introduction
+======================
+
+Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the
+interrupt moderation configuration of a channel in order to optimize packet
+processing. The mechanism includes an algorithm which decides if and how to
+change moderation parameters for a channel, usually by performing an analysis on
+runtime data sampled from the system. Net DIM is such a mechanism. In each
+iteration of the algorithm, it analyses a given sample of the data, compares it
+to the previous sample and if required, it can decide to change some of the
+interrupt moderation configuration fields. The data sample is composed of data
+bandwidth, the number of packets and the number of events. The time between
+samples is also measured. Net DIM compares the current and the previous data and
+returns an adjusted interrupt moderation configuration object. In some cases,
+the algorithm might decide not to change anything. The configuration fields are
+the minimum duration (microseconds) allowed between events and the maximum
+number of wanted packets per event. The Net DIM algorithm ascribes importance to
+increase bandwidth over reducing interrupt rate.
+
+
+Part II: The Net DIM Algorithm
+===============================
+
+Each iteration of the Net DIM algorithm follows these steps:
+1. Calculates new data sample.
+2. Compares it to previous sample.
+3. Makes a decision - suggests interrupt moderation configuration fields.
+4. Applies a schedule work function, which applies suggested configuration.
+
+The first two steps are straightforward, both the new and the previous data are
+supplied by the driver registered to Net DIM. The previous data is the new data
+supplied to the previous iteration. The comparison step checks the difference
+between the new and previous data and decides on the result of the last step.
+A step would result as "better" if bandwidth increases and as "worse" if
+bandwidth reduces. If there is no change in bandwidth, the packet rate is
+compared in a similar fashion - increase == "better" and decrease == "worse".
+In case there is no change in the packet rate as well, the interrupt rate is
+compared. Here the algorithm tries to optimize for lower interrupt rate so an
+increase in the interrupt rate is considered "worse" and a decrease is
+considered "better". Step #2 has an optimization for avoiding false results: it
+only considers a difference between samples as valid if it is greater than a
+certain percentage. Also, since Net DIM does not measure anything by itself, it
+assumes the data provided by the driver is valid.
+
+Step #3 decides on the suggested configuration based on the result from step #2
+and the internal state of the algorithm. The states reflect the "direction" of
+the algorithm: is it going left (reducing moderation), right (increasing
+moderation) or standing still. Another optimization is that if a decision
+to stay still is made multiple times, the interval between iterations of the
+algorithm would increase in order to reduce calculation overhead. Also, after
+"parking" on one of the most left or most right decisions, the algorithm may
+decide to verify this decision by taking a step in the other direction. This is
+done in order to avoid getting stuck in a "deep sleep" scenario. Once a
+decision is made, an interrupt moderation configuration is selected from
+the predefined profiles.
+
+The last step is to notify the registered driver that it should apply the
+suggested configuration. This is done by scheduling a work function, defined by
+the Net DIM API and provided by the registered driver.
+
+As you can see, Net DIM itself does not actively interact with the system. It
+would have trouble making the correct decisions if the wrong data is supplied to
+it and it would be useless if the work function would not apply the suggested
+configuration. This does, however, allow the registered driver some room for
+manoeuvre as it may provide partial data or ignore the algorithm suggestion
+under some conditions.
+
+
+Part III: Registering a Network Device to DIM
+==============================================
+
+Net DIM API exposes the main function net_dim(struct net_dim *dim,
+struct net_dim_sample end_sample). This function is the entry point to the Net
+DIM algorithm and has to be called every time the driver would like to check if
+it should change interrupt moderation parameters. The driver should provide two
+data structures: struct net_dim and struct net_dim_sample. Struct net_dim
+describes the state of DIM for a specific object (RX queue, TX queue,
+other queues, etc.). This includes the current selected profile, previous data
+samples, the callback function provided by the driver and more.
+Struct net_dim_sample describes a data sample, which will be compared to the
+data sample stored in struct net_dim in order to decide on the algorithm's next
+step. The sample should include bytes, packets and interrupts, measured by
+the driver.
+
+In order to use Net DIM from a networking driver, the driver needs to call the
+main net_dim() function. The recommended method is to call net_dim() on each
+interrupt. Since Net DIM has a built-in moderation and it might decide to skip
+iterations under certain conditions, there is no need to moderate the net_dim()
+calls as well. As mentioned above, the driver needs to provide an object of type
+struct net_dim to the net_dim() function call. It is advised for each entity
+using Net DIM to hold a struct net_dim as part of its data structure and use it
+as the main Net DIM API object. The struct net_dim_sample should hold the latest
+bytes, packets and interrupts count. No need to perform any calculations, just
+include the raw data.
+
+The net_dim() call itself does not return anything. Instead Net DIM relies on
+the driver to provide a callback function, which is called when the algorithm
+decides to make a change in the interrupt moderation parameters. This callback
+will be scheduled and run in a separate thread in order not to add overhead to
+the data flow. After the work is done, Net DIM algorithm needs to be set to
+the proper state in order to move to the next iteration.
+
+
+Part IV: Example
+=================
+
+The following code demonstrates how to register a driver to Net DIM. The actual
+usage is not complete but it should make the outline of the usage clear.
+
+my_driver.c:
+
+#include <linux/net_dim.h>
+
+/* Callback for net DIM to schedule on a decision to change moderation */
+void my_driver_do_dim_work(struct work_struct *work)
+{
+	/* Get struct net_dim from struct work_struct */
+	struct net_dim *dim = container_of(work, struct net_dim,
+					   work);
+	/* Do interrupt moderation related stuff */
+	...
+
+	/* Signal net DIM work is done and it should move to next iteration */
+	dim->state = NET_DIM_START_MEASURE;
+}
+
+/* My driver's interrupt handler */
+int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
+{
+	...
+	/* A struct to hold current measured data */
+	struct net_dim_sample dim_sample;
+	...
+	/* Initiate data sample struct with current data */
+	net_dim_sample(my_entity->events,
+		       my_entity->packets,
+		       my_entity->bytes,
+		       &dim_sample);
+	/* Call net DIM */
+	net_dim(&my_entity->dim, dim_sample);
+	...
+}
+
+/* My entity's initialization function (my_entity was already allocated) */
+int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
+{
+	...
+	/* Initiate struct work_struct with my driver's callback function */
+	INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
+	...
+}
diff --git a/Documentation/networking/net_failover.rst b/Documentation/networking/net_failover.rst
new file mode 100644
index 000000000..06c97dcb5
--- /dev/null
+++ b/Documentation/networking/net_failover.rst
@@ -0,0 +1,119 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============
+NET_FAILOVER
+============
+
+Overview
+========
+
+The net_failover driver provides an automated failover mechanism via APIs
+to create and destroy a failover master netdev and mananges a primary and
+standby slave netdevs that get registered via the generic failover
+infrastructrure.
+
+The failover netdev acts a master device and controls 2 slave devices. The
+original paravirtual interface is registered as 'standby' slave netdev and
+a passthru/vf device with the same MAC gets registered as 'primary' slave
+netdev. Both 'standby' and 'failover' netdevs are associated with the same
+'pci' device. The user accesses the network interface via 'failover' netdev.
+The 'failover' netdev chooses 'primary' netdev as default for transmits when
+it is available with link up and running.
+
+This can be used by paravirtual drivers to enable an alternate low latency
+datapath. It also enables hypervisor controlled live migration of a VM with
+direct attached VF by failing over to the paravirtual datapath when the VF
+is unplugged.
+
+virtio-net accelerated datapath: STANDBY mode
+=============================================
+
+net_failover enables hypervisor controlled accelerated datapath to virtio-net
+enabled VMs in a transparent manner with no/minimal guest userspace chanages.
+
+To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
+feature on the virtio-net interface and assign the same MAC address to both
+virtio-net and VF interfaces.
+
+Here is an example XML snippet that shows such configuration.
+::
+
+  <interface type='network'>
+    <mac address='52:54:00:00:12:53'/>
+    <source network='enp66s0f0_br'/>
+    <target dev='tap01'/>
+    <model type='virtio'/>
+    <driver name='vhost' queues='4'/>
+    <link state='down'/>
+    <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
+  </interface>
+  <interface type='hostdev' managed='yes'>
+    <mac address='52:54:00:00:12:53'/>
+    <source>
+      <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
+    </source>
+    <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
+  </interface>
+
+Booting a VM with the above configuration will result in the following 3
+netdevs created in the VM.
+::
+
+  4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
+      link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
+      inet 192.168.12.53/24 brd 192.168.12.255 scope global dynamic ens10
+         valid_lft 42482sec preferred_lft 42482sec
+      inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
+         valid_lft forever preferred_lft forever
+  5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000
+      link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
+  7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
+      link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
+
+ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave
+'standby' and 'primary' netdevs respectively.
+
+Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
+==================================================================
+
+net_failover also enables hypervisor controlled live migration to be supported
+with VMs that have direct attached SR-IOV VF devices by automatic failover to
+the paravirtual datapath when the VF is unplugged.
+
+Here is a sample script that shows the steps to initiate live migration on
+the source hypervisor.
+::
+
+  # cat vf_xml
+  <interface type='hostdev' managed='yes'>
+    <mac address='52:54:00:00:12:53'/>
+    <source>
+      <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
+    </source>
+    <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
+  </interface>
+
+  # Source Hypervisor
+  #!/bin/bash
+
+  DOMAIN=fedora27-tap01
+  PF=enp66s0f0
+  VF_NUM=5
+  TAP_IF=tap01
+  VF_XML=
+
+  MAC=52:54:00:00:12:53
+  ZERO_MAC=00:00:00:00:00:00
+
+  virsh domif-setlink $DOMAIN $TAP_IF up
+  bridge fdb del $MAC dev $PF master
+  virsh detach-device $DOMAIN $VF_XML
+  ip link set $PF vf $VF_NUM mac $ZERO_MAC
+
+  virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system
+
+  # Destination Hypervisor
+  #!/bin/bash
+
+  virsh attach-device $DOMAIN $VF_XML
+  virsh domif-setlink $DOMAIN $TAP_IF down
diff --git a/Documentation/networking/netconsole.txt b/Documentation/networking/netconsole.txt
new file mode 100644
index 000000000..296ea00fd
--- /dev/null
+++ b/Documentation/networking/netconsole.txt
@@ -0,0 +1,210 @@
+
+started by Ingo Molnar <mingo@redhat.com>, 2001.09.17
+2.6 port and netpoll api by Matt Mackall <mpm@selenic.com>, Sep 9 2003
+IPv6 support by Cong Wang <xiyou.wangcong@gmail.com>, Jan 1 2013
+Extended console support by Tejun Heo <tj@kernel.org>, May 1 2015
+
+Please send bug reports to Matt Mackall <mpm@selenic.com>
+Satyam Sharma <satyam.sharma@gmail.com>, and Cong Wang <xiyou.wangcong@gmail.com>
+
+Introduction:
+=============
+
+This module logs kernel printk messages over UDP allowing debugging of
+problem where disk logging fails and serial consoles are impractical.
+
+It can be used either built-in or as a module. As a built-in,
+netconsole initializes immediately after NIC cards and will bring up
+the specified interface as soon as possible. While this doesn't allow
+capture of early kernel panics, it does capture most of the boot
+process.
+
+Sender and receiver configuration:
+==================================
+
+It takes a string configuration parameter "netconsole" in the
+following format:
+
+ netconsole=[+][src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr]
+
+   where
+        +             if present, enable extended console support
+        src-port      source for UDP packets (defaults to 6665)
+        src-ip        source IP to use (interface address)
+        dev           network interface (eth0)
+        tgt-port      port for logging agent (6666)
+        tgt-ip        IP address for logging agent
+        tgt-macaddr   ethernet MAC address for logging agent (broadcast)
+
+Examples:
+
+ linux netconsole=4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
+
+  or
+
+ insmod netconsole netconsole=@/,@10.0.0.2/
+
+  or using IPv6
+
+ insmod netconsole netconsole=@/,@fd00:1:2:3::1/
+
+It also supports logging to multiple remote agents by specifying
+parameters for the multiple agents separated by semicolons and the
+complete string enclosed in "quotes", thusly:
+
+ modprobe netconsole netconsole="@/,@10.0.0.2/;@/eth1,6892@10.0.0.3/"
+
+Built-in netconsole starts immediately after the TCP stack is
+initialized and attempts to bring up the supplied dev at the supplied
+address.
+
+The remote host has several options to receive the kernel messages,
+for example:
+
+1) syslogd
+
+2) netcat
+
+   On distributions using a BSD-based netcat version (e.g. Fedora,
+   openSUSE and Ubuntu) the listening port must be specified without
+   the -p switch:
+
+   'nc -u -l -p <port>' / 'nc -u -l <port>' or
+   'netcat -u -l -p <port>' / 'netcat -u -l <port>'
+
+3) socat
+
+   'socat udp-recv:<port> -'
+
+Dynamic reconfiguration:
+========================
+
+Dynamic reconfigurability is a useful addition to netconsole that enables
+remote logging targets to be dynamically added, removed, or have their
+parameters reconfigured at runtime from a configfs-based userspace interface.
+[ Note that the parameters of netconsole targets that were specified/created
+from the boot/module option are not exposed via this interface, and hence
+cannot be modified dynamically. ]
+
+To include this feature, select CONFIG_NETCONSOLE_DYNAMIC when building the
+netconsole module (or kernel, if netconsole is built-in).
+
+Some examples follow (where configfs is mounted at the /sys/kernel/config
+mountpoint).
+
+To add a remote logging target (target names can be arbitrary):
+
+ cd /sys/kernel/config/netconsole/
+ mkdir target1
+
+Note that newly created targets have default parameter values (as mentioned
+above) and are disabled by default -- they must first be enabled by writing
+"1" to the "enabled" attribute (usually after setting parameters accordingly)
+as described below.
+
+To remove a target:
+
+ rmdir /sys/kernel/config/netconsole/othertarget/
+
+The interface exposes these parameters of a netconsole target to userspace:
+
+	enabled		Is this target currently enabled?	(read-write)
+	extended	Extended mode enabled			(read-write)
+	dev_name	Local network interface name		(read-write)
+	local_port	Source UDP port to use			(read-write)
+	remote_port	Remote agent's UDP port			(read-write)
+	local_ip	Source IP address to use		(read-write)
+	remote_ip	Remote agent's IP address		(read-write)
+	local_mac	Local interface's MAC address		(read-only)
+	remote_mac	Remote agent's MAC address		(read-write)
+
+The "enabled" attribute is also used to control whether the parameters of
+a target can be updated or not -- you can modify the parameters of only
+disabled targets (i.e. if "enabled" is 0).
+
+To update a target's parameters:
+
+ cat enabled				# check if enabled is 1
+ echo 0 > enabled			# disable the target (if required)
+ echo eth2 > dev_name			# set local interface
+ echo 10.0.0.4 > remote_ip		# update some parameter
+ echo cb:a9:87:65:43:21 > remote_mac	# update more parameters
+ echo 1 > enabled			# enable target again
+
+You can also update the local interface dynamically. This is especially
+useful if you want to use interfaces that have newly come up (and may not
+have existed when netconsole was loaded / initialized).
+
+Extended console:
+=================
+
+If '+' is prefixed to the configuration line or "extended" config file
+is set to 1, extended console support is enabled. An example boot
+param follows.
+
+ linux netconsole=+4444@10.0.0.1/eth1,9353@10.0.0.2/12:34:56:78:9a:bc
+
+Log messages are transmitted with extended metadata header in the
+following format which is the same as /dev/kmsg.
+
+ <level>,<sequnum>,<timestamp>,<contflag>;<message text>
+
+Non printable characters in <message text> are escaped using "\xff"
+notation. If the message contains optional dictionary, verbatim
+newline is used as the delimeter.
+
+If a message doesn't fit in certain number of bytes (currently 1000),
+the message is split into multiple fragments by netconsole. These
+fragments are transmitted with "ncfrag" header field added.
+
+ ncfrag=<byte-offset>/<total-bytes>
+
+For example, assuming a lot smaller chunk size, a message "the first
+chunk, the 2nd chunk." may be split as follows.
+
+ 6,416,1758426,-,ncfrag=0/31;the first chunk,
+ 6,416,1758426,-,ncfrag=16/31; the 2nd chunk.
+
+Miscellaneous notes:
+====================
+
+WARNING: the default target ethernet setting uses the broadcast
+ethernet address to send packets, which can cause increased load on
+other systems on the same ethernet segment.
+
+TIP: some LAN switches may be configured to suppress ethernet broadcasts
+so it is advised to explicitly specify the remote agents' MAC addresses
+from the config parameters passed to netconsole.
+
+TIP: to find out the MAC address of, say, 10.0.0.2, you may try using:
+
+ ping -c 1 10.0.0.2 ; /sbin/arp -n | grep 10.0.0.2
+
+TIP: in case the remote logging agent is on a separate LAN subnet than
+the sender, it is suggested to try specifying the MAC address of the
+default gateway (you may use /sbin/route -n to find it out) as the
+remote MAC address instead.
+
+NOTE: the network device (eth1 in the above case) can run any kind
+of other network traffic, netconsole is not intrusive. Netconsole
+might cause slight delays in other traffic if the volume of kernel
+messages is high, but should have no other impact.
+
+NOTE: if you find that the remote logging agent is not receiving or
+printing all messages from the sender, it is likely that you have set
+the "console_loglevel" parameter (on the sender) to only send high
+priority messages to the console. You can change this at runtime using:
+
+ dmesg -n 8
+
+or by specifying "debug" on the kernel command line at boot, to send
+all kernel messages to the console. A specific value for this parameter
+can also be set using the "loglevel" kernel boot option. See the
+dmesg(8) man page and Documentation/admin-guide/kernel-parameters.rst for details.
+
+Netconsole was designed to be as instantaneous as possible, to
+enable the logging of even the most critical kernel bugs. It works
+from IRQ contexts as well, and does not enable interrupts while
+sending packets. Due to these unique needs, configuration cannot
+be more automatic, and some fundamental limitations will remain:
+only IP networks, UDP packets and ethernet devices are supported.
diff --git a/Documentation/networking/netdev-FAQ.rst b/Documentation/networking/netdev-FAQ.rst
new file mode 100644
index 000000000..0ac5fa77f
--- /dev/null
+++ b/Documentation/networking/netdev-FAQ.rst
@@ -0,0 +1,259 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _netdev-FAQ:
+
+==========
+netdev FAQ
+==========
+
+Q: What is netdev?
+------------------
+A: It is a mailing list for all network-related Linux stuff.  This
+includes anything found under net/ (i.e. core code like IPv6) and
+drivers/net (i.e. hardware specific drivers) in the Linux source tree.
+
+Note that some subsystems (e.g. wireless drivers) which have a high
+volume of traffic have their own specific mailing lists.
+
+The netdev list is managed (like many other Linux mailing lists) through
+VGER (http://vger.kernel.org/) and archives can be found below:
+
+-  http://marc.info/?l=linux-netdev
+-  http://www.spinics.net/lists/netdev/
+
+Aside from subsystems like that mentioned above, all network-related
+Linux development (i.e. RFC, review, comments, etc.) takes place on
+netdev.
+
+Q: How do the changes posted to netdev make their way into Linux?
+-----------------------------------------------------------------
+A: There are always two trees (git repositories) in play.  Both are
+driven by David Miller, the main network maintainer.  There is the
+``net`` tree, and the ``net-next`` tree.  As you can probably guess from
+the names, the ``net`` tree is for fixes to existing code already in the
+mainline tree from Linus, and ``net-next`` is where the new code goes
+for the future release.  You can find the trees here:
+
+- https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
+- https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
+
+Q: How often do changes from these trees make it to the mainline Linus tree?
+----------------------------------------------------------------------------
+A: To understand this, you need to know a bit of background information on
+the cadence of Linux development.  Each new release starts off with a
+two week "merge window" where the main maintainers feed their new stuff
+to Linus for merging into the mainline tree.  After the two weeks, the
+merge window is closed, and it is called/tagged ``-rc1``.  No new
+features get mainlined after this -- only fixes to the rc1 content are
+expected.  After roughly a week of collecting fixes to the rc1 content,
+rc2 is released.  This repeats on a roughly weekly basis until rc7
+(typically; sometimes rc6 if things are quiet, or rc8 if things are in a
+state of churn), and a week after the last vX.Y-rcN was done, the
+official vX.Y is released.
+
+Relating that to netdev: At the beginning of the 2-week merge window,
+the ``net-next`` tree will be closed - no new changes/features.  The
+accumulated new content of the past ~10 weeks will be passed onto
+mainline/Linus via a pull request for vX.Y -- at the same time, the
+``net`` tree will start accumulating fixes for this pulled content
+relating to vX.Y
+
+An announcement indicating when ``net-next`` has been closed is usually
+sent to netdev, but knowing the above, you can predict that in advance.
+
+IMPORTANT: Do not send new ``net-next`` content to netdev during the
+period during which ``net-next`` tree is closed.
+
+Shortly after the two weeks have passed (and vX.Y-rc1 is released), the
+tree for ``net-next`` reopens to collect content for the next (vX.Y+1)
+release.
+
+If you aren't subscribed to netdev and/or are simply unsure if
+``net-next`` has re-opened yet, simply check the ``net-next`` git
+repository link above for any new networking-related commits.  You may
+also check the following website for the current status:
+
+  http://vger.kernel.org/~davem/net-next.html
+
+The ``net`` tree continues to collect fixes for the vX.Y content, and is
+fed back to Linus at regular (~weekly) intervals.  Meaning that the
+focus for ``net`` is on stabilization and bug fixes.
+
+Finally, the vX.Y gets released, and the whole cycle starts over.
+
+Q: So where are we now in this cycle?
+
+Load the mainline (Linus) page here:
+
+  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
+
+and note the top of the "tags" section.  If it is rc1, it is early in
+the dev cycle.  If it was tagged rc7 a week ago, then a release is
+probably imminent.
+
+Q: How do I indicate which tree (net vs. net-next) my patch should be in?
+-------------------------------------------------------------------------
+A: Firstly, think whether you have a bug fix or new "next-like" content.
+Then once decided, assuming that you use git, use the prefix flag, i.e.
+::
+
+  git format-patch --subject-prefix='PATCH net-next' start..finish
+
+Use ``net`` instead of ``net-next`` (always lower case) in the above for
+bug-fix ``net`` content.  If you don't use git, then note the only magic
+in the above is just the subject text of the outgoing e-mail, and you
+can manually change it yourself with whatever MUA you are comfortable
+with.
+
+Q: I sent a patch and I'm wondering what happened to it?
+--------------------------------------------------------
+Q: How can I tell whether it got merged?
+A: Start by looking at the main patchworks queue for netdev:
+
+  http://patchwork.ozlabs.org/project/netdev/list/
+
+The "State" field will tell you exactly where things are at with your
+patch.
+
+Q: The above only says "Under Review".  How can I find out more?
+----------------------------------------------------------------
+A: Generally speaking, the patches get triaged quickly (in less than
+48h).  So be patient.  Asking the maintainer for status updates on your
+patch is a good way to ensure your patch is ignored or pushed to the
+bottom of the priority list.
+
+Q: I submitted multiple versions of the patch series
+----------------------------------------------------
+Q: should I directly update patchwork for the previous versions of these
+patch series?
+A: No, please don't interfere with the patch status on patchwork, leave
+it to the maintainer to figure out what is the most recent and current
+version that should be applied. If there is any doubt, the maintainer
+will reply and ask what should be done.
+
+Q: How can I tell what patches are queued up for backporting to the various stable releases?
+--------------------------------------------------------------------------------------------
+A: Normally Greg Kroah-Hartman collects stable commits himself, but for
+networking, Dave collects up patches he deems critical for the
+networking subsystem, and then hands them off to Greg.
+
+There is a patchworks queue that you can see here:
+
+  http://patchwork.ozlabs.org/bundle/davem/stable/?state=*
+
+It contains the patches which Dave has selected, but not yet handed off
+to Greg.  If Greg already has the patch, then it will be here:
+
+  https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
+
+A quick way to find whether the patch is in this stable-queue is to
+simply clone the repo, and then git grep the mainline commit ID, e.g.
+::
+
+  stable-queue$ git grep -l 284041ef21fdf2e
+  releases/3.0.84/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
+  releases/3.4.51/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
+  releases/3.9.8/ipv6-fix-possible-crashes-in-ip6_cork_release.patch
+  stable/stable-queue$
+
+Q: I see a network patch and I think it should be backported to stable.
+-----------------------------------------------------------------------
+Q: Should I request it via stable@vger.kernel.org like the references in
+the kernel's Documentation/process/stable-kernel-rules.rst file say?
+A: No, not for networking.  Check the stable queues as per above first
+to see if it is already queued.  If not, then send a mail to netdev,
+listing the upstream commit ID and why you think it should be a stable
+candidate.
+
+Before you jump to go do the above, do note that the normal stable rules
+in :ref:`Documentation/process/stable-kernel-rules.rst <stable_kernel_rules>`
+still apply.  So you need to explicitly indicate why it is a critical
+fix and exactly what users are impacted.  In addition, you need to
+convince yourself that you *really* think it has been overlooked,
+vs. having been considered and rejected.
+
+Generally speaking, the longer it has had a chance to "soak" in
+mainline, the better the odds that it is an OK candidate for stable.  So
+scrambling to request a commit be added the day after it appears should
+be avoided.
+
+Q: I have created a network patch and I think it should be backported to stable.
+--------------------------------------------------------------------------------
+Q: Should I add a Cc: stable@vger.kernel.org like the references in the
+kernel's Documentation/ directory say?
+A: No.  See above answer.  In short, if you think it really belongs in
+stable, then ensure you write a decent commit log that describes who
+gets impacted by the bug fix and how it manifests itself, and when the
+bug was introduced.  If you do that properly, then the commit will get
+handled appropriately and most likely get put in the patchworks stable
+queue if it really warrants it.
+
+If you think there is some valid information relating to it being in
+stable that does *not* belong in the commit log, then use the three dash
+marker line as described in
+:ref:`Documentation/process/submitting-patches.rst <the_canonical_patch_format>`
+to temporarily embed that information into the patch that you send.
+
+Q: Are all networking bug fixes backported to all stable releases?
+------------------------------------------------------------------
+A: Due to capacity, Dave could only take care of the backports for the
+last two stable releases. For earlier stable releases, each stable
+branch maintainer is supposed to take care of them. If you find any
+patch is missing from an earlier stable branch, please notify
+stable@vger.kernel.org with either a commit ID or a formal patch
+backported, and CC Dave and other relevant networking developers.
+
+Q: Is the comment style convention different for the networking content?
+------------------------------------------------------------------------
+A: Yes, in a largely trivial way.  Instead of this::
+
+  /*
+   * foobar blah blah blah
+   * another line of text
+   */
+
+it is requested that you make it look like this::
+
+  /* foobar blah blah blah
+   * another line of text
+   */
+
+Q: I am working in existing code that has the former comment style and not the latter.
+--------------------------------------------------------------------------------------
+Q: Should I submit new code in the former style or the latter?
+A: Make it the latter style, so that eventually all code in the domain
+of netdev is of this format.
+
+Q: I found a bug that might have possible security implications or similar.
+---------------------------------------------------------------------------
+Q: Should I mail the main netdev maintainer off-list?**
+A: No. The current netdev maintainer has consistently requested that
+people use the mailing lists and not reach out directly.  If you aren't
+OK with that, then perhaps consider mailing security@kernel.org or
+reading about http://oss-security.openwall.org/wiki/mailing-lists/distros
+as possible alternative mechanisms.
+
+Q: What level of testing is expected before I submit my change?
+---------------------------------------------------------------
+A: If your changes are against ``net-next``, the expectation is that you
+have tested by layering your changes on top of ``net-next``.  Ideally
+you will have done run-time testing specific to your change, but at a
+minimum, your changes should survive an ``allyesconfig`` and an
+``allmodconfig`` build without new warnings or failures.
+
+Q: Any other tips to help ensure my net/net-next patch gets OK'd?
+-----------------------------------------------------------------
+A: Attention to detail.  Re-read your own work as if you were the
+reviewer.  You can start with using ``checkpatch.pl``, perhaps even with
+the ``--strict`` flag.  But do not be mindlessly robotic in doing so.
+If your change is a bug fix, make sure your commit log indicates the
+end-user visible symptom, the underlying reason as to why it happens,
+and then if necessary, explain why the fix proposed is the best way to
+get things done.  Don't mangle whitespace, and as is common, don't
+mis-indent function arguments that span multiple lines.  If it is your
+first patch, mail it to yourself so you can test apply it to an
+unpatched tree to confirm infrastructure didn't mangle it.
+
+Finally, go back and read
+:ref:`Documentation/process/submitting-patches.rst <submittingpatches>`
+to be sure you are not repeating some common mistake documented there.
diff --git a/Documentation/networking/netdev-features.txt b/Documentation/networking/netdev-features.txt
new file mode 100644
index 000000000..c4a54c162
--- /dev/null
+++ b/Documentation/networking/netdev-features.txt
@@ -0,0 +1,181 @@
+Netdev features mess and how to get out from it alive
+=====================================================
+
+Author:
+	Michał Mirosław <mirq-linux@rere.qmqm.pl>
+
+
+
+ Part I: Feature sets
+======================
+
+Long gone are the days when a network card would just take and give packets
+verbatim.  Today's devices add multiple features and bugs (read: offloads)
+that relieve an OS of various tasks like generating and checking checksums,
+splitting packets, classifying them.  Those capabilities and their state
+are commonly referred to as netdev features in Linux kernel world.
+
+There are currently three sets of features relevant to the driver, and
+one used internally by network core:
+
+ 1. netdev->hw_features set contains features whose state may possibly
+    be changed (enabled or disabled) for a particular device by user's
+    request.  This set should be initialized in ndo_init callback and not
+    changed later.
+
+ 2. netdev->features set contains features which are currently enabled
+    for a device.  This should be changed only by network core or in
+    error paths of ndo_set_features callback.
+
+ 3. netdev->vlan_features set contains features whose state is inherited
+    by child VLAN devices (limits netdev->features set).  This is currently
+    used for all VLAN devices whether tags are stripped or inserted in
+    hardware or software.
+
+ 4. netdev->wanted_features set contains feature set requested by user.
+    This set is filtered by ndo_fix_features callback whenever it or
+    some device-specific conditions change. This set is internal to
+    networking core and should not be referenced in drivers.
+
+
+
+ Part II: Controlling enabled features
+=======================================
+
+When current feature set (netdev->features) is to be changed, new set
+is calculated and filtered by calling ndo_fix_features callback
+and netdev_fix_features(). If the resulting set differs from current
+set, it is passed to ndo_set_features callback and (if the callback
+returns success) replaces value stored in netdev->features.
+NETDEV_FEAT_CHANGE notification is issued after that whenever current
+set might have changed.
+
+The following events trigger recalculation:
+ 1. device's registration, after ndo_init returned success
+ 2. user requested changes in features state
+ 3. netdev_update_features() is called
+
+ndo_*_features callbacks are called with rtnl_lock held. Missing callbacks
+are treated as always returning success.
+
+A driver that wants to trigger recalculation must do so by calling
+netdev_update_features() while holding rtnl_lock. This should not be done
+from ndo_*_features callbacks. netdev->features should not be modified by
+driver except by means of ndo_fix_features callback.
+
+
+
+ Part III: Implementation hints
+================================
+
+ * ndo_fix_features:
+
+All dependencies between features should be resolved here. The resulting
+set can be reduced further by networking core imposed limitations (as coded
+in netdev_fix_features()). For this reason it is safer to disable a feature
+when its dependencies are not met instead of forcing the dependency on.
+
+This callback should not modify hardware nor driver state (should be
+stateless).  It can be called multiple times between successive
+ndo_set_features calls.
+
+Callback must not alter features contained in NETIF_F_SOFT_FEATURES or
+NETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but
+care must be taken as the change won't affect already configured VLANs.
+
+ * ndo_set_features:
+
+Hardware should be reconfigured to match passed feature set. The set
+should not be altered unless some error condition happens that can't
+be reliably detected in ndo_fix_features. In this case, the callback
+should update netdev->features to match resulting hardware state.
+Errors returned are not (and cannot be) propagated anywhere except dmesg.
+(Note: successful return is zero, >0 means silent error.)
+
+
+
+ Part IV: Features
+===================
+
+For current list of features, see include/linux/netdev_features.h.
+This section describes semantics of some of them.
+
+ * Transmit checksumming
+
+For complete description, see comments near the top of include/linux/skbuff.h.
+
+Note: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM.
+It means that device can fill TCP/UDP-like checksum anywhere in the packets
+whatever headers there might be.
+
+ * Transmit TCP segmentation offload
+
+NETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
+set, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
+
+ * Transmit UDP segmentation offload
+
+NETIF_F_GSO_UDP_GSO_L4 accepts a single UDP header with a payload that exceeds
+gso_size. On segmentation, it segments the payload on gso_size boundaries and
+replicates the network and UDP headers (fixing up the last one if less than
+gso_size).
+
+ * Transmit DMA from high memory
+
+On platforms where this is relevant, NETIF_F_HIGHDMA signals that
+ndo_start_xmit can handle skbs with frags in high memory.
+
+ * Transmit scatter-gather
+
+Those features say that ndo_start_xmit can handle fragmented skbs:
+NETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST ---
+chained skbs (skb->next/prev list).
+
+ * Software features
+
+Features contained in NETIF_F_SOFT_FEATURES are features of networking
+stack. Driver should not change behaviour based on them.
+
+ * LLTX driver (deprecated for hardware drivers)
+
+NETIF_F_LLTX is meant to be used by drivers that don't need locking at all,
+e.g. software tunnels.
+
+This is also used in a few legacy drivers that implement their
+own locking, don't use it for new (hardware) drivers.
+
+ * netns-local device
+
+NETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between
+network namespaces (e.g. loopback).
+
+Don't use it in drivers.
+
+ * VLAN challenged
+
+NETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN
+headers. Some drivers set this because the cards can't handle the bigger MTU.
+[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU
+VLANs. This may be not useful, though.]
+
+*  rx-fcs
+
+This requests that the NIC append the Ethernet Frame Checksum (FCS)
+to the end of the skb data.  This allows sniffers and other tools to
+read the CRC recorded by the NIC on receipt of the packet.
+
+*  rx-all
+
+This requests that the NIC receive all possible frames, including errored
+frames (such as bad FCS, etc).  This can be helpful when sniffing a link with
+bad packets on it.  Some NICs may receive more packets if also put into normal
+PROMISC mode.
+
+*  rx-gro-hw
+
+This requests that the NIC enables Hardware GRO (generic receive offload).
+Hardware GRO is basically the exact reverse of TSO, and is generally
+stricter than Hardware LRO.  A packet stream merged by Hardware GRO must
+be re-segmentable by GSO or TSO back to the exact original packet stream.
+Hardware GRO is dependent on RXCSUM since every packet successfully merged
+by hardware must also have the checksum verified by hardware.
diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt
new file mode 100644
index 000000000..7fec2061a
--- /dev/null
+++ b/Documentation/networking/netdevices.txt
@@ -0,0 +1,104 @@
+
+Network Devices, the Kernel, and You!
+
+
+Introduction
+============
+The following is a random collection of documentation regarding
+network devices.
+
+struct net_device allocation rules
+==================================
+Network device structures need to persist even after module is unloaded and
+must be allocated with alloc_netdev_mqs() and friends.
+If device has registered successfully, it will be freed on last use
+by free_netdev(). This is required to handle the pathologic case cleanly
+(example: rmmod mydriver </sys/class/net/myeth/mtu )
+
+alloc_netdev_mqs()/alloc_netdev() reserve extra space for driver
+private data which gets freed when the network device is freed. If
+separately allocated data is attached to the network device
+(netdev_priv(dev)) then it is up to the module exit handler to free that.
+
+MTU
+===
+Each network device has a Maximum Transfer Unit. The MTU does not
+include any link layer protocol overhead. Upper layer protocols must
+not pass a socket buffer (skb) to a device to transmit with more data
+than the mtu. The MTU does not include link layer header overhead, so
+for example on Ethernet if the standard MTU is 1500 bytes used, the
+actual skb will contain up to 1514 bytes because of the Ethernet
+header. Devices should allow for the 4 byte VLAN header as well.
+
+Segmentation Offload (GSO, TSO) is an exception to this rule.  The
+upper layer protocol may pass a large socket buffer to the device
+transmit routine, and the device will break that up into separate
+packets based on the current MTU.
+
+MTU is symmetrical and applies both to receive and transmit. A device
+must be able to receive at least the maximum size packet allowed by
+the MTU. A network device may use the MTU as mechanism to size receive
+buffers, but the device should allow packets with VLAN header. With
+standard Ethernet mtu of 1500 bytes, the device should allow up to
+1518 byte packets (1500 + 14 header + 4 tag).  The device may either:
+drop, truncate, or pass up oversize packets, but dropping oversize
+packets is preferred.
+
+
+struct net_device synchronization rules
+=======================================
+ndo_open:
+	Synchronization: rtnl_lock() semaphore.
+	Context: process
+
+ndo_stop:
+	Synchronization: rtnl_lock() semaphore.
+	Context: process
+	Note: netif_running() is guaranteed false
+
+ndo_do_ioctl:
+	Synchronization: rtnl_lock() semaphore.
+	Context: process
+
+ndo_get_stats:
+	Synchronization: dev_base_lock rwlock.
+	Context: nominally process, but don't sleep inside an rwlock
+
+ndo_start_xmit:
+	Synchronization: __netif_tx_lock spinlock.
+
+	When the driver sets NETIF_F_LLTX in dev->features this will be
+	called without holding netif_tx_lock. In this case the driver
+	has to lock by itself when needed.
+	The locking there should also properly protect against
+	set_rx_mode. WARNING: use of NETIF_F_LLTX is deprecated.
+	Don't use it for new drivers.
+
+	Context: Process with BHs disabled or BH (timer),
+	         will be called with interrupts disabled by netconsole.
+
+	Return codes: 
+	o NETDEV_TX_OK everything ok. 
+	o NETDEV_TX_BUSY Cannot transmit packet, try later 
+	  Usually a bug, means queue start/stop flow control is broken in
+	  the driver. Note: the driver must NOT put the skb in its DMA ring.
+
+ndo_tx_timeout:
+	Synchronization: netif_tx_lock spinlock; all TX queues frozen.
+	Context: BHs disabled
+	Notes: netif_queue_stopped() is guaranteed true
+
+ndo_set_rx_mode:
+	Synchronization: netif_addr_lock spinlock.
+	Context: BHs disabled
+
+struct napi_struct synchronization rules
+========================================
+napi->poll:
+	Synchronization: NAPI_STATE_SCHED bit in napi->state.  Device
+		driver's ndo_stop method will invoke napi_disable() on
+		all NAPI instances which will do a sleeping poll on the
+		NAPI_STATE_SCHED napi->state bit, waiting for all pending
+		NAPI activity to cease.
+	Context: softirq
+	         will be called with interrupts disabled by netconsole.
diff --git a/Documentation/networking/netfilter-sysctl.txt b/Documentation/networking/netfilter-sysctl.txt
new file mode 100644
index 000000000..55791e50e
--- /dev/null
+++ b/Documentation/networking/netfilter-sysctl.txt
@@ -0,0 +1,10 @@
+/proc/sys/net/netfilter/* Variables:
+
+nf_log_all_netns - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	By default, only init_net namespace can log packets into kernel log
+	with LOG target; this aims to prevent containers from flooding host
+	kernel log. If enabled, this target also works in other network
+	namespaces. This variable is only accessible from init_net.
diff --git a/Documentation/networking/netif-msg.txt b/Documentation/networking/netif-msg.txt
new file mode 100644
index 000000000..c967ddb90
--- /dev/null
+++ b/Documentation/networking/netif-msg.txt
@@ -0,0 +1,79 @@
+
+________________
+NETIF Msg Level
+
+The design of the network interface message level setting.
+
+History
+
+ The design of the debugging message interface was guided and
+ constrained by backwards compatibility previous practice.  It is useful
+ to understand the history and evolution in order to understand current
+ practice and relate it to older driver source code.
+
+ From the beginning of Linux, each network device driver has had a local
+ integer variable that controls the debug message level.  The message
+ level ranged from 0 to 7, and monotonically increased in verbosity.
+
+ The message level was not precisely defined past level 3, but were
+ always implemented within +-1 of the specified level.  Drivers tended
+ to shed the more verbose level messages as they matured.
+    0  Minimal messages, only essential information on fatal errors.
+    1  Standard messages, initialization status.  No run-time messages
+    2  Special media selection messages, generally timer-driver.
+    3  Interface starts and stops, including normal status messages
+    4  Tx and Rx frame error messages, and abnormal driver operation
+    5  Tx packet queue information, interrupt events.
+    6  Status on each completed Tx packet and received Rx packets
+    7  Initial contents of Tx and Rx packets
+
+ Initially this message level variable was uniquely named in each driver
+ e.g. "lance_debug", so that a kernel symbolic debugger could locate and
+ modify the setting.  When kernel modules became common, the variables
+ were consistently renamed to "debug" and allowed to be set as a module
+ parameter.
+
+ This approach worked well.  However there is always a demand for
+ additional features.  Over the years the following emerged as
+ reasonable and easily implemented enhancements
+   Using an ioctl() call to modify the level.
+   Per-interface rather than per-driver message level setting.
+   More selective control over the type of messages emitted.
+
+ The netif_msg recommendation adds these features with only a minor
+ complexity and code size increase.
+
+ The recommendation is the following points
+    Retaining the per-driver integer variable "debug" as a module
+    parameter with a default level of '1'.
+
+    Adding a per-interface private variable named "msg_enable".  The
+    variable is a bit map rather than a level, and is initialized as
+       1 << debug
+    Or more precisely
+        debug < 0 ? 0 : 1 << min(sizeof(int)-1, debug)
+
+    Messages should changes from
+      if (debug > 1)
+           printk(MSG_DEBUG "%s: ...
+    to
+      if (np->msg_enable & NETIF_MSG_LINK)
+           printk(MSG_DEBUG "%s: ...
+
+
+The set of message levels is named
+  Old level   Name   Bit position
+    0    NETIF_MSG_DRV		0x0001
+    1    NETIF_MSG_PROBE	0x0002
+    2    NETIF_MSG_LINK		0x0004
+    2    NETIF_MSG_TIMER	0x0004
+    3    NETIF_MSG_IFDOWN	0x0008
+    3    NETIF_MSG_IFUP		0x0008
+    4    NETIF_MSG_RX_ERR	0x0010
+    4    NETIF_MSG_TX_ERR	0x0010
+    5    NETIF_MSG_TX_QUEUED	0x0020
+    5    NETIF_MSG_INTR		0x0020
+    6    NETIF_MSG_TX_DONE	0x0040
+    6    NETIF_MSG_RX_STATUS	0x0040
+    7    NETIF_MSG_PKTDATA	0x0080
+
diff --git a/Documentation/networking/netvsc.txt b/Documentation/networking/netvsc.txt
new file mode 100644
index 000000000..92f5b3139
--- /dev/null
+++ b/Documentation/networking/netvsc.txt
@@ -0,0 +1,75 @@
+Hyper-V network driver
+======================
+
+Compatibility
+=============
+
+This driver is compatible with Windows Server 2012 R2, 2016 and
+Windows 10.
+
+Features
+========
+
+  Checksum offload
+  ----------------
+  The netvsc driver supports checksum offload as long as the
+  Hyper-V host version does. Windows Server 2016 and Azure
+  support checksum offload for TCP and UDP for both IPv4 and
+  IPv6. Windows Server 2012 only supports checksum offload for TCP.
+
+  Receive Side Scaling
+  --------------------
+  Hyper-V supports receive side scaling. For TCP & UDP, packets can
+  be distributed among available queues based on IP address and port
+  number.
+
+  For TCP & UDP, we can switch hash level between L3 and L4 by ethtool
+  command. TCP/UDP over IPv4 and v6 can be set differently. The default
+  hash level is L4. We currently only allow switching TX hash level
+  from within the guests.
+
+  On Azure, fragmented UDP packets have high loss rate with L4
+  hashing. Using L3 hashing is recommended in this case.
+
+  For example, for UDP over IPv4 on eth0:
+  To include UDP port numbers in hashing:
+        ethtool -N eth0 rx-flow-hash udp4 sdfn
+  To exclude UDP port numbers in hashing:
+        ethtool -N eth0 rx-flow-hash udp4 sd
+  To show UDP hash level:
+        ethtool -n eth0 rx-flow-hash udp4
+
+  Generic Receive Offload, aka GRO
+  --------------------------------
+  The driver supports GRO and it is enabled by default. GRO coalesces
+  like packets and significantly reduces CPU usage under heavy Rx
+  load.
+
+  SR-IOV support
+  --------------
+  Hyper-V supports SR-IOV as a hardware acceleration option. If SR-IOV
+  is enabled in both the vSwitch and the guest configuration, then the
+  Virtual Function (VF) device is passed to the guest as a PCI
+  device. In this case, both a synthetic (netvsc) and VF device are
+  visible in the guest OS and both NIC's have the same MAC address.
+
+  The VF is enslaved by netvsc device.  The netvsc driver will transparently
+  switch the data path to the VF when it is available and up.
+  Network state (addresses, firewall, etc) should be applied only to the
+  netvsc device; the slave device should not be accessed directly in
+  most cases.  The exceptions are if some special queue discipline or
+  flow direction is desired, these should be applied directly to the
+  VF slave device.
+
+  Receive Buffer
+  --------------
+  Packets are received into a receive area which is created when device
+  is probed. The receive area is broken into MTU sized chunks and each may
+  contain one or more packets. The number of receive sections may be changed
+  via ethtool Rx ring parameters.
+
+  There is a similar send buffer which is used to aggregate packets for sending.
+  The send area is broken into chunks of 6144 bytes, each of section may
+  contain one or more packets. The send buffer is an optimization, the driver
+  will use slower method to handle very large packets or if the send buffer
+  area is exhausted.
diff --git a/Documentation/networking/nf_conntrack-sysctl.txt b/Documentation/networking/nf_conntrack-sysctl.txt
new file mode 100644
index 000000000..1669dc241
--- /dev/null
+++ b/Documentation/networking/nf_conntrack-sysctl.txt
@@ -0,0 +1,163 @@
+/proc/sys/net/netfilter/nf_conntrack_* Variables:
+
+nf_conntrack_acct - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	Enable connection tracking flow accounting. 64-bit byte and packet
+	counters per flow are added.
+
+nf_conntrack_buckets - INTEGER
+	Size of hash table. If not specified as parameter during module
+	loading, the default size is calculated by dividing total memory
+	by 16384 to determine the number of buckets but the hash table will
+	never have fewer than 32 and limited to 16384 buckets. For systems
+	with more than 4GB of memory it will be 65536 buckets.
+	This sysctl is only writeable in the initial net namespace.
+
+nf_conntrack_checksum - BOOLEAN
+	0 - disabled
+	not 0 - enabled (default)
+
+	Verify checksum of incoming packets. Packets with bad checksums are
+	in INVALID state. If this is enabled, such packets will not be
+	considered for connection tracking.
+
+nf_conntrack_count - INTEGER (read-only)
+	Number of currently allocated flow entries.
+
+nf_conntrack_events - BOOLEAN
+	0 - disabled
+	not 0 - enabled (default)
+
+	If this option is enabled, the connection tracking code will
+	provide userspace with connection tracking events via ctnetlink.
+
+nf_conntrack_expect_max - INTEGER
+	Maximum size of expectation table.  Default value is
+	nf_conntrack_buckets / 256. Minimum is 1.
+
+nf_conntrack_frag6_high_thresh - INTEGER
+	default 262144
+
+	Maximum memory used to reassemble IPv6 fragments.  When
+	nf_conntrack_frag6_high_thresh bytes of memory is allocated for this
+	purpose, the fragment handler will toss packets until
+	nf_conntrack_frag6_low_thresh is reached.
+
+nf_conntrack_frag6_low_thresh - INTEGER
+	default 196608
+
+	See nf_conntrack_frag6_low_thresh
+
+nf_conntrack_frag6_timeout - INTEGER (seconds)
+	default 60
+
+	Time to keep an IPv6 fragment in memory.
+
+nf_conntrack_generic_timeout - INTEGER (seconds)
+	default 600
+
+	Default for generic timeout.  This refers to layer 4 unknown/unsupported
+	protocols.
+
+nf_conntrack_helper - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	Enable automatic conntrack helper assignment.
+	If disabled it is required to set up iptables rules to assign
+	helpers to connections.  See the CT target description in the
+	iptables-extensions(8) man page for further information.
+
+nf_conntrack_icmp_timeout - INTEGER (seconds)
+	default 30
+
+	Default for ICMP timeout.
+
+nf_conntrack_icmpv6_timeout - INTEGER (seconds)
+	default 30
+
+	Default for ICMP6 timeout.
+
+nf_conntrack_log_invalid - INTEGER
+	0   - disable (default)
+	1   - log ICMP packets
+	6   - log TCP packets
+	17  - log UDP packets
+	33  - log DCCP packets
+	41  - log ICMPv6 packets
+	136 - log UDPLITE packets
+	255 - log packets of any protocol
+
+	Log invalid packets of a type specified by value.
+
+nf_conntrack_max - INTEGER
+	Size of connection tracking table.  Default value is
+	nf_conntrack_buckets value * 4.
+
+nf_conntrack_tcp_be_liberal - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	Be conservative in what you do, be liberal in what you accept from others.
+	If it's non-zero, we mark only out of window RST segments as INVALID.
+
+nf_conntrack_tcp_loose - BOOLEAN
+	0 - disabled
+	not 0 - enabled (default)
+
+	If it is set to zero, we disable picking up already established
+	connections.
+
+nf_conntrack_tcp_max_retrans - INTEGER
+	default 3
+
+	Maximum number of packets that can be retransmitted without
+	received an (acceptable) ACK from the destination. If this number
+	is reached, a shorter timer will be started.
+
+nf_conntrack_tcp_timeout_close - INTEGER (seconds)
+	default 10
+
+nf_conntrack_tcp_timeout_close_wait - INTEGER (seconds)
+	default 60
+
+nf_conntrack_tcp_timeout_established - INTEGER (seconds)
+	default 432000 (5 days)
+
+nf_conntrack_tcp_timeout_fin_wait - INTEGER (seconds)
+	default 120
+
+nf_conntrack_tcp_timeout_last_ack - INTEGER (seconds)
+	default 30
+
+nf_conntrack_tcp_timeout_max_retrans - INTEGER (seconds)
+	default 300
+
+nf_conntrack_tcp_timeout_syn_recv - INTEGER (seconds)
+	default 60
+
+nf_conntrack_tcp_timeout_syn_sent - INTEGER (seconds)
+	default 120
+
+nf_conntrack_tcp_timeout_time_wait - INTEGER (seconds)
+	default 120
+
+nf_conntrack_tcp_timeout_unacknowledged - INTEGER (seconds)
+	default 300
+
+nf_conntrack_timestamp - BOOLEAN
+	0 - disabled (default)
+	not 0 - enabled
+
+	Enable connection tracking flow timestamping.
+
+nf_conntrack_udp_timeout - INTEGER (seconds)
+	default 30
+
+nf_conntrack_udp_timeout_stream - INTEGER (seconds)
+	default 180
+
+	This extended timeout will be used in case there is an UDP stream
+	detected.
diff --git a/Documentation/networking/nf_flowtable.txt b/Documentation/networking/nf_flowtable.txt
new file mode 100644
index 000000000..b01c91893
--- /dev/null
+++ b/Documentation/networking/nf_flowtable.txt
@@ -0,0 +1,112 @@
+Netfilter's flowtable infrastructure
+====================================
+
+This documentation describes the software flowtable infrastructure available in
+Netfilter since Linux kernel 4.16.
+
+Overview
+--------
+
+Initial packets follow the classic forwarding path, once the flow enters the
+established state according to the conntrack semantics (ie. we have seen traffic
+in both directions), then you can decide to offload the flow to the flowtable
+from the forward chain via the 'flow offload' action available in nftables.
+
+Packets that find an entry in the flowtable (ie. flowtable hit) are sent to the
+output netdevice via neigh_xmit(), hence, they bypass the classic forwarding
+path (the visible effect is that you do not see these packets from any of the
+netfilter hooks coming after the ingress). In case of flowtable miss, the packet
+follows the classic forward path.
+
+The flowtable uses a resizable hashtable, lookups are based on the following
+7-tuple selectors: source, destination, layer 3 and layer 4 protocols, source
+and destination ports and the input interface (useful in case there are several
+conntrack zones in place).
+
+Flowtables are populated via the 'flow offload' nftables action, so the user can
+selectively specify what flows are placed into the flow table. Hence, packets
+follow the classic forwarding path unless the user explicitly instruct packets
+to use this new alternative forwarding path via nftables policy.
+
+This is represented in Fig.1, which describes the classic forwarding path
+including the Netfilter hooks and the flowtable fastpath bypass.
+
+                                         userspace process
+                                          ^              |
+                                          |              |
+                                     _____|____     ____\/___
+                                    /          \   /         \
+                                    |   input   |  |  output  |
+                                    \__________/   \_________/
+                                         ^               |
+                                         |               |
+      _________      __________      ---------     _____\/_____
+     /         \    /          \     |Routing |   /            \
+  -->  ingress  ---> prerouting ---> |decision|   | postrouting |--> neigh_xmit
+     \_________/    \__________/     ----------   \____________/          ^
+       |      ^          |               |               ^                |
+   flowtable  |          |          ____\/___            |                |
+       |      |          |         /         \           |                |
+    __\/___   |          --------->| forward |------------                |
+    |-----|   |                    \_________/                            |
+    |-----|   |                 'flow offload' rule                       |
+    |-----|   |                   adds entry to                           |
+    |_____|   |                     flowtable                             |
+       |      |                                                           |
+      / \     |                                                           |
+     /hit\_no_|                                                           |
+     \ ? /                                                                |
+      \ /                                                                 |
+       |__yes_________________fastpath bypass ____________________________|
+
+               Fig.1 Netfilter hooks and flowtable interactions
+
+The flowtable entry also stores the NAT configuration, so all packets are
+mangled according to the NAT policy that matches the initial packets that went
+through the classic forwarding path. The TTL is decremented before calling
+neigh_xmit(). Fragmented traffic is passed up to follow the classic forwarding
+path given that the transport selectors are missing, therefore flowtable lookup
+is not possible.
+
+Example configuration
+---------------------
+
+Enabling the flowtable bypass is relatively easy, you only need to create a
+flowtable and add one rule to your forward chain.
+
+        table inet x {
+		flowtable f {
+			hook ingress priority 0; devices = { eth0, eth1 };
+		}
+                chain y {
+                        type filter hook forward priority 0; policy accept;
+                        ip protocol tcp flow offload @f
+                        counter packets 0 bytes 0
+                }
+        }
+
+This example adds the flowtable 'f' to the ingress hook of the eth0 and eth1
+netdevices. You can create as many flowtables as you want in case you need to
+perform resource partitioning. The flowtable priority defines the order in which
+hooks are run in the pipeline, this is convenient in case you already have a
+nftables ingress chain (make sure the flowtable priority is smaller than the
+nftables ingress chain hence the flowtable runs before in the pipeline).
+
+The 'flow offload' action from the forward chain 'y' adds an entry to the
+flowtable for the TCP syn-ack packet coming in the reply direction. Once the
+flow is offloaded, you will observe that the counter rule in the example above
+does not get updated for the packets that are being forwarded through the
+forwarding bypass.
+
+More reading
+------------
+
+This documentation is based on the LWN.net articles [1][2]. Rafal Milecki also
+made a very complete and comprehensive summary called "A state of network
+acceleration" that describes how things were before this infrastructure was
+mailined [3] and it also makes a rough summary of this work [4].
+
+[1] https://lwn.net/Articles/738214/
+[2] https://lwn.net/Articles/742164/
+[3] http://lists.infradead.org/pipermail/lede-dev/2018-January/010830.html
+[4] http://lists.infradead.org/pipermail/lede-dev/2018-January/010829.html
diff --git a/Documentation/networking/nfc.txt b/Documentation/networking/nfc.txt
new file mode 100644
index 000000000..b24c29bda
--- /dev/null
+++ b/Documentation/networking/nfc.txt
@@ -0,0 +1,128 @@
+Linux NFC subsystem
+===================
+
+The Near Field Communication (NFC) subsystem is required to standardize the
+NFC device drivers development and to create an unified userspace interface.
+
+This document covers the architecture overview, the device driver interface
+description and the userspace interface description.
+
+Architecture overview
+---------------------
+
+The NFC subsystem is responsible for:
+      - NFC adapters management;
+      - Polling for targets;
+      - Low-level data exchange;
+
+The subsystem is divided in some parts. The 'core' is responsible for
+providing the device driver interface. On the other side, it is also
+responsible for providing an interface to control operations and low-level
+data exchange.
+
+The control operations are available to userspace via generic netlink.
+
+The low-level data exchange interface is provided by the new socket family
+PF_NFC. The NFC_SOCKPROTO_RAW performs raw communication with NFC targets.
+
+
+             +--------------------------------------+
+             |              USER SPACE              |
+             +--------------------------------------+
+                 ^                       ^
+                 | low-level             | control
+                 | data exchange         | operations
+                 |                       |
+                 |                       v
+                 |                  +-----------+
+                 | AF_NFC           |  netlink  |
+                 | socket           +-----------+
+                 | raw                   ^
+                 |                       |
+                 v                       v
+             +---------+            +-----------+
+             | rawsock | <--------> |   core    |
+             +---------+            +-----------+
+                                         ^
+                                         |
+                                         v
+                                    +-----------+
+                                    |  driver   |
+                                    +-----------+
+
+Device Driver Interface
+-----------------------
+
+When registering on the NFC subsystem, the device driver must inform the core
+of the set of supported NFC protocols and the set of ops callbacks. The ops
+callbacks that must be implemented are the following:
+
+* start_poll - setup the device to poll for targets
+* stop_poll - stop on progress polling operation
+* activate_target - select and initialize one of the targets found
+* deactivate_target - deselect and deinitialize the selected target
+* data_exchange - send data and receive the response (transceive operation)
+
+Userspace interface
+--------------------
+
+The userspace interface is divided in control operations and low-level data
+exchange operation.
+
+CONTROL OPERATIONS:
+
+Generic netlink is used to implement the interface to the control operations.
+The operations are composed by commands and events, all listed below:
+
+* NFC_CMD_GET_DEVICE - get specific device info or dump the device list
+* NFC_CMD_START_POLL - setup a specific device to polling for targets
+* NFC_CMD_STOP_POLL - stop the polling operation in a specific device
+* NFC_CMD_GET_TARGET - dump the list of targets found by a specific device
+
+* NFC_EVENT_DEVICE_ADDED - reports an NFC device addition
+* NFC_EVENT_DEVICE_REMOVED - reports an NFC device removal
+* NFC_EVENT_TARGETS_FOUND - reports START_POLL results when 1 or more targets
+are found
+
+The user must call START_POLL to poll for NFC targets, passing the desired NFC
+protocols through NFC_ATTR_PROTOCOLS attribute. The device remains in polling
+state until it finds any target. However, the user can stop the polling
+operation by calling STOP_POLL command. In this case, it will be checked if
+the requester of STOP_POLL is the same of START_POLL.
+
+If the polling operation finds one or more targets, the event TARGETS_FOUND is
+sent (including the device id). The user must call GET_TARGET to get the list of
+all targets found by such device. Each reply message has target attributes with
+relevant information such as the supported NFC protocols.
+
+All polling operations requested through one netlink socket are stopped when
+it's closed.
+
+LOW-LEVEL DATA EXCHANGE:
+
+The userspace must use PF_NFC sockets to perform any data communication with
+targets. All NFC sockets use AF_NFC:
+
+struct sockaddr_nfc {
+       sa_family_t sa_family;
+       __u32 dev_idx;
+       __u32 target_idx;
+       __u32 nfc_protocol;
+};
+
+To establish a connection with one target, the user must create an
+NFC_SOCKPROTO_RAW socket and call the 'connect' syscall with the sockaddr_nfc
+struct correctly filled. All information comes from NFC_EVENT_TARGETS_FOUND
+netlink event. As a target can support more than one NFC protocol, the user
+must inform which protocol it wants to use.
+
+Internally, 'connect' will result in an activate_target call to the driver.
+When the socket is closed, the target is deactivated.
+
+The data format exchanged through the sockets is NFC protocol dependent. For
+instance, when communicating with MIFARE tags, the data exchanged are MIFARE
+commands and their responses.
+
+The first received package is the response to the first sent package and so
+on. In order to allow valid "empty" responses, every data received has a NULL
+header of 1 byte.
diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt
new file mode 100644
index 000000000..b3b9ac61d
--- /dev/null
+++ b/Documentation/networking/openvswitch.txt
@@ -0,0 +1,248 @@
+Open vSwitch datapath developer documentation
+=============================================
+
+The Open vSwitch kernel module allows flexible userspace control over
+flow-level packet processing on selected network devices.  It can be
+used to implement a plain Ethernet switch, network device bonding,
+VLAN processing, network access control, flow-based network control,
+and so on.
+
+The kernel module implements multiple "datapaths" (analogous to
+bridges), each of which can have multiple "vports" (analogous to ports
+within a bridge).  Each datapath also has associated with it a "flow
+table" that userspace populates with "flows" that map from keys based
+on packet headers and metadata to sets of actions.  The most common
+action forwards the packet to another vport; other actions are also
+implemented.
+
+When a packet arrives on a vport, the kernel module processes it by
+extracting its flow key and looking it up in the flow table.  If there
+is a matching flow, it executes the associated actions.  If there is
+no match, it queues the packet to userspace for processing (as part of
+its processing, userspace will likely set up a flow to handle further
+packets of the same type entirely in-kernel).
+
+
+Flow key compatibility
+----------------------
+
+Network protocols evolve over time.  New protocols become important
+and existing protocols lose their prominence.  For the Open vSwitch
+kernel module to remain relevant, it must be possible for newer
+versions to parse additional protocols as part of the flow key.  It
+might even be desirable, someday, to drop support for parsing
+protocols that have become obsolete.  Therefore, the Netlink interface
+to Open vSwitch is designed to allow carefully written userspace
+applications to work with any version of the flow key, past or future.
+
+To support this forward and backward compatibility, whenever the
+kernel module passes a packet to userspace, it also passes along the
+flow key that it parsed from the packet.  Userspace then extracts its
+own notion of a flow key from the packet and compares it against the
+kernel-provided version:
+
+    - If userspace's notion of the flow key for the packet matches the
+      kernel's, then nothing special is necessary.
+
+    - If the kernel's flow key includes more fields than the userspace
+      version of the flow key, for example if the kernel decoded IPv6
+      headers but userspace stopped at the Ethernet type (because it
+      does not understand IPv6), then again nothing special is
+      necessary.  Userspace can still set up a flow in the usual way,
+      as long as it uses the kernel-provided flow key to do it.
+
+    - If the userspace flow key includes more fields than the
+      kernel's, for example if userspace decoded an IPv6 header but
+      the kernel stopped at the Ethernet type, then userspace can
+      forward the packet manually, without setting up a flow in the
+      kernel.  This case is bad for performance because every packet
+      that the kernel considers part of the flow must go to userspace,
+      but the forwarding behavior is correct.  (If userspace can
+      determine that the values of the extra fields would not affect
+      forwarding behavior, then it could set up a flow anyway.)
+
+How flow keys evolve over time is important to making this work, so
+the following sections go into detail.
+
+
+Flow key format
+---------------
+
+A flow key is passed over a Netlink socket as a sequence of Netlink
+attributes.  Some attributes represent packet metadata, defined as any
+information about a packet that cannot be extracted from the packet
+itself, e.g. the vport on which the packet was received.  Most
+attributes, however, are extracted from headers within the packet,
+e.g. source and destination addresses from Ethernet, IP, or TCP
+headers.
+
+The <linux/openvswitch.h> header file defines the exact format of the
+flow key attributes.  For informal explanatory purposes here, we write
+them as comma-separated strings, with parentheses indicating arguments
+and nesting.  For example, the following could represent a flow key
+corresponding to a TCP packet that arrived on vport 1:
+
+    in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
+    eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
+    frag=no), tcp(src=49163, dst=80)
+
+Often we ellipsize arguments not important to the discussion, e.g.:
+
+    in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
+
+
+Wildcarded flow key format
+--------------------------
+
+A wildcarded flow is described with two sequences of Netlink attributes
+passed over the Netlink socket. A flow key, exactly as described above, and an
+optional corresponding flow mask.
+
+A wildcarded flow can represent a group of exact match flows. Each '1' bit
+in the mask specifies a exact match with the corresponding bit in the flow key.
+A '0' bit specifies a don't care bit, which will match either a '1' or '0' bit
+of a incoming packet. Using wildcarded flow can improve the flow set up rate
+by reduce the number of new flows need to be processed by the user space program.
+
+Support for the mask Netlink attribute is optional for both the kernel and user
+space program. The kernel can ignore the mask attribute, installing an exact
+match flow, or reduce the number of don't care bits in the kernel to less than
+what was specified by the user space program. In this case, variations in bits
+that the kernel does not implement will simply result in additional flow setups.
+The kernel module will also work with user space programs that neither support
+nor supply flow mask attributes.
+
+Since the kernel may ignore or modify wildcard bits, it can be difficult for
+the userspace program to know exactly what matches are installed. There are
+two possible approaches: reactively install flows as they miss the kernel
+flow table (and therefore not attempt to determine wildcard changes at all)
+or use the kernel's response messages to determine the installed wildcards.
+
+When interacting with userspace, the kernel should maintain the match portion
+of the key exactly as originally installed. This will provides a handle to
+identify the flow for all future operations. However, when reporting the
+mask of an installed flow, the mask should include any restrictions imposed
+by the kernel.
+
+The behavior when using overlapping wildcarded flows is undefined. It is the
+responsibility of the user space program to ensure that any incoming packet
+can match at most one flow, wildcarded or not. The current implementation
+performs best-effort detection of overlapping wildcarded flows and may reject
+some but not all of them. However, this behavior may change in future versions.
+
+
+Unique flow identifiers
+-----------------------
+
+An alternative to using the original match portion of a key as the handle for
+flow identification is a unique flow identifier, or "UFID". UFIDs are optional
+for both the kernel and user space program.
+
+User space programs that support UFID are expected to provide it during flow
+setup in addition to the flow, then refer to the flow using the UFID for all
+future operations. The kernel is not required to index flows by the original
+flow key if a UFID is specified.
+
+
+Basic rule for evolving flow keys
+---------------------------------
+
+Some care is needed to really maintain forward and backward
+compatibility for applications that follow the rules listed under
+"Flow key compatibility" above.
+
+The basic rule is obvious:
+
+    ------------------------------------------------------------------
+    New network protocol support must only supplement existing flow
+    key attributes.  It must not change the meaning of already defined
+    flow key attributes.
+    ------------------------------------------------------------------
+
+This rule does have less-obvious consequences so it is worth working
+through a few examples.  Suppose, for example, that the kernel module
+did not already implement VLAN parsing.  Instead, it just interpreted
+the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
+packet.  The flow key for any packet with an 802.1Q header would look
+essentially like this, ignoring metadata:
+
+    eth(...), eth_type(0x8100)
+
+Naively, to add VLAN support, it makes sense to add a new "vlan" flow
+key attribute to contain the VLAN tag, then continue to decode the
+encapsulated headers beyond the VLAN tag using the existing field
+definitions.  With this change, a TCP packet in VLAN 10 would have a
+flow key much like this:
+
+    eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
+
+But this change would negatively affect a userspace application that
+has not been updated to understand the new "vlan" flow key attribute.
+The application could, following the flow compatibility rules above,
+ignore the "vlan" attribute that it does not understand and therefore
+assume that the flow contained IP packets.  This is a bad assumption
+(the flow only contains IP packets if one parses and skips over the
+802.1Q header) and it could cause the application's behavior to change
+across kernel versions even though it follows the compatibility rules.
+
+The solution is to use a set of nested attributes.  This is, for
+example, why 802.1Q support uses nested attributes.  A TCP packet in
+VLAN 10 is actually expressed as:
+
+    eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
+    ip(proto=6, ...), tcp(...)))
+
+Notice how the "eth_type", "ip", and "tcp" flow key attributes are
+nested inside the "encap" attribute.  Thus, an application that does
+not understand the "vlan" key will not see either of those attributes
+and therefore will not misinterpret them.  (Also, the outer eth_type
+is still 0x8100, not changed to 0x0800.)
+
+Handling malformed packets
+--------------------------
+
+Don't drop packets in the kernel for malformed protocol headers, bad
+checksums, etc.  This would prevent userspace from implementing a
+simple Ethernet switch that forwards every packet.
+
+Instead, in such a case, include an attribute with "empty" content.
+It doesn't matter if the empty content could be valid protocol values,
+as long as those values are rarely seen in practice, because userspace
+can always forward all packets with those values to userspace and
+handle them individually.
+
+For example, consider a packet that contains an IP header that
+indicates protocol 6 for TCP, but which is truncated just after the IP
+header, so that the TCP header is missing.  The flow key for this
+packet would include a tcp attribute with all-zero src and dst, like
+this:
+
+    eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
+
+As another example, consider a packet with an Ethernet type of 0x8100,
+indicating that a VLAN TCI should follow, but which is truncated just
+after the Ethernet type.  The flow key for this packet would include
+an all-zero-bits vlan and an empty encap attribute, like this:
+
+    eth(...), eth_type(0x8100), vlan(0), encap()
+
+Unlike a TCP packet with source and destination ports 0, an
+all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
+VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
+attribute expressly to allow this situation to be distinguished.
+Thus, the flow key in this second example unambiguously indicates a
+missing or malformed VLAN TCI.
+
+Other rules
+-----------
+
+The other rules for flow keys are much less subtle:
+
+    - Duplicate attributes are not allowed at a given nesting level.
+
+    - Ordering of attributes is not significant.
+
+    - When the kernel sends a given flow key to userspace, it always
+      composes it the same way.  This allows userspace to hash and
+      compare entire flow keys that it may not be able to fully
+      interpret.
diff --git a/Documentation/networking/operstates.txt b/Documentation/networking/operstates.txt
new file mode 100644
index 000000000..355c6d8ef
--- /dev/null
+++ b/Documentation/networking/operstates.txt
@@ -0,0 +1,162 @@
+
+1. Introduction
+
+Linux distinguishes between administrative and operational state of an
+interface. Administrative state is the result of "ip link set dev
+<dev> up or down" and reflects whether the administrator wants to use
+the device for traffic.
+
+However, an interface is not usable just because the admin enabled it
+- ethernet requires to be plugged into the switch and, depending on
+a site's networking policy and configuration, an 802.1X authentication
+to be performed before user data can be transferred. Operational state
+shows the ability of an interface to transmit this user data.
+
+Thanks to 802.1X, userspace must be granted the possibility to
+influence operational state. To accommodate this, operational state is
+split into two parts: Two flags that can be set by the driver only, and
+a RFC2863 compatible state that is derived from these flags, a policy,
+and changeable from userspace under certain rules.
+
+
+2. Querying from userspace
+
+Both admin and operational state can be queried via the netlink
+operation RTM_GETLINK. It is also possible to subscribe to RTMGRP_LINK
+to be notified of updates. This is important for setting from userspace.
+
+These values contain interface state:
+
+ifinfomsg::if_flags & IFF_UP:
+ Interface is admin up
+ifinfomsg::if_flags & IFF_RUNNING:
+ Interface is in RFC2863 operational state UP or UNKNOWN. This is for
+ backward compatibility, routing daemons, dhcp clients can use this
+ flag to determine whether they should use the interface.
+ifinfomsg::if_flags & IFF_LOWER_UP:
+ Driver has signaled netif_carrier_on()
+ifinfomsg::if_flags & IFF_DORMANT:
+ Driver has signaled netif_dormant_on()
+
+TLV IFLA_OPERSTATE
+
+contains RFC2863 state of the interface in numeric representation:
+
+IF_OPER_UNKNOWN (0):
+ Interface is in unknown state, neither driver nor userspace has set
+ operational state. Interface must be considered for user data as
+ setting operational state has not been implemented in every driver.
+IF_OPER_NOTPRESENT (1):
+ Unused in current kernel (notpresent interfaces normally disappear),
+ just a numerical placeholder.
+IF_OPER_DOWN (2):
+ Interface is unable to transfer data on L1, f.e. ethernet is not
+ plugged or interface is ADMIN down.
+IF_OPER_LOWERLAYERDOWN (3):
+ Interfaces stacked on an interface that is IF_OPER_DOWN show this
+ state (f.e. VLAN).
+IF_OPER_TESTING (4):
+ Unused in current kernel.
+IF_OPER_DORMANT (5):
+ Interface is L1 up, but waiting for an external event, f.e. for a
+ protocol to establish. (802.1X)
+IF_OPER_UP (6):
+ Interface is operational up and can be used.
+
+This TLV can also be queried via sysfs.
+
+TLV IFLA_LINKMODE
+
+contains link policy. This is needed for userspace interaction
+described below.
+
+This TLV can also be queried via sysfs.
+
+
+3. Kernel driver API
+
+Kernel drivers have access to two flags that map to IFF_LOWER_UP and
+IFF_DORMANT. These flags can be set from everywhere, even from
+interrupts. It is guaranteed that only the driver has write access,
+however, if different layers of the driver manipulate the same flag,
+the driver has to provide the synchronisation needed.
+
+__LINK_STATE_NOCARRIER, maps to !IFF_LOWER_UP:
+
+The driver uses netif_carrier_on() to clear and netif_carrier_off() to
+set this flag. On netif_carrier_off(), the scheduler stops sending
+packets. The name 'carrier' and the inversion are historical, think of
+it as lower layer.
+
+Note that for certain kind of soft-devices, which are not managing any
+real hardware, it is possible to set this bit from userspace.  One
+should use TVL IFLA_CARRIER to do so.
+
+netif_carrier_ok() can be used to query that bit.
+
+__LINK_STATE_DORMANT, maps to IFF_DORMANT:
+
+Set by the driver to express that the device cannot yet be used
+because some driver controlled protocol establishment has to
+complete. Corresponding functions are netif_dormant_on() to set the
+flag, netif_dormant_off() to clear it and netif_dormant() to query.
+
+On device allocation, networking core sets the flags equivalent to
+netif_carrier_ok() and !netif_dormant().
+
+
+Whenever the driver CHANGES one of these flags, a workqueue event is
+scheduled to translate the flag combination to IFLA_OPERSTATE as
+follows:
+
+!netif_carrier_ok():
+ IF_OPER_LOWERLAYERDOWN if the interface is stacked, IF_OPER_DOWN
+ otherwise. Kernel can recognise stacked interfaces because their
+ ifindex != iflink.
+
+netif_carrier_ok() && netif_dormant():
+ IF_OPER_DORMANT
+
+netif_carrier_ok() && !netif_dormant():
+ IF_OPER_UP if userspace interaction is disabled. Otherwise
+ IF_OPER_DORMANT with the possibility for userspace to initiate the
+ IF_OPER_UP transition afterwards.
+
+
+4. Setting from userspace
+
+Applications have to use the netlink interface to influence the
+RFC2863 operational state of an interface. Setting IFLA_LINKMODE to 1
+via RTM_SETLINK instructs the kernel that an interface should go to
+IF_OPER_DORMANT instead of IF_OPER_UP when the combination
+netif_carrier_ok() && !netif_dormant() is set by the
+driver. Afterwards, the userspace application can set IFLA_OPERSTATE
+to IF_OPER_DORMANT or IF_OPER_UP as long as the driver does not set
+netif_carrier_off() or netif_dormant_on(). Changes made by userspace
+are multicasted on the netlink group RTMGRP_LINK.
+
+So basically a 802.1X supplicant interacts with the kernel like this:
+
+-subscribe to RTMGRP_LINK
+-set IFLA_LINKMODE to 1 via RTM_SETLINK
+-query RTM_GETLINK once to get initial state
+-if initial flags are not (IFF_LOWER_UP && !IFF_DORMANT), wait until
+ netlink multicast signals this state
+-do 802.1X, eventually abort if flags go down again
+-send RTM_SETLINK to set operstate to IF_OPER_UP if authentication
+ succeeds, IF_OPER_DORMANT otherwise
+-see how operstate and IFF_RUNNING is echoed via netlink multicast
+-set interface back to IF_OPER_DORMANT if 802.1X reauthentication
+ fails
+-restart if kernel changes IFF_LOWER_UP or IFF_DORMANT flag
+
+if supplicant goes down, bring back IFLA_LINKMODE to 0 and
+IFLA_OPERSTATE to a sane value.
+
+A routing daemon or dhcp client just needs to care for IFF_RUNNING or
+waiting for operstate to go IF_OPER_UP/IF_OPER_UNKNOWN before
+considering the interface / querying a DHCP address.
+
+
+For technical questions and/or comments please e-mail to Stefan Rompf
+(stefan at loplof.de).
diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
new file mode 100644
index 000000000..999eb41da
--- /dev/null
+++ b/Documentation/networking/packet_mmap.txt
@@ -0,0 +1,1061 @@
+--------------------------------------------------------------------------------
++ ABSTRACT
+--------------------------------------------------------------------------------
+
+This file documents the mmap() facility available with the PACKET
+socket interface on 2.4/2.6/3.x kernels. This type of sockets is used for
+i) capture network traffic with utilities like tcpdump, ii) transmit network
+traffic, or any other that needs raw access to network interface.
+
+Howto can be found at:
+    https://sites.google.com/site/packetmmap/
+
+Please send your comments to
+    Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es>
+    Johann Baudy
+
+-------------------------------------------------------------------------------
++ Why use PACKET_MMAP
+--------------------------------------------------------------------------------
+
+In Linux 2.4/2.6/3.x if PACKET_MMAP is not enabled, the capture process is very
+inefficient. It uses very limited buffers and requires one system call to
+capture each packet, it requires two if you want to get packet's timestamp
+(like libpcap always does).
+
+In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size 
+configurable circular buffer mapped in user space that can be used to either
+send or receive packets. This way reading packets just needs to wait for them,
+most of the time there is no need to issue a single system call. Concerning
+transmission, multiple packets can be sent through one system call to get the
+highest bandwidth. By using a shared buffer between the kernel and the user
+also has the benefit of minimizing packet copies.
+
+It's fine to use PACKET_MMAP to improve the performance of the capture and
+transmission process, but it isn't everything. At least, if you are capturing
+at high speeds (this is relative to the cpu speed), you should check if the
+device driver of your network interface card supports some sort of interrupt
+load mitigation or (even better) if it supports NAPI, also make sure it is
+enabled. For transmission, check the MTU (Maximum Transmission Unit) used and
+supported by devices of your network. CPU IRQ pinning of your network interface
+card can also be an advantage.
+
+--------------------------------------------------------------------------------
++ How to use mmap() to improve capture process
+--------------------------------------------------------------------------------
+
+From the user standpoint, you should use the higher level libpcap library, which
+is a de facto standard, portable across nearly all operating systems
+including Win32. 
+
+Packet MMAP support was integrated into libpcap around the time of version 1.3.0;
+TPACKET_V3 support was added in version 1.5.0
+
+--------------------------------------------------------------------------------
++ How to use mmap() directly to improve capture process
+--------------------------------------------------------------------------------
+
+From the system calls stand point, the use of PACKET_MMAP involves
+the following process:
+
+
+[setup]     socket() -------> creation of the capture socket
+            setsockopt() ---> allocation of the circular buffer (ring)
+                              option: PACKET_RX_RING
+            mmap() ---------> mapping of the allocated buffer to the
+                              user process
+
+[capture]   poll() ---------> to wait for incoming packets
+
+[shutdown]  close() --------> destruction of the capture socket and
+                              deallocation of all associated 
+                              resources.
+
+
+socket creation and destruction is straight forward, and is done 
+the same way with or without PACKET_MMAP:
+
+ int fd = socket(PF_PACKET, mode, htons(ETH_P_ALL));
+
+where mode is SOCK_RAW for the raw interface were link level
+information can be captured or SOCK_DGRAM for the cooked
+interface where link level information capture is not 
+supported and a link level pseudo-header is provided 
+by the kernel.
+
+The destruction of the socket and all associated resources
+is done by a simple call to close(fd).
+
+Similarly as without PACKET_MMAP, it is possible to use one socket
+for capture and transmission. This can be done by mapping the
+allocated RX and TX buffer ring with a single mmap() call.
+See "Mapping and use of the circular buffer (ring)".
+
+Next I will describe PACKET_MMAP settings and its constraints,
+also the mapping of the circular buffer in the user process and 
+the use of this buffer.
+
+--------------------------------------------------------------------------------
++ How to use mmap() directly to improve transmission process
+--------------------------------------------------------------------------------
+Transmission process is similar to capture as shown below.
+
+[setup]          socket() -------> creation of the transmission socket
+                 setsockopt() ---> allocation of the circular buffer (ring)
+                                   option: PACKET_TX_RING
+                 bind() ---------> bind transmission socket with a network interface
+                 mmap() ---------> mapping of the allocated buffer to the
+                                   user process
+
+[transmission]   poll() ---------> wait for free packets (optional)
+                 send() ---------> send all packets that are set as ready in
+                                   the ring
+                                   The flag MSG_DONTWAIT can be used to return
+                                   before end of transfer.
+
+[shutdown]  close() --------> destruction of the transmission socket and
+                              deallocation of all associated resources.
+
+Socket creation and destruction is also straight forward, and is done
+the same way as in capturing described in the previous paragraph:
+
+ int fd = socket(PF_PACKET, mode, 0);
+
+The protocol can optionally be 0 in case we only want to transmit
+via this socket, which avoids an expensive call to packet_rcv().
+In this case, you also need to bind(2) the TX_RING with sll_protocol = 0
+set. Otherwise, htons(ETH_P_ALL) or any other protocol, for example.
+
+Binding the socket to your network interface is mandatory (with zero copy) to
+know the header size of frames used in the circular buffer.
+
+As capture, each frame contains two parts:
+
+ --------------------
+| struct tpacket_hdr | Header. It contains the status of
+|                    | of this frame
+|--------------------|
+| data buffer        |
+.                    .  Data that will be sent over the network interface.
+.                    .
+ --------------------
+
+ bind() associates the socket to your network interface thanks to
+ sll_ifindex parameter of struct sockaddr_ll.
+
+ Initialization example:
+
+ struct sockaddr_ll my_addr;
+ struct ifreq s_ifr;
+ ...
+
+ strncpy (s_ifr.ifr_name, "eth0", sizeof(s_ifr.ifr_name));
+
+ /* get interface index of eth0 */
+ ioctl(this->socket, SIOCGIFINDEX, &s_ifr);
+
+ /* fill sockaddr_ll struct to prepare binding */
+ my_addr.sll_family = AF_PACKET;
+ my_addr.sll_protocol = htons(ETH_P_ALL);
+ my_addr.sll_ifindex =  s_ifr.ifr_ifindex;
+
+ /* bind socket to eth0 */
+ bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll));
+
+ A complete tutorial is available at: https://sites.google.com/site/packetmmap/
+
+By default, the user should put data at :
+ frame base + TPACKET_HDRLEN - sizeof(struct sockaddr_ll)
+
+So, whatever you choose for the socket mode (SOCK_DGRAM or SOCK_RAW),
+the beginning of the user data will be at :
+ frame base + TPACKET_ALIGN(sizeof(struct tpacket_hdr))
+
+If you wish to put user data at a custom offset from the beginning of
+the frame (for payload alignment with SOCK_RAW mode for instance) you
+can set tp_net (with SOCK_DGRAM) or tp_mac (with SOCK_RAW). In order
+to make this work it must be enabled previously with setsockopt()
+and the PACKET_TX_HAS_OFF option.
+
+--------------------------------------------------------------------------------
++ PACKET_MMAP settings
+--------------------------------------------------------------------------------
+
+To setup PACKET_MMAP from user level code is done with a call like
+
+ - Capture process
+     setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req))
+ - Transmission process
+     setsockopt(fd, SOL_PACKET, PACKET_TX_RING, (void *) &req, sizeof(req))
+
+The most significant argument in the previous call is the req parameter, 
+this parameter must to have the following structure:
+
+    struct tpacket_req
+    {
+        unsigned int    tp_block_size;  /* Minimal size of contiguous block */
+        unsigned int    tp_block_nr;    /* Number of blocks */
+        unsigned int    tp_frame_size;  /* Size of frame */
+        unsigned int    tp_frame_nr;    /* Total number of frames */
+    };
+
+This structure is defined in /usr/include/linux/if_packet.h and establishes a 
+circular buffer (ring) of unswappable memory.
+Being mapped in the capture process allows reading the captured frames and 
+related meta-information like timestamps without requiring a system call.
+
+Frames are grouped in blocks. Each block is a physically contiguous
+region of memory and holds tp_block_size/tp_frame_size frames. The total number 
+of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because
+
+    frames_per_block = tp_block_size/tp_frame_size
+
+indeed, packet_set_ring checks that the following condition is true
+
+    frames_per_block * tp_block_nr == tp_frame_nr
+
+Lets see an example, with the following values:
+
+     tp_block_size= 4096
+     tp_frame_size= 2048
+     tp_block_nr  = 4
+     tp_frame_nr  = 8
+
+we will get the following buffer structure:
+
+        block #1                 block #2         
++---------+---------+    +---------+---------+    
+| frame 1 | frame 2 |    | frame 3 | frame 4 |    
++---------+---------+    +---------+---------+    
+
+        block #3                 block #4
++---------+---------+    +---------+---------+
+| frame 5 | frame 6 |    | frame 7 | frame 8 |
++---------+---------+    +---------+---------+
+
+A frame can be of any size with the only condition it can fit in a block. A block
+can only hold an integer number of frames, or in other words, a frame cannot 
+be spawned across two blocks, so there are some details you have to take into 
+account when choosing the frame_size. See "Mapping and use of the circular 
+buffer (ring)".
+
+--------------------------------------------------------------------------------
++ PACKET_MMAP setting constraints
+--------------------------------------------------------------------------------
+
+In kernel versions prior to 2.4.26 (for the 2.4 branch) and 2.6.5 (2.6 branch),
+the PACKET_MMAP buffer could hold only 32768 frames in a 32 bit architecture or
+16384 in a 64 bit architecture. For information on these kernel versions
+see http://pusa.uv.es/~ulisses/packet_mmap/packet_mmap.pre-2.4.26_2.6.5.txt
+
+ Block size limit
+------------------
+
+As stated earlier, each block is a contiguous physical region of memory. These 
+memory regions are allocated with calls to the __get_free_pages() function. As 
+the name indicates, this function allocates pages of memory, and the second
+argument is "order" or a power of two number of pages, that is 
+(for PAGE_SIZE == 4096) order=0 ==> 4096 bytes, order=1 ==> 8192 bytes, 
+order=2 ==> 16384 bytes, etc. The maximum size of a 
+region allocated by __get_free_pages is determined by the MAX_ORDER macro. More 
+precisely the limit can be calculated as:
+
+   PAGE_SIZE << MAX_ORDER
+
+   In a i386 architecture PAGE_SIZE is 4096 bytes 
+   In a 2.4/i386 kernel MAX_ORDER is 10
+   In a 2.6/i386 kernel MAX_ORDER is 11
+
+So get_free_pages can allocate as much as 4MB or 8MB in a 2.4/2.6 kernel 
+respectively, with an i386 architecture.
+
+User space programs can include /usr/include/sys/user.h and 
+/usr/include/linux/mmzone.h to get PAGE_SIZE MAX_ORDER declarations.
+
+The pagesize can also be determined dynamically with the getpagesize (2) 
+system call. 
+
+ Block number limit
+--------------------
+
+To understand the constraints of PACKET_MMAP, we have to see the structure 
+used to hold the pointers to each block.
+
+Currently, this structure is a dynamically allocated vector with kmalloc 
+called pg_vec, its size limits the number of blocks that can be allocated.
+
+    +---+---+---+---+
+    | x | x | x | x |
+    +---+---+---+---+
+      |   |   |   |
+      |   |   |   v
+      |   |   v  block #4
+      |   v  block #3
+      v  block #2
+     block #1
+
+kmalloc allocates any number of bytes of physically contiguous memory from 
+a pool of pre-determined sizes. This pool of memory is maintained by the slab 
+allocator which is at the end the responsible for doing the allocation and 
+hence which imposes the maximum memory that kmalloc can allocate. 
+
+In a 2.4/2.6 kernel and the i386 architecture, the limit is 131072 bytes. The 
+predetermined sizes that kmalloc uses can be checked in the "size-<bytes>" 
+entries of /proc/slabinfo
+
+In a 32 bit architecture, pointers are 4 bytes long, so the total number of 
+pointers to blocks is
+
+     131072/4 = 32768 blocks
+
+ PACKET_MMAP buffer size calculator
+------------------------------------
+
+Definitions:
+
+<size-max>    : is the maximum size of allocable with kmalloc (see /proc/slabinfo)
+<pointer size>: depends on the architecture -- sizeof(void *)
+<page size>   : depends on the architecture -- PAGE_SIZE or getpagesize (2)
+<max-order>   : is the value defined with MAX_ORDER
+<frame size>  : it's an upper bound of frame's capture size (more on this later)
+
+from these definitions we will derive 
+
+	<block number> = <size-max>/<pointer size>
+	<block size> = <pagesize> << <max-order>
+
+so, the max buffer size is
+
+	<block number> * <block size>
+
+and, the number of frames be
+
+	<block number> * <block size> / <frame size>
+
+Suppose the following parameters, which apply for 2.6 kernel and an
+i386 architecture:
+
+	<size-max> = 131072 bytes
+	<pointer size> = 4 bytes
+	<pagesize> = 4096 bytes
+	<max-order> = 11
+
+and a value for <frame size> of 2048 bytes. These parameters will yield
+
+	<block number> = 131072/4 = 32768 blocks
+	<block size> = 4096 << 11 = 8 MiB.
+
+and hence the buffer will have a 262144 MiB size. So it can hold 
+262144 MiB / 2048 bytes = 134217728 frames
+
+Actually, this buffer size is not possible with an i386 architecture. 
+Remember that the memory is allocated in kernel space, in the case of 
+an i386 kernel's memory size is limited to 1GiB.
+
+All memory allocations are not freed until the socket is closed. The memory 
+allocations are done with GFP_KERNEL priority, this basically means that 
+the allocation can wait and swap other process' memory in order to allocate 
+the necessary memory, so normally limits can be reached.
+
+ Other constraints
+-------------------
+
+If you check the source code you will see that what I draw here as a frame
+is not only the link level frame. At the beginning of each frame there is a 
+header called struct tpacket_hdr used in PACKET_MMAP to hold link level's frame
+meta information like timestamp. So what we draw here a frame it's really 
+the following (from include/linux/if_packet.h):
+
+/*
+   Frame structure:
+
+   - Start. Frame must be aligned to TPACKET_ALIGNMENT=16
+   - struct tpacket_hdr
+   - pad to TPACKET_ALIGNMENT=16
+   - struct sockaddr_ll
+   - Gap, chosen so that packet data (Start+tp_net) aligns to 
+     TPACKET_ALIGNMENT=16
+   - Start+tp_mac: [ Optional MAC header ]
+   - Start+tp_net: Packet data, aligned to TPACKET_ALIGNMENT=16.
+   - Pad to align to TPACKET_ALIGNMENT=16
+ */
+ 
+ The following are conditions that are checked in packet_set_ring
+
+   tp_block_size must be a multiple of PAGE_SIZE (1)
+   tp_frame_size must be greater than TPACKET_HDRLEN (obvious)
+   tp_frame_size must be a multiple of TPACKET_ALIGNMENT
+   tp_frame_nr   must be exactly frames_per_block*tp_block_nr
+
+Note that tp_block_size should be chosen to be a power of two or there will
+be a waste of memory.
+
+--------------------------------------------------------------------------------
++ Mapping and use of the circular buffer (ring)
+--------------------------------------------------------------------------------
+
+The mapping of the buffer in the user process is done with the conventional 
+mmap function. Even the circular buffer is compound of several physically
+discontiguous blocks of memory, they are contiguous to the user space, hence
+just one call to mmap is needed:
+
+    mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+
+If tp_frame_size is a divisor of tp_block_size frames will be 
+contiguously spaced by tp_frame_size bytes. If not, each
+tp_block_size/tp_frame_size frames there will be a gap between 
+the frames. This is because a frame cannot be spawn across two
+blocks. 
+
+To use one socket for capture and transmission, the mapping of both the
+RX and TX buffer ring has to be done with one call to mmap:
+
+    ...
+    setsockopt(fd, SOL_PACKET, PACKET_RX_RING, &foo, sizeof(foo));
+    setsockopt(fd, SOL_PACKET, PACKET_TX_RING, &bar, sizeof(bar));
+    ...
+    rx_ring = mmap(0, size * 2, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+    tx_ring = rx_ring + size;
+
+RX must be the first as the kernel maps the TX ring memory right
+after the RX one.
+
+At the beginning of each frame there is an status field (see 
+struct tpacket_hdr). If this field is 0 means that the frame is ready
+to be used for the kernel, If not, there is a frame the user can read 
+and the following flags apply:
+
++++ Capture process:
+     from include/linux/if_packet.h
+
+     #define TP_STATUS_COPY          (1 << 1)
+     #define TP_STATUS_LOSING        (1 << 2)
+     #define TP_STATUS_CSUMNOTREADY  (1 << 3)
+     #define TP_STATUS_CSUM_VALID    (1 << 7)
+
+TP_STATUS_COPY        : This flag indicates that the frame (and associated
+                        meta information) has been truncated because it's 
+                        larger than tp_frame_size. This packet can be 
+                        read entirely with recvfrom().
+                        
+                        In order to make this work it must to be
+                        enabled previously with setsockopt() and 
+                        the PACKET_COPY_THRESH option. 
+
+                        The number of frames that can be buffered to
+                        be read with recvfrom is limited like a normal socket.
+                        See the SO_RCVBUF option in the socket (7) man page.
+
+TP_STATUS_LOSING      : indicates there were packet drops from last time 
+                        statistics where checked with getsockopt() and
+                        the PACKET_STATISTICS option.
+
+TP_STATUS_CSUMNOTREADY: currently it's used for outgoing IP packets which 
+                        its checksum will be done in hardware. So while
+                        reading the packet we should not try to check the 
+                        checksum. 
+
+TP_STATUS_CSUM_VALID  : This flag indicates that at least the transport
+                        header checksum of the packet has been already
+                        validated on the kernel side. If the flag is not set
+                        then we are free to check the checksum by ourselves
+                        provided that TP_STATUS_CSUMNOTREADY is also not set.
+
+for convenience there are also the following defines:
+
+     #define TP_STATUS_KERNEL        0
+     #define TP_STATUS_USER          1
+
+The kernel initializes all frames to TP_STATUS_KERNEL, when the kernel
+receives a packet it puts in the buffer and updates the status with
+at least the TP_STATUS_USER flag. Then the user can read the packet,
+once the packet is read the user must zero the status field, so the kernel 
+can use again that frame buffer.
+
+The user can use poll (any other variant should apply too) to check if new
+packets are in the ring:
+
+    struct pollfd pfd;
+
+    pfd.fd = fd;
+    pfd.revents = 0;
+    pfd.events = POLLIN|POLLRDNORM|POLLERR;
+
+    if (status == TP_STATUS_KERNEL)
+        retval = poll(&pfd, 1, timeout);
+
+It doesn't incur in a race condition to first check the status value and 
+then poll for frames.
+
+++ Transmission process
+Those defines are also used for transmission:
+
+     #define TP_STATUS_AVAILABLE        0 // Frame is available
+     #define TP_STATUS_SEND_REQUEST     1 // Frame will be sent on next send()
+     #define TP_STATUS_SENDING          2 // Frame is currently in transmission
+     #define TP_STATUS_WRONG_FORMAT     4 // Frame format is not correct
+
+First, the kernel initializes all frames to TP_STATUS_AVAILABLE. To send a
+packet, the user fills a data buffer of an available frame, sets tp_len to
+current data buffer size and sets its status field to TP_STATUS_SEND_REQUEST.
+This can be done on multiple frames. Once the user is ready to transmit, it
+calls send(). Then all buffers with status equal to TP_STATUS_SEND_REQUEST are
+forwarded to the network device. The kernel updates each status of sent
+frames with TP_STATUS_SENDING until the end of transfer.
+At the end of each transfer, buffer status returns to TP_STATUS_AVAILABLE.
+
+    header->tp_len = in_i_size;
+    header->tp_status = TP_STATUS_SEND_REQUEST;
+    retval = send(this->socket, NULL, 0, 0);
+
+The user can also use poll() to check if a buffer is available:
+(status == TP_STATUS_SENDING)
+
+    struct pollfd pfd;
+    pfd.fd = fd;
+    pfd.revents = 0;
+    pfd.events = POLLOUT;
+    retval = poll(&pfd, 1, timeout);
+
+-------------------------------------------------------------------------------
++ What TPACKET versions are available and when to use them?
+-------------------------------------------------------------------------------
+
+ int val = tpacket_version;
+ setsockopt(fd, SOL_PACKET, PACKET_VERSION, &val, sizeof(val));
+ getsockopt(fd, SOL_PACKET, PACKET_VERSION, &val, sizeof(val));
+
+where 'tpacket_version' can be TPACKET_V1 (default), TPACKET_V2, TPACKET_V3.
+
+TPACKET_V1:
+	- Default if not otherwise specified by setsockopt(2)
+	- RX_RING, TX_RING available
+
+TPACKET_V1 --> TPACKET_V2:
+	- Made 64 bit clean due to unsigned long usage in TPACKET_V1
+	  structures, thus this also works on 64 bit kernel with 32 bit
+	  userspace and the like
+	- Timestamp resolution in nanoseconds instead of microseconds
+	- RX_RING, TX_RING available
+	- VLAN metadata information available for packets
+	  (TP_STATUS_VLAN_VALID, TP_STATUS_VLAN_TPID_VALID),
+	  in the tpacket2_hdr structure:
+		- TP_STATUS_VLAN_VALID bit being set into the tp_status field indicates
+		  that the tp_vlan_tci field has valid VLAN TCI value
+		- TP_STATUS_VLAN_TPID_VALID bit being set into the tp_status field
+		  indicates that the tp_vlan_tpid field has valid VLAN TPID value
+	- How to switch to TPACKET_V2:
+		1. Replace struct tpacket_hdr by struct tpacket2_hdr
+		2. Query header len and save
+		3. Set protocol version to 2, set up ring as usual
+		4. For getting the sockaddr_ll,
+		   use (void *)hdr + TPACKET_ALIGN(hdrlen) instead of
+		   (void *)hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr))
+
+TPACKET_V2 --> TPACKET_V3:
+	- Flexible buffer implementation for RX_RING:
+		1. Blocks can be configured with non-static frame-size
+		2. Read/poll is at a block-level (as opposed to packet-level)
+		3. Added poll timeout to avoid indefinite user-space wait
+		   on idle links
+		4. Added user-configurable knobs:
+			4.1 block::timeout
+			4.2 tpkt_hdr::sk_rxhash
+	- RX Hash data available in user space
+	- TX_RING semantics are conceptually similar to TPACKET_V2;
+	  use tpacket3_hdr instead of tpacket2_hdr, and TPACKET3_HDRLEN
+	  instead of TPACKET2_HDRLEN. In the current implementation,
+	  the tp_next_offset field in the tpacket3_hdr MUST be set to
+	  zero, indicating that the ring does not hold variable sized frames.
+	  Packets with non-zero values of tp_next_offset will be dropped.
+
+-------------------------------------------------------------------------------
++ AF_PACKET fanout mode
+-------------------------------------------------------------------------------
+
+In the AF_PACKET fanout mode, packet reception can be load balanced among
+processes. This also works in combination with mmap(2) on packet sockets.
+
+Currently implemented fanout policies are:
+
+  - PACKET_FANOUT_HASH: schedule to socket by skb's packet hash
+  - PACKET_FANOUT_LB: schedule to socket by round-robin
+  - PACKET_FANOUT_CPU: schedule to socket by CPU packet arrives on
+  - PACKET_FANOUT_RND: schedule to socket by random selection
+  - PACKET_FANOUT_ROLLOVER: if one socket is full, rollover to another
+  - PACKET_FANOUT_QM: schedule to socket by skbs recorded queue_mapping
+
+Minimal example code by David S. Miller (try things like "./test eth0 hash",
+"./test eth0 lb", etc.):
+
+#include <stddef.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <sys/socket.h>
+#include <sys/ioctl.h>
+
+#include <unistd.h>
+
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+
+#include <net/if.h>
+
+static const char *device_name;
+static int fanout_type;
+static int fanout_id;
+
+#ifndef PACKET_FANOUT
+# define PACKET_FANOUT			18
+# define PACKET_FANOUT_HASH		0
+# define PACKET_FANOUT_LB		1
+#endif
+
+static int setup_socket(void)
+{
+	int err, fd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_IP));
+	struct sockaddr_ll ll;
+	struct ifreq ifr;
+	int fanout_arg;
+
+	if (fd < 0) {
+		perror("socket");
+		return EXIT_FAILURE;
+	}
+
+	memset(&ifr, 0, sizeof(ifr));
+	strcpy(ifr.ifr_name, device_name);
+	err = ioctl(fd, SIOCGIFINDEX, &ifr);
+	if (err < 0) {
+		perror("SIOCGIFINDEX");
+		return EXIT_FAILURE;
+	}
+
+	memset(&ll, 0, sizeof(ll));
+	ll.sll_family = AF_PACKET;
+	ll.sll_ifindex = ifr.ifr_ifindex;
+	err = bind(fd, (struct sockaddr *) &ll, sizeof(ll));
+	if (err < 0) {
+		perror("bind");
+		return EXIT_FAILURE;
+	}
+
+	fanout_arg = (fanout_id | (fanout_type << 16));
+	err = setsockopt(fd, SOL_PACKET, PACKET_FANOUT,
+			 &fanout_arg, sizeof(fanout_arg));
+	if (err) {
+		perror("setsockopt");
+		return EXIT_FAILURE;
+	}
+
+	return fd;
+}
+
+static void fanout_thread(void)
+{
+	int fd = setup_socket();
+	int limit = 10000;
+
+	if (fd < 0)
+		exit(fd);
+
+	while (limit-- > 0) {
+		char buf[1600];
+		int err;
+
+		err = read(fd, buf, sizeof(buf));
+		if (err < 0) {
+			perror("read");
+			exit(EXIT_FAILURE);
+		}
+		if ((limit % 10) == 0)
+			fprintf(stdout, "(%d) \n", getpid());
+	}
+
+	fprintf(stdout, "%d: Received 10000 packets\n", getpid());
+
+	close(fd);
+	exit(0);
+}
+
+int main(int argc, char **argp)
+{
+	int fd, err;
+	int i;
+
+	if (argc != 3) {
+		fprintf(stderr, "Usage: %s INTERFACE {hash|lb}\n", argp[0]);
+		return EXIT_FAILURE;
+	}
+
+	if (!strcmp(argp[2], "hash"))
+		fanout_type = PACKET_FANOUT_HASH;
+	else if (!strcmp(argp[2], "lb"))
+		fanout_type = PACKET_FANOUT_LB;
+	else {
+		fprintf(stderr, "Unknown fanout type [%s]\n", argp[2]);
+		exit(EXIT_FAILURE);
+	}
+
+	device_name = argp[1];
+	fanout_id = getpid() & 0xffff;
+
+	for (i = 0; i < 4; i++) {
+		pid_t pid = fork();
+
+		switch (pid) {
+		case 0:
+			fanout_thread();
+
+		case -1:
+			perror("fork");
+			exit(EXIT_FAILURE);
+		}
+	}
+
+	for (i = 0; i < 4; i++) {
+		int status;
+
+		wait(&status);
+	}
+
+	return 0;
+}
+
+-------------------------------------------------------------------------------
++ AF_PACKET TPACKET_V3 example
+-------------------------------------------------------------------------------
+
+AF_PACKET's TPACKET_V3 ring buffer can be configured to use non-static frame
+sizes by doing it's own memory management. It is based on blocks where polling
+works on a per block basis instead of per ring as in TPACKET_V2 and predecessor.
+
+It is said that TPACKET_V3 brings the following benefits:
+ *) ~15 - 20% reduction in CPU-usage
+ *) ~20% increase in packet capture rate
+ *) ~2x increase in packet density
+ *) Port aggregation analysis
+ *) Non static frame size to capture entire packet payload
+
+So it seems to be a good candidate to be used with packet fanout.
+
+Minimal example code by Daniel Borkmann based on Chetan Loke's lolpcap (compile
+it with gcc -Wall -O2 blob.c, and try things like "./a.out eth0", etc.):
+
+/* Written from scratch, but kernel-to-user space API usage
+ * dissected from lolpcap:
+ *  Copyright 2011, Chetan Loke <loke.chetan@gmail.com>
+ *  License: GPL, version 2.0
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>
+#include <assert.h>
+#include <net/if.h>
+#include <arpa/inet.h>
+#include <netdb.h>
+#include <poll.h>
+#include <unistd.h>
+#include <signal.h>
+#include <inttypes.h>
+#include <sys/socket.h>
+#include <sys/mman.h>
+#include <linux/if_packet.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
+
+#ifndef likely
+# define likely(x)		__builtin_expect(!!(x), 1)
+#endif
+#ifndef unlikely
+# define unlikely(x)		__builtin_expect(!!(x), 0)
+#endif
+
+struct block_desc {
+	uint32_t version;
+	uint32_t offset_to_priv;
+	struct tpacket_hdr_v1 h1;
+};
+
+struct ring {
+	struct iovec *rd;
+	uint8_t *map;
+	struct tpacket_req3 req;
+};
+
+static unsigned long packets_total = 0, bytes_total = 0;
+static sig_atomic_t sigint = 0;
+
+static void sighandler(int num)
+{
+	sigint = 1;
+}
+
+static int setup_socket(struct ring *ring, char *netdev)
+{
+	int err, i, fd, v = TPACKET_V3;
+	struct sockaddr_ll ll;
+	unsigned int blocksiz = 1 << 22, framesiz = 1 << 11;
+	unsigned int blocknum = 64;
+
+	fd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
+	if (fd < 0) {
+		perror("socket");
+		exit(1);
+	}
+
+	err = setsockopt(fd, SOL_PACKET, PACKET_VERSION, &v, sizeof(v));
+	if (err < 0) {
+		perror("setsockopt");
+		exit(1);
+	}
+
+	memset(&ring->req, 0, sizeof(ring->req));
+	ring->req.tp_block_size = blocksiz;
+	ring->req.tp_frame_size = framesiz;
+	ring->req.tp_block_nr = blocknum;
+	ring->req.tp_frame_nr = (blocksiz * blocknum) / framesiz;
+	ring->req.tp_retire_blk_tov = 60;
+	ring->req.tp_feature_req_word = TP_FT_REQ_FILL_RXHASH;
+
+	err = setsockopt(fd, SOL_PACKET, PACKET_RX_RING, &ring->req,
+			 sizeof(ring->req));
+	if (err < 0) {
+		perror("setsockopt");
+		exit(1);
+	}
+
+	ring->map = mmap(NULL, ring->req.tp_block_size * ring->req.tp_block_nr,
+			 PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED, fd, 0);
+	if (ring->map == MAP_FAILED) {
+		perror("mmap");
+		exit(1);
+	}
+
+	ring->rd = malloc(ring->req.tp_block_nr * sizeof(*ring->rd));
+	assert(ring->rd);
+	for (i = 0; i < ring->req.tp_block_nr; ++i) {
+		ring->rd[i].iov_base = ring->map + (i * ring->req.tp_block_size);
+		ring->rd[i].iov_len = ring->req.tp_block_size;
+	}
+
+	memset(&ll, 0, sizeof(ll));
+	ll.sll_family = PF_PACKET;
+	ll.sll_protocol = htons(ETH_P_ALL);
+	ll.sll_ifindex = if_nametoindex(netdev);
+	ll.sll_hatype = 0;
+	ll.sll_pkttype = 0;
+	ll.sll_halen = 0;
+
+	err = bind(fd, (struct sockaddr *) &ll, sizeof(ll));
+	if (err < 0) {
+		perror("bind");
+		exit(1);
+	}
+
+	return fd;
+}
+
+static void display(struct tpacket3_hdr *ppd)
+{
+	struct ethhdr *eth = (struct ethhdr *) ((uint8_t *) ppd + ppd->tp_mac);
+	struct iphdr *ip = (struct iphdr *) ((uint8_t *) eth + ETH_HLEN);
+
+	if (eth->h_proto == htons(ETH_P_IP)) {
+		struct sockaddr_in ss, sd;
+		char sbuff[NI_MAXHOST], dbuff[NI_MAXHOST];
+
+		memset(&ss, 0, sizeof(ss));
+		ss.sin_family = PF_INET;
+		ss.sin_addr.s_addr = ip->saddr;
+		getnameinfo((struct sockaddr *) &ss, sizeof(ss),
+			    sbuff, sizeof(sbuff), NULL, 0, NI_NUMERICHOST);
+
+		memset(&sd, 0, sizeof(sd));
+		sd.sin_family = PF_INET;
+		sd.sin_addr.s_addr = ip->daddr;
+		getnameinfo((struct sockaddr *) &sd, sizeof(sd),
+			    dbuff, sizeof(dbuff), NULL, 0, NI_NUMERICHOST);
+
+		printf("%s -> %s, ", sbuff, dbuff);
+	}
+
+	printf("rxhash: 0x%x\n", ppd->hv1.tp_rxhash);
+}
+
+static void walk_block(struct block_desc *pbd, const int block_num)
+{
+	int num_pkts = pbd->h1.num_pkts, i;
+	unsigned long bytes = 0;
+	struct tpacket3_hdr *ppd;
+
+	ppd = (struct tpacket3_hdr *) ((uint8_t *) pbd +
+				       pbd->h1.offset_to_first_pkt);
+	for (i = 0; i < num_pkts; ++i) {
+		bytes += ppd->tp_snaplen;
+		display(ppd);
+
+		ppd = (struct tpacket3_hdr *) ((uint8_t *) ppd +
+					       ppd->tp_next_offset);
+	}
+
+	packets_total += num_pkts;
+	bytes_total += bytes;
+}
+
+static void flush_block(struct block_desc *pbd)
+{
+	pbd->h1.block_status = TP_STATUS_KERNEL;
+}
+
+static void teardown_socket(struct ring *ring, int fd)
+{
+	munmap(ring->map, ring->req.tp_block_size * ring->req.tp_block_nr);
+	free(ring->rd);
+	close(fd);
+}
+
+int main(int argc, char **argp)
+{
+	int fd, err;
+	socklen_t len;
+	struct ring ring;
+	struct pollfd pfd;
+	unsigned int block_num = 0, blocks = 64;
+	struct block_desc *pbd;
+	struct tpacket_stats_v3 stats;
+
+	if (argc != 2) {
+		fprintf(stderr, "Usage: %s INTERFACE\n", argp[0]);
+		return EXIT_FAILURE;
+	}
+
+	signal(SIGINT, sighandler);
+
+	memset(&ring, 0, sizeof(ring));
+	fd = setup_socket(&ring, argp[argc - 1]);
+	assert(fd > 0);
+
+	memset(&pfd, 0, sizeof(pfd));
+	pfd.fd = fd;
+	pfd.events = POLLIN | POLLERR;
+	pfd.revents = 0;
+
+	while (likely(!sigint)) {
+		pbd = (struct block_desc *) ring.rd[block_num].iov_base;
+
+		if ((pbd->h1.block_status & TP_STATUS_USER) == 0) {
+			poll(&pfd, 1, -1);
+			continue;
+		}
+
+		walk_block(pbd, block_num);
+		flush_block(pbd);
+		block_num = (block_num + 1) % blocks;
+	}
+
+	len = sizeof(stats);
+	err = getsockopt(fd, SOL_PACKET, PACKET_STATISTICS, &stats, &len);
+	if (err < 0) {
+		perror("getsockopt");
+		exit(1);
+	}
+
+	fflush(stdout);
+	printf("\nReceived %u packets, %lu bytes, %u dropped, freeze_q_cnt: %u\n",
+	       stats.tp_packets, bytes_total, stats.tp_drops,
+	       stats.tp_freeze_q_cnt);
+
+	teardown_socket(&ring, fd);
+	return 0;
+}
+
+-------------------------------------------------------------------------------
++ PACKET_QDISC_BYPASS
+-------------------------------------------------------------------------------
+
+If there is a requirement to load the network with many packets in a similar
+fashion as pktgen does, you might set the following option after socket
+creation:
+
+    int one = 1;
+    setsockopt(fd, SOL_PACKET, PACKET_QDISC_BYPASS, &one, sizeof(one));
+
+This has the side-effect, that packets sent through PF_PACKET will bypass the
+kernel's qdisc layer and are forcedly pushed to the driver directly. Meaning,
+packet are not buffered, tc disciplines are ignored, increased loss can occur
+and such packets are also not visible to other PF_PACKET sockets anymore. So,
+you have been warned; generally, this can be useful for stress testing various
+components of a system.
+
+On default, PACKET_QDISC_BYPASS is disabled and needs to be explicitly enabled
+on PF_PACKET sockets.
+
+-------------------------------------------------------------------------------
++ PACKET_TIMESTAMP
+-------------------------------------------------------------------------------
+
+The PACKET_TIMESTAMP setting determines the source of the timestamp in
+the packet meta information for mmap(2)ed RX_RING and TX_RINGs.  If your
+NIC is capable of timestamping packets in hardware, you can request those
+hardware timestamps to be used. Note: you may need to enable the generation
+of hardware timestamps with SIOCSHWTSTAMP (see related information from
+Documentation/networking/timestamping.txt).
+
+PACKET_TIMESTAMP accepts the same integer bit field as SO_TIMESTAMPING:
+
+    int req = SOF_TIMESTAMPING_RAW_HARDWARE;
+    setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, (void *) &req, sizeof(req))
+
+For the mmap(2)ed ring buffers, such timestamps are stored in the
+tpacket{,2,3}_hdr structure's tp_sec and tp_{n,u}sec members. To determine
+what kind of timestamp has been reported, the tp_status field is binary |'ed
+with the following possible bits ...
+
+    TP_STATUS_TS_RAW_HARDWARE
+    TP_STATUS_TS_SOFTWARE
+
+... that are equivalent to its SOF_TIMESTAMPING_* counterparts. For the
+RX_RING, if neither is set (i.e. PACKET_TIMESTAMP is not set), then a
+software fallback was invoked *within* PF_PACKET's processing code (less
+precise).
+
+Getting timestamps for the TX_RING works as follows: i) fill the ring frames,
+ii) call sendto() e.g. in blocking mode, iii) wait for status of relevant
+frames to be updated resp. the frame handed over to the application, iv) walk
+through the frames to pick up the individual hw/sw timestamps.
+
+Only (!) if transmit timestamping is enabled, then these bits are combined
+with binary | with TP_STATUS_AVAILABLE, so you must check for that in your
+application (e.g. !(tp_status & (TP_STATUS_SEND_REQUEST | TP_STATUS_SENDING))
+in a first step to see if the frame belongs to the application, and then
+one can extract the type of timestamp in a second step from tp_status)!
+
+If you don't care about them, thus having it disabled, checking for
+TP_STATUS_AVAILABLE resp. TP_STATUS_WRONG_FORMAT is sufficient. If in the
+TX_RING part only TP_STATUS_AVAILABLE is set, then the tp_sec and tp_{n,u}sec
+members do not contain a valid value. For TX_RINGs, by default no timestamp
+is generated!
+
+See include/linux/net_tstamp.h and Documentation/networking/timestamping.txt
+for more information on hardware timestamps.
+
+-------------------------------------------------------------------------------
++ Miscellaneous bits
+-------------------------------------------------------------------------------
+
+- Packet sockets work well together with Linux socket filters, thus you also
+  might want to have a look at Documentation/networking/filter.txt
+
+--------------------------------------------------------------------------------
++ THANKS
+--------------------------------------------------------------------------------
+   
+   Jesse Brandeburg, for fixing my grammathical/spelling errors
+
diff --git a/Documentation/networking/phonet.txt b/Documentation/networking/phonet.txt
new file mode 100644
index 000000000..81003581f
--- /dev/null
+++ b/Documentation/networking/phonet.txt
@@ -0,0 +1,214 @@
+Linux Phonet protocol family
+============================
+
+Introduction
+------------
+
+Phonet is a packet protocol used by Nokia cellular modems for both IPC
+and RPC. With the Linux Phonet socket family, Linux host processes can
+receive and send messages from/to the modem, or any other external
+device attached to the modem. The modem takes care of routing.
+
+Phonet packets can be exchanged through various hardware connections
+depending on the device, such as:
+  - USB with the CDC Phonet interface,
+  - infrared,
+  - Bluetooth,
+  - an RS232 serial port (with a dedicated "FBUS" line discipline),
+  - the SSI bus with some TI OMAP processors.
+
+
+Packets format
+--------------
+
+Phonet packets have a common header as follows:
+
+  struct phonethdr {
+    uint8_t  pn_media;  /* Media type (link-layer identifier) */
+    uint8_t  pn_rdev;   /* Receiver device ID */
+    uint8_t  pn_sdev;   /* Sender device ID */
+    uint8_t  pn_res;    /* Resource ID or function */
+    uint16_t pn_length; /* Big-endian message byte length (minus 6) */
+    uint8_t  pn_robj;   /* Receiver object ID */
+    uint8_t  pn_sobj;   /* Sender object ID */
+  };
+
+On Linux, the link-layer header includes the pn_media byte (see below).
+The next 7 bytes are part of the network-layer header.
+
+The device ID is split: the 6 higher-order bits constitute the device
+address, while the 2 lower-order bits are used for multiplexing, as are
+the 8-bit object identifiers. As such, Phonet can be considered as a
+network layer with 6 bits of address space and 10 bits for transport
+protocol (much like port numbers in IP world).
+
+The modem always has address number zero. All other device have a their
+own 6-bit address.
+
+
+Link layer
+----------
+
+Phonet links are always point-to-point links. The link layer header
+consists of a single Phonet media type byte. It uniquely identifies the
+link through which the packet is transmitted, from the modem's
+perspective. Each Phonet network device shall prepend and set the media
+type byte as appropriate. For convenience, a common phonet_header_ops
+link-layer header operations structure is provided. It sets the
+media type according to the network device hardware address.
+
+Linux Phonet network interfaces support a dedicated link layer packets
+type (ETH_P_PHONET) which is out of the Ethernet type range. They can
+only send and receive Phonet packets.
+
+The virtual TUN tunnel device driver can also be used for Phonet. This
+requires IFF_TUN mode, _without_ the IFF_NO_PI flag. In this case,
+there is no link-layer header, so there is no Phonet media type byte.
+
+Note that Phonet interfaces are not allowed to re-order packets, so
+only the (default) Linux FIFO qdisc should be used with them.
+
+
+Network layer
+-------------
+
+The Phonet socket address family maps the Phonet packet header:
+
+  struct sockaddr_pn {
+    sa_family_t spn_family;    /* AF_PHONET */
+    uint8_t     spn_obj;       /* Object ID */
+    uint8_t     spn_dev;       /* Device ID */
+    uint8_t     spn_resource;  /* Resource or function */
+    uint8_t     spn_zero[...]; /* Padding */
+  };
+
+The resource field is only used when sending and receiving;
+It is ignored by bind() and getsockname().
+
+
+Low-level datagram protocol
+---------------------------
+
+Applications can send Phonet messages using the Phonet datagram socket
+protocol from the PF_PHONET family. Each socket is bound to one of the
+2^10 object IDs available, and can send and receive packets with any
+other peer.
+
+  struct sockaddr_pn addr = { .spn_family = AF_PHONET, };
+  ssize_t len;
+  socklen_t addrlen = sizeof(addr);
+  int fd;
+
+  fd = socket(PF_PHONET, SOCK_DGRAM, 0);
+  bind(fd, (struct sockaddr *)&addr, sizeof(addr));
+  /* ... */
+
+  sendto(fd, msg, msglen, 0, (struct sockaddr *)&addr, sizeof(addr));
+  len = recvfrom(fd, buf, sizeof(buf), 0,
+                 (struct sockaddr *)&addr, &addrlen);
+
+This protocol follows the SOCK_DGRAM connection-less semantics.
+However, connect() and getpeername() are not supported, as they did
+not seem useful with Phonet usages (could be added easily).
+
+
+Resource subscription
+---------------------
+
+A Phonet datagram socket can be subscribed to any number of 8-bits
+Phonet resources, as follow:
+
+  uint32_t res = 0xXX;
+  ioctl(fd, SIOCPNADDRESOURCE, &res);
+
+Subscription is similarly cancelled using the SIOCPNDELRESOURCE I/O
+control request, or when the socket is closed.
+
+Note that no more than one socket can be subcribed to any given
+resource at a time. If not, ioctl() will return EBUSY.
+
+
+Phonet Pipe protocol
+--------------------
+
+The Phonet Pipe protocol is a simple sequenced packets protocol
+with end-to-end congestion control. It uses the passive listening
+socket paradigm. The listening socket is bound to an unique free object
+ID. Each listening socket can handle up to 255 simultaneous
+connections, one per accept()'d socket.
+
+  int lfd, cfd;
+
+  lfd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
+  listen (lfd, INT_MAX);
+
+  /* ... */
+  cfd = accept(lfd, NULL, NULL);
+  for (;;)
+  {
+    char buf[...];
+    ssize_t len = read(cfd, buf, sizeof(buf));
+
+    /* ... */
+
+    write(cfd, msg, msglen);
+  }
+
+Connections are traditionally established between two endpoints by a
+"third party" application. This means that both endpoints are passive.
+
+
+As of Linux kernel version 2.6.39, it is also possible to connect
+two endpoints directly, using connect() on the active side. This is
+intended to support the newer Nokia Wireless Modem API, as found in
+e.g. the Nokia Slim Modem in the ST-Ericsson U8500 platform:
+
+  struct sockaddr_spn spn;
+  int fd;
+
+  fd = socket(PF_PHONET, SOCK_SEQPACKET, PN_PROTO_PIPE);
+  memset(&spn, 0, sizeof(spn));
+  spn.spn_family = AF_PHONET;
+  spn.spn_obj = ...;
+  spn.spn_dev = ...;
+  spn.spn_resource = 0xD9;
+  connect(fd, (struct sockaddr *)&spn, sizeof(spn));
+  /* normal I/O here ... */
+  close(fd);
+
+
+WARNING:
+When polling a connected pipe socket for writability, there is an
+intrinsic race condition whereby writability might be lost between the
+polling and the writing system calls. In this case, the socket will
+block until write becomes possible again, unless non-blocking mode
+is enabled.
+
+
+The pipe protocol provides two socket options at the SOL_PNPIPE level:
+
+  PNPIPE_ENCAP accepts one integer value (int) of:
+
+    PNPIPE_ENCAP_NONE: The socket operates normally (default).
+
+    PNPIPE_ENCAP_IP: The socket is used as a backend for a virtual IP
+      interface. This requires CAP_NET_ADMIN capability. GPRS data
+      support on Nokia modems can use this. Note that the socket cannot
+      be reliably poll()'d or read() from while in this mode.
+
+  PNPIPE_IFINDEX is a read-only integer value. It contains the
+    interface index of the network interface created by PNPIPE_ENCAP,
+    or zero if encapsulation is off.
+
+  PNPIPE_HANDLE is a read-only integer value. It contains the underlying
+    identifier ("pipe handle") of the pipe. This is only defined for
+    socket descriptors that are already connected or being connected.
+
+
+Authors
+-------
+
+Linux Phonet was initially written by Sakari Ailus.
+Other contributors include Mikä Liljeberg, Andras Domokos,
+Carlos Chinea and Rémi Denis-Courmont.
+Copyright (C) 2008 Nokia Corporation.
diff --git a/Documentation/networking/phy.txt b/Documentation/networking/phy.txt
new file mode 100644
index 000000000..bdec0f700
--- /dev/null
+++ b/Documentation/networking/phy.txt
@@ -0,0 +1,427 @@
+
+-------
+PHY Abstraction Layer
+(Updated 2008-04-08)
+
+Purpose
+
+ Most network devices consist of set of registers which provide an interface
+ to a MAC layer, which communicates with the physical connection through a
+ PHY.  The PHY concerns itself with negotiating link parameters with the link
+ partner on the other side of the network connection (typically, an ethernet
+ cable), and provides a register interface to allow drivers to determine what
+ settings were chosen, and to configure what settings are allowed.
+
+ While these devices are distinct from the network devices, and conform to a
+ standard layout for the registers, it has been common practice to integrate
+ the PHY management code with the network driver.  This has resulted in large
+ amounts of redundant code.  Also, on embedded systems with multiple (and
+ sometimes quite different) ethernet controllers connected to the same 
+ management bus, it is difficult to ensure safe use of the bus.
+
+ Since the PHYs are devices, and the management busses through which they are
+ accessed are, in fact, busses, the PHY Abstraction Layer treats them as such.
+ In doing so, it has these goals:
+
+   1) Increase code-reuse
+   2) Increase overall code-maintainability
+   3) Speed development time for new network drivers, and for new systems
+ 
+ Basically, this layer is meant to provide an interface to PHY devices which
+ allows network driver writers to write as little code as possible, while
+ still providing a full feature set.
+
+The MDIO bus
+
+ Most network devices are connected to a PHY by means of a management bus.
+ Different devices use different busses (though some share common interfaces).
+ In order to take advantage of the PAL, each bus interface needs to be
+ registered as a distinct device.
+
+ 1) read and write functions must be implemented.  Their prototypes are:
+
+     int write(struct mii_bus *bus, int mii_id, int regnum, u16 value);
+     int read(struct mii_bus *bus, int mii_id, int regnum);
+
+   mii_id is the address on the bus for the PHY, and regnum is the register
+   number.  These functions are guaranteed not to be called from interrupt
+   time, so it is safe for them to block, waiting for an interrupt to signal
+   the operation is complete
+ 
+ 2) A reset function is optional.  This is used to return the bus to an
+   initialized state.
+
+ 3) A probe function is needed.  This function should set up anything the bus
+   driver needs, setup the mii_bus structure, and register with the PAL using
+   mdiobus_register.  Similarly, there's a remove function to undo all of
+   that (use mdiobus_unregister).
+ 
+ 4) Like any driver, the device_driver structure must be configured, and init
+   exit functions are used to register the driver.
+
+ 5) The bus must also be declared somewhere as a device, and registered.
+
+ As an example for how one driver implemented an mdio bus driver, see
+ drivers/net/ethernet/freescale/fsl_pq_mdio.c and an associated DTS file
+ for one of the users. (e.g. "git grep fsl,.*-mdio arch/powerpc/boot/dts/")
+
+(RG)MII/electrical interface considerations
+
+ The Reduced Gigabit Medium Independent Interface (RGMII) is a 12-pin
+ electrical signal interface using a synchronous 125Mhz clock signal and several
+ data lines. Due to this design decision, a 1.5ns to 2ns delay must be added
+ between the clock line (RXC or TXC) and the data lines to let the PHY (clock
+ sink) have enough setup and hold times to sample the data lines correctly. The
+ PHY library offers different types of PHY_INTERFACE_MODE_RGMII* values to let
+ the PHY driver and optionally the MAC driver, implement the required delay. The
+ values of phy_interface_t must be understood from the perspective of the PHY
+ device itself, leading to the following:
+
+ * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any
+   internal delay by itself, it assumes that either the Ethernet MAC (if capable
+   or the PCB traces) insert the correct 1.5-2ns delay
+
+ * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay
+   for the transmit data lines (TXD[3:0]) processed by the PHY device
+
+ * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay
+   for the receive data lines (RXD[3:0]) processed by the PHY device
+
+ * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for
+   both transmit AND receive data lines from/to the PHY device
+
+ Whenever possible, use the PHY side RGMII delay for these reasons:
+
+ * PHY devices may offer sub-nanosecond granularity in how they allow a
+   receiver/transmitter side delay (e.g: 0.5, 1.0, 1.5ns) to be specified. Such
+   precision may be required to account for differences in PCB trace lengths
+
+ * PHY devices are typically qualified for a large range of applications
+   (industrial, medical, automotive...), and they provide a constant and
+   reliable delay across temperature/pressure/voltage ranges
+
+ * PHY device drivers in PHYLIB being reusable by nature, being able to
+   configure correctly a specified delay enables more designs with similar delay
+   requirements to be operate correctly
+
+ For cases where the PHY is not capable of providing this delay, but the
+ Ethernet MAC driver is capable of doing so, the correct phy_interface_t value
+ should be PHY_INTERFACE_MODE_RGMII, and the Ethernet MAC driver should be
+ configured correctly in order to provide the required transmit and/or receive
+ side delay from the perspective of the PHY device. Conversely, if the Ethernet
+ MAC driver looks at the phy_interface_t value, for any other mode but
+ PHY_INTERFACE_MODE_RGMII, it should make sure that the MAC-level delays are
+ disabled.
+
+ In case neither the Ethernet MAC, nor the PHY are capable of providing the
+ required delays, as defined per the RGMII standard, several options may be
+ available:
+
+ * Some SoCs may offer a pin pad/mux/controller capable of configuring a given
+   set of pins'strength, delays, and voltage; and it may be a suitable
+   option to insert the expected 2ns RGMII delay.
+
+ * Modifying the PCB design to include a fixed delay (e.g: using a specifically
+   designed serpentine), which may not require software configuration at all.
+
+Common problems with RGMII delay mismatch
+
+ When there is a RGMII delay mismatch between the Ethernet MAC and the PHY, this
+ will most likely result in the clock and data line signals to be unstable when
+ the PHY or MAC take a snapshot of these signals to translate them into logical
+ 1 or 0 states and reconstruct the data being transmitted/received. Typical
+ symptoms include:
+
+ * Transmission/reception partially works, and there is frequent or occasional
+   packet loss observed
+
+ * Ethernet MAC may report some or all packets ingressing with a FCS/CRC error,
+   or just discard them all
+
+ * Switching to lower speeds such as 10/100Mbits/sec makes the problem go away
+   (since there is enough setup/hold time in that case)
+
+
+Connecting to a PHY
+
+ Sometime during startup, the network driver needs to establish a connection
+ between the PHY device, and the network device.  At this time, the PHY's bus
+ and drivers need to all have been loaded, so it is ready for the connection.
+ At this point, there are several ways to connect to the PHY:
+
+ 1) The PAL handles everything, and only calls the network driver when
+   the link state changes, so it can react.
+
+ 2) The PAL handles everything except interrupts (usually because the
+   controller has the interrupt registers).
+
+ 3) The PAL handles everything, but checks in with the driver every second,
+   allowing the network driver to react first to any changes before the PAL
+   does.
+ 
+ 4) The PAL serves only as a library of functions, with the network device
+   manually calling functions to update status, and configure the PHY
+
+
+Letting the PHY Abstraction Layer do Everything
+
+ If you choose option 1 (The hope is that every driver can, but to still be
+ useful to drivers that can't), connecting to the PHY is simple:
+
+ First, you need a function to react to changes in the link state.  This
+ function follows this protocol:
+
+   static void adjust_link(struct net_device *dev);
+ 
+ Next, you need to know the device name of the PHY connected to this device. 
+ The name will look something like, "0:00", where the first number is the
+ bus id, and the second is the PHY's address on that bus.  Typically,
+ the bus is responsible for making its ID unique.
+ 
+ Now, to connect, just call this function:
+ 
+   phydev = phy_connect(dev, phy_name, &adjust_link, interface);
+
+ phydev is a pointer to the phy_device structure which represents the PHY.  If
+ phy_connect is successful, it will return the pointer.  dev, here, is the
+ pointer to your net_device.  Once done, this function will have started the
+ PHY's software state machine, and registered for the PHY's interrupt, if it
+ has one.  The phydev structure will be populated with information about the
+ current state, though the PHY will not yet be truly operational at this
+ point.
+
+ PHY-specific flags should be set in phydev->dev_flags prior to the call
+ to phy_connect() such that the underlying PHY driver can check for flags
+ and perform specific operations based on them.
+ This is useful if the system has put hardware restrictions on
+ the PHY/controller, of which the PHY needs to be aware.
+
+ interface is a u32 which specifies the connection type used
+ between the controller and the PHY.  Examples are GMII, MII,
+ RGMII, and SGMII.  For a full list, see include/linux/phy.h
+
+ Now just make sure that phydev->supported and phydev->advertising have any
+ values pruned from them which don't make sense for your controller (a 10/100
+ controller may be connected to a gigabit capable PHY, so you would need to
+ mask off SUPPORTED_1000baseT*).  See include/linux/ethtool.h for definitions
+ for these bitfields. Note that you should not SET any bits, except the
+ SUPPORTED_Pause and SUPPORTED_AsymPause bits (see below), or the PHY may get
+ put into an unsupported state.
+
+ Lastly, once the controller is ready to handle network traffic, you call
+ phy_start(phydev).  This tells the PAL that you are ready, and configures the
+ PHY to connect to the network.  If you want to handle your own interrupts,
+ just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start.
+ Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL.
+
+ When you want to disconnect from the network (even if just briefly), you call
+ phy_stop(phydev).
+
+Pause frames / flow control
+
+ The PHY does not participate directly in flow control/pause frames except by
+ making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in
+ MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC
+ controller supports such a thing. Since flow control/pause frames generation
+ involves the Ethernet MAC driver, it is recommended that this driver takes care
+ of properly indicating advertisement and support for such features by setting
+ the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done
+ either before or after phy_connect() and/or as a result of implementing the
+ ethtool::set_pauseparam feature.
+
+
+Keeping Close Tabs on the PAL
+
+ It is possible that the PAL's built-in state machine needs a little help to
+ keep your network device and the PHY properly in sync.  If so, you can
+ register a helper function when connecting to the PHY, which will be called
+ every second before the state machine reacts to any changes.  To do this, you
+ need to manually call phy_attach() and phy_prepare_link(), and then call
+ phy_start_machine() with the second argument set to point to your special
+ handler.
+
+ Currently there are no examples of how to use this functionality, and testing
+ on it has been limited because the author does not have any drivers which use
+ it (they all use option 1).  So Caveat Emptor.
+
+Doing it all yourself
+
+ There's a remote chance that the PAL's built-in state machine cannot track
+ the complex interactions between the PHY and your network device.  If this is
+ so, you can simply call phy_attach(), and not call phy_start_machine or
+ phy_prepare_link().  This will mean that phydev->state is entirely yours to
+ handle (phy_start and phy_stop toggle between some of the states, so you
+ might need to avoid them).
+
+ An effort has been made to make sure that useful functionality can be
+ accessed without the state-machine running, and most of these functions are
+ descended from functions which did not interact with a complex state-machine.
+ However, again, no effort has been made so far to test running without the
+ state machine, so tryer beware.
+
+ Here is a brief rundown of the functions:
+
+ int phy_read(struct phy_device *phydev, u16 regnum);
+ int phy_write(struct phy_device *phydev, u16 regnum, u16 val);
+
+   Simple read/write primitives.  They invoke the bus's read/write function
+   pointers.
+
+ void phy_print_status(struct phy_device *phydev);
+ 
+   A convenience function to print out the PHY status neatly.
+
+ int phy_start_interrupts(struct phy_device *phydev);
+ int phy_stop_interrupts(struct phy_device *phydev);
+
+   Requests the IRQ for the PHY interrupts, then enables them for
+   start, or disables then frees them for stop.
+
+ struct phy_device * phy_attach(struct net_device *dev, const char *phy_id,
+		 phy_interface_t interface);
+
+   Attaches a network device to a particular PHY, binding the PHY to a generic
+   driver if none was found during bus initialization.
+
+ int phy_start_aneg(struct phy_device *phydev);
+   
+   Using variables inside the phydev structure, either configures advertising
+   and resets autonegotiation, or disables autonegotiation, and configures
+   forced settings.
+
+ static inline int phy_read_status(struct phy_device *phydev);
+
+   Fills the phydev structure with up-to-date information about the current
+   settings in the PHY.
+
+ int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
+
+   Ethtool convenience functions.
+
+ int phy_mii_ioctl(struct phy_device *phydev,
+                 struct mii_ioctl_data *mii_data, int cmd);
+
+   The MII ioctl.  Note that this function will completely screw up the state
+   machine if you write registers like BMCR, BMSR, ADVERTISE, etc.  Best to
+   use this only to write registers which are not standard, and don't set off
+   a renegotiation.
+
+
+PHY Device Drivers
+
+ With the PHY Abstraction Layer, adding support for new PHYs is
+ quite easy.  In some cases, no work is required at all!  However,
+ many PHYs require a little hand-holding to get up-and-running.
+
+Generic PHY driver
+
+ If the desired PHY doesn't have any errata, quirks, or special
+ features you want to support, then it may be best to not add
+ support, and let the PHY Abstraction Layer's Generic PHY Driver
+ do all of the work.  
+
+Writing a PHY driver
+
+ If you do need to write a PHY driver, the first thing to do is
+ make sure it can be matched with an appropriate PHY device.
+ This is done during bus initialization by reading the device's
+ UID (stored in registers 2 and 3), then comparing it to each
+ driver's phy_id field by ANDing it with each driver's
+ phy_id_mask field.  Also, it needs a name.  Here's an example:
+
+   static struct phy_driver dm9161_driver = {
+         .phy_id         = 0x0181b880,
+	 .name           = "Davicom DM9161E",
+	 .phy_id_mask    = 0x0ffffff0,
+	 ...
+   }
+
+ Next, you need to specify what features (speed, duplex, autoneg,
+ etc) your PHY device and driver support.  Most PHYs support
+ PHY_BASIC_FEATURES, but you can look in include/mii.h for other
+ features.
+
+ Each driver consists of a number of function pointers, documented
+ in include/linux/phy.h under the phy_driver structure.
+
+ Of these, only config_aneg and read_status are required to be
+ assigned by the driver code.  The rest are optional.  Also, it is
+ preferred to use the generic phy driver's versions of these two
+ functions if at all possible: genphy_read_status and
+ genphy_config_aneg.  If this is not possible, it is likely that
+ you only need to perform some actions before and after invoking
+ these functions, and so your functions will wrap the generic
+ ones.
+
+ Feel free to look at the Marvell, Cicada, and Davicom drivers in
+ drivers/net/phy/ for examples (the lxt and qsemi drivers have
+ not been tested as of this writing).
+
+ The PHY's MMD register accesses are handled by the PAL framework
+ by default, but can be overridden by a specific PHY driver if
+ required. This could be the case if a PHY was released for
+ manufacturing before the MMD PHY register definitions were
+ standardized by the IEEE. Most modern PHYs will be able to use
+ the generic PAL framework for accessing the PHY's MMD registers.
+ An example of such usage is for Energy Efficient Ethernet support,
+ implemented in the PAL. This support uses the PAL to access MMD
+ registers for EEE query and configuration if the PHY supports
+ the IEEE standard access mechanisms, or can use the PHY's specific
+ access interfaces if overridden by the specific PHY driver. See
+ the Micrel driver in drivers/net/phy/ for an example of how this
+ can be implemented.
+
+Board Fixups
+
+ Sometimes the specific interaction between the platform and the PHY requires
+ special handling.  For instance, to change where the PHY's clock input is,
+ or to add a delay to account for latency issues in the data path.  In order
+ to support such contingencies, the PHY Layer allows platform code to register
+ fixups to be run when the PHY is brought up (or subsequently reset).
+
+ When the PHY Layer brings up a PHY it checks to see if there are any fixups
+ registered for it, matching based on UID (contained in the PHY device's phy_id
+ field) and the bus identifier (contained in phydev->dev.bus_id).  Both must
+ match, however two constants, PHY_ANY_ID and PHY_ANY_UID, are provided as
+ wildcards for the bus ID and UID, respectively.
+
+ When a match is found, the PHY layer will invoke the run function associated
+ with the fixup.  This function is passed a pointer to the phy_device of
+ interest.  It should therefore only operate on that PHY.
+
+ The platform code can either register the fixup using phy_register_fixup():
+
+	int phy_register_fixup(const char *phy_id,
+		u32 phy_uid, u32 phy_uid_mask,
+		int (*run)(struct phy_device *));
+
+ Or using one of the two stubs, phy_register_fixup_for_uid() and
+ phy_register_fixup_for_id():
+
+ int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask,
+		int (*run)(struct phy_device *));
+ int phy_register_fixup_for_id(const char *phy_id,
+		int (*run)(struct phy_device *));
+
+ The stubs set one of the two matching criteria, and set the other one to
+ match anything.
+
+ When phy_register_fixup() or *_for_uid()/*_for_id() is called at module,
+ unregister fixup and free allocate memory are required.
+
+ Call one of following function before unloading module.
+
+ int phy_unregister_fixup(const char *phy_id, u32 phy_uid, u32 phy_uid_mask);
+ int phy_unregister_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask);
+ int phy_register_fixup_for_id(const char *phy_id);
+
+Standards
+
+ IEEE Standard 802.3: CSMA/CD Access Method and Physical Layer Specifications, Section Two:
+ http://standards.ieee.org/getieee802/download/802.3-2008_section2.pdf
+
+ RGMII v1.3:
+ http://web.archive.org/web/20160303212629/http://www.hp.com/rnd/pdfs/RGMIIv1_3.pdf
+
+ RGMII v2.0:
+ http://web.archive.org/web/20160303171328/http://www.hp.com/rnd/pdfs/RGMIIv2_0_final_hp.pdf
diff --git a/Documentation/networking/pktgen.txt b/Documentation/networking/pktgen.txt
new file mode 100644
index 000000000..d2fd78f85
--- /dev/null
+++ b/Documentation/networking/pktgen.txt
@@ -0,0 +1,400 @@
+
+
+                  HOWTO for the linux packet generator
+                  ------------------------------------
+
+Enable CONFIG_NET_PKTGEN to compile and build pktgen either in-kernel
+or as a module.  A module is preferred; modprobe pktgen if needed.  Once
+running, pktgen creates a thread for each CPU with affinity to that CPU.
+Monitoring and controlling is done via /proc.  It is easiest to select a
+suitable sample script and configure that.
+
+On a dual CPU:
+
+ps aux | grep pkt
+root       129  0.3  0.0     0    0 ?        SW    2003 523:20 [kpktgend_0]
+root       130  0.3  0.0     0    0 ?        SW    2003 509:50 [kpktgend_1]
+
+
+For monitoring and control pktgen creates:
+	/proc/net/pktgen/pgctrl
+	/proc/net/pktgen/kpktgend_X
+        /proc/net/pktgen/ethX
+
+
+Tuning NIC for max performance
+==============================
+
+The default NIC settings are (likely) not tuned for pktgen's artificial
+overload type of benchmarking, as this could hurt the normal use-case.
+
+Specifically increasing the TX ring buffer in the NIC:
+ # ethtool -G ethX tx 1024
+
+A larger TX ring can improve pktgen's performance, while it can hurt
+in the general case, 1) because the TX ring buffer might get larger
+than the CPU's L1/L2 cache, 2) because it allows more queueing in the
+NIC HW layer (which is bad for bufferbloat).
+
+One should hesitate to conclude that packets/descriptors in the HW
+TX ring cause delay.  Drivers usually delay cleaning up the
+ring-buffers for various performance reasons, and packets stalling
+the TX ring might just be waiting for cleanup.
+
+This cleanup issue is specifically the case for the driver ixgbe
+(Intel 82599 chip).  This driver (ixgbe) combines TX+RX ring cleanups,
+and the cleanup interval is affected by the ethtool --coalesce setting
+of parameter "rx-usecs".
+
+For ixgbe use e.g. "30" resulting in approx 33K interrupts/sec (1/30*10^6):
+ # ethtool -C ethX rx-usecs 30
+
+
+Kernel threads
+==============
+Pktgen creates a thread for each CPU with affinity to that CPU.
+Which is controlled through procfile /proc/net/pktgen/kpktgend_X.
+
+Example: /proc/net/pktgen/kpktgend_0
+
+ Running:
+ Stopped: eth4@0
+ Result: OK: add_device=eth4@0
+
+Most important are the devices assigned to the thread.
+
+The two basic thread commands are:
+ * add_device DEVICE@NAME -- adds a single device
+ * rem_device_all         -- remove all associated devices
+
+When adding a device to a thread, a corresponding procfile is created
+which is used for configuring this device. Thus, device names need to
+be unique.
+
+To support adding the same device to multiple threads, which is useful
+with multi queue NICs, the device naming scheme is extended with "@":
+ device@something
+
+The part after "@" can be anything, but it is custom to use the thread
+number.
+
+Viewing devices
+===============
+
+The Params section holds configured information.  The Current section
+holds running statistics.  The Result is printed after a run or after
+interruption.  Example:
+
+/proc/net/pktgen/eth4@0
+
+ Params: count 100000  min_pkt_size: 60  max_pkt_size: 60
+     frags: 0  delay: 0  clone_skb: 64  ifname: eth4@0
+     flows: 0 flowlen: 0
+     queue_map_min: 0  queue_map_max: 0
+     dst_min: 192.168.81.2  dst_max:
+     src_min:   src_max:
+     src_mac: 90:e2:ba:0a:56:b4 dst_mac: 00:1b:21:3c:9d:f8
+     udp_src_min: 9  udp_src_max: 109  udp_dst_min: 9  udp_dst_max: 9
+     src_mac_count: 0  dst_mac_count: 0
+     Flags: UDPSRC_RND  NO_TIMESTAMP  QUEUE_MAP_CPU
+ Current:
+     pkts-sofar: 100000  errors: 0
+     started: 623913381008us  stopped: 623913396439us idle: 25us
+     seq_num: 100001  cur_dst_mac_offset: 0  cur_src_mac_offset: 0
+     cur_saddr: 192.168.8.3  cur_daddr: 192.168.81.2
+     cur_udp_dst: 9  cur_udp_src: 42
+     cur_queue_map: 0
+     flows: 0
+ Result: OK: 15430(c15405+d25) usec, 100000 (60byte,0frags)
+  6480562pps 3110Mb/sec (3110669760bps) errors: 0
+
+
+Configuring devices
+===================
+This is done via the /proc interface, and most easily done via pgset
+as defined in the sample scripts.
+You need to specify PGDEV environment variable to use functions from sample
+scripts, i.e.:
+export PGDEV=/proc/net/pktgen/eth4@0
+source samples/pktgen/functions.sh
+
+Examples:
+
+ pg_ctrl start           starts injection.
+ pg_ctrl stop            aborts injection. Also, ^C aborts generator.
+
+ pgset "clone_skb 1"     sets the number of copies of the same packet
+ pgset "clone_skb 0"     use single SKB for all transmits
+ pgset "burst 8"         uses xmit_more API to queue 8 copies of the same
+                         packet and update HW tx queue tail pointer once.
+                         "burst 1" is the default
+ pgset "pkt_size 9014"   sets packet size to 9014
+ pgset "frags 5"         packet will consist of 5 fragments
+ pgset "count 200000"    sets number of packets to send, set to zero
+                         for continuous sends until explicitly stopped.
+
+ pgset "delay 5000"      adds delay to hard_start_xmit(). nanoseconds
+
+ pgset "dst 10.0.0.1"    sets IP destination address
+                         (BEWARE! This generator is very aggressive!)
+
+ pgset "dst_min 10.0.0.1"            Same as dst
+ pgset "dst_max 10.0.0.254"          Set the maximum destination IP.
+ pgset "src_min 10.0.0.1"            Set the minimum (or only) source IP.
+ pgset "src_max 10.0.0.254"          Set the maximum source IP.
+ pgset "dst6 fec0::1"     IPV6 destination address
+ pgset "src6 fec0::2"     IPV6 source address
+ pgset "dstmac 00:00:00:00:00:00"    sets MAC destination address
+ pgset "srcmac 00:00:00:00:00:00"    sets MAC source address
+
+ pgset "queue_map_min 0" Sets the min value of tx queue interval
+ pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices
+                         To select queue 1 of a given device,
+                         use queue_map_min=1 and queue_map_max=1
+
+ pgset "src_mac_count 1" Sets the number of MACs we'll range through.
+                         The 'minimum' MAC is what you set with srcmac.
+
+ pgset "dst_mac_count 1" Sets the number of MACs we'll range through.
+                         The 'minimum' MAC is what you set with dstmac.
+
+ pgset "flag [name]"     Set a flag to determine behaviour.  Current flags
+                         are: IPSRC_RND # IP source is random (between min/max)
+                              IPDST_RND # IP destination is random
+                              UDPSRC_RND, UDPDST_RND,
+                              MACSRC_RND, MACDST_RND
+                              TXSIZE_RND, IPV6,
+                              MPLS_RND, VID_RND, SVID_RND
+                              FLOW_SEQ,
+                              QUEUE_MAP_RND # queue map random
+                              QUEUE_MAP_CPU # queue map mirrors smp_processor_id()
+                              UDPCSUM,
+                              IPSEC # IPsec encapsulation (needs CONFIG_XFRM)
+                              NODE_ALLOC # node specific memory allocation
+                              NO_TIMESTAMP # disable timestamping
+ pgset 'flag ![name]'    Clear a flag to determine behaviour.
+                         Note that you might need to use single quote in
+                         interactive mode, so that your shell wouldn't expand
+                         the specified flag as a history command.
+
+ pgset "spi [SPI_VALUE]" Set specific SA used to transform packet.
+
+ pgset "udp_src_min 9"   set UDP source port min, If < udp_src_max, then
+                         cycle through the port range.
+
+ pgset "udp_src_max 9"   set UDP source port max.
+ pgset "udp_dst_min 9"   set UDP destination port min, If < udp_dst_max, then
+                         cycle through the port range.
+ pgset "udp_dst_max 9"   set UDP destination port max.
+
+ pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example
+                                         outer label=16,middle label=32,
+					 inner label=0 (IPv4 NULL)) Note that
+					 there must be no spaces between the
+					 arguments. Leading zeros are required.
+					 Do not set the bottom of stack bit,
+					 that's done automatically. If you do
+					 set the bottom of stack bit, that
+					 indicates that you want to randomly
+					 generate that address and the flag
+					 MPLS_RND will be turned on. You
+					 can have any mix of random and fixed
+					 labels in the label stack.
+
+ pgset "mpls 0"		  turn off mpls (or any invalid argument works too!)
+
+ pgset "vlan_id 77"       set VLAN ID 0-4095
+ pgset "vlan_p 3"         set priority bit 0-7 (default 0)
+ pgset "vlan_cfi 0"       set canonical format identifier 0-1 (default 0)
+
+ pgset "svlan_id 22"      set SVLAN ID 0-4095
+ pgset "svlan_p 3"        set priority bit 0-7 (default 0)
+ pgset "svlan_cfi 0"      set canonical format identifier 0-1 (default 0)
+
+ pgset "vlan_id 9999"     > 4095 remove vlan and svlan tags
+ pgset "svlan 9999"       > 4095 remove svlan tag
+
+
+ pgset "tos XX"           set former IPv4 TOS field (e.g. "tos 28" for AF11 no ECN, default 00)
+ pgset "traffic_class XX" set former IPv6 TRAFFIC CLASS (e.g. "traffic_class B8" for EF no ECN, default 00)
+
+ pgset "rate 300M"        set rate to 300 Mb/s
+ pgset "ratep 1000000"    set rate to 1Mpps
+
+ pgset "xmit_mode netif_receive"  RX inject into stack netif_receive_skb()
+				  Works with "burst" but not with "clone_skb".
+				  Default xmit_mode is "start_xmit".
+
+Sample scripts
+==============
+
+A collection of tutorial scripts and helpers for pktgen is in the
+samples/pktgen directory. The helper parameters.sh file support easy
+and consistent parameter parsing across the sample scripts.
+
+Usage example and help:
+ ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2
+
+Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
+  -i : ($DEV)       output interface/device (required)
+  -s : ($PKT_SIZE)  packet size
+  -d : ($DEST_IP)   destination IP
+  -m : ($DST_MAC)   destination MAC-addr
+  -t : ($THREADS)   threads to start
+  -c : ($SKB_CLONE) SKB clones send before alloc new SKB
+  -b : ($BURST)     HW level bursting of SKBs
+  -v : ($VERBOSE)   verbose
+  -x : ($DEBUG)     debug
+
+The global variables being set are also listed.  E.g. the required
+interface/device parameter "-i" sets variable $DEV.  Copy the
+pktgen_sampleXX scripts and modify them to fit your own needs.
+
+The old scripts:
+
+pktgen.conf-1-2                  # 1 CPU 2 dev
+pktgen.conf-1-1-rdos             # 1 CPU 1 dev w. route DoS 
+pktgen.conf-1-1-ip6              # 1 CPU 1 dev ipv6
+pktgen.conf-1-1-ip6-rdos         # 1 CPU 1 dev ipv6  w. route DoS
+pktgen.conf-1-1-flows            # 1 CPU 1 dev multiple flows.
+
+
+Interrupt affinity
+===================
+Note that when adding devices to a specific CPU it is a good idea to
+also assign /proc/irq/XX/smp_affinity so that the TX interrupts are bound
+to the same CPU.  This reduces cache bouncing when freeing skbs.
+
+Plus using the device flag QUEUE_MAP_CPU, which maps the SKBs TX queue
+to the running threads CPU (directly from smp_processor_id()).
+
+Enable IPsec
+============
+Default IPsec transformation with ESP encapsulation plus transport mode
+can be enabled by simply setting:
+
+pgset "flag IPSEC"
+pgset "flows 1"
+
+To avoid breaking existing testbed scripts for using AH type and tunnel mode,
+you can use "pgset spi SPI_VALUE" to specify which transformation mode
+to employ.
+
+
+Current commands and configuration options
+==========================================
+
+** Pgcontrol commands:
+
+start
+stop
+reset
+
+** Thread commands:
+
+add_device
+rem_device_all
+
+
+** Device commands:
+
+count
+clone_skb
+burst
+debug
+
+frags
+delay
+
+src_mac_count
+dst_mac_count
+
+pkt_size
+min_pkt_size
+max_pkt_size
+
+queue_map_min
+queue_map_max
+skb_priority
+
+tos           (ipv4)
+traffic_class (ipv6)
+
+mpls
+
+udp_src_min
+udp_src_max
+
+udp_dst_min
+udp_dst_max
+
+node
+
+flag
+  IPSRC_RND
+  IPDST_RND
+  UDPSRC_RND
+  UDPDST_RND
+  MACSRC_RND
+  MACDST_RND
+  TXSIZE_RND
+  IPV6
+  MPLS_RND
+  VID_RND
+  SVID_RND
+  FLOW_SEQ
+  QUEUE_MAP_RND
+  QUEUE_MAP_CPU
+  UDPCSUM
+  IPSEC
+  NODE_ALLOC
+  NO_TIMESTAMP
+
+spi (ipsec)
+
+dst_min
+dst_max
+
+src_min
+src_max
+
+dst_mac
+src_mac
+
+clear_counters
+
+src6
+dst6
+dst6_max
+dst6_min
+
+flows
+flowlen
+
+rate
+ratep
+
+xmit_mode <start_xmit|netif_receive>
+
+vlan_cfi
+vlan_id
+vlan_p
+
+svlan_cfi
+svlan_id
+svlan_p
+
+
+References:
+ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/
+ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/
+
+Paper from Linux-Kongress in Erlangen 2004.
+ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf
+
+Thanks to:
+Grant Grundler for testing on IA-64 and parisc, Harald Welte,  Lennert Buytenhek
+Stephen Hemminger, Andi Kleen, Dave Miller and many others.
+
+
+Good luck with the linux net-development.
diff --git a/Documentation/networking/ppp_generic.txt b/Documentation/networking/ppp_generic.txt
new file mode 100644
index 000000000..61daf4b39
--- /dev/null
+++ b/Documentation/networking/ppp_generic.txt
@@ -0,0 +1,426 @@
+		PPP Generic Driver and Channel Interface
+		----------------------------------------
+
+			    Paul Mackerras
+			   paulus@samba.org
+			      7 Feb 2002
+
+The generic PPP driver in linux-2.4 provides an implementation of the
+functionality which is of use in any PPP implementation, including:
+
+* the network interface unit (ppp0 etc.)
+* the interface to the networking code
+* PPP multilink: splitting datagrams between multiple links, and
+  ordering and combining received fragments
+* the interface to pppd, via a /dev/ppp character device
+* packet compression and decompression
+* TCP/IP header compression and decompression
+* detecting network traffic for demand dialling and for idle timeouts
+* simple packet filtering
+
+For sending and receiving PPP frames, the generic PPP driver calls on
+the services of PPP `channels'.  A PPP channel encapsulates a
+mechanism for transporting PPP frames from one machine to another.  A
+PPP channel implementation can be arbitrarily complex internally but
+has a very simple interface with the generic PPP code: it merely has
+to be able to send PPP frames, receive PPP frames, and optionally
+handle ioctl requests.  Currently there are PPP channel
+implementations for asynchronous serial ports, synchronous serial
+ports, and for PPP over ethernet.
+
+This architecture makes it possible to implement PPP multilink in a
+natural and straightforward way, by allowing more than one channel to
+be linked to each ppp network interface unit.  The generic layer is
+responsible for splitting datagrams on transmit and recombining them
+on receive.
+
+
+PPP channel API
+---------------
+
+See include/linux/ppp_channel.h for the declaration of the types and
+functions used to communicate between the generic PPP layer and PPP
+channels.
+
+Each channel has to provide two functions to the generic PPP layer,
+via the ppp_channel.ops pointer:
+
+* start_xmit() is called by the generic layer when it has a frame to
+  send.  The channel has the option of rejecting the frame for
+  flow-control reasons.  In this case, start_xmit() should return 0
+  and the channel should call the ppp_output_wakeup() function at a
+  later time when it can accept frames again, and the generic layer
+  will then attempt to retransmit the rejected frame(s).  If the frame
+  is accepted, the start_xmit() function should return 1.
+
+* ioctl() provides an interface which can be used by a user-space
+  program to control aspects of the channel's behaviour.  This
+  procedure will be called when a user-space program does an ioctl
+  system call on an instance of /dev/ppp which is bound to the
+  channel.  (Usually it would only be pppd which would do this.)
+
+The generic PPP layer provides seven functions to channels:
+
+* ppp_register_channel() is called when a channel has been created, to
+  notify the PPP generic layer of its presence.  For example, setting
+  a serial port to the PPPDISC line discipline causes the ppp_async
+  channel code to call this function.
+
+* ppp_unregister_channel() is called when a channel is to be
+  destroyed.  For example, the ppp_async channel code calls this when
+  a hangup is detected on the serial port.
+
+* ppp_output_wakeup() is called by a channel when it has previously
+  rejected a call to its start_xmit function, and can now accept more
+  packets.
+
+* ppp_input() is called by a channel when it has received a complete
+  PPP frame.
+
+* ppp_input_error() is called by a channel when it has detected that a
+  frame has been lost or dropped (for example, because of a FCS (frame
+  check sequence) error).
+
+* ppp_channel_index() returns the channel index assigned by the PPP
+  generic layer to this channel.  The channel should provide some way
+  (e.g. an ioctl) to transmit this back to user-space, as user-space
+  will need it to attach an instance of /dev/ppp to this channel.
+
+* ppp_unit_number() returns the unit number of the ppp network
+  interface to which this channel is connected, or -1 if the channel
+  is not connected.
+
+Connecting a channel to the ppp generic layer is initiated from the
+channel code, rather than from the generic layer.  The channel is
+expected to have some way for a user-level process to control it
+independently of the ppp generic layer.  For example, with the
+ppp_async channel, this is provided by the file descriptor to the
+serial port.
+
+Generally a user-level process will initialize the underlying
+communications medium and prepare it to do PPP.  For example, with an
+async tty, this can involve setting the tty speed and modes, issuing
+modem commands, and then going through some sort of dialog with the
+remote system to invoke PPP service there.  We refer to this process
+as `discovery'.  Then the user-level process tells the medium to
+become a PPP channel and register itself with the generic PPP layer.
+The channel then has to report the channel number assigned to it back
+to the user-level process.  From that point, the PPP negotiation code
+in the PPP daemon (pppd) can take over and perform the PPP
+negotiation, accessing the channel through the /dev/ppp interface.
+
+At the interface to the PPP generic layer, PPP frames are stored in
+skbuff structures and start with the two-byte PPP protocol number.
+The frame does *not* include the 0xff `address' byte or the 0x03
+`control' byte that are optionally used in async PPP.  Nor is there
+any escaping of control characters, nor are there any FCS or framing
+characters included.  That is all the responsibility of the channel
+code, if it is needed for the particular medium.  That is, the skbuffs
+presented to the start_xmit() function contain only the 2-byte
+protocol number and the data, and the skbuffs presented to ppp_input()
+must be in the same format.
+
+The channel must provide an instance of a ppp_channel struct to
+represent the channel.  The channel is free to use the `private' field
+however it wishes.  The channel should initialize the `mtu' and
+`hdrlen' fields before calling ppp_register_channel() and not change
+them until after ppp_unregister_channel() returns.  The `mtu' field
+represents the maximum size of the data part of the PPP frames, that
+is, it does not include the 2-byte protocol number.
+
+If the channel needs some headroom in the skbuffs presented to it for
+transmission (i.e., some space free in the skbuff data area before the
+start of the PPP frame), it should set the `hdrlen' field of the
+ppp_channel struct to the amount of headroom required.  The generic
+PPP layer will attempt to provide that much headroom but the channel
+should still check if there is sufficient headroom and copy the skbuff
+if there isn't.
+
+On the input side, channels should ideally provide at least 2 bytes of
+headroom in the skbuffs presented to ppp_input().  The generic PPP
+code does not require this but will be more efficient if this is done.
+
+
+Buffering and flow control
+--------------------------
+
+The generic PPP layer has been designed to minimize the amount of data
+that it buffers in the transmit direction.  It maintains a queue of
+transmit packets for the PPP unit (network interface device) plus a
+queue of transmit packets for each attached channel.  Normally the
+transmit queue for the unit will contain at most one packet; the
+exceptions are when pppd sends packets by writing to /dev/ppp, and
+when the core networking code calls the generic layer's start_xmit()
+function with the queue stopped, i.e. when the generic layer has
+called netif_stop_queue(), which only happens on a transmit timeout.
+The start_xmit function always accepts and queues the packet which it
+is asked to transmit.
+
+Transmit packets are dequeued from the PPP unit transmit queue and
+then subjected to TCP/IP header compression and packet compression
+(Deflate or BSD-Compress compression), as appropriate.  After this
+point the packets can no longer be reordered, as the decompression
+algorithms rely on receiving compressed packets in the same order that
+they were generated.
+
+If multilink is not in use, this packet is then passed to the attached
+channel's start_xmit() function.  If the channel refuses to take
+the packet, the generic layer saves it for later transmission.  The
+generic layer will call the channel's start_xmit() function again
+when the channel calls  ppp_output_wakeup() or when the core
+networking code calls the generic layer's start_xmit() function
+again.  The generic layer contains no timeout and retransmission
+logic; it relies on the core networking code for that.
+
+If multilink is in use, the generic layer divides the packet into one
+or more fragments and puts a multilink header on each fragment.  It
+decides how many fragments to use based on the length of the packet
+and the number of channels which are potentially able to accept a
+fragment at the moment.  A channel is potentially able to accept a
+fragment if it doesn't have any fragments currently queued up for it
+to transmit.  The channel may still refuse a fragment; in this case
+the fragment is queued up for the channel to transmit later.  This
+scheme has the effect that more fragments are given to higher-
+bandwidth channels.  It also means that under light load, the generic
+layer will tend to fragment large packets across all the channels,
+thus reducing latency, while under heavy load, packets will tend to be
+transmitted as single fragments, thus reducing the overhead of
+fragmentation.
+
+
+SMP safety
+----------
+
+The PPP generic layer has been designed to be SMP-safe.  Locks are
+used around accesses to the internal data structures where necessary
+to ensure their integrity.  As part of this, the generic layer
+requires that the channels adhere to certain requirements and in turn
+provides certain guarantees to the channels.  Essentially the channels
+are required to provide the appropriate locking on the ppp_channel
+structures that form the basis of the communication between the
+channel and the generic layer.  This is because the channel provides
+the storage for the ppp_channel structure, and so the channel is
+required to provide the guarantee that this storage exists and is
+valid at the appropriate times.
+
+The generic layer requires these guarantees from the channel:
+
+* The ppp_channel object must exist from the time that
+  ppp_register_channel() is called until after the call to
+  ppp_unregister_channel() returns.
+
+* No thread may be in a call to any of ppp_input(), ppp_input_error(),
+  ppp_output_wakeup(), ppp_channel_index() or ppp_unit_number() for a
+  channel at the time that ppp_unregister_channel() is called for that
+  channel.
+
+* ppp_register_channel() and ppp_unregister_channel() must be called
+  from process context, not interrupt or softirq/BH context.
+
+* The remaining generic layer functions may be called at softirq/BH
+  level but must not be called from a hardware interrupt handler.
+
+* The generic layer may call the channel start_xmit() function at
+  softirq/BH level but will not call it at interrupt level.  Thus the
+  start_xmit() function may not block.
+
+* The generic layer will only call the channel ioctl() function in
+  process context.
+
+The generic layer provides these guarantees to the channels:
+
+* The generic layer will not call the start_xmit() function for a
+  channel while any thread is already executing in that function for
+  that channel.
+
+* The generic layer will not call the ioctl() function for a channel
+  while any thread is already executing in that function for that
+  channel.
+
+* By the time a call to ppp_unregister_channel() returns, no thread
+  will be executing in a call from the generic layer to that channel's
+  start_xmit() or ioctl() function, and the generic layer will not
+  call either of those functions subsequently.
+
+
+Interface to pppd
+-----------------
+
+The PPP generic layer exports a character device interface called
+/dev/ppp.  This is used by pppd to control PPP interface units and
+channels.  Although there is only one /dev/ppp, each open instance of
+/dev/ppp acts independently and can be attached either to a PPP unit
+or a PPP channel.  This is achieved using the file->private_data field
+to point to a separate object for each open instance of /dev/ppp.  In
+this way an effect similar to Solaris' clone open is obtained,
+allowing us to control an arbitrary number of PPP interfaces and
+channels without having to fill up /dev with hundreds of device names.
+
+When /dev/ppp is opened, a new instance is created which is initially
+unattached.  Using an ioctl call, it can then be attached to an
+existing unit, attached to a newly-created unit, or attached to an
+existing channel.  An instance attached to a unit can be used to send
+and receive PPP control frames, using the read() and write() system
+calls, along with poll() if necessary.  Similarly, an instance
+attached to a channel can be used to send and receive PPP frames on
+that channel.
+
+In multilink terms, the unit represents the bundle, while the channels
+represent the individual physical links.  Thus, a PPP frame sent by a
+write to the unit (i.e., to an instance of /dev/ppp attached to the
+unit) will be subject to bundle-level compression and to fragmentation
+across the individual links (if multilink is in use).  In contrast, a
+PPP frame sent by a write to the channel will be sent as-is on that
+channel, without any multilink header.
+
+A channel is not initially attached to any unit.  In this state it can
+be used for PPP negotiation but not for the transfer of data packets.
+It can then be connected to a PPP unit with an ioctl call, which
+makes it available to send and receive data packets for that unit.
+
+The ioctl calls which are available on an instance of /dev/ppp depend
+on whether it is unattached, attached to a PPP interface, or attached
+to a PPP channel.  The ioctl calls which are available on an
+unattached instance are:
+
+* PPPIOCNEWUNIT creates a new PPP interface and makes this /dev/ppp
+  instance the "owner" of the interface.  The argument should point to
+  an int which is the desired unit number if >= 0, or -1 to assign the
+  lowest unused unit number.  Being the owner of the interface means
+  that the interface will be shut down if this instance of /dev/ppp is
+  closed.
+
+* PPPIOCATTACH attaches this instance to an existing PPP interface.
+  The argument should point to an int containing the unit number.
+  This does not make this instance the owner of the PPP interface.
+
+* PPPIOCATTCHAN attaches this instance to an existing PPP channel.
+  The argument should point to an int containing the channel number.
+
+The ioctl calls available on an instance of /dev/ppp attached to a
+channel are:
+
+* PPPIOCCONNECT connects this channel to a PPP interface.  The
+  argument should point to an int containing the interface unit
+  number.  It will return an EINVAL error if the channel is already
+  connected to an interface, or ENXIO if the requested interface does
+  not exist.
+
+* PPPIOCDISCONN disconnects this channel from the PPP interface that
+  it is connected to.  It will return an EINVAL error if the channel
+  is not connected to an interface.
+
+* All other ioctl commands are passed to the channel ioctl() function.
+
+The ioctl calls that are available on an instance that is attached to
+an interface unit are:
+
+* PPPIOCSMRU sets the MRU (maximum receive unit) for the interface.
+  The argument should point to an int containing the new MRU value.
+
+* PPPIOCSFLAGS sets flags which control the operation of the
+  interface.  The argument should be a pointer to an int containing
+  the new flags value.  The bits in the flags value that can be set
+  are:
+	SC_COMP_TCP		enable transmit TCP header compression
+	SC_NO_TCP_CCID		disable connection-id compression for
+				TCP header compression
+	SC_REJ_COMP_TCP		disable receive TCP header decompression
+	SC_CCP_OPEN		Compression Control Protocol (CCP) is
+				open, so inspect CCP packets
+	SC_CCP_UP		CCP is up, may (de)compress packets
+	SC_LOOP_TRAFFIC		send IP traffic to pppd
+	SC_MULTILINK		enable PPP multilink fragmentation on
+				transmitted packets
+	SC_MP_SHORTSEQ		expect short multilink sequence
+				numbers on received multilink fragments
+	SC_MP_XSHORTSEQ		transmit short multilink sequence nos.
+
+  The values of these flags are defined in <linux/ppp-ioctl.h>.  Note
+  that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and
+  SC_MP_XSHORTSEQ bits are ignored if the CONFIG_PPP_MULTILINK option
+  is not selected.
+
+* PPPIOCGFLAGS returns the value of the status/control flags for the
+  interface unit.  The argument should point to an int where the ioctl
+  will store the flags value.  As well as the values listed above for
+  PPPIOCSFLAGS, the following bits may be set in the returned value:
+	SC_COMP_RUN		CCP compressor is running
+	SC_DECOMP_RUN		CCP decompressor is running
+	SC_DC_ERROR		CCP decompressor detected non-fatal error
+	SC_DC_FERROR		CCP decompressor detected fatal error
+
+* PPPIOCSCOMPRESS sets the parameters for packet compression or
+  decompression.  The argument should point to a ppp_option_data
+  structure (defined in <linux/ppp-ioctl.h>), which contains a
+  pointer/length pair which should describe a block of memory
+  containing a CCP option specifying a compression method and its
+  parameters.  The ppp_option_data struct also contains a `transmit'
+  field.  If this is 0, the ioctl will affect the receive path,
+  otherwise the transmit path.
+
+* PPPIOCGUNIT returns, in the int pointed to by the argument, the unit
+  number of this interface unit.
+
+* PPPIOCSDEBUG sets the debug flags for the interface to the value in
+  the int pointed to by the argument.  Only the least significant bit
+  is used; if this is 1 the generic layer will print some debug
+  messages during its operation.  This is only intended for debugging
+  the generic PPP layer code; it is generally not helpful for working
+  out why a PPP connection is failing.
+
+* PPPIOCGDEBUG returns the debug flags for the interface in the int
+  pointed to by the argument.
+
+* PPPIOCGIDLE returns the time, in seconds, since the last data
+  packets were sent and received.  The argument should point to a
+  ppp_idle structure (defined in <linux/ppp_defs.h>).  If the
+  CONFIG_PPP_FILTER option is enabled, the set of packets which reset
+  the transmit and receive idle timers is restricted to those which
+  pass the `active' packet filter.
+
+* PPPIOCSMAXCID sets the maximum connection-ID parameter (and thus the
+  number of connection slots) for the TCP header compressor and
+  decompressor.  The lower 16 bits of the int pointed to by the
+  argument specify the maximum connection-ID for the compressor.  If
+  the upper 16 bits of that int are non-zero, they specify the maximum
+  connection-ID for the decompressor, otherwise the decompressor's
+  maximum connection-ID is set to 15.
+
+* PPPIOCSNPMODE sets the network-protocol mode for a given network
+  protocol.  The argument should point to an npioctl struct (defined
+  in <linux/ppp-ioctl.h>).  The `protocol' field gives the PPP protocol
+  number for the protocol to be affected, and the `mode' field
+  specifies what to do with packets for that protocol:
+
+	NPMODE_PASS	normal operation, transmit and receive packets
+	NPMODE_DROP	silently drop packets for this protocol
+	NPMODE_ERROR	drop packets and return an error on transmit
+	NPMODE_QUEUE	queue up packets for transmit, drop received
+			packets
+
+  At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as
+  NPMODE_DROP.
+
+* PPPIOCGNPMODE returns the network-protocol mode for a given
+  protocol.  The argument should point to an npioctl struct with the
+  `protocol' field set to the PPP protocol number for the protocol of
+  interest.  On return the `mode' field will be set to the network-
+  protocol mode for that protocol.
+
+* PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet
+  filters.  These ioctls are only available if the CONFIG_PPP_FILTER
+  option is selected.  The argument should point to a sock_fprog
+  structure (defined in <linux/filter.h>) containing the compiled BPF
+  instructions for the filter.  Packets are dropped if they fail the
+  `pass' filter; otherwise, if they fail the `active' filter they are
+  passed but they do not reset the transmit or receive idle timer.
+
+* PPPIOCSMRRU enables or disables multilink processing for received
+  packets and sets the multilink MRRU (maximum reconstructed receive
+  unit).  The argument should point to an int containing the new MRRU
+  value.  If the MRRU value is 0, processing of received multilink
+  fragments is disabled.  This ioctl is only available if the
+  CONFIG_PPP_MULTILINK option is selected.
+
+Last modified: 7-feb-2002
diff --git a/Documentation/networking/proc_net_tcp.txt b/Documentation/networking/proc_net_tcp.txt
new file mode 100644
index 000000000..4a79209e7
--- /dev/null
+++ b/Documentation/networking/proc_net_tcp.txt
@@ -0,0 +1,48 @@
+This document describes the interfaces /proc/net/tcp and /proc/net/tcp6.
+Note that these interfaces are deprecated in favor of tcp_diag.
+
+These /proc interfaces provide information about currently active TCP 
+connections, and are implemented by tcp4_seq_show() in net/ipv4/tcp_ipv4.c
+and tcp6_seq_show() in net/ipv6/tcp_ipv6.c, respectively.
+
+It will first list all listening TCP sockets, and next list all established
+TCP connections. A typical entry of /proc/net/tcp would look like this (split 
+up into 3 parts because of the length of the line):
+
+   46: 010310AC:9C4C 030310AC:1770 01 
+   |      |      |      |      |   |--> connection state
+   |      |      |      |      |------> remote TCP port number
+   |      |      |      |-------------> remote IPv4 address
+   |      |      |--------------------> local TCP port number
+   |      |---------------------------> local IPv4 address
+   |----------------------------------> number of entry
+
+   00000150:00000000 01:00000019 00000000  
+      |        |     |     |       |--> number of unrecovered RTO timeouts
+      |        |     |     |----------> number of jiffies until timer expires
+      |        |     |----------------> timer_active (see below)
+      |        |----------------------> receive-queue
+      |-------------------------------> transmit-queue
+
+   1000        0 54165785 4 cd1e6040 25 4 27 3 -1
+    |          |    |     |    |     |  | |  | |--> slow start size threshold, 
+    |          |    |     |    |     |  | |  |      or -1 if the threshold
+    |          |    |     |    |     |  | |  |      is >= 0xFFFF
+    |          |    |     |    |     |  | |  |----> sending congestion window
+    |          |    |     |    |     |  | |-------> (ack.quick<<1)|ack.pingpong
+    |          |    |     |    |     |  |---------> Predicted tick of soft clock
+    |          |    |     |    |     |              (delayed ACK control data)
+    |          |    |     |    |     |------------> retransmit timeout
+    |          |    |     |    |------------------> location of socket in memory
+    |          |    |     |-----------------------> socket reference count
+    |          |    |-----------------------------> inode
+    |          |----------------------------------> unanswered 0-window probes
+    |---------------------------------------------> uid
+
+timer_active:
+  0  no timer is pending
+  1  retransmit-timer is pending
+  2  another timer (e.g. delayed ack or keepalive) is pending
+  3  this is a socket in TIME_WAIT state. Not all fields will contain 
+     data (or even exist)
+  4  zero window probe timer is pending
diff --git a/Documentation/networking/radiotap-headers.txt b/Documentation/networking/radiotap-headers.txt
new file mode 100644
index 000000000..953331c79
--- /dev/null
+++ b/Documentation/networking/radiotap-headers.txt
@@ -0,0 +1,152 @@
+How to use radiotap headers
+===========================
+
+Pointer to the radiotap include file
+------------------------------------
+
+Radiotap headers are variable-length and extensible, you can get most of the
+information you need to know on them from:
+
+./include/net/ieee80211_radiotap.h
+
+This document gives an overview and warns on some corner cases.
+
+
+Structure of the header
+-----------------------
+
+There is a fixed portion at the start which contains a u32 bitmap that defines
+if the possible argument associated with that bit is present or not.  So if b0
+of the it_present member of ieee80211_radiotap_header is set, it means that
+the header for argument index 0 (IEEE80211_RADIOTAP_TSFT) is present in the
+argument area.
+
+   < 8-byte ieee80211_radiotap_header >
+   [ <possible argument bitmap extensions ... > ]
+   [ <argument> ... ]
+
+At the moment there are only 13 possible argument indexes defined, but in case
+we run out of space in the u32 it_present member, it is defined that b31 set
+indicates that there is another u32 bitmap following (shown as "possible
+argument bitmap extensions..." above), and the start of the arguments is moved
+forward 4 bytes each time.
+
+Note also that the it_len member __le16 is set to the total number of bytes
+covered by the ieee80211_radiotap_header and any arguments following.
+
+
+Requirements for arguments
+--------------------------
+
+After the fixed part of the header, the arguments follow for each argument
+index whose matching bit is set in the it_present member of
+ieee80211_radiotap_header.
+
+ - the arguments are all stored little-endian!
+
+ - the argument payload for a given argument index has a fixed size.  So
+   IEEE80211_RADIOTAP_TSFT being present always indicates an 8-byte argument is
+   present.  See the comments in ./include/net/ieee80211_radiotap.h for a nice
+   breakdown of all the argument sizes
+
+ - the arguments must be aligned to a boundary of the argument size using
+   padding.  So a u16 argument must start on the next u16 boundary if it isn't
+   already on one, a u32 must start on the next u32 boundary and so on.
+
+ - "alignment" is relative to the start of the ieee80211_radiotap_header, ie,
+   the first byte of the radiotap header.  The absolute alignment of that first
+   byte isn't defined.  So even if the whole radiotap header is starting at, eg,
+   address 0x00000003, still the first byte of the radiotap header is treated as
+   0 for alignment purposes.
+
+ - the above point that there may be no absolute alignment for multibyte
+   entities in the fixed radiotap header or the argument region means that you
+   have to take special evasive action when trying to access these multibyte
+   entities.  Some arches like Blackfin cannot deal with an attempt to
+   dereference, eg, a u16 pointer that is pointing to an odd address.  Instead
+   you have to use a kernel API get_unaligned() to dereference the pointer,
+   which will do it bytewise on the arches that require that.
+
+ - The arguments for a given argument index can be a compound of multiple types
+   together.  For example IEEE80211_RADIOTAP_CHANNEL has an argument payload
+   consisting of two u16s of total length 4.  When this happens, the padding
+   rule is applied dealing with a u16, NOT dealing with a 4-byte single entity.
+
+
+Example valid radiotap header
+-----------------------------
+
+	0x00, 0x00, // <-- radiotap version + pad byte
+	0x0b, 0x00, // <- radiotap header length
+	0x04, 0x0c, 0x00, 0x00, // <-- bitmap
+	0x6c, // <-- rate (in 500kHz units)
+	0x0c, //<-- tx power
+	0x01 //<-- antenna
+
+
+Using the Radiotap Parser
+-------------------------
+
+If you are having to parse a radiotap struct, you can radically simplify the
+job by using the radiotap parser that lives in net/wireless/radiotap.c and has
+its prototypes available in include/net/cfg80211.h.  You use it like this:
+
+#include <net/cfg80211.h>
+
+/* buf points to the start of the radiotap header part */
+
+int MyFunction(u8 * buf, int buflen)
+{
+	int pkt_rate_100kHz = 0, antenna = 0, pwr = 0;
+	struct ieee80211_radiotap_iterator iterator;
+	int ret = ieee80211_radiotap_iterator_init(&iterator, buf, buflen);
+
+	while (!ret) {
+
+		ret = ieee80211_radiotap_iterator_next(&iterator);
+
+		if (ret)
+			continue;
+
+		/* see if this argument is something we can use */
+
+		switch (iterator.this_arg_index) {
+		/*
+		 * You must take care when dereferencing iterator.this_arg
+		 * for multibyte types... the pointer is not aligned.  Use
+		 * get_unaligned((type *)iterator.this_arg) to dereference
+		 * iterator.this_arg for type "type" safely on all arches.
+		 */
+		case IEEE80211_RADIOTAP_RATE:
+			/* radiotap "rate" u8 is in
+			 * 500kbps units, eg, 0x02=1Mbps
+			 */
+			pkt_rate_100kHz = (*iterator.this_arg) * 5;
+			break;
+
+		case IEEE80211_RADIOTAP_ANTENNA:
+			/* radiotap uses 0 for 1st ant */
+			antenna = *iterator.this_arg);
+			break;
+
+		case IEEE80211_RADIOTAP_DBM_TX_POWER:
+			pwr = *iterator.this_arg;
+			break;
+
+		default:
+			break;
+		}
+	}  /* while more rt headers */
+
+	if (ret != -ENOENT)
+		return TXRX_DROP;
+
+	/* discard the radiotap header part */
+	buf += iterator.max_length;
+	buflen -= iterator.max_length;
+
+	...
+
+}
+
+Andy Green <andy@warmcat.com>
diff --git a/Documentation/networking/ray_cs.txt b/Documentation/networking/ray_cs.txt
new file mode 100644
index 000000000..c0c12307e
--- /dev/null
+++ b/Documentation/networking/ray_cs.txt
@@ -0,0 +1,150 @@
+September 21, 1999
+
+Copyright (c) 1998  Corey Thomas (corey@world.std.com)
+
+This file is the documentation for the Raylink Wireless LAN card driver for
+Linux.  The Raylink wireless LAN card is a PCMCIA card which provides IEEE
+802.11 compatible wireless network connectivity at 1 and 2 megabits/second.
+See http://www.raytheon.com/micro/raylink/ for more information on the Raylink
+card.  This driver is in early development and does have bugs.  See the known
+bugs and limitations at the end of this document for more information.
+This driver also works with WebGear's Aviator 2.4 and Aviator Pro
+wireless LAN cards.
+
+As of kernel 2.3.18, the ray_cs driver is part of the Linux kernel
+source.  My web page for the development of ray_cs is at
+http://web.ralinktech.com/ralink/Home/Support/Linux.html 
+and I can be emailed at corey@world.std.com
+
+The kernel driver is based on ray_cs-1.62.tgz
+
+The driver at my web page is intended to be used as an add on to
+David Hinds pcmcia package.  All the command line parameters are
+available when compiled as a module.  When built into the kernel, only
+the essid= string parameter is available via the kernel command line.
+This will change after the method of sorting out parameters for all
+the PCMCIA drivers is agreed upon.  If you must have a built in driver
+with nondefault parameters, they can be edited in
+/usr/src/linux/drivers/net/pcmcia/ray_cs.c.  Searching for module_param
+will find them all.
+
+Information on card services is available at:
+	http://pcmcia-cs.sourceforge.net/
+
+
+Card services user programs are still required for PCMCIA devices.
+pcmcia-cs-3.1.1 or greater is required for the kernel version of
+the driver.
+
+Currently, ray_cs is not part of David Hinds card services package,
+so the following magic is required.
+
+At the end of the /etc/pcmcia/config.opts file, add the line: 
+source ./ray_cs.opts 
+This will make card services read the ray_cs.opts file
+when starting.  Create the file /etc/pcmcia/ray_cs.opts containing the
+following:
+
+#### start of /etc/pcmcia/ray_cs.opts ###################
+# Configuration options for Raylink Wireless LAN PCMCIA card
+device "ray_cs"
+  class "network" module "misc/ray_cs"
+
+card "RayLink PC Card WLAN Adapter"
+  manfid 0x01a6, 0x0000
+  bind "ray_cs"
+
+module "misc/ray_cs" opts ""
+#### end of /etc/pcmcia/ray_cs.opts #####################
+
+
+To join an existing network with
+different parameters, contact the network administrator for the 
+configuration information, and edit /etc/pcmcia/ray_cs.opts.
+Add the parameters below between the empty quotes.
+
+Parameters for ray_cs driver which may be specified in ray_cs.opts:
+
+bc              integer         0 = normal mode (802.11 timing)
+                                1 = slow down inter frame timing to allow
+                                    operation with older breezecom access
+                                    points.
+
+beacon_period	integer         beacon period in Kilo-microseconds
+				legal values = must be integer multiple 
+                                               of hop dwell
+                                default = 256
+
+country         integer         1 = USA (default)
+                                2 = Europe
+                                3 = Japan
+                                4 = Korea
+                                5 = Spain
+                                6 = France
+                                7 = Israel
+                                8 = Australia
+
+essid		string		ESS ID - network name to join
+				string with maximum length of 32 chars
+				default value = "ADHOC_ESSID"
+
+hop_dwell	integer         hop dwell time in Kilo-microseconds 
+				legal values = 16,32,64,128(default),256
+
+irq_mask	integer         linux standard 16 bit value 1bit/IRQ
+				lsb is IRQ 0, bit 1 is IRQ 1 etc.
+				Used to restrict choice of IRQ's to use.
+                                Recommended method for controlling
+                                interrupts is in /etc/pcmcia/config.opts
+
+net_type	integer		0 (default) = adhoc network, 
+				1 = infrastructure
+
+phy_addr	string          string containing new MAC address in
+				hex, must start with x eg
+				x00008f123456
+
+psm		integer         0 = continuously active
+				1 = power save mode (not useful yet)
+
+pc_debug	integer		(0-5) larger values for more verbose
+				logging.  Replaces ray_debug.
+
+ray_debug	integer		Replaced with pc_debug
+
+ray_mem_speed   integer         defaults to 500
+
+sniffer         integer         0 = not sniffer (default)
+                                1 = sniffer which can be used to record all
+                                    network traffic using tcpdump or similar, 
+                                    but no normal network use is allowed.
+
+translate	integer		0 = no translation (encapsulate frames)
+				1 = translation    (RFC1042/802.1)
+
+
+More on sniffer mode:
+
+tcpdump does not understand 802.11 headers, so it can't
+interpret the contents, but it can record to a file.  This is only
+useful for debugging 802.11 lowlevel protocols that are not visible to
+linux.  If you want to watch ftp xfers, or do similar things, you
+don't need to use sniffer mode.  Also, some packet types are never
+sent up by the card, so you will never see them (ack, rts, cts, probe
+etc.)  There is a simple program (showcap) included in the ray_cs
+package which parses the 802.11 headers.
+
+Known Problems and missing features
+
+        Does not work with non x86
+
+	Does not work with SMP
+
+	Support for defragmenting frames is not yet debugged, and in
+	fact is known to not work.  I have never encountered a net set
+	up to fragment, but still, it should be fixed.
+
+	The ioctl support is incomplete.  The hardware address cannot be set
+	using ifconfig yet.  If a different hardware address is needed, it may
+	be set using the phy_addr parameter in ray_cs.opts.  This requires
+	a card insertion to take effect.
diff --git a/Documentation/networking/rds.txt b/Documentation/networking/rds.txt
new file mode 100644
index 000000000..0235ae69a
--- /dev/null
+++ b/Documentation/networking/rds.txt
@@ -0,0 +1,423 @@
+
+Overview
+========
+
+This readme tries to provide some background on the hows and whys of RDS,
+and will hopefully help you find your way around the code.
+
+In addition, please see this email about RDS origins:
+http://oss.oracle.com/pipermail/rds-devel/2007-November/000228.html
+
+RDS Architecture
+================
+
+RDS provides reliable, ordered datagram delivery by using a single
+reliable connection between any two nodes in the cluster. This allows
+applications to use a single socket to talk to any other process in the
+cluster - so in a cluster with N processes you need N sockets, in contrast
+to N*N if you use a connection-oriented socket transport like TCP.
+
+RDS is not Infiniband-specific; it was designed to support different
+transports.  The current implementation used to support RDS over TCP as well
+as IB.
+
+The high-level semantics of RDS from the application's point of view are
+
+ *	Addressing
+        RDS uses IPv4 addresses and 16bit port numbers to identify
+        the end point of a connection. All socket operations that involve
+        passing addresses between kernel and user space generally
+        use a struct sockaddr_in.
+
+        The fact that IPv4 addresses are used does not mean the underlying
+        transport has to be IP-based. In fact, RDS over IB uses a
+        reliable IB connection; the IP address is used exclusively to
+        locate the remote node's GID (by ARPing for the given IP).
+
+        The port space is entirely independent of UDP, TCP or any other
+        protocol.
+
+ *	Socket interface
+        RDS sockets work *mostly* as you would expect from a BSD
+        socket. The next section will cover the details. At any rate,
+        all I/O is performed through the standard BSD socket API.
+        Some additions like zerocopy support are implemented through
+        control messages, while other extensions use the getsockopt/
+        setsockopt calls.
+
+        Sockets must be bound before you can send or receive data.
+        This is needed because binding also selects a transport and
+        attaches it to the socket. Once bound, the transport assignment
+        does not change. RDS will tolerate IPs moving around (eg in
+        a active-active HA scenario), but only as long as the address
+        doesn't move to a different transport.
+
+ *	sysctls
+        RDS supports a number of sysctls in /proc/sys/net/rds
+
+
+Socket Interface
+================
+
+  AF_RDS, PF_RDS, SOL_RDS
+	AF_RDS and PF_RDS are the domain type to be used with socket(2)
+	to create RDS sockets. SOL_RDS is the socket-level to be used
+	with setsockopt(2) and getsockopt(2) for RDS specific socket
+	options.
+
+  fd = socket(PF_RDS, SOCK_SEQPACKET, 0);
+        This creates a new, unbound RDS socket.
+
+  setsockopt(SOL_SOCKET): send and receive buffer size
+        RDS honors the send and receive buffer size socket options.
+        You are not allowed to queue more than SO_SNDSIZE bytes to
+        a socket. A message is queued when sendmsg is called, and
+        it leaves the queue when the remote system acknowledges
+        its arrival.
+
+        The SO_RCVSIZE option controls the maximum receive queue length.
+        This is a soft limit rather than a hard limit - RDS will
+        continue to accept and queue incoming messages, even if that
+        takes the queue length over the limit. However, it will also
+        mark the port as "congested" and send a congestion update to
+        the source node. The source node is supposed to throttle any
+        processes sending to this congested port.
+
+  bind(fd, &sockaddr_in, ...)
+        This binds the socket to a local IP address and port, and a
+        transport, if one has not already been selected via the
+	SO_RDS_TRANSPORT socket option
+
+  sendmsg(fd, ...)
+        Sends a message to the indicated recipient. The kernel will
+        transparently establish the underlying reliable connection
+        if it isn't up yet.
+
+        An attempt to send a message that exceeds SO_SNDSIZE will
+        return with -EMSGSIZE
+
+        An attempt to send a message that would take the total number
+        of queued bytes over the SO_SNDSIZE threshold will return
+        EAGAIN.
+
+        An attempt to send a message to a destination that is marked
+        as "congested" will return ENOBUFS.
+
+  recvmsg(fd, ...)
+        Receives a message that was queued to this socket. The sockets
+        recv queue accounting is adjusted, and if the queue length
+        drops below SO_SNDSIZE, the port is marked uncongested, and
+        a congestion update is sent to all peers.
+
+        Applications can ask the RDS kernel module to receive
+        notifications via control messages (for instance, there is a
+        notification when a congestion update arrived, or when a RDMA
+        operation completes). These notifications are received through
+        the msg.msg_control buffer of struct msghdr. The format of the
+        messages is described in manpages.
+
+  poll(fd)
+        RDS supports the poll interface to allow the application
+        to implement async I/O.
+
+        POLLIN handling is pretty straightforward. When there's an
+        incoming message queued to the socket, or a pending notification,
+        we signal POLLIN.
+
+        POLLOUT is a little harder. Since you can essentially send
+        to any destination, RDS will always signal POLLOUT as long as
+        there's room on the send queue (ie the number of bytes queued
+        is less than the sendbuf size).
+
+        However, the kernel will refuse to accept messages to
+        a destination marked congested - in this case you will loop
+        forever if you rely on poll to tell you what to do.
+        This isn't a trivial problem, but applications can deal with
+        this - by using congestion notifications, and by checking for
+        ENOBUFS errors returned by sendmsg.
+
+  setsockopt(SOL_RDS, RDS_CANCEL_SENT_TO, &sockaddr_in)
+        This allows the application to discard all messages queued to a
+        specific destination on this particular socket.
+
+        This allows the application to cancel outstanding messages if
+        it detects a timeout. For instance, if it tried to send a message,
+        and the remote host is unreachable, RDS will keep trying forever.
+        The application may decide it's not worth it, and cancel the
+        operation. In this case, it would use RDS_CANCEL_SENT_TO to
+        nuke any pending messages.
+
+  setsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
+  getsockopt(fd, SOL_RDS, SO_RDS_TRANSPORT, (int *)&transport ..)
+	Set or read an integer defining  the underlying
+	encapsulating transport to be used for RDS packets on the
+	socket. When setting the option, integer argument may be
+	one of RDS_TRANS_TCP or RDS_TRANS_IB. When retrieving the
+	value, RDS_TRANS_NONE will be returned on an unbound socket.
+	This socket option may only be set exactly once on the socket,
+	prior to binding it via the bind(2) system call. Attempts to
+	set SO_RDS_TRANSPORT on a socket for which the transport has
+	been previously attached explicitly (by SO_RDS_TRANSPORT) or
+	implicitly (via bind(2)) will return an error of EOPNOTSUPP.
+	An attempt to set SO_RDS_TRANSPPORT to RDS_TRANS_NONE will
+	always return EINVAL.
+
+RDMA for RDS
+============
+
+  see rds-rdma(7) manpage (available in rds-tools)
+
+
+Congestion Notifications
+========================
+
+  see rds(7) manpage
+
+
+RDS Protocol
+============
+
+  Message header
+
+    The message header is a 'struct rds_header' (see rds.h):
+    Fields:
+      h_sequence:
+          per-packet sequence number
+      h_ack:
+          piggybacked acknowledgment of last packet received
+      h_len:
+          length of data, not including header
+      h_sport:
+          source port
+      h_dport:
+          destination port
+      h_flags:
+          CONG_BITMAP - this is a congestion update bitmap
+          ACK_REQUIRED - receiver must ack this packet
+          RETRANSMITTED - packet has previously been sent
+      h_credit:
+          indicate to other end of connection that
+          it has more credits available (i.e. there is
+          more send room)
+      h_padding[4]:
+          unused, for future use
+      h_csum:
+          header checksum
+      h_exthdr:
+          optional data can be passed here. This is currently used for
+          passing RDMA-related information.
+
+  ACK and retransmit handling
+
+      One might think that with reliable IB connections you wouldn't need
+      to ack messages that have been received.  The problem is that IB
+      hardware generates an ack message before it has DMAed the message
+      into memory.  This creates a potential message loss if the HCA is
+      disabled for any reason between when it sends the ack and before
+      the message is DMAed and processed.  This is only a potential issue
+      if another HCA is available for fail-over.
+
+      Sending an ack immediately would allow the sender to free the sent
+      message from their send queue quickly, but could cause excessive
+      traffic to be used for acks. RDS piggybacks acks on sent data
+      packets.  Ack-only packets are reduced by only allowing one to be
+      in flight at a time, and by the sender only asking for acks when
+      its send buffers start to fill up. All retransmissions are also
+      acked.
+
+  Flow Control
+
+      RDS's IB transport uses a credit-based mechanism to verify that
+      there is space in the peer's receive buffers for more data. This
+      eliminates the need for hardware retries on the connection.
+
+  Congestion
+
+      Messages waiting in the receive queue on the receiving socket
+      are accounted against the sockets SO_RCVBUF option value.  Only
+      the payload bytes in the message are accounted for.  If the
+      number of bytes queued equals or exceeds rcvbuf then the socket
+      is congested.  All sends attempted to this socket's address
+      should return block or return -EWOULDBLOCK.
+
+      Applications are expected to be reasonably tuned such that this
+      situation very rarely occurs.  An application encountering this
+      "back-pressure" is considered a bug.
+
+      This is implemented by having each node maintain bitmaps which
+      indicate which ports on bound addresses are congested.  As the
+      bitmap changes it is sent through all the connections which
+      terminate in the local address of the bitmap which changed.
+
+      The bitmaps are allocated as connections are brought up.  This
+      avoids allocation in the interrupt handling path which queues
+      sages on sockets.  The dense bitmaps let transports send the
+      entire bitmap on any bitmap change reasonably efficiently.  This
+      is much easier to implement than some finer-grained
+      communication of per-port congestion.  The sender does a very
+      inexpensive bit test to test if the port it's about to send to
+      is congested or not.
+
+
+RDS Transport Layer
+==================
+
+  As mentioned above, RDS is not IB-specific. Its code is divided
+  into a general RDS layer and a transport layer.
+
+  The general layer handles the socket API, congestion handling,
+  loopback, stats, usermem pinning, and the connection state machine.
+
+  The transport layer handles the details of the transport. The IB
+  transport, for example, handles all the queue pairs, work requests,
+  CM event handlers, and other Infiniband details.
+
+
+RDS Kernel Structures
+=====================
+
+  struct rds_message
+    aka possibly "rds_outgoing", the generic RDS layer copies data to
+    be sent and sets header fields as needed, based on the socket API.
+    This is then queued for the individual connection and sent by the
+    connection's transport.
+  struct rds_incoming
+    a generic struct referring to incoming data that can be handed from
+    the transport to the general code and queued by the general code
+    while the socket is awoken. It is then passed back to the transport
+    code to handle the actual copy-to-user.
+  struct rds_socket
+    per-socket information
+  struct rds_connection
+    per-connection information
+  struct rds_transport
+    pointers to transport-specific functions
+  struct rds_statistics
+    non-transport-specific statistics
+  struct rds_cong_map
+    wraps the raw congestion bitmap, contains rbnode, waitq, etc.
+
+Connection management
+=====================
+
+  Connections may be in UP, DOWN, CONNECTING, DISCONNECTING, and
+  ERROR states.
+
+  The first time an attempt is made by an RDS socket to send data to
+  a node, a connection is allocated and connected. That connection is
+  then maintained forever -- if there are transport errors, the
+  connection will be dropped and re-established.
+
+  Dropping a connection while packets are queued will cause queued or
+  partially-sent datagrams to be retransmitted when the connection is
+  re-established.
+
+
+The send path
+=============
+
+  rds_sendmsg()
+    struct rds_message built from incoming data
+    CMSGs parsed (e.g. RDMA ops)
+    transport connection alloced and connected if not already
+    rds_message placed on send queue
+    send worker awoken
+  rds_send_worker()
+    calls rds_send_xmit() until queue is empty
+  rds_send_xmit()
+    transmits congestion map if one is pending
+    may set ACK_REQUIRED
+    calls transport to send either non-RDMA or RDMA message
+    (RDMA ops never retransmitted)
+  rds_ib_xmit()
+    allocs work requests from send ring
+    adds any new send credits available to peer (h_credits)
+    maps the rds_message's sg list
+    piggybacks ack
+    populates work requests
+    post send to connection's queue pair
+
+The recv path
+=============
+
+  rds_ib_recv_cq_comp_handler()
+    looks at write completions
+    unmaps recv buffer from device
+    no errors, call rds_ib_process_recv()
+    refill recv ring
+  rds_ib_process_recv()
+    validate header checksum
+    copy header to rds_ib_incoming struct if start of a new datagram
+    add to ibinc's fraglist
+    if competed datagram:
+      update cong map if datagram was cong update
+      call rds_recv_incoming() otherwise
+      note if ack is required
+  rds_recv_incoming()
+    drop duplicate packets
+    respond to pings
+    find the sock associated with this datagram
+    add to sock queue
+    wake up sock
+    do some congestion calculations
+  rds_recvmsg
+    copy data into user iovec
+    handle CMSGs
+    return to application
+
+Multipath RDS (mprds)
+=====================
+  Mprds is multipathed-RDS, primarily intended for RDS-over-TCP
+  (though the concept can be extended to other transports). The classical
+  implementation of RDS-over-TCP is implemented by demultiplexing multiple
+  PF_RDS sockets between any 2 endpoints (where endpoint == [IP address,
+  port]) over a single TCP socket between the 2 IP addresses involved. This
+  has the limitation that it ends up funneling multiple RDS flows over a
+  single TCP flow, thus it is
+  (a) upper-bounded to the single-flow bandwidth,
+  (b) suffers from head-of-line blocking for all the RDS sockets.
+
+  Better throughput (for a fixed small packet size, MTU) can be achieved
+  by having multiple TCP/IP flows per rds/tcp connection, i.e., multipathed
+  RDS (mprds).  Each such TCP/IP flow constitutes a path for the rds/tcp
+  connection. RDS sockets will be attached to a path based on some hash
+  (e.g., of local address and RDS port number) and packets for that RDS
+  socket will be sent over the attached path using TCP to segment/reassemble
+  RDS datagrams on that path.
+
+  Multipathed RDS is implemented by splitting the struct rds_connection into
+  a common (to all paths) part, and a per-path struct rds_conn_path. All
+  I/O workqs and reconnect threads are driven from the rds_conn_path.
+  Transports such as TCP that are multipath capable may then set up a
+  TPC socket per rds_conn_path, and this is managed by the transport via
+  the transport privatee cp_transport_data pointer.
+
+  Transports announce themselves as multipath capable by setting the
+  t_mp_capable bit during registration with the rds core module. When the
+  transport is multipath-capable, rds_sendmsg() hashes outgoing traffic
+  across multiple paths. The outgoing hash is computed based on the
+  local address and port that the PF_RDS socket is bound to.
+
+  Additionally, even if the transport is MP capable, we may be
+  peering with some node that does not support mprds, or supports
+  a different number of paths. As a result, the peering nodes need
+  to agree on the number of paths to be used for the connection.
+  This is done by sending out a control packet exchange before the
+  first data packet. The control packet exchange must have completed
+  prior to outgoing hash completion in rds_sendmsg() when the transport
+  is mutlipath capable.
+
+  The control packet is an RDS ping packet (i.e., packet to rds dest
+  port 0) with the ping packet having a rds extension header option  of
+  type RDS_EXTHDR_NPATHS, length 2 bytes, and the value is the
+  number of paths supported by the sender. The "probe" ping packet will
+  get sent from some reserved port, RDS_FLAG_PROBE_PORT (in <linux/rds.h>)
+  The receiver of a ping from RDS_FLAG_PROBE_PORT will thus immediately
+  be able to compute the min(sender_paths, rcvr_paths). The pong
+  sent in response to a probe-ping should contain the rcvr's npaths
+  when the rcvr is mprds-capable.
+
+  If the rcvr is not mprds-capable, the exthdr in the ping will be
+  ignored.  In this case the pong will not have any exthdrs, so the sender
+  of the probe-ping can default to single-path mprds.
+
diff --git a/Documentation/networking/regulatory.txt b/Documentation/networking/regulatory.txt
new file mode 100644
index 000000000..381e5b23d
--- /dev/null
+++ b/Documentation/networking/regulatory.txt
@@ -0,0 +1,204 @@
+Linux wireless regulatory documentation
+---------------------------------------
+
+This document gives a brief review over how the Linux wireless
+regulatory infrastructure works.
+
+More up to date information can be obtained at the project's web page:
+
+http://wireless.kernel.org/en/developers/Regulatory
+
+Keeping regulatory domains in userspace
+---------------------------------------
+
+Due to the dynamic nature of regulatory domains we keep them
+in userspace and provide a framework for userspace to upload
+to the kernel one regulatory domain to be used as the central
+core regulatory domain all wireless devices should adhere to.
+
+How to get regulatory domains to the kernel
+-------------------------------------------
+
+When the regulatory domain is first set up, the kernel will request a
+database file (regulatory.db) containing all the regulatory rules. It
+will then use that database when it needs to look up the rules for a
+given country.
+
+How to get regulatory domains to the kernel (old CRDA solution)
+---------------------------------------------------------------
+
+Userspace gets a regulatory domain in the kernel by having
+a userspace agent build it and send it via nl80211. Only
+expected regulatory domains will be respected by the kernel.
+
+A currently available userspace agent which can accomplish this
+is CRDA - central regulatory domain agent. Its documented here:
+
+http://wireless.kernel.org/en/developers/Regulatory/CRDA
+
+Essentially the kernel will send a udev event when it knows
+it needs a new regulatory domain. A udev rule can be put in place
+to trigger crda to send the respective regulatory domain for a
+specific ISO/IEC 3166 alpha2.
+
+Below is an example udev rule which can be used:
+
+# Example file, should be put in /etc/udev/rules.d/regulatory.rules
+KERNEL=="regulatory*", ACTION=="change", SUBSYSTEM=="platform", RUN+="/sbin/crda"
+
+The alpha2 is passed as an environment variable under the variable COUNTRY.
+
+Who asks for regulatory domains?
+--------------------------------
+
+* Users
+
+Users can use iw:
+
+http://wireless.kernel.org/en/users/Documentation/iw
+
+An example:
+
+  # set regulatory domain to "Costa Rica"
+  iw reg set CR
+
+This will request the kernel to set the regulatory domain to
+the specificied alpha2. The kernel in turn will then ask userspace
+to provide a regulatory domain for the alpha2 specified by the user
+by sending a uevent.
+
+* Wireless subsystems for Country Information elements
+
+The kernel will send a uevent to inform userspace a new
+regulatory domain is required. More on this to be added
+as its integration is added.
+
+* Drivers
+
+If drivers determine they need a specific regulatory domain
+set they can inform the wireless core using regulatory_hint().
+They have two options -- they either provide an alpha2 so that
+crda can provide back a regulatory domain for that country or
+they can build their own regulatory domain based on internal
+custom knowledge so the wireless core can respect it.
+
+*Most* drivers will rely on the first mechanism of providing a
+regulatory hint with an alpha2. For these drivers there is an additional
+check that can be used to ensure compliance based on custom EEPROM
+regulatory data. This additional check can be used by drivers by
+registering on its struct wiphy a reg_notifier() callback. This notifier
+is called when the core's regulatory domain has been changed. The driver
+can use this to review the changes made and also review who made them
+(driver, user, country IE) and determine what to allow based on its
+internal EEPROM data. Devices drivers wishing to be capable of world
+roaming should use this callback. More on world roaming will be
+added to this document when its support is enabled.
+
+Device drivers who provide their own built regulatory domain
+do not need a callback as the channels registered by them are
+the only ones that will be allowed and therefore *additional*
+channels cannot be enabled.
+
+Example code - drivers hinting an alpha2:
+------------------------------------------
+
+This example comes from the zd1211rw device driver. You can start
+by having a mapping of your device's EEPROM country/regulatory
+domain value to a specific alpha2 as follows:
+
+static struct zd_reg_alpha2_map reg_alpha2_map[] = {
+	{ ZD_REGDOMAIN_FCC, "US" },
+	{ ZD_REGDOMAIN_IC, "CA" },
+	{ ZD_REGDOMAIN_ETSI, "DE" }, /* Generic ETSI, use most restrictive */
+	{ ZD_REGDOMAIN_JAPAN, "JP" },
+	{ ZD_REGDOMAIN_JAPAN_ADD, "JP" },
+	{ ZD_REGDOMAIN_SPAIN, "ES" },
+	{ ZD_REGDOMAIN_FRANCE, "FR" },
+
+Then you can define a routine to map your read EEPROM value to an alpha2,
+as follows:
+
+static int zd_reg2alpha2(u8 regdomain, char *alpha2)
+{
+	unsigned int i;
+	struct zd_reg_alpha2_map *reg_map;
+		for (i = 0; i < ARRAY_SIZE(reg_alpha2_map); i++) {
+			reg_map = &reg_alpha2_map[i];
+			if (regdomain == reg_map->reg) {
+			alpha2[0] = reg_map->alpha2[0];
+			alpha2[1] = reg_map->alpha2[1];
+			return 0;
+		}
+	}
+	return 1;
+}
+
+Lastly, you can then hint to the core of your discovered alpha2, if a match
+was found. You need to do this after you have registered your wiphy. You
+are expected to do this during initialization.
+
+	r = zd_reg2alpha2(mac->regdomain, alpha2);
+	if (!r)
+		regulatory_hint(hw->wiphy, alpha2);
+
+Example code - drivers providing a built in regulatory domain:
+--------------------------------------------------------------
+
+[NOTE: This API is not currently available, it can be added when required]
+
+If you have regulatory information you can obtain from your
+driver and you *need* to use this we let you build a regulatory domain
+structure and pass it to the wireless core. To do this you should
+kmalloc() a structure big enough to hold your regulatory domain
+structure and you should then fill it with your data. Finally you simply
+call regulatory_hint() with the regulatory domain structure in it.
+
+Bellow is a simple example, with a regulatory domain cached using the stack.
+Your implementation may vary (read EEPROM cache instead, for example).
+
+Example cache of some regulatory domain
+
+struct ieee80211_regdomain mydriver_jp_regdom = {
+	.n_reg_rules = 3,
+	.alpha2 =  "JP",
+	//.alpha2 =  "99", /* If I have no alpha2 to map it to */
+	.reg_rules = {
+		/* IEEE 802.11b/g, channels 1..14 */
+		REG_RULE(2412-10, 2484+10, 40, 6, 20, 0),
+		/* IEEE 802.11a, channels 34..48 */
+		REG_RULE(5170-10, 5240+10, 40, 6, 20,
+			NL80211_RRF_NO_IR),
+		/* IEEE 802.11a, channels 52..64 */
+		REG_RULE(5260-10, 5320+10, 40, 6, 20,
+			NL80211_RRF_NO_IR|
+			NL80211_RRF_DFS),
+	}
+};
+
+Then in some part of your code after your wiphy has been registered:
+
+	struct ieee80211_regdomain *rd;
+	int size_of_regd;
+	int num_rules = mydriver_jp_regdom.n_reg_rules;
+	unsigned int i;
+
+	size_of_regd = sizeof(struct ieee80211_regdomain) +
+		(num_rules * sizeof(struct ieee80211_reg_rule));
+
+	rd = kzalloc(size_of_regd, GFP_KERNEL);
+	if (!rd)
+		return -ENOMEM;
+
+	memcpy(rd, &mydriver_jp_regdom, sizeof(struct ieee80211_regdomain));
+
+	for (i=0; i < num_rules; i++)
+		memcpy(&rd->reg_rules[i],
+		       &mydriver_jp_regdom.reg_rules[i],
+		       sizeof(struct ieee80211_reg_rule));
+	regulatory_struct_hint(rd);
+
+Statically compiled regulatory database
+---------------------------------------
+
+When a database should be fixed into the kernel, it can be provided as a
+firmware file at build time that is then linked into the kernel.
diff --git a/Documentation/networking/rmnet.txt b/Documentation/networking/rmnet.txt
new file mode 100644
index 000000000..6b341eaf2
--- /dev/null
+++ b/Documentation/networking/rmnet.txt
@@ -0,0 +1,82 @@
+1. Introduction
+
+rmnet driver is used for supporting the Multiplexing and aggregation
+Protocol (MAP). This protocol is used by all recent chipsets using Qualcomm
+Technologies, Inc. modems.
+
+This driver can be used to register onto any physical network device in
+IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator.
+
+Multiplexing allows for creation of logical netdevices (rmnet devices) to
+handle multiple private data networks (PDN) like a default internet, tethering,
+multimedia messaging service (MMS) or IP media subsystem (IMS). Hardware sends
+packets with MAP headers to rmnet. Based on the multiplexer id, rmnet
+routes to the appropriate PDN after removing the MAP header.
+
+Aggregation is required to achieve high data rates. This involves hardware
+sending aggregated bunch of MAP frames. rmnet driver will de-aggregate
+these MAP frames and send them to appropriate PDN's.
+
+2. Packet format
+
+a. MAP packet (data / control)
+
+MAP header has the same endianness of the IP packet.
+
+Packet format -
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command / Data   Reserved     Pad   Multiplexer ID    Payload length
+Bit            32 - x
+Function     Raw  Bytes
+
+Command (1)/ Data (0) bit value is to indicate if the packet is a MAP command
+or data packet. Control packet is used for transport level flow control. Data
+packets are standard IP packets.
+
+Reserved bits are usually zeroed out and to be ignored by receiver.
+
+Padding is number of bytes to be added for 4 byte alignment if required by
+hardware.
+
+Multiplexer ID is to indicate the PDN on which data has to be sent.
+
+Payload length includes the padding length but does not include MAP header
+length.
+
+b. MAP packet (command specific)
+
+Bit             0             1           2-7      8 - 15           16 - 31
+Function   Command         Reserved     Pad   Multiplexer ID    Payload length
+Bit          32 - 39        40 - 45    46 - 47       48 - 63
+Function   Command name    Reserved   Command Type   Reserved
+Bit          64 - 95
+Function   Transaction ID
+Bit          96 - 127
+Function   Command data
+
+Command 1 indicates disabling flow while 2 is enabling flow
+
+Command types -
+0 for MAP command request
+1 is to acknowledge the receipt of a command
+2 is for unsupported commands
+3 is for error during processing of commands
+
+c. Aggregation
+
+Aggregation is multiple MAP packets (can be data or command) delivered to
+rmnet in a single linear skb. rmnet will process the individual
+packets and either ACK the MAP command or deliver the IP packet to the
+network stack as needed
+
+MAP header|IP Packet|Optional padding|MAP header|IP Packet|Optional padding....
+MAP header|IP Packet|Optional padding|MAP header|Command Packet|Optional pad...
+
+3. Userspace configuration
+
+rmnet userspace configuration is done through netlink library librmnetctl
+and command line utility rmnetcli. Utility is hosted in codeaurora forum git.
+The driver uses rtnl_link_ops for communication.
+
+https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/dataservices/tree/rmnetctl
diff --git a/Documentation/networking/rxrpc.txt b/Documentation/networking/rxrpc.txt
new file mode 100644
index 000000000..b5407163d
--- /dev/null
+++ b/Documentation/networking/rxrpc.txt
@@ -0,0 +1,1149 @@
+			    ======================
+			    RxRPC NETWORK PROTOCOL
+			    ======================
+
+The RxRPC protocol driver provides a reliable two-phase transport on top of UDP
+that can be used to perform RxRPC remote operations.  This is done over sockets
+of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and
+receive data, aborts and errors.
+
+Contents of this document:
+
+ (*) Overview.
+
+ (*) RxRPC protocol summary.
+
+ (*) AF_RXRPC driver model.
+
+ (*) Control messages.
+
+ (*) Socket options.
+
+ (*) Security.
+
+ (*) Example client usage.
+
+ (*) Example server usage.
+
+ (*) AF_RXRPC kernel interface.
+
+ (*) Configurable parameters.
+
+
+========
+OVERVIEW
+========
+
+RxRPC is a two-layer protocol.  There is a session layer which provides
+reliable virtual connections using UDP over IPv4 (or IPv6) as the transport
+layer, but implements a real network protocol; and there's the presentation
+layer which renders structured data to binary blobs and back again using XDR
+(as does SunRPC):
+
+		+-------------+
+		| Application |
+		+-------------+
+		|     XDR     |		Presentation
+		+-------------+
+		|    RxRPC    |		Session
+		+-------------+
+		|     UDP     |		Transport
+		+-------------+
+
+
+AF_RXRPC provides:
+
+ (1) Part of an RxRPC facility for both kernel and userspace applications by
+     making the session part of it a Linux network protocol (AF_RXRPC).
+
+ (2) A two-phase protocol.  The client transmits a blob (the request) and then
+     receives a blob (the reply), and the server receives the request and then
+     transmits the reply.
+
+ (3) Retention of the reusable bits of the transport system set up for one call
+     to speed up subsequent calls.
+
+ (4) A secure protocol, using the Linux kernel's key retention facility to
+     manage security on the client end.  The server end must of necessity be
+     more active in security negotiations.
+
+AF_RXRPC does not provide XDR marshalling/presentation facilities.  That is
+left to the application.  AF_RXRPC only deals in blobs.  Even the operation ID
+is just the first four bytes of the request blob, and as such is beyond the
+kernel's interest.
+
+
+Sockets of AF_RXRPC family are:
+
+ (1) created as type SOCK_DGRAM;
+
+ (2) provided with a protocol of the type of underlying transport they're going
+     to use - currently only PF_INET is supported.
+
+
+The Andrew File System (AFS) is an example of an application that uses this and
+that has both kernel (filesystem) and userspace (utility) components.
+
+
+======================
+RXRPC PROTOCOL SUMMARY
+======================
+
+An overview of the RxRPC protocol:
+
+ (*) RxRPC sits on top of another networking protocol (UDP is the only option
+     currently), and uses this to provide network transport.  UDP ports, for
+     example, provide transport endpoints.
+
+ (*) RxRPC supports multiple virtual "connections" from any given transport
+     endpoint, thus allowing the endpoints to be shared, even to the same
+     remote endpoint.
+
+ (*) Each connection goes to a particular "service".  A connection may not go
+     to multiple services.  A service may be considered the RxRPC equivalent of
+     a port number.  AF_RXRPC permits multiple services to share an endpoint.
+
+ (*) Client-originating packets are marked, thus a transport endpoint can be
+     shared between client and server connections (connections have a
+     direction).
+
+ (*) Up to a billion connections may be supported concurrently between one
+     local transport endpoint and one service on one remote endpoint.  An RxRPC
+     connection is described by seven numbers:
+
+	Local address	}
+	Local port	} Transport (UDP) address
+	Remote address	}
+	Remote port	}
+	Direction
+	Connection ID
+	Service ID
+
+ (*) Each RxRPC operation is a "call".  A connection may make up to four
+     billion calls, but only up to four calls may be in progress on a
+     connection at any one time.
+
+ (*) Calls are two-phase and asymmetric: the client sends its request data,
+     which the service receives; then the service transmits the reply data
+     which the client receives.
+
+ (*) The data blobs are of indefinite size, the end of a phase is marked with a
+     flag in the packet.  The number of packets of data making up one blob may
+     not exceed 4 billion, however, as this would cause the sequence number to
+     wrap.
+
+ (*) The first four bytes of the request data are the service operation ID.
+
+ (*) Security is negotiated on a per-connection basis.  The connection is
+     initiated by the first data packet on it arriving.  If security is
+     requested, the server then issues a "challenge" and then the client
+     replies with a "response".  If the response is successful, the security is
+     set for the lifetime of that connection, and all subsequent calls made
+     upon it use that same security.  In the event that the server lets a
+     connection lapse before the client, the security will be renegotiated if
+     the client uses the connection again.
+
+ (*) Calls use ACK packets to handle reliability.  Data packets are also
+     explicitly sequenced per call.
+
+ (*) There are two types of positive acknowledgment: hard-ACKs and soft-ACKs.
+     A hard-ACK indicates to the far side that all the data received to a point
+     has been received and processed; a soft-ACK indicates that the data has
+     been received but may yet be discarded and re-requested.  The sender may
+     not discard any transmittable packets until they've been hard-ACK'd.
+
+ (*) Reception of a reply data packet implicitly hard-ACK's all the data
+     packets that make up the request.
+
+ (*) An call is complete when the request has been sent, the reply has been
+     received and the final hard-ACK on the last packet of the reply has
+     reached the server.
+
+ (*) An call may be aborted by either end at any time up to its completion.
+
+
+=====================
+AF_RXRPC DRIVER MODEL
+=====================
+
+About the AF_RXRPC driver:
+
+ (*) The AF_RXRPC protocol transparently uses internal sockets of the transport
+     protocol to represent transport endpoints.
+
+ (*) AF_RXRPC sockets map onto RxRPC connection bundles.  Actual RxRPC
+     connections are handled transparently.  One client socket may be used to
+     make multiple simultaneous calls to the same service.  One server socket
+     may handle calls from many clients.
+
+ (*) Additional parallel client connections will be initiated to support extra
+     concurrent calls, up to a tunable limit.
+
+ (*) Each connection is retained for a certain amount of time [tunable] after
+     the last call currently using it has completed in case a new call is made
+     that could reuse it.
+
+ (*) Each internal UDP socket is retained [tunable] for a certain amount of
+     time [tunable] after the last connection using it discarded, in case a new
+     connection is made that could use it.
+
+ (*) A client-side connection is only shared between calls if they have have
+     the same key struct describing their security (and assuming the calls
+     would otherwise share the connection).  Non-secured calls would also be
+     able to share connections with each other.
+
+ (*) A server-side connection is shared if the client says it is.
+
+ (*) ACK'ing is handled by the protocol driver automatically, including ping
+     replying.
+
+ (*) SO_KEEPALIVE automatically pings the other side to keep the connection
+     alive [TODO].
+
+ (*) If an ICMP error is received, all calls affected by that error will be
+     aborted with an appropriate network error passed through recvmsg().
+
+
+Interaction with the user of the RxRPC socket:
+
+ (*) A socket is made into a server socket by binding an address with a
+     non-zero service ID.
+
+ (*) In the client, sending a request is achieved with one or more sendmsgs,
+     followed by the reply being received with one or more recvmsgs.
+
+ (*) The first sendmsg for a request to be sent from a client contains a tag to
+     be used in all other sendmsgs or recvmsgs associated with that call.  The
+     tag is carried in the control data.
+
+ (*) connect() is used to supply a default destination address for a client
+     socket.  This may be overridden by supplying an alternate address to the
+     first sendmsg() of a call (struct msghdr::msg_name).
+
+ (*) If connect() is called on an unbound client, a random local port will
+     bound before the operation takes place.
+
+ (*) A server socket may also be used to make client calls.  To do this, the
+     first sendmsg() of the call must specify the target address.  The server's
+     transport endpoint is used to send the packets.
+
+ (*) Once the application has received the last message associated with a call,
+     the tag is guaranteed not to be seen again, and so it can be used to pin
+     client resources.  A new call can then be initiated with the same tag
+     without fear of interference.
+
+ (*) In the server, a request is received with one or more recvmsgs, then the
+     the reply is transmitted with one or more sendmsgs, and then the final ACK
+     is received with a last recvmsg.
+
+ (*) When sending data for a call, sendmsg is given MSG_MORE if there's more
+     data to come on that call.
+
+ (*) When receiving data for a call, recvmsg flags MSG_MORE if there's more
+     data to come for that call.
+
+ (*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg
+     to indicate the terminal message for that call.
+
+ (*) A call may be aborted by adding an abort control message to the control
+     data.  Issuing an abort terminates the kernel's use of that call's tag.
+     Any messages waiting in the receive queue for that call will be discarded.
+
+ (*) Aborts, busy notifications and challenge packets are delivered by recvmsg,
+     and control data messages will be set to indicate the context.  Receiving
+     an abort or a busy message terminates the kernel's use of that call's tag.
+
+ (*) The control data part of the msghdr struct is used for a number of things:
+
+     (*) The tag of the intended or affected call.
+
+     (*) Sending or receiving errors, aborts and busy notifications.
+
+     (*) Notifications of incoming calls.
+
+     (*) Sending debug requests and receiving debug replies [TODO].
+
+ (*) When the kernel has received and set up an incoming call, it sends a
+     message to server application to let it know there's a new call awaiting
+     its acceptance [recvmsg reports a special control message].  The server
+     application then uses sendmsg to assign a tag to the new call.  Once that
+     is done, the first part of the request data will be delivered by recvmsg.
+
+ (*) The server application has to provide the server socket with a keyring of
+     secret keys corresponding to the security types it permits.  When a secure
+     connection is being set up, the kernel looks up the appropriate secret key
+     in the keyring and then sends a challenge packet to the client and
+     receives a response packet.  The kernel then checks the authorisation of
+     the packet and either aborts the connection or sets up the security.
+
+ (*) The name of the key a client will use to secure its communications is
+     nominated by a socket option.
+
+
+Notes on sendmsg:
+
+ (*) MSG_WAITALL can be set to tell sendmsg to ignore signals if the peer is
+     making progress at accepting packets within a reasonable time such that we
+     manage to queue up all the data for transmission.  This requires the
+     client to accept at least one packet per 2*RTT time period.
+
+     If this isn't set, sendmsg() will return immediately, either returning
+     EINTR/ERESTARTSYS if nothing was consumed or returning the amount of data
+     consumed.
+
+
+Notes on recvmsg:
+
+ (*) If there's a sequence of data messages belonging to a particular call on
+     the receive queue, then recvmsg will keep working through them until:
+
+     (a) it meets the end of that call's received data,
+
+     (b) it meets a non-data message,
+
+     (c) it meets a message belonging to a different call, or
+
+     (d) it fills the user buffer.
+
+     If recvmsg is called in blocking mode, it will keep sleeping, awaiting the
+     reception of further data, until one of the above four conditions is met.
+
+ (2) MSG_PEEK operates similarly, but will return immediately if it has put any
+     data in the buffer rather than sleeping until it can fill the buffer.
+
+ (3) If a data message is only partially consumed in filling a user buffer,
+     then the remainder of that message will be left on the front of the queue
+     for the next taker.  MSG_TRUNC will never be flagged.
+
+ (4) If there is more data to be had on a call (it hasn't copied the last byte
+     of the last data message in that phase yet), then MSG_MORE will be
+     flagged.
+
+
+================
+CONTROL MESSAGES
+================
+
+AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex
+calls, to invoke certain actions and to report certain conditions.  These are:
+
+	MESSAGE ID		SRT DATA	MEANING
+	=======================	=== ===========	===============================
+	RXRPC_USER_CALL_ID	sr- User ID	App's call specifier
+	RXRPC_ABORT		srt Abort code	Abort code to issue/received
+	RXRPC_ACK		-rt n/a		Final ACK received
+	RXRPC_NET_ERROR		-rt error num	Network error on call
+	RXRPC_BUSY		-rt n/a		Call rejected (server busy)
+	RXRPC_LOCAL_ERROR	-rt error num	Local error encountered
+	RXRPC_NEW_CALL		-r- n/a		New call received
+	RXRPC_ACCEPT		s-- n/a		Accept new call
+	RXRPC_EXCLUSIVE_CALL	s-- n/a		Make an exclusive client call
+	RXRPC_UPGRADE_SERVICE	s-- n/a		Client call can be upgraded
+	RXRPC_TX_LENGTH		s-- data len	Total length of Tx data
+
+	(SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message)
+
+ (*) RXRPC_USER_CALL_ID
+
+     This is used to indicate the application's call ID.  It's an unsigned long
+     that the app specifies in the client by attaching it to the first data
+     message or in the server by passing it in association with an RXRPC_ACCEPT
+     message.  recvmsg() passes it in conjunction with all messages except
+     those of the RXRPC_NEW_CALL message.
+
+ (*) RXRPC_ABORT
+
+     This is can be used by an application to abort a call by passing it to
+     sendmsg, or it can be delivered by recvmsg to indicate a remote abort was
+     received.  Either way, it must be associated with an RXRPC_USER_CALL_ID to
+     specify the call affected.  If an abort is being sent, then error EBADSLT
+     will be returned if there is no call with that user ID.
+
+ (*) RXRPC_ACK
+
+     This is delivered to a server application to indicate that the final ACK
+     of a call was received from the client.  It will be associated with an
+     RXRPC_USER_CALL_ID to indicate the call that's now complete.
+
+ (*) RXRPC_NET_ERROR
+
+     This is delivered to an application to indicate that an ICMP error message
+     was encountered in the process of trying to talk to the peer.  An
+     errno-class integer value will be included in the control message data
+     indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
+     affected.
+
+ (*) RXRPC_BUSY
+
+     This is delivered to a client application to indicate that a call was
+     rejected by the server due to the server being busy.  It will be
+     associated with an RXRPC_USER_CALL_ID to indicate the rejected call.
+
+ (*) RXRPC_LOCAL_ERROR
+
+     This is delivered to an application to indicate that a local error was
+     encountered and that a call has been aborted because of it.  An
+     errno-class integer value will be included in the control message data
+     indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call
+     affected.
+
+ (*) RXRPC_NEW_CALL
+
+     This is delivered to indicate to a server application that a new call has
+     arrived and is awaiting acceptance.  No user ID is associated with this,
+     as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT.
+
+ (*) RXRPC_ACCEPT
+
+     This is used by a server application to attempt to accept a call and
+     assign it a user ID.  It should be associated with an RXRPC_USER_CALL_ID
+     to indicate the user ID to be assigned.  If there is no call to be
+     accepted (it may have timed out, been aborted, etc.), then sendmsg will
+     return error ENODATA.  If the user ID is already in use by another call,
+     then error EBADSLT will be returned.
+
+ (*) RXRPC_EXCLUSIVE_CALL
+
+     This is used to indicate that a client call should be made on a one-off
+     connection.  The connection is discarded once the call has terminated.
+
+ (*) RXRPC_UPGRADE_SERVICE
+
+     This is used to make a client call to probe if the specified service ID
+     may be upgraded by the server.  The caller must check msg_name returned to
+     recvmsg() for the service ID actually in use.  The operation probed must
+     be one that takes the same arguments in both services.
+
+     Once this has been used to establish the upgrade capability (or lack
+     thereof) of the server, the service ID returned should be used for all
+     future communication to that server and RXRPC_UPGRADE_SERVICE should no
+     longer be set.
+
+ (*) RXRPC_TX_LENGTH
+
+     This is used to inform the kernel of the total amount of data that is
+     going to be transmitted by a call (whether in a client request or a
+     service response).  If given, it allows the kernel to encrypt from the
+     userspace buffer directly to the packet buffers, rather than copying into
+     the buffer and then encrypting in place.  This may only be given with the
+     first sendmsg() providing data for a call.  EMSGSIZE will be generated if
+     the amount of data actually given is different.
+
+     This takes a parameter of __s64 type that indicates how much will be
+     transmitted.  This may not be less than zero.
+
+The symbol RXRPC__SUPPORTED is defined as one more than the highest control
+message type supported.  At run time this can be queried by means of the
+RXRPC_SUPPORTED_CMSG socket option (see below).
+
+
+==============
+SOCKET OPTIONS
+==============
+
+AF_RXRPC sockets support a few socket options at the SOL_RXRPC level:
+
+ (*) RXRPC_SECURITY_KEY
+
+     This is used to specify the description of the key to be used.  The key is
+     extracted from the calling process's keyrings with request_key() and
+     should be of "rxrpc" type.
+
+     The optval pointer points to the description string, and optlen indicates
+     how long the string is, without the NUL terminator.
+
+ (*) RXRPC_SECURITY_KEYRING
+
+     Similar to above but specifies a keyring of server secret keys to use (key
+     type "keyring").  See the "Security" section.
+
+ (*) RXRPC_EXCLUSIVE_CONNECTION
+
+     This is used to request that new connections should be used for each call
+     made subsequently on this socket.  optval should be NULL and optlen 0.
+
+ (*) RXRPC_MIN_SECURITY_LEVEL
+
+     This is used to specify the minimum security level required for calls on
+     this socket.  optval must point to an int containing one of the following
+     values:
+
+     (a) RXRPC_SECURITY_PLAIN
+
+	 Encrypted checksum only.
+
+     (b) RXRPC_SECURITY_AUTH
+
+	 Encrypted checksum plus packet padded and first eight bytes of packet
+	 encrypted - which includes the actual packet length.
+
+     (c) RXRPC_SECURITY_ENCRYPTED
+
+	 Encrypted checksum plus entire packet padded and encrypted, including
+	 actual packet length.
+
+ (*) RXRPC_UPGRADEABLE_SERVICE
+
+     This is used to indicate that a service socket with two bindings may
+     upgrade one bound service to the other if requested by the client.  optval
+     must point to an array of two unsigned short ints.  The first is the
+     service ID to upgrade from and the second the service ID to upgrade to.
+
+ (*) RXRPC_SUPPORTED_CMSG
+
+     This is a read-only option that writes an int into the buffer indicating
+     the highest control message type supported.
+
+
+========
+SECURITY
+========
+
+Currently, only the kerberos 4 equivalent protocol has been implemented
+(security index 2 - rxkad).  This requires the rxkad module to be loaded and,
+on the client, tickets of the appropriate type to be obtained from the AFS
+kaserver or the kerberos server and installed as "rxrpc" type keys.  This is
+normally done using the klog program.  An example simple klog program can be
+found at:
+
+	http://people.redhat.com/~dhowells/rxrpc/klog.c
+
+The payload provided to add_key() on the client should be of the following
+form:
+
+	struct rxrpc_key_sec2_v1 {
+		uint16_t	security_index;	/* 2 */
+		uint16_t	ticket_length;	/* length of ticket[] */
+		uint32_t	expiry;		/* time at which expires */
+		uint8_t		kvno;		/* key version number */
+		uint8_t		__pad[3];
+		uint8_t		session_key[8];	/* DES session key */
+		uint8_t		ticket[0];	/* the encrypted ticket */
+	};
+
+Where the ticket blob is just appended to the above structure.
+
+
+For the server, keys of type "rxrpc_s" must be made available to the server.
+They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an
+rxkad key for the AFS VL service).  When such a key is created, it should be
+given the server's secret key as the instantiation data (see the example
+below).
+
+	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
+
+A keyring is passed to the server socket by naming it in a sockopt.  The server
+socket then looks the server secret keys up in this keyring when secure
+incoming connections are made.  This can be seen in an example program that can
+be found at:
+
+	http://people.redhat.com/~dhowells/rxrpc/listen.c
+
+
+====================
+EXAMPLE CLIENT USAGE
+====================
+
+A client would issue an operation by:
+
+ (1) An RxRPC socket is set up by:
+
+	client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
+
+     Where the third parameter indicates the protocol family of the transport
+     socket used - usually IPv4 but it can also be IPv6 [TODO].
+
+ (2) A local address can optionally be bound:
+
+	struct sockaddr_rxrpc srx = {
+		.srx_family	= AF_RXRPC,
+		.srx_service	= 0,  /* we're a client */
+		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
+		.transport.sin_family	= AF_INET,
+		.transport.sin_port	= htons(7000), /* AFS callback */
+		.transport.sin_address	= 0,  /* all local interfaces */
+	};
+	bind(client, &srx, sizeof(srx));
+
+     This specifies the local UDP port to be used.  If not given, a random
+     non-privileged port will be used.  A UDP port may be shared between
+     several unrelated RxRPC sockets.  Security is handled on a basis of
+     per-RxRPC virtual connection.
+
+ (3) The security is set:
+
+	const char *key = "AFS:cambridge.redhat.com";
+	setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key));
+
+     This issues a request_key() to get the key representing the security
+     context.  The minimum security level can be set:
+
+	unsigned int sec = RXRPC_SECURITY_ENCRYPTED;
+	setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL,
+		   &sec, sizeof(sec));
+
+ (4) The server to be contacted can then be specified (alternatively this can
+     be done through sendmsg):
+
+	struct sockaddr_rxrpc srx = {
+		.srx_family	= AF_RXRPC,
+		.srx_service	= VL_SERVICE_ID,
+		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
+		.transport.sin_family	= AF_INET,
+		.transport.sin_port	= htons(7005), /* AFS volume manager */
+		.transport.sin_address	= ...,
+	};
+	connect(client, &srx, sizeof(srx));
+
+ (5) The request data should then be posted to the server socket using a series
+     of sendmsg() calls, each with the following control message attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+
+     MSG_MORE should be set in msghdr::msg_flags on all but the last part of
+     the request.  Multiple requests may be made simultaneously.
+
+     An RXRPC_TX_LENGTH control message can also be specified on the first
+     sendmsg() call.
+
+     If a call is intended to go to a destination other than the default
+     specified through connect(), then msghdr::msg_name should be set on the
+     first request message of that call.
+
+ (6) The reply data will then be posted to the server socket for recvmsg() to
+     pick up.  MSG_MORE will be flagged by recvmsg() if there's more reply data
+     for a particular call to be read.  MSG_EOR will be set on the terminal
+     read for a call.
+
+     All data will be delivered with the following control message attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+
+     If an abort or error occurred, this will be returned in the control data
+     buffer instead, and MSG_EOR will be flagged to indicate the end of that
+     call.
+
+A client may ask for a service ID it knows and ask that this be upgraded to a
+better service if one is available by supplying RXRPC_UPGRADE_SERVICE on the
+first sendmsg() of a call.  The client should then check srx_service in the
+msg_name filled in by recvmsg() when collecting the result.  srx_service will
+hold the same value as given to sendmsg() if the upgrade request was ignored by
+the service - otherwise it will be altered to indicate the service ID the
+server upgraded to.  Note that the upgraded service ID is chosen by the server.
+The caller has to wait until it sees the service ID in the reply before sending
+any more calls (further calls to the same destination will be blocked until the
+probe is concluded).
+
+
+====================
+EXAMPLE SERVER USAGE
+====================
+
+A server would be set up to accept operations in the following manner:
+
+ (1) An RxRPC socket is created by:
+
+	server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);
+
+     Where the third parameter indicates the address type of the transport
+     socket used - usually IPv4.
+
+ (2) Security is set up if desired by giving the socket a keyring with server
+     secret keys in it:
+
+	keyring = add_key("keyring", "AFSkeys", NULL, 0,
+			  KEY_SPEC_PROCESS_KEYRING);
+
+	const char secret_key[8] = {
+		0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 };
+	add_key("rxrpc_s", "52:2", secret_key, 8, keyring);
+
+	setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7);
+
+     The keyring can be manipulated after it has been given to the socket. This
+     permits the server to add more keys, replace keys, etc. whilst it is live.
+
+ (3) A local address must then be bound:
+
+	struct sockaddr_rxrpc srx = {
+		.srx_family	= AF_RXRPC,
+		.srx_service	= VL_SERVICE_ID, /* RxRPC service ID */
+		.transport_type	= SOCK_DGRAM,	/* type of transport socket */
+		.transport.sin_family	= AF_INET,
+		.transport.sin_port	= htons(7000), /* AFS callback */
+		.transport.sin_address	= 0,  /* all local interfaces */
+	};
+	bind(server, &srx, sizeof(srx));
+
+     More than one service ID may be bound to a socket, provided the transport
+     parameters are the same.  The limit is currently two.  To do this, bind()
+     should be called twice.
+
+ (4) If service upgrading is required, first two service IDs must have been
+     bound and then the following option must be set:
+
+	unsigned short service_ids[2] = { from_ID, to_ID };
+	setsockopt(server, SOL_RXRPC, RXRPC_UPGRADEABLE_SERVICE,
+		   service_ids, sizeof(service_ids));
+
+     This will automatically upgrade connections on service from_ID to service
+     to_ID if they request it.  This will be reflected in msg_name obtained
+     through recvmsg() when the request data is delivered to userspace.
+
+ (5) The server is then set to listen out for incoming calls:
+
+	listen(server, 100);
+
+ (6) The kernel notifies the server of pending incoming connections by sending
+     it a message for each.  This is received with recvmsg() on the server
+     socket.  It has no data, and has a single dataless control message
+     attached:
+
+	RXRPC_NEW_CALL
+
+     The address that can be passed back by recvmsg() at this point should be
+     ignored since the call for which the message was posted may have gone by
+     the time it is accepted - in which case the first call still on the queue
+     will be accepted.
+
+ (7) The server then accepts the new call by issuing a sendmsg() with two
+     pieces of control data and no actual data:
+
+	RXRPC_ACCEPT		- indicate connection acceptance
+	RXRPC_USER_CALL_ID	- specify user ID for this call
+
+ (8) The first request data packet will then be posted to the server socket for
+     recvmsg() to pick up.  At that point, the RxRPC address for the call can
+     be read from the address fields in the msghdr struct.
+
+     Subsequent request data will be posted to the server socket for recvmsg()
+     to collect as it arrives.  All but the last piece of the request data will
+     be delivered with MSG_MORE flagged.
+
+     All data will be delivered with the following control message attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+
+ (9) The reply data should then be posted to the server socket using a series
+     of sendmsg() calls, each with the following control messages attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+
+     MSG_MORE should be set in msghdr::msg_flags on all but the last message
+     for a particular call.
+
+(10) The final ACK from the client will be posted for retrieval by recvmsg()
+     when it is received.  It will take the form of a dataless message with two
+     control messages attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+	RXRPC_ACK		- indicates final ACK (no data)
+
+     MSG_EOR will be flagged to indicate that this is the final message for
+     this call.
+
+(11) Up to the point the final packet of reply data is sent, the call can be
+     aborted by calling sendmsg() with a dataless message with the following
+     control messages attached:
+
+	RXRPC_USER_CALL_ID	- specifies the user ID for this call
+	RXRPC_ABORT		- indicates abort code (4 byte data)
+
+     Any packets waiting in the socket's receive queue will be discarded if
+     this is issued.
+
+Note that all the communications for a particular service take place through
+the one server socket, using control messages on sendmsg() and recvmsg() to
+determine the call affected.
+
+
+=========================
+AF_RXRPC KERNEL INTERFACE
+=========================
+
+The AF_RXRPC module also provides an interface for use by in-kernel utilities
+such as the AFS filesystem.  This permits such a utility to:
+
+ (1) Use different keys directly on individual client calls on one socket
+     rather than having to open a whole slew of sockets, one for each key it
+     might want to use.
+
+ (2) Avoid having RxRPC call request_key() at the point of issue of a call or
+     opening of a socket.  Instead the utility is responsible for requesting a
+     key at the appropriate point.  AFS, for instance, would do this during VFS
+     operations such as open() or unlink().  The key is then handed through
+     when the call is initiated.
+
+ (3) Request the use of something other than GFP_KERNEL to allocate memory.
+
+ (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be
+     intercepted before they get put into the socket Rx queue and the socket
+     buffers manipulated directly.
+
+To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
+bind an address as appropriate and listen if it's to be a server socket, but
+then it passes this to the kernel interface functions.
+
+The kernel interface functions are as follows:
+
+ (*) Begin a new client call.
+
+	struct rxrpc_call *
+	rxrpc_kernel_begin_call(struct socket *sock,
+				struct sockaddr_rxrpc *srx,
+				struct key *key,
+				unsigned long user_call_ID,
+				s64 tx_total_len,
+				gfp_t gfp,
+				rxrpc_notify_rx_t notify_rx,
+				bool upgrade);
+
+     This allocates the infrastructure to make a new RxRPC call and assigns
+     call and connection numbers.  The call will be made on the UDP port that
+     the socket is bound to.  The call will go to the destination address of a
+     connected client socket unless an alternative is supplied (srx is
+     non-NULL).
+
+     If a key is supplied then this will be used to secure the call instead of
+     the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls
+     secured in this way will still share connections if at all possible.
+
+     The user_call_ID is equivalent to that supplied to sendmsg() in the
+     control data buffer.  It is entirely feasible to use this to point to a
+     kernel data structure.
+
+     tx_total_len is the amount of data the caller is intending to transmit
+     with this call (or -1 if unknown at this point).  Setting the data size
+     allows the kernel to encrypt directly to the packet buffers, thereby
+     saving a copy.  The value may not be less than -1.
+
+     notify_rx is a pointer to a function to be called when events such as
+     incoming data packets or remote aborts happen.
+
+     upgrade should be set to true if a client operation should request that
+     the server upgrade the service to a better one.  The resultant service ID
+     is returned by rxrpc_kernel_recv_data().
+
+     If this function is successful, an opaque reference to the RxRPC call is
+     returned.  The caller now holds a reference on this and it must be
+     properly ended.
+
+ (*) End a client call.
+
+	void rxrpc_kernel_end_call(struct socket *sock,
+				   struct rxrpc_call *call);
+
+     This is used to end a previously begun call.  The user_call_ID is expunged
+     from AF_RXRPC's knowledge and will not be seen again in association with
+     the specified call.
+
+ (*) Send data through a call.
+
+	typedef void (*rxrpc_notify_end_tx_t)(struct sock *sk,
+					      unsigned long user_call_ID,
+					      struct sk_buff *skb);
+
+	int rxrpc_kernel_send_data(struct socket *sock,
+				   struct rxrpc_call *call,
+				   struct msghdr *msg,
+				   size_t len,
+				   rxrpc_notify_end_tx_t notify_end_rx);
+
+     This is used to supply either the request part of a client call or the
+     reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the
+     data buffers to be used.  msg_iov may not be NULL and must point
+     exclusively to in-kernel virtual addresses.  msg.msg_flags may be given
+     MSG_MORE if there will be subsequent data sends for this call.
+
+     The msg must not specify a destination address, control data or any flags
+     other than MSG_MORE.  len is the total amount of data to transmit.
+
+     notify_end_rx can be NULL or it can be used to specify a function to be
+     called when the call changes state to end the Tx phase.  This function is
+     called with the call-state spinlock held to prevent any reply or final ACK
+     from being delivered first.
+
+ (*) Receive data from a call.
+
+	int rxrpc_kernel_recv_data(struct socket *sock,
+				   struct rxrpc_call *call,
+				   void *buf,
+				   size_t size,
+				   size_t *_offset,
+				   bool want_more,
+				   u32 *_abort,
+				   u16 *_service)
+
+      This is used to receive data from either the reply part of a client call
+      or the request part of a service call.  buf and size specify how much
+      data is desired and where to store it.  *_offset is added on to buf and
+      subtracted from size internally; the amount copied into the buffer is
+      added to *_offset before returning.
+
+      want_more should be true if further data will be required after this is
+      satisfied and false if this is the last item of the receive phase.
+
+      There are three normal returns: 0 if the buffer was filled and want_more
+      was true; 1 if the buffer was filled, the last DATA packet has been
+      emptied and want_more was false; and -EAGAIN if the function needs to be
+      called again.
+
+      If the last DATA packet is processed but the buffer contains less than
+      the amount requested, EBADMSG is returned.  If want_more wasn't set, but
+      more data was available, EMSGSIZE is returned.
+
+      If a remote ABORT is detected, the abort code received will be stored in
+      *_abort and ECONNABORTED will be returned.
+
+      The service ID that the call ended up with is returned into *_service.
+      This can be used to see if a call got a service upgrade.
+
+ (*) Abort a call.
+
+	void rxrpc_kernel_abort_call(struct socket *sock,
+				     struct rxrpc_call *call,
+				     u32 abort_code);
+
+     This is used to abort a call if it's still in an abortable state.  The
+     abort code specified will be placed in the ABORT message sent.
+
+ (*) Intercept received RxRPC messages.
+
+	typedef void (*rxrpc_interceptor_t)(struct sock *sk,
+					    unsigned long user_call_ID,
+					    struct sk_buff *skb);
+
+	void
+	rxrpc_kernel_intercept_rx_messages(struct socket *sock,
+					   rxrpc_interceptor_t interceptor);
+
+     This installs an interceptor function on the specified AF_RXRPC socket.
+     All messages that would otherwise wind up in the socket's Rx queue are
+     then diverted to this function.  Note that care must be taken to process
+     the messages in the right order to maintain DATA message sequentiality.
+
+     The interceptor function itself is provided with the address of the socket
+     and handling the incoming message, the ID assigned by the kernel utility
+     to the call and the socket buffer containing the message.
+
+     The skb->mark field indicates the type of message:
+
+	MARK				MEANING
+	===============================	=======================================
+	RXRPC_SKB_MARK_DATA		Data message
+	RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call
+	RXRPC_SKB_MARK_BUSY		Client call rejected as server busy
+	RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer
+	RXRPC_SKB_MARK_NET_ERROR	Network error detected
+	RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered
+	RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance
+
+     The remote abort message can be probed with rxrpc_kernel_get_abort_code().
+     The two error messages can be probed with rxrpc_kernel_get_error_number().
+     A new call can be accepted with rxrpc_kernel_accept_call().
+
+     Data messages can have their contents extracted with the usual bunch of
+     socket buffer manipulation functions.  A data message can be determined to
+     be the last one in a sequence with rxrpc_kernel_is_data_last().  When a
+     data message has been used up, rxrpc_kernel_data_consumed() should be
+     called on it.
+
+     Messages should be handled to rxrpc_kernel_free_skb() to dispose of.  It
+     is possible to get extra refs on all types of message for later freeing,
+     but this may pin the state of a call until the message is finally freed.
+
+ (*) Accept an incoming call.
+
+	struct rxrpc_call *
+	rxrpc_kernel_accept_call(struct socket *sock,
+				 unsigned long user_call_ID);
+
+     This is used to accept an incoming call and to assign it a call ID.  This
+     function is similar to rxrpc_kernel_begin_call() and calls accepted must
+     be ended in the same way.
+
+     If this function is successful, an opaque reference to the RxRPC call is
+     returned.  The caller now holds a reference on this and it must be
+     properly ended.
+
+ (*) Reject an incoming call.
+
+	int rxrpc_kernel_reject_call(struct socket *sock);
+
+     This is used to reject the first incoming call on the socket's queue with
+     a BUSY message.  -ENODATA is returned if there were no incoming calls.
+     Other errors may be returned if the call had been aborted (-ECONNABORTED)
+     or had timed out (-ETIME).
+
+ (*) Allocate a null key for doing anonymous security.
+
+	struct key *rxrpc_get_null_key(const char *keyname);
+
+     This is used to allocate a null RxRPC key that can be used to indicate
+     anonymous security for a particular domain.
+
+ (*) Get the peer address of a call.
+
+	void rxrpc_kernel_get_peer(struct socket *sock, struct rxrpc_call *call,
+				   struct sockaddr_rxrpc *_srx);
+
+     This is used to find the remote peer address of a call.
+
+ (*) Set the total transmit data size on a call.
+
+	void rxrpc_kernel_set_tx_length(struct socket *sock,
+					struct rxrpc_call *call,
+					s64 tx_total_len);
+
+     This sets the amount of data that the caller is intending to transmit on a
+     call.  It's intended to be used for setting the reply size as the request
+     size should be set when the call is begun.  tx_total_len may not be less
+     than zero.
+
+ (*) Check to see the completion state of a call so that the caller can assess
+     whether it needs to be retried.
+
+	enum rxrpc_call_completion {
+		RXRPC_CALL_SUCCEEDED,
+		RXRPC_CALL_REMOTELY_ABORTED,
+		RXRPC_CALL_LOCALLY_ABORTED,
+		RXRPC_CALL_LOCAL_ERROR,
+		RXRPC_CALL_NETWORK_ERROR,
+	};
+
+	int rxrpc_kernel_check_call(struct socket *sock, struct rxrpc_call *call,
+				    enum rxrpc_call_completion *_compl,
+				    u32 *_abort_code);
+
+     On return, -EINPROGRESS will be returned if the call is still ongoing; if
+     it is finished, *_compl will be set to indicate the manner of completion,
+     *_abort_code will be set to any abort code that occurred.  0 will be
+     returned on a successful completion, -ECONNABORTED will be returned if the
+     client failed due to a remote abort and anything else will return an
+     appropriate error code.
+
+     The caller should look at this information to decide if it's worth
+     retrying the call.
+
+ (*) Retry a client call.
+
+	int rxrpc_kernel_retry_call(struct socket *sock,
+				    struct rxrpc_call *call,
+				    struct sockaddr_rxrpc *srx,
+				    struct key *key);
+
+     This attempts to partially reinitialise a call and submit it again whilst
+     reusing the original call's Tx queue to avoid the need to repackage and
+     re-encrypt the data to be sent.  call indicates the call to retry, srx the
+     new address to send it to and key the encryption key to use for signing or
+     encrypting the packets.
+
+     For this to work, the first Tx data packet must still be in the transmit
+     queue, and currently this is only permitted for local and network errors
+     and the call must not have been aborted.  Any partially constructed Tx
+     packet is left as is and can continue being filled afterwards.
+
+     It returns 0 if the call was requeued and an error otherwise.
+
+ (*) Get call RTT.
+
+	u64 rxrpc_kernel_get_rtt(struct socket *sock, struct rxrpc_call *call);
+
+     Get the RTT time to the peer in use by a call.  The value returned is in
+     nanoseconds.
+
+ (*) Check call still alive.
+
+	u32 rxrpc_kernel_check_life(struct socket *sock,
+				    struct rxrpc_call *call);
+
+     This returns a number that is updated when ACKs are received from the peer
+     (notably including PING RESPONSE ACKs which we can elicit by sending PING
+     ACKs to see if the call still exists on the server).  The caller should
+     compare the numbers of two calls to see if the call is still alive after
+     waiting for a suitable interval.
+
+     This allows the caller to work out if the server is still contactable and
+     if the call is still alive on the server whilst waiting for the server to
+     process a client operation.
+
+     This function may transmit a PING ACK.
+
+
+=======================
+CONFIGURABLE PARAMETERS
+=======================
+
+The RxRPC protocol driver has a number of configurable parameters that can be
+adjusted through sysctls in /proc/net/rxrpc/:
+
+ (*) req_ack_delay
+
+     The amount of time in milliseconds after receiving a packet with the
+     request-ack flag set before we honour the flag and actually send the
+     requested ack.
+
+     Usually the other side won't stop sending packets until the advertised
+     reception window is full (to a maximum of 255 packets), so delaying the
+     ACK permits several packets to be ACK'd in one go.
+
+ (*) soft_ack_delay
+
+     The amount of time in milliseconds after receiving a new packet before we
+     generate a soft-ACK to tell the sender that it doesn't need to resend.
+
+ (*) idle_ack_delay
+
+     The amount of time in milliseconds after all the packets currently in the
+     received queue have been consumed before we generate a hard-ACK to tell
+     the sender it can free its buffers, assuming no other reason occurs that
+     we would send an ACK.
+
+ (*) resend_timeout
+
+     The amount of time in milliseconds after transmitting a packet before we
+     transmit it again, assuming no ACK is received from the receiver telling
+     us they got it.
+
+ (*) max_call_lifetime
+
+     The maximum amount of time in seconds that a call may be in progress
+     before we preemptively kill it.
+
+ (*) dead_call_expiry
+
+     The amount of time in seconds before we remove a dead call from the call
+     list.  Dead calls are kept around for a little while for the purpose of
+     repeating ACK and ABORT packets.
+
+ (*) connection_expiry
+
+     The amount of time in seconds after a connection was last used before we
+     remove it from the connection list.  Whilst a connection is in existence,
+     it serves as a placeholder for negotiated security; when it is deleted,
+     the security must be renegotiated.
+
+ (*) transport_expiry
+
+     The amount of time in seconds after a transport was last used before we
+     remove it from the transport list.  Whilst a transport is in existence, it
+     serves to anchor the peer data and keeps the connection ID counter.
+
+ (*) rxrpc_rx_window_size
+
+     The size of the receive window in packets.  This is the maximum number of
+     unconsumed received packets we're willing to hold in memory for any
+     particular call.
+
+ (*) rxrpc_rx_mtu
+
+     The maximum packet MTU size that we're willing to receive in bytes.  This
+     indicates to the peer whether we're willing to accept jumbo packets.
+
+ (*) rxrpc_rx_jumbo_max
+
+     The maximum number of packets that we're willing to accept in a jumbo
+     packet.  Non-terminal packets in a jumbo packet must contain a four byte
+     header plus exactly 1412 bytes of data.  The terminal packet must contain
+     a four byte header plus any amount of data.  In any event, a jumbo packet
+     may not exceed rxrpc_rx_mtu in size.
diff --git a/Documentation/networking/s2io.txt b/Documentation/networking/s2io.txt
new file mode 100644
index 000000000..0362a42f7
--- /dev/null
+++ b/Documentation/networking/s2io.txt
@@ -0,0 +1,141 @@
+Release notes for Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver.
+
+Contents
+=======
+- 1.  Introduction
+- 2.  Identifying the adapter/interface
+- 3.  Features supported
+- 4.  Command line parameters
+- 5.  Performance suggestions
+- 6.  Available Downloads 
+
+
+1.	Introduction:
+This Linux driver supports Neterion's Xframe I PCI-X 1.0 and
+Xframe II PCI-X 2.0 adapters. It supports several features 
+such as jumbo frames, MSI/MSI-X, checksum offloads, TSO, UFO and so on.
+See below for complete list of features.
+All features are supported for both IPv4 and IPv6.
+
+2.	Identifying the adapter/interface:
+a. Insert the adapter(s) in your system.
+b. Build and load driver 
+# insmod s2io.ko
+c. View log messages
+# dmesg | tail -40
+You will see messages similar to:
+eth3: Neterion Xframe I 10GbE adapter (rev 3), Version 2.0.9.1, Intr type INTA
+eth4: Neterion Xframe II 10GbE adapter (rev 2), Version 2.0.9.1, Intr type INTA
+eth4: Device is on 64 bit 133MHz PCIX(M1) bus
+
+The above messages identify the adapter type(Xframe I/II), adapter revision,
+driver version, interface name(eth3, eth4), Interrupt type(INTA, MSI, MSI-X).
+In case of Xframe II, the PCI/PCI-X bus width and frequency are displayed
+as well.
+
+To associate an interface with a physical adapter use "ethtool -p <ethX>".
+The corresponding adapter's LED will blink multiple times.
+
+3.	Features supported:
+a. Jumbo frames. Xframe I/II supports MTU up to 9600 bytes,
+modifiable using ip command.
+
+b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit
+and receive, TSO.
+
+c. Multi-buffer receive mode. Scattering of packet across multiple
+buffers. Currently driver supports 2-buffer mode which yields
+significant performance improvement on certain platforms(SGI Altix,
+IBM xSeries).
+
+d. MSI/MSI-X. Can be enabled on platforms which support this feature
+(IA64, Xeon) resulting in noticeable performance improvement(up to 7%
+on certain platforms).
+
+e. Statistics. Comprehensive MAC-level and software statistics displayed
+using "ethtool -S" option.
+
+f. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings,
+with multiple steering options.
+
+4.  Command line parameters
+a. tx_fifo_num
+Number of transmit queues
+Valid range: 1-8
+Default: 1
+
+b. rx_ring_num
+Number of receive rings
+Valid range: 1-8
+Default: 1
+
+c. tx_fifo_len
+Size of each transmit queue
+Valid range: Total length of all queues should not exceed 8192
+Default: 4096
+
+d. rx_ring_sz 
+Size of each receive ring(in 4K blocks)
+Valid range: Limited by memory on system
+Default: 30 
+
+e. intr_type
+Specifies interrupt type. Possible values 0(INTA), 2(MSI-X)
+Valid values: 0, 2
+Default: 2
+
+5.  Performance suggestions
+General:
+a. Set MTU to maximum(9000 for switch setup, 9600 in back-to-back configuration)
+b. Set TCP windows size to optimal value. 
+For instance, for MTU=1500 a value of 210K has been observed to result in 
+good performance.
+# sysctl -w net.ipv4.tcp_rmem="210000 210000 210000"
+# sysctl -w net.ipv4.tcp_wmem="210000 210000 210000"
+For MTU=9000, TCP window size of 10 MB is recommended.
+# sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
+# sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"
+
+Transmit performance:
+a. By default, the driver respects BIOS settings for PCI bus parameters. 
+However, you may want to experiment with PCI bus parameters 
+max-split-transactions(MOST) and MMRBC (use setpci command). 
+A MOST value of 2 has been found optimal for Opterons and 3 for Itanium.  
+It could be different for your hardware.  
+Set MMRBC to 4K**.
+
+For example you can set 
+For opteron
+#setpci -d 17d5:* 62=1d 
+For Itanium
+#setpci -d 17d5:* 62=3d 
+
+For detailed description of the PCI registers, please see Xframe User Guide.
+
+b. Ensure Transmit Checksum offload is enabled. Use ethtool to set/verify this 
+parameter.
+c. Turn on TSO(using "ethtool -K")
+# ethtool -K <ethX> tso on
+
+Receive performance:
+a. By default, the driver respects BIOS settings for PCI bus parameters. 
+However, you may want to set PCI latency timer to 248.
+#setpci -d 17d5:* LATENCY_TIMER=f8
+For detailed description of the PCI registers, please see Xframe User Guide.
+b. Use 2-buffer mode. This results in large performance boost on
+certain platforms(eg. SGI Altix, IBM xSeries).
+c. Ensure Receive Checksum offload is enabled. Use "ethtool -K ethX" command to 
+set/verify this option.
+d. Enable NAPI feature(in kernel configuration Device Drivers ---> Network 
+device support --->  Ethernet (10000 Mbit) ---> S2IO 10Gbe Xframe NIC) to 
+bring down CPU utilization.
+
+** For AMD opteron platforms with 8131 chipset, MMRBC=1 and MOST=1 are 
+recommended as safe parameters.
+For more information, please review the AMD8131 errata at
+http://vip.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/
+26310_AMD-8131_HyperTransport_PCI-X_Tunnel_Revision_Guide_rev_3_18.pdf
+
+6. Support
+For further support please contact either your 10GbE Xframe NIC vendor (IBM, 
+HP, SGI etc.)
diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
new file mode 100644
index 000000000..b7056a8a0
--- /dev/null
+++ b/Documentation/networking/scaling.txt
@@ -0,0 +1,484 @@
+Scaling in the Linux Networking Stack
+
+
+Introduction
+============
+
+This document describes a set of complementary techniques in the Linux
+networking stack to increase parallelism and improve performance for
+multi-processor systems.
+
+The following technologies are described:
+
+  RSS: Receive Side Scaling
+  RPS: Receive Packet Steering
+  RFS: Receive Flow Steering
+  Accelerated Receive Flow Steering
+  XPS: Transmit Packet Steering
+
+
+RSS: Receive Side Scaling
+=========================
+
+Contemporary NICs support multiple receive and transmit descriptor queues
+(multi-queue). On reception, a NIC can send different packets to different
+queues to distribute processing among CPUs. The NIC distributes packets by
+applying a filter to each packet that assigns it to one of a small number
+of logical flows. Packets for each flow are steered to a separate receive
+queue, which in turn can be processed by separate CPUs. This mechanism is
+generally known as “Receive-side Scaling” (RSS). The goal of RSS and
+the other scaling techniques is to increase performance uniformly.
+Multi-queue distribution can also be used for traffic prioritization, but
+that is not the focus of these techniques.
+
+The filter used in RSS is typically a hash function over the network
+and/or transport layer headers-- for example, a 4-tuple hash over
+IP addresses and TCP ports of a packet. The most common hardware
+implementation of RSS uses a 128-entry indirection table where each entry
+stores a queue number. The receive queue for a packet is determined
+by masking out the low order seven bits of the computed hash for the
+packet (usually a Toeplitz hash), taking this number as a key into the
+indirection table and reading the corresponding value.
+
+Some advanced NICs allow steering packets to queues based on
+programmable filters. For example, webserver bound TCP port 80 packets
+can be directed to their own receive queue. Such “n-tuple” filters can
+be configured from ethtool (--config-ntuple).
+
+==== RSS Configuration
+
+The driver for a multi-queue capable NIC typically provides a kernel
+module parameter for specifying the number of hardware queues to
+configure. In the bnx2x driver, for instance, this parameter is called
+num_queues. A typical RSS configuration would be to have one receive queue
+for each CPU if the device supports enough queues, or otherwise at least
+one for each memory domain, where a memory domain is a set of CPUs that
+share a particular memory level (L1, L2, NUMA node, etc.).
+
+The indirection table of an RSS device, which resolves a queue by masked
+hash, is usually programmed by the driver at initialization. The
+default mapping is to distribute the queues evenly in the table, but the
+indirection table can be retrieved and modified at runtime using ethtool
+commands (--show-rxfh-indir and --set-rxfh-indir). Modifying the
+indirection table could be done to give different queues different
+relative weights.
+
+== RSS IRQ Configuration
+
+Each receive queue has a separate IRQ associated with it. The NIC triggers
+this to notify a CPU when new packets arrive on the given queue. The
+signaling path for PCIe devices uses message signaled interrupts (MSI-X),
+that can route each interrupt to a particular CPU. The active mapping
+of queues to IRQs can be determined from /proc/interrupts. By default,
+an IRQ may be handled on any CPU. Because a non-negligible part of packet
+processing takes place in receive interrupt handling, it is advantageous
+to spread receive interrupts between CPUs. To manually adjust the IRQ
+affinity of each interrupt see Documentation/IRQ-affinity.txt. Some systems
+will be running irqbalance, a daemon that dynamically optimizes IRQ
+assignments and as a result may override any manual settings.
+
+== Suggested Configuration
+
+RSS should be enabled when latency is a concern or whenever receive
+interrupt processing forms a bottleneck. Spreading load between CPUs
+decreases queue length. For low latency networking, the optimal setting
+is to allocate as many queues as there are CPUs in the system (or the
+NIC maximum, if lower). The most efficient high-rate configuration
+is likely the one with the smallest number of receive queues where no
+receive queue overflows due to a saturated CPU, because in default
+mode with interrupt coalescing enabled, the aggregate number of
+interrupts (and thus work) grows with each additional queue.
+
+Per-cpu load can be observed using the mpstat utility, but note that on
+processors with hyperthreading (HT), each hyperthread is represented as
+a separate CPU. For interrupt handling, HT has shown no benefit in
+initial tests, so limit the number of queues to the number of CPU cores
+in the system.
+
+
+RPS: Receive Packet Steering
+============================
+
+Receive Packet Steering (RPS) is logically a software implementation of
+RSS. Being in software, it is necessarily called later in the datapath.
+Whereas RSS selects the queue and hence CPU that will run the hardware
+interrupt handler, RPS selects the CPU to perform protocol processing
+above the interrupt handler. This is accomplished by placing the packet
+on the desired CPU’s backlog queue and waking up the CPU for processing.
+RPS has some advantages over RSS: 1) it can be used with any NIC,
+2) software filters can easily be added to hash over new protocols,
+3) it does not increase hardware device interrupt rate (although it does
+introduce inter-processor interrupts (IPIs)).
+
+RPS is called during bottom half of the receive interrupt handler, when
+a driver sends a packet up the network stack with netif_rx() or
+netif_receive_skb(). These call the get_rps_cpu() function, which
+selects the queue that should process a packet.
+
+The first step in determining the target CPU for RPS is to calculate a
+flow hash over the packet’s addresses or ports (2-tuple or 4-tuple hash
+depending on the protocol). This serves as a consistent hash of the
+associated flow of the packet. The hash is either provided by hardware
+or will be computed in the stack. Capable hardware can pass the hash in
+the receive descriptor for the packet; this would usually be the same
+hash used for RSS (e.g. computed Toeplitz hash). The hash is saved in
+skb->hash and can be used elsewhere in the stack as a hash of the
+packet’s flow.
+
+Each receive hardware queue has an associated list of CPUs to which
+RPS may enqueue packets for processing. For each received packet,
+an index into the list is computed from the flow hash modulo the size
+of the list. The indexed CPU is the target for processing the packet,
+and the packet is queued to the tail of that CPU’s backlog queue. At
+the end of the bottom half routine, IPIs are sent to any CPUs for which
+packets have been queued to their backlog queue. The IPI wakes backlog
+processing on the remote CPU, and any queued packets are then processed
+up the networking stack.
+
+==== RPS Configuration
+
+RPS requires a kernel compiled with the CONFIG_RPS kconfig symbol (on
+by default for SMP). Even when compiled in, RPS remains disabled until
+explicitly configured. The list of CPUs to which RPS may forward traffic
+can be configured for each receive queue using a sysfs file entry:
+
+ /sys/class/net/<dev>/queues/rx-<n>/rps_cpus
+
+This file implements a bitmap of CPUs. RPS is disabled when it is zero
+(the default), in which case packets are processed on the interrupting
+CPU. Documentation/IRQ-affinity.txt explains how CPUs are assigned to
+the bitmap.
+
+== Suggested Configuration
+
+For a single queue device, a typical RPS configuration would be to set
+the rps_cpus to the CPUs in the same memory domain of the interrupting
+CPU. If NUMA locality is not an issue, this could also be all CPUs in
+the system. At high interrupt rate, it might be wise to exclude the
+interrupting CPU from the map since that already performs much work.
+
+For a multi-queue system, if RSS is configured so that a hardware
+receive queue is mapped to each CPU, then RPS is probably redundant
+and unnecessary. If there are fewer hardware queues than CPUs, then
+RPS might be beneficial if the rps_cpus for each queue are the ones that
+share the same memory domain as the interrupting CPU for that queue.
+
+==== RPS Flow Limit
+
+RPS scales kernel receive processing across CPUs without introducing
+reordering. The trade-off to sending all packets from the same flow
+to the same CPU is CPU load imbalance if flows vary in packet rate.
+In the extreme case a single flow dominates traffic. Especially on
+common server workloads with many concurrent connections, such
+behavior indicates a problem such as a misconfiguration or spoofed
+source Denial of Service attack.
+
+Flow Limit is an optional RPS feature that prioritizes small flows
+during CPU contention by dropping packets from large flows slightly
+ahead of those from small flows. It is active only when an RPS or RFS
+destination CPU approaches saturation.  Once a CPU's input packet
+queue exceeds half the maximum queue length (as set by sysctl
+net.core.netdev_max_backlog), the kernel starts a per-flow packet
+count over the last 256 packets. If a flow exceeds a set ratio (by
+default, half) of these packets when a new packet arrives, then the
+new packet is dropped. Packets from other flows are still only
+dropped once the input packet queue reaches netdev_max_backlog.
+No packets are dropped when the input packet queue length is below
+the threshold, so flow limit does not sever connections outright:
+even large flows maintain connectivity.
+
+== Interface
+
+Flow limit is compiled in by default (CONFIG_NET_FLOW_LIMIT), but not
+turned on. It is implemented for each CPU independently (to avoid lock
+and cache contention) and toggled per CPU by setting the relevant bit
+in sysctl net.core.flow_limit_cpu_bitmap. It exposes the same CPU
+bitmap interface as rps_cpus (see above) when called from procfs:
+
+ /proc/sys/net/core/flow_limit_cpu_bitmap
+
+Per-flow rate is calculated by hashing each packet into a hashtable
+bucket and incrementing a per-bucket counter. The hash function is
+the same that selects a CPU in RPS, but as the number of buckets can
+be much larger than the number of CPUs, flow limit has finer-grained
+identification of large flows and fewer false positives. The default
+table has 4096 buckets. This value can be modified through sysctl
+
+ net.core.flow_limit_table_len
+
+The value is only consulted when a new table is allocated. Modifying
+it does not update active tables.
+
+== Suggested Configuration
+
+Flow limit is useful on systems with many concurrent connections,
+where a single connection taking up 50% of a CPU indicates a problem.
+In such environments, enable the feature on all CPUs that handle
+network rx interrupts (as set in /proc/irq/N/smp_affinity).
+
+The feature depends on the input packet queue length to exceed
+the flow limit threshold (50%) + the flow history length (256).
+Setting net.core.netdev_max_backlog to either 1000 or 10000
+performed well in experiments.
+
+
+RFS: Receive Flow Steering
+==========================
+
+While RPS steers packets solely based on hash, and thus generally
+provides good load distribution, it does not take into account
+application locality. This is accomplished by Receive Flow Steering
+(RFS). The goal of RFS is to increase datacache hitrate by steering
+kernel processing of packets to the CPU where the application thread
+consuming the packet is running. RFS relies on the same RPS mechanisms
+to enqueue packets onto the backlog of another CPU and to wake up that
+CPU.
+
+In RFS, packets are not forwarded directly by the value of their hash,
+but the hash is used as index into a flow lookup table. This table maps
+flows to the CPUs where those flows are being processed. The flow hash
+(see RPS section above) is used to calculate the index into this table.
+The CPU recorded in each entry is the one which last processed the flow.
+If an entry does not hold a valid CPU, then packets mapped to that entry
+are steered using plain RPS. Multiple table entries may point to the
+same CPU. Indeed, with many flows and few CPUs, it is very likely that
+a single application thread handles flows with many different flow hashes.
+
+rps_sock_flow_table is a global flow table that contains the *desired* CPU
+for flows: the CPU that is currently processing the flow in userspace.
+Each table value is a CPU index that is updated during calls to recvmsg
+and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
+and tcp_splice_read()).
+
+When the scheduler moves a thread to a new CPU while it has outstanding
+receive packets on the old CPU, packets may arrive out of order. To
+avoid this, RFS uses a second flow table to track outstanding packets
+for each flow: rps_dev_flow_table is a table specific to each hardware
+receive queue of each device. Each table value stores a CPU index and a
+counter. The CPU index represents the *current* CPU onto which packets
+for this flow are enqueued for further kernel processing. Ideally, kernel
+and userspace processing occur on the same CPU, and hence the CPU index
+in both tables is identical. This is likely false if the scheduler has
+recently migrated a userspace thread while the kernel still has packets
+enqueued for kernel processing on the old CPU.
+
+The counter in rps_dev_flow_table values records the length of the current
+CPU's backlog when a packet in this flow was last enqueued. Each backlog
+queue has a head counter that is incremented on dequeue. A tail counter
+is computed as head counter + queue length. In other words, the counter
+in rps_dev_flow[i] records the last element in flow i that has
+been enqueued onto the currently designated CPU for flow i (of course,
+entry i is actually selected by hash and multiple flows may hash to the
+same entry i).
+
+And now the trick for avoiding out of order packets: when selecting the
+CPU for packet processing (from get_rps_cpu()) the rps_sock_flow table
+and the rps_dev_flow table of the queue that the packet was received on
+are compared. If the desired CPU for the flow (found in the
+rps_sock_flow table) matches the current CPU (found in the rps_dev_flow
+table), the packet is enqueued onto that CPU’s backlog. If they differ,
+the current CPU is updated to match the desired CPU if one of the
+following is true:
+
+- The current CPU's queue head counter >= the recorded tail counter
+  value in rps_dev_flow[i]
+- The current CPU is unset (>= nr_cpu_ids)
+- The current CPU is offline
+
+After this check, the packet is sent to the (possibly updated) current
+CPU. These rules aim to ensure that a flow only moves to a new CPU when
+there are no packets outstanding on the old CPU, as the outstanding
+packets could arrive later than those about to be processed on the new
+CPU.
+
+==== RFS Configuration
+
+RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on
+by default for SMP). The functionality remains disabled until explicitly
+configured. The number of entries in the global flow table is set through:
+
+ /proc/sys/net/core/rps_sock_flow_entries
+
+The number of entries in the per-queue flow table are set through:
+
+ /sys/class/net/<dev>/queues/rx-<n>/rps_flow_cnt
+
+== Suggested Configuration
+
+Both of these need to be set before RFS is enabled for a receive queue.
+Values for both are rounded up to the nearest power of two. The
+suggested flow count depends on the expected number of active connections
+at any given time, which may be significantly less than the number of open
+connections. We have found that a value of 32768 for rps_sock_flow_entries
+works fairly well on a moderately loaded server.
+
+For a single queue device, the rps_flow_cnt value for the single queue
+would normally be configured to the same value as rps_sock_flow_entries.
+For a multi-queue device, the rps_flow_cnt for each queue might be
+configured as rps_sock_flow_entries / N, where N is the number of
+queues. So for instance, if rps_sock_flow_entries is set to 32768 and there
+are 16 configured receive queues, rps_flow_cnt for each queue might be
+configured as 2048.
+
+
+Accelerated RFS
+===============
+
+Accelerated RFS is to RFS what RSS is to RPS: a hardware-accelerated load
+balancing mechanism that uses soft state to steer flows based on where
+the application thread consuming the packets of each flow is running.
+Accelerated RFS should perform better than RFS since packets are sent
+directly to a CPU local to the thread consuming the data. The target CPU
+will either be the same CPU where the application runs, or at least a CPU
+which is local to the application thread’s CPU in the cache hierarchy.
+
+To enable accelerated RFS, the networking stack calls the
+ndo_rx_flow_steer driver function to communicate the desired hardware
+queue for packets matching a particular flow. The network stack
+automatically calls this function every time a flow entry in
+rps_dev_flow_table is updated. The driver in turn uses a device specific
+method to program the NIC to steer the packets.
+
+The hardware queue for a flow is derived from the CPU recorded in
+rps_dev_flow_table. The stack consults a CPU to hardware queue map which
+is maintained by the NIC driver. This is an auto-generated reverse map of
+the IRQ affinity table shown by /proc/interrupts. Drivers can use
+functions in the cpu_rmap (“CPU affinity reverse map”) kernel library
+to populate the map. For each CPU, the corresponding queue in the map is
+set to be one whose processing CPU is closest in cache locality.
+
+==== Accelerated RFS Configuration
+
+Accelerated RFS is only available if the kernel is compiled with
+CONFIG_RFS_ACCEL and support is provided by the NIC device and driver.
+It also requires that ntuple filtering is enabled via ethtool. The map
+of CPU to queues is automatically deduced from the IRQ affinities
+configured for each receive queue by the driver, so no additional
+configuration should be necessary.
+
+== Suggested Configuration
+
+This technique should be enabled whenever one wants to use RFS and the
+NIC supports hardware acceleration.
+
+XPS: Transmit Packet Steering
+=============================
+
+Transmit Packet Steering is a mechanism for intelligently selecting
+which transmit queue to use when transmitting a packet on a multi-queue
+device. This can be accomplished by recording two kinds of maps, either
+a mapping of CPU to hardware queue(s) or a mapping of receive queue(s)
+to hardware transmit queue(s).
+
+1. XPS using CPUs map
+
+The goal of this mapping is usually to assign queues
+exclusively to a subset of CPUs, where the transmit completions for
+these queues are processed on a CPU within this set. This choice
+provides two benefits. First, contention on the device queue lock is
+significantly reduced since fewer CPUs contend for the same queue
+(contention can be eliminated completely if each CPU has its own
+transmit queue). Secondly, cache miss rate on transmit completion is
+reduced, in particular for data cache lines that hold the sk_buff
+structures.
+
+2. XPS using receive queues map
+
+This mapping is used to pick transmit queue based on the receive
+queue(s) map configuration set by the administrator. A set of receive
+queues can be mapped to a set of transmit queues (many:many), although
+the common use case is a 1:1 mapping. This will enable sending packets
+on the same queue associations for transmit and receive. This is useful for
+busy polling multi-threaded workloads where there are challenges in
+associating a given CPU to a given application thread. The application
+threads are not pinned to CPUs and each thread handles packets
+received on a single queue. The receive queue number is cached in the
+socket for the connection. In this model, sending the packets on the same
+transmit queue corresponding to the associated receive queue has benefits
+in keeping the CPU overhead low. Transmit completion work is locked into
+the same queue-association that a given application is polling on. This
+avoids the overhead of triggering an interrupt on another CPU. When the
+application cleans up the packets during the busy poll, transmit completion
+may be processed along with it in the same thread context and so result in
+reduced latency.
+
+XPS is configured per transmit queue by setting a bitmap of
+CPUs/receive-queues that may use that queue to transmit. The reverse
+mapping, from CPUs to transmit queues or from receive-queues to transmit
+queues, is computed and maintained for each network device. When
+transmitting the first packet in a flow, the function get_xps_queue() is
+called to select a queue. This function uses the ID of the receive queue
+for the socket connection for a match in the receive queue-to-transmit queue
+lookup table. Alternatively, this function can also use the ID of the
+running CPU as a key into the CPU-to-queue lookup table. If the
+ID matches a single queue, that is used for transmission. If multiple
+queues match, one is selected by using the flow hash to compute an index
+into the set. When selecting the transmit queue based on receive queue(s)
+map, the transmit device is not validated against the receive device as it
+requires expensive lookup operation in the datapath.
+
+The queue chosen for transmitting a particular flow is saved in the
+corresponding socket structure for the flow (e.g. a TCP connection).
+This transmit queue is used for subsequent packets sent on the flow to
+prevent out of order (ooo) packets. The choice also amortizes the cost
+of calling get_xps_queues() over all packets in the flow. To avoid
+ooo packets, the queue for a flow can subsequently only be changed if
+skb->ooo_okay is set for a packet in the flow. This flag indicates that
+there are no outstanding packets in the flow, so the transmit queue can
+change without the risk of generating out of order packets. The
+transport layer is responsible for setting ooo_okay appropriately. TCP,
+for instance, sets the flag when all data for a connection has been
+acknowledged.
+
+==== XPS Configuration
+
+XPS is only available if the kconfig symbol CONFIG_XPS is enabled (on by
+default for SMP). The functionality remains disabled until explicitly
+configured. To enable XPS, the bitmap of CPUs/receive-queues that may
+use a transmit queue is configured using the sysfs file entry:
+
+For selection based on CPUs map:
+/sys/class/net/<dev>/queues/tx-<n>/xps_cpus
+
+For selection based on receive-queues map:
+/sys/class/net/<dev>/queues/tx-<n>/xps_rxqs
+
+== Suggested Configuration
+
+For a network device with a single transmission queue, XPS configuration
+has no effect, since there is no choice in this case. In a multi-queue
+system, XPS is preferably configured so that each CPU maps onto one queue.
+If there are as many queues as there are CPUs in the system, then each
+queue can also map onto one CPU, resulting in exclusive pairings that
+experience no contention. If there are fewer queues than CPUs, then the
+best CPUs to share a given queue are probably those that share the cache
+with the CPU that processes transmit completions for that queue
+(transmit interrupts).
+
+For transmit queue selection based on receive queue(s), XPS has to be
+explicitly configured mapping receive-queue(s) to transmit queue(s). If the
+user configuration for receive-queue map does not apply, then the transmit
+queue is selected based on the CPUs map.
+
+Per TX Queue rate limitation:
+=============================
+
+These are rate-limitation mechanisms implemented by HW, where currently
+a max-rate attribute is supported, by setting a Mbps value to
+
+/sys/class/net/<dev>/queues/tx-<n>/tx_maxrate
+
+A value of zero means disabled, and this is the default.
+
+Further Information
+===================
+RPS and RFS were introduced in kernel 2.6.35. XPS was incorporated into
+2.6.38. Original patches were submitted by Tom Herbert
+(therbert@google.com)
+
+Accelerated RFS was introduced in 2.6.35. Original patches were
+submitted by Ben Hutchings (bwh@kernel.org)
+
+Authors:
+Tom Herbert (therbert@google.com)
+Willem de Bruijn (willemb@google.com)
diff --git a/Documentation/networking/sctp.txt b/Documentation/networking/sctp.txt
new file mode 100644
index 000000000..97b810ca9
--- /dev/null
+++ b/Documentation/networking/sctp.txt
@@ -0,0 +1,35 @@
+Linux Kernel SCTP 
+
+This is the current BETA release of the Linux Kernel SCTP reference
+implementation.  
+
+SCTP (Stream Control Transmission Protocol) is a IP based, message oriented,
+reliable transport protocol, with congestion control, support for
+transparent multi-homing, and multiple ordered streams of messages.
+RFC2960 defines the core protocol.  The IETF SIGTRAN working group originally
+developed the SCTP protocol and later handed the protocol over to the 
+Transport Area (TSVWG) working group for the continued evolvement of SCTP as a 
+general purpose transport.  
+
+See the IETF website (http://www.ietf.org) for further documents on SCTP. 
+See http://www.ietf.org/rfc/rfc2960.txt 
+
+The initial project goal is to create an Linux kernel reference implementation
+of SCTP that is RFC 2960 compliant and provides an programming interface 
+referred to as the  UDP-style API of the Sockets Extensions for SCTP, as 
+proposed in IETF Internet-Drafts.    
+
+Caveats:  
+
+-lksctp can be built as statically or as a module.  However, be aware that 
+module removal of lksctp is not yet a safe activity.   
+
+-There is tentative support for IPv6, but most work has gone towards 
+implementation and testing lksctp on IPv4.   
+
+
+For more information, please visit the lksctp project website:
+   http://www.sf.net/projects/lksctp
+
+Or contact the lksctp developers through the mailing list:
+   <linux-sctp@vger.kernel.org>
diff --git a/Documentation/networking/secid.txt b/Documentation/networking/secid.txt
new file mode 100644
index 000000000..95ea06784
--- /dev/null
+++ b/Documentation/networking/secid.txt
@@ -0,0 +1,14 @@
+flowi structure:
+
+The secid member in the flow structure is used in LSMs (e.g. SELinux) to indicate
+the label of the flow. This label of the flow is currently used in selecting
+matching labeled xfrm(s).
+
+If this is an outbound flow, the label is derived from the socket, if any, or
+the incoming packet this flow is being generated as a response to (e.g. tcp
+resets, timewait ack, etc.). It is also conceivable that the label could be
+derived from other sources such as process context, device, etc., in special
+cases, as may be appropriate.
+
+If this is an inbound flow, the label is derived from the IPSec security
+associations, if any, used by the packet.
diff --git a/Documentation/networking/seg6-sysctl.txt b/Documentation/networking/seg6-sysctl.txt
new file mode 100644
index 000000000..bdbde23b1
--- /dev/null
+++ b/Documentation/networking/seg6-sysctl.txt
@@ -0,0 +1,18 @@
+/proc/sys/net/conf/<iface>/seg6_* variables:
+
+seg6_enabled - BOOL
+	Accept or drop SR-enabled IPv6 packets on this interface.
+
+	Relevant packets are those with SRH present and DA = local.
+
+	0 - disabled (default)
+	not 0 - enabled
+
+seg6_require_hmac - INTEGER
+	Define HMAC policy for ingress SR-enabled packets on this interface.
+
+	-1 - Ignore HMAC field
+	0 - Accept SR packets without HMAC, validate SR packets with HMAC
+	1 - Drop SR packets without HMAC, validate SR packets with HMAC
+
+	Default is 0.
diff --git a/Documentation/networking/segmentation-offloads.txt b/Documentation/networking/segmentation-offloads.txt
new file mode 100644
index 000000000..aca542ec1
--- /dev/null
+++ b/Documentation/networking/segmentation-offloads.txt
@@ -0,0 +1,170 @@
+Segmentation Offloads in the Linux Networking Stack
+
+Introduction
+============
+
+This document describes a set of techniques in the Linux networking stack
+to take advantage of segmentation offload capabilities of various NICs.
+
+The following technologies are described:
+ * TCP Segmentation Offload - TSO
+ * UDP Fragmentation Offload - UFO
+ * IPIP, SIT, GRE, and UDP Tunnel Offloads
+ * Generic Segmentation Offload - GSO
+ * Generic Receive Offload - GRO
+ * Partial Generic Segmentation Offload - GSO_PARTIAL
+ * SCTP accelleration with GSO - GSO_BY_FRAGS
+
+TCP Segmentation Offload
+========================
+
+TCP segmentation allows a device to segment a single frame into multiple
+frames with a data payload size specified in skb_shinfo()->gso_size.
+When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
+SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
+skb_shinfo()->gso_size should be set to a non-zero value.
+
+TCP segmentation is dependent on support for the use of partial checksum
+offload.  For this reason TSO is normally disabled if the Tx checksum
+offload for a given device is disabled.
+
+In order to support TCP segmentation offload it is necessary to populate
+the network and transport header offsets of the skbuff so that the device
+drivers will be able determine the offsets of the IP or IPv6 header and the
+TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
+also point to the TCP header of the packet.
+
+For IPv4 segmentation we support one of two types in terms of the IP ID.
+The default behavior is to increment the IP ID with every segment.  If the
+GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
+ID and all segments will use the same IP ID.  If a device has
+NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
+and we will either increment the IP ID for all frames, or leave it at a
+static value based on driver preference.
+
+UDP Fragmentation Offload
+=========================
+
+UDP fragmentation offload allows a device to fragment an oversized UDP
+datagram into multiple IPv4 fragments.  Many of the requirements for UDP
+fragmentation offload are the same as TSO.  However the IPv4 ID for
+fragments should not increment as a single IPv4 datagram is fragmented.
+
+UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
+still receive them from tuntap and similar devices. Offload of UDP-based
+tunnel protocols is still supported.
+
+IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
+========================================================
+
+In addition to the offloads described above it is possible for a frame to
+contain additional headers such as an outer tunnel.  In order to account
+for such instances an additional set of segmentation offload types were
+introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
+SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
+cases where there are more than just 1 set of headers.  For example in the
+case of IPIP and SIT we should have the network and transport headers moved
+from the standard list of headers to "inner" header offsets.
+
+Currently only two levels of headers are supported.  The convention is to
+refer to the tunnel headers as the outer headers, while the encapsulated
+data is normally referred to as the inner headers.  Below is the list of
+calls to access the given headers:
+
+IPIP/SIT Tunnel:
+		Outer			Inner
+MAC		skb_mac_header
+Network		skb_network_header	skb_inner_network_header
+Transport	skb_transport_header
+
+UDP/GRE Tunnel:
+		Outer			Inner
+MAC		skb_mac_header		skb_inner_mac_header
+Network		skb_network_header	skb_inner_network_header
+Transport	skb_transport_header	skb_inner_transport_header
+
+In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
+SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
+fact that the outer header also requests to have a non-zero checksum
+included in the outer header.
+
+Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
+header has requested a remote checksum offload.  In this case the inner
+headers will be left with a partial checksum and only the outer header
+checksum will be computed.
+
+Generic Segmentation Offload
+============================
+
+Generic segmentation offload is a pure software offload that is meant to
+deal with cases where device drivers cannot perform the offloads described
+above.  What occurs in GSO is that a given skbuff will have its data broken
+out over multiple skbuffs that have been resized to match the MSS provided
+via skb_shinfo()->gso_size.
+
+Before enabling any hardware segmentation offload a corresponding software
+offload is required in GSO.  Otherwise it becomes possible for a frame to
+be re-routed between devices and end up being unable to be transmitted.
+
+Generic Receive Offload
+=======================
+
+Generic receive offload is the complement to GSO.  Ideally any frame
+assembled by GRO should be segmented to create an identical sequence of
+frames using GSO, and any sequence of frames segmented by GSO should be
+able to be reassembled back to the original by GRO.  The only exception to
+this is IPv4 ID in the case that the DF bit is set for a given IP header.
+If the value of the IPv4 ID is not sequentially incrementing it will be
+altered so that it is when a frame assembled via GRO is segmented via GSO.
+
+Partial Generic Segmentation Offload
+====================================
+
+Partial generic segmentation offload is a hybrid between TSO and GSO.  What
+it effectively does is take advantage of certain traits of TCP and tunnels
+so that instead of having to rewrite the packet headers for each segment
+only the inner-most transport header and possibly the outer-most network
+header need to be updated.  This allows devices that do not support tunnel
+offloads or tunnel offloads with checksum to still make use of segmentation.
+
+With the partial offload what occurs is that all headers excluding the
+inner transport header are updated such that they will contain the correct
+values for if the header was simply duplicated.  The one exception to this
+is the outer IPv4 ID field.  It is up to the device drivers to guarantee
+that the IPv4 ID field is incremented in the case that a given header does
+not have the DF bit set.
+
+SCTP accelleration with GSO
+===========================
+
+SCTP - despite the lack of hardware support - can still take advantage of
+GSO to pass one large packet through the network stack, rather than
+multiple small packets.
+
+This requires a different approach to other offloads, as SCTP packets
+cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
+IP segments, padding respected. So unlike regular GSO, SCTP can't just
+generate a big skb, set gso_size to the fragmentation point and deliver it
+to IP layer.
+
+Instead, the SCTP protocol layer builds an skb with the segments correctly
+padded and stored as chained skbs, and skb_segment() splits based on those.
+To signal this, gso_size is set to the special value GSO_BY_FRAGS.
+
+Therefore, any code in the core networking stack must be aware of the
+possibility that gso_size will be GSO_BY_FRAGS and handle that case
+appropriately.
+
+There are some helpers to make this easier:
+
+ - skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
+   an skb is an SCTP GSO skb.
+
+ - For size checks, the skb_gso_validate_*_len family of helpers correctly
+   considers GSO_BY_FRAGS.
+
+ - For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
+   will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
+
+This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
+set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.
diff --git a/Documentation/networking/skfp.txt b/Documentation/networking/skfp.txt
new file mode 100644
index 000000000..203ec66c9
--- /dev/null
+++ b/Documentation/networking/skfp.txt
@@ -0,0 +1,220 @@
+(C)Copyright 1998-2000 SysKonnect,
+===========================================================================
+
+skfp.txt created 11-May-2000
+
+Readme File for skfp.o v2.06
+
+
+This file contains
+(1) OVERVIEW
+(2) SUPPORTED ADAPTERS
+(3) GENERAL INFORMATION
+(4) INSTALLATION
+(5) INCLUSION OF THE ADAPTER IN SYSTEM START
+(6) TROUBLESHOOTING
+(7) FUNCTION OF THE ADAPTER LEDS
+(8) HISTORY
+
+===========================================================================
+
+
+
+(1) OVERVIEW
+============
+
+This README explains how to use the driver 'skfp' for Linux with your
+network adapter.
+
+Chapter 2: Contains a list of all network adapters that are supported by
+	   this driver.
+
+Chapter 3: Gives some general information.
+
+Chapter 4: Describes common problems and solutions.
+
+Chapter 5: Shows the changed functionality of the adapter LEDs.
+
+Chapter 6: History of development.
+
+***
+
+
+(2) SUPPORTED ADAPTERS
+======================
+
+The network driver 'skfp' supports the following network adapters:
+SysKonnect adapters:
+  - SK-5521 (SK-NET FDDI-UP)
+  - SK-5522 (SK-NET FDDI-UP DAS)
+  - SK-5541 (SK-NET FDDI-FP)
+  - SK-5543 (SK-NET FDDI-LP)
+  - SK-5544 (SK-NET FDDI-LP DAS)
+  - SK-5821 (SK-NET FDDI-UP64)
+  - SK-5822 (SK-NET FDDI-UP64 DAS)
+  - SK-5841 (SK-NET FDDI-FP64)
+  - SK-5843 (SK-NET FDDI-LP64)
+  - SK-5844 (SK-NET FDDI-LP64 DAS)
+Compaq adapters (not tested):
+  - Netelligent 100 FDDI DAS Fibre SC
+  - Netelligent 100 FDDI SAS Fibre SC
+  - Netelligent 100 FDDI DAS UTP
+  - Netelligent 100 FDDI SAS UTP
+  - Netelligent 100 FDDI SAS Fibre MIC
+***
+
+
+(3) GENERAL INFORMATION
+=======================
+
+From v2.01 on, the driver is integrated in the linux kernel sources.
+Therefore, the installation is the same as for any other adapter
+supported by the kernel.
+Refer to the manual of your distribution about the installation
+of network adapters.
+Makes my life much easier :-)
+***
+
+
+(4) TROUBLESHOOTING
+===================
+
+If you run into problems during installation, check those items:
+
+Problem:  The FDDI adapter cannot be found by the driver.
+Reason:   Look in /proc/pci for the following entry:
+             'FDDI network controller: SysKonnect SK-FDDI-PCI ...'
+	  If this entry exists, then the FDDI adapter has been
+	  found by the system and should be able to be used.
+	  If this entry does not exist or if the file '/proc/pci'
+	  is not there, then you may have a hardware problem or PCI
+	  support may not be enabled in your kernel.
+	  The adapter can be checked using the diagnostic program
+	  which is available from the SysKonnect web site:
+	      www.syskonnect.de
+	  Some COMPAQ machines have a problem with PCI under
+	  Linux. This is described in the 'PCI howto' document
+	  (included in some distributions or available from the
+	  www, e.g. at 'www.linux.org') and no workaround is available.
+
+Problem:  You want to use your computer as a router between
+          multiple IP subnetworks (using multiple adapters), but
+	  you cannot reach computers in other subnetworks.
+Reason:   Either the router's kernel is not configured for IP
+	  forwarding or there is a problem with the routing table
+	  and gateway configuration in at least one of the
+	  computers.
+
+If your problem is not listed here, please contact our
+technical support for help. 
+You can send email to:
+  linux@syskonnect.de
+When contacting our technical support,
+please ensure that the following information is available:
+- System Manufacturer and Model
+- Boards in your system
+- Distribution
+- Kernel version
+
+***
+
+
+(5) FUNCTION OF THE ADAPTER LEDS
+================================
+
+        The functionality of the LED's on the FDDI network adapters was
+        changed in SMT version v2.82. With this new SMT version, the yellow
+        LED works as a ring operational indicator. An active yellow LED
+        indicates that the ring is down. The green LED on the adapter now
+        works as a link indicator where an active GREEN LED indicates that
+        the respective port has a physical connection.
+
+        With versions of SMT prior to v2.82 a ring up was indicated if the
+        yellow LED was off while the green LED(s) showed the connection
+        status of the adapter. During a ring down the green LED was off and
+        the yellow LED was on.
+
+        All implementations indicate that a driver is not loaded if
+        all LEDs are off.
+
+***
+
+
+(6) HISTORY
+===========
+
+v2.06 (20000511) (In-Kernel version)
+    New features:
+	- 64 bit support
+	- new pci dma interface
+	- in kernel 2.3.99
+
+v2.05 (20000217) (In-Kernel version)
+    New features:
+	- Changes for 2.3.45 kernel
+
+v2.04 (20000207) (Standalone version)
+    New features:
+	- Added rx/tx byte counter
+
+v2.03 (20000111) (Standalone version)
+    Problems fixed:
+	- Fixed printk statements from v2.02
+
+v2.02 (991215) (Standalone version)
+    Problems fixed:
+	- Removed unnecessary output
+	- Fixed path for "printver.sh" in makefile
+
+v2.01 (991122) (In-Kernel version)
+    New features:
+	- Integration in Linux kernel sources
+	- Support for memory mapped I/O.
+
+v2.00 (991112)
+    New features:
+	- Full source released under GPL
+
+v1.05 (991023)
+    Problems fixed:
+	- Compilation with kernel version 2.2.13 failed
+
+v1.04 (990427)
+    Changes:
+	- New SMT module included, changing LED functionality
+    Problems fixed:
+	- Synchronization on SMP machines was buggy
+
+v1.03 (990325)
+    Problems fixed:
+	- Interrupt routing on SMP machines could be incorrect
+
+v1.02 (990310)
+    New features:
+	- Support for kernel versions 2.2.x added
+	- Kernel patch instead of private duplicate of kernel functions
+
+v1.01 (980812)
+    Problems fixed:
+	Connection hangup with telnet
+	Slow telnet connection
+
+v1.00 beta 01 (980507)
+    New features:
+	None.
+    Problems fixed:
+	None.
+    Known limitations:
+        - tar archive instead of standard package format (rpm).
+	- FDDI statistic is empty.
+	- not tested with 2.1.xx kernels
+	- integration in kernel not tested
+	- not tested simultaneously with FDDI adapters from other vendors.
+	- only X86 processors supported.
+	- SBA (Synchronous Bandwidth Allocator) parameters can
+	  not be configured.
+	- does not work on some COMPAQ machines. See the PCI howto
+	  document for details about this problem.
+	- data corruption with kernel versions below 2.0.33.
+
+*** End of information file ***
diff --git a/Documentation/networking/smc9.txt b/Documentation/networking/smc9.txt
new file mode 100644
index 000000000..d1e15074e
--- /dev/null
+++ b/Documentation/networking/smc9.txt
@@ -0,0 +1,42 @@
+
+SMC 9xxxx Driver 
+Revision 0.12 
+3/5/96
+Copyright 1996  Erik Stahlman 
+Released under terms of the GNU General Public License. 
+
+This file contains the instructions and caveats for my SMC9xxx driver.  You
+should not be using the driver without reading this file.  
+
+Things to note about installation:
+
+  1. The driver should work on all kernels from 1.2.13 until 1.3.71.
+	(A kernel patch is supplied for 1.3.71 )
+
+  2. If you include this into the kernel, you might need to change some
+	options, such as for forcing IRQ.   
+
+ 
+  3.  To compile as a module, run 'make' .   
+	Make will give you the appropriate options for various kernel support.
+ 
+  4.  Loading the driver as a module :
+
+	use:   insmod smc9194.o 
+	optional parameters:
+		io=xxxx    : your base address
+		irq=xx	   : your irq 
+		ifport=x   :	0 for whatever is default
+				1 for twisted pair
+				2 for AUI  ( or BNC on some cards )
+
+How to obtain the latest version? 
+	
+FTP:  
+	ftp://fenris.campus.vt.edu/smc9/smc9-12.tar.gz
+	ftp://sfbox.vt.edu/filebox/F/fenris/smc9/smc9-12.tar.gz 
+   
+
+Contacting me:
+    erik@mail.vt.edu
+ 
diff --git a/Documentation/networking/spider_net.txt b/Documentation/networking/spider_net.txt
new file mode 100644
index 000000000..b0b75f846
--- /dev/null
+++ b/Documentation/networking/spider_net.txt
@@ -0,0 +1,204 @@
+
+            The Spidernet Device Driver
+            ===========================
+
+Written by Linas Vepstas <linas@austin.ibm.com>
+
+Version of 7 June 2007
+
+Abstract
+========
+This document sketches the structure of portions of the spidernet
+device driver in the Linux kernel tree. The spidernet is a gigabit
+ethernet device built into the Toshiba southbridge commonly used
+in the SONY Playstation 3 and the IBM QS20 Cell blade.
+
+The Structure of the RX Ring.
+=============================
+The receive (RX) ring is a circular linked list of RX descriptors,
+together with three pointers into the ring that are used to manage its
+contents.
+
+The elements of the ring are called "descriptors" or "descrs"; they
+describe the received data. This includes a pointer to a buffer
+containing the received data, the buffer size, and various status bits.
+
+There are three primary states that a descriptor can be in: "empty",
+"full" and "not-in-use".  An "empty" or "ready" descriptor is ready
+to receive data from the hardware. A "full" descriptor has data in it,
+and is waiting to be emptied and processed by the OS. A "not-in-use"
+descriptor is neither empty or full; it is simply not ready. It may
+not even have a data buffer in it, or is otherwise unusable.
+
+During normal operation, on device startup, the OS (specifically, the
+spidernet device driver) allocates a set of RX descriptors and RX
+buffers. These are all marked "empty", ready to receive data. This
+ring is handed off to the hardware, which sequentially fills in the
+buffers, and marks them "full". The OS follows up, taking the full
+buffers, processing them, and re-marking them empty.
+
+This filling and emptying is managed by three pointers, the "head"
+and "tail" pointers, managed by the OS, and a hardware current
+descriptor pointer (GDACTDPA). The GDACTDPA points at the descr
+currently being filled. When this descr is filled, the hardware
+marks it full, and advances the GDACTDPA by one.  Thus, when there is
+flowing RX traffic, every descr behind it should be marked "full",
+and everything in front of it should be "empty".  If the hardware
+discovers that the current descr is not empty, it will signal an
+interrupt, and halt processing.
+
+The tail pointer tails or trails the hardware pointer. When the
+hardware is ahead, the tail pointer will be pointing at a "full"
+descr. The OS will process this descr, and then mark it "not-in-use",
+and advance the tail pointer.  Thus, when there is flowing RX traffic,
+all of the descrs in front of the tail pointer should be "full", and
+all of those behind it should be "not-in-use". When RX traffic is not
+flowing, then the tail pointer can catch up to the hardware pointer.
+The OS will then note that the current tail is "empty", and halt
+processing.
+
+The head pointer (somewhat mis-named) follows after the tail pointer.
+When traffic is flowing, then the head pointer will be pointing at
+a "not-in-use" descr. The OS will perform various housekeeping duties
+on this descr. This includes allocating a new data buffer and
+dma-mapping it so as to make it visible to the hardware. The OS will
+then mark the descr as "empty", ready to receive data. Thus, when there
+is flowing RX traffic, everything in front of the head pointer should
+be "not-in-use", and everything behind it should be "empty". If no
+RX traffic is flowing, then the head pointer can catch up to the tail
+pointer, at which point the OS will notice that the head descr is
+"empty", and it will halt processing.
+
+Thus, in an idle system, the GDACTDPA, tail and head pointers will
+all be pointing at the same descr, which should be "empty". All of the
+other descrs in the ring should be "empty" as well.
+
+The show_rx_chain() routine will print out the locations of the
+GDACTDPA, tail and head pointers. It will also summarize the contents
+of the ring, starting at the tail pointer, and listing the status
+of the descrs that follow.
+
+A typical example of the output, for a nearly idle system, might be
+
+net eth1: Total number of descrs=256
+net eth1: Chain tail located at descr=20
+net eth1: Chain head is at 20
+net eth1: HW curr desc (GDACTDPA) is at 21
+net eth1: Have 1 descrs with stat=x40800101
+net eth1: HW next desc (GDACNEXTDA) is at 22
+net eth1: Last 255 descrs with stat=xa0800000
+
+In the above, the hardware has filled in one descr, number 20. Both
+head and tail are pointing at 20, because it has not yet been emptied.
+Meanwhile, hw is pointing at 21, which is free.
+
+The "Have nnn decrs" refers to the descr starting at the tail: in this
+case, nnn=1 descr, starting at descr 20. The "Last nnn descrs" refers
+to all of the rest of the descrs, from the last status change. The "nnn"
+is a count of how many descrs have exactly the same status.
+
+The status x4... corresponds to "full" and status xa... corresponds
+to "empty". The actual value printed is RXCOMST_A.
+
+In the device driver source code, a different set of names are
+used for these same concepts, so that
+
+"empty" == SPIDER_NET_DESCR_CARDOWNED == 0xa
+"full"  == SPIDER_NET_DESCR_FRAME_END == 0x4
+"not in use" == SPIDER_NET_DESCR_NOT_IN_USE == 0xf
+
+
+The RX RAM full bug/feature
+===========================
+
+As long as the OS can empty out the RX buffers at a rate faster than
+the hardware can fill them, there is no problem. If, for some reason,
+the OS fails to empty the RX ring fast enough, the hardware GDACTDPA
+pointer will catch up to the head, notice the not-empty condition,
+ad stop. However, RX packets may still continue arriving on the wire.
+The spidernet chip can save some limited number of these in local RAM.
+When this local ram fills up, the spider chip will issue an interrupt
+indicating this (GHIINT0STS will show ERRINT, and the GRMFLLINT bit
+will be set in GHIINT1STS).  When the RX ram full condition occurs,
+a certain bug/feature is triggered that has to be specially handled.
+This section describes the special handling for this condition.
+
+When the OS finally has a chance to run, it will empty out the RX ring.
+In particular, it will clear the descriptor on which the hardware had
+stopped. However, once the hardware has decided that a certain
+descriptor is invalid, it will not restart at that descriptor; instead
+it will restart at the next descr. This potentially will lead to a
+deadlock condition, as the tail pointer will be pointing at this descr,
+which, from the OS point of view, is empty; the OS will be waiting for
+this descr to be filled. However, the hardware has skipped this descr,
+and is filling the next descrs. Since the OS doesn't see this, there
+is a potential deadlock, with the OS waiting for one descr to fill,
+while the hardware is waiting for a different set of descrs to become
+empty.
+
+A call to show_rx_chain() at this point indicates the nature of the
+problem. A typical print when the network is hung shows the following:
+
+net eth1: Spider RX RAM full, incoming packets might be discarded!
+net eth1: Total number of descrs=256
+net eth1: Chain tail located at descr=255
+net eth1: Chain head is at 255
+net eth1: HW curr desc (GDACTDPA) is at 0
+net eth1: Have 1 descrs with stat=xa0800000
+net eth1: HW next desc (GDACNEXTDA) is at 1
+net eth1: Have 127 descrs with stat=x40800101
+net eth1: Have 1 descrs with stat=x40800001
+net eth1: Have 126 descrs with stat=x40800101
+net eth1: Last 1 descrs with stat=xa0800000
+
+Both the tail and head pointers are pointing at descr 255, which is
+marked xa... which is "empty". Thus, from the OS point of view, there
+is nothing to be done. In particular, there is the implicit assumption
+that everything in front of the "empty" descr must surely also be empty,
+as explained in the last section. The OS is waiting for descr 255 to
+become non-empty, which, in this case, will never happen.
+
+The HW pointer is at descr 0. This descr is marked 0x4.. or "full".
+Since its already full, the hardware can do nothing more, and thus has
+halted processing. Notice that descrs 0 through 254 are all marked
+"full", while descr 254 and 255 are empty. (The "Last 1 descrs" is
+descr 254, since tail was at 255.) Thus, the system is deadlocked,
+and there can be no forward progress; the OS thinks there's nothing
+to do, and the hardware has nowhere to put incoming data.
+
+This bug/feature is worked around with the spider_net_resync_head_ptr()
+routine. When the driver receives RX interrupts, but an examination
+of the RX chain seems to show it is empty, then it is probable that
+the hardware has skipped a descr or two (sometimes dozens under heavy
+network conditions). The spider_net_resync_head_ptr() subroutine will
+search the ring for the next full descr, and the driver will resume
+operations there.  Since this will leave "holes" in the ring, there
+is also a spider_net_resync_tail_ptr() that will skip over such holes.
+
+As of this writing, the spider_net_resync() strategy seems to work very
+well, even under heavy network loads.
+
+
+The TX ring
+===========
+The TX ring uses a low-watermark interrupt scheme to make sure that
+the TX queue is appropriately serviced for large packet sizes.
+
+For packet sizes greater than about 1KBytes, the kernel can fill
+the TX ring quicker than the device can drain it. Once the ring
+is full, the netdev is stopped. When there is room in the ring,
+the netdev needs to be reawakened, so that more TX packets are placed
+in the ring. The hardware can empty the ring about four times per jiffy,
+so its not appropriate to wait for the poll routine to refill, since
+the poll routine runs only once per jiffy.  The low-watermark mechanism
+marks a descr about 1/4th of the way from the bottom of the queue, so
+that an interrupt is generated when the descr is processed. This
+interrupt wakes up the netdev, which can then refill the queue.
+For large packets, this mechanism generates a relatively small number
+of interrupts, about 1K/sec. For smaller packets, this will drop to zero
+interrupts, as the hardware can empty the queue faster than the kernel
+can fill it.
+
+
+ ======= END OF DOCUMENT ========
+
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
new file mode 100644
index 000000000..2bb07078f
--- /dev/null
+++ b/Documentation/networking/stmmac.txt
@@ -0,0 +1,401 @@
+       STMicroelectronics 10/100/1000 Synopsys Ethernet driver
+
+Copyright (C) 2007-2015  STMicroelectronics Ltd
+Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
+
+This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
+(Synopsys IP blocks).
+
+Currently this network device driver is for all STi embedded MAC/GMAC
+(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000
+FF1152AMT0221 D1215994A VIRTEX FPGA board.
+
+DWC Ether MAC 10/100/1000 Universal version 3.70a (and older) and DWC Ether
+MAC 10/100 Universal version 4.0 have been used for developing this driver.
+
+This driver supports both the platform bus and PCI.
+
+Please, for more information also visit: www.stlinux.com
+
+1) Kernel Configuration
+The kernel configuration option is STMMAC_ETH:
+ Device Drivers ---> Network device support ---> Ethernet (1000 Mbit) --->
+ STMicroelectronics 10/100/1000 Ethernet driver (STMMAC_ETH)
+
+CONFIG_STMMAC_PLATFORM: is to enable the platform driver.
+CONFIG_STMMAC_PCI: is to enable the pci driver.
+
+2) Driver parameters list:
+	debug: message level (0: no output, 16: all);
+	phyaddr: to manually provide the physical address to the PHY device;
+	buf_sz: DMA buffer size;
+	tc: control the HW FIFO threshold;
+	watchdog: transmit timeout (in milliseconds);
+	flow_ctrl: Flow control ability [on/off];
+	pause: Flow Control Pause Time;
+	eee_timer: tx EEE timer;
+	chain_mode: select chain mode instead of ring.
+
+3) Command line options
+Driver parameters can be also passed in command line by using:
+	stmmaceth=watchdog:100,chain_mode=1
+
+4) Driver information and notes
+
+4.1) Transmit process
+The xmit method is invoked when the kernel needs to transmit a packet; it sets
+the descriptors in the ring and informs the DMA engine, that there is a packet
+ready to be transmitted.
+By default, the driver sets the NETIF_F_SG bit in the features field of the
+net_device structure, enabling the scatter-gather feature. This is true on
+chips and configurations where the checksum can be done in hardware.
+Once the controller has finished transmitting the packet, timer will be
+scheduled to release the transmit resources.
+
+4.2) Receive process
+When one or more packets are received, an interrupt happens. The interrupts
+are not queued, so the driver has to scan all the descriptors in the ring during
+the receive process.
+This is based on NAPI, so the interrupt handler signals only if there is work
+to be done, and it exits.
+Then the poll method will be scheduled at some future point.
+The incoming packets are stored, by the DMA, in a list of pre-allocated socket
+buffers in order to avoid the memcpy (zero-copy).
+
+4.3) Interrupt mitigation
+The driver is able to mitigate the number of its DMA interrupts
+using NAPI for the reception on chips older than the 3.50.
+New chips have an HW RX-Watchdog used for this mitigation.
+Mitigation parameters can be tuned by ethtool.
+
+4.4) WOL
+Wake up on Lan feature through Magic and Unicast frames are supported for the
+GMAC core.
+
+4.5) DMA descriptors
+Driver handles both normal and alternate descriptors. The latter has been only
+tested on DWC Ether MAC 10/100/1000 Universal version 3.41a and later.
+
+STMMAC supports DMA descriptor to operate both in dual buffer (RING)
+and linked-list(CHAINED) mode. In RING each descriptor points to two
+data buffer pointers whereas in CHAINED mode they point to only one data
+buffer pointer. RING mode is the default.
+
+In CHAINED mode each descriptor will have pointer to next descriptor in
+the list, hence creating the explicit chaining in the descriptor itself,
+whereas such explicit chaining is not possible in RING mode.
+
+4.5.1) Extended descriptors
+The extended descriptors give us information about the Ethernet payload
+when it is carrying PTP packets or TCP/UDP/ICMP over IP.
+These are not available on GMAC Synopsys chips older than the 3.50.
+At probe time the driver will decide if these can be actually used.
+This support also is mandatory for PTPv2 because the extra descriptors
+are used for saving the hardware timestamps and Extended Status.
+
+4.6) Ethtool support
+Ethtool is supported.
+
+For example, driver statistics (including RMON), internal errors can be taken
+using:
+  # ethtool -S ethX
+command
+
+4.7) Jumbo and Segmentation Offloading
+Jumbo frames are supported and tested for the GMAC.
+The GSO has been also added but it's performed in software.
+LRO is not supported.
+
+4.8) Physical
+The driver is compatible with Physical Abstraction Layer to be connected with
+PHY and GPHY devices.
+
+4.9) Platform information
+Several information can be passed through the platform and device-tree.
+
+struct plat_stmmacenet_data {
+	char *phy_bus_name;
+	int bus_id;
+	int phy_addr;
+	int interface;
+	struct stmmac_mdio_bus_data *mdio_bus_data;
+	struct stmmac_dma_cfg *dma_cfg;
+	int clk_csr;
+	int has_gmac;
+	int enh_desc;
+	int tx_coe;
+	int rx_coe;
+	int bugged_jumbo;
+	int pmt;
+	int force_sf_dma_mode;
+	int force_thresh_dma_mode;
+	int riwt_off;
+	int max_speed;
+	int maxmtu;
+	void (*fix_mac_speed)(void *priv, unsigned int speed);
+	void (*bus_setup)(void __iomem *ioaddr);
+	int (*init)(struct platform_device *pdev, void *priv);
+	void (*exit)(struct platform_device *pdev, void *priv);
+	void *bsp_priv;
+	int has_gmac4;
+	bool tso_en;
+};
+
+Where:
+ o phy_bus_name: phy bus name to attach to the stmmac.
+ o bus_id: bus identifier.
+ o phy_addr: the physical address can be passed from the platform.
+	    If it is set to -1 the driver will automatically
+	    detect it at run-time by probing all the 32 addresses.
+ o interface: PHY device's interface.
+ o mdio_bus_data: specific platform fields for the MDIO bus.
+ o dma_cfg: internal DMA parameters
+   o pbl: the Programmable Burst Length is maximum number of beats to
+       be transferred in one DMA transaction.
+       GMAC also enables the 4xPBL by default. (8xPBL for GMAC 3.50 and newer)
+   o txpbl/rxpbl: GMAC and newer supports independent DMA pbl for tx/rx.
+   o pblx8: Enable 8xPBL (4xPBL for core rev < 3.50). Enabled by default.
+   o fixed_burst/mixed_burst/aal
+ o clk_csr: fixed CSR Clock range selection.
+ o has_gmac: uses the GMAC core.
+ o enh_desc: if sets the MAC will use the enhanced descriptor structure.
+ o tx_coe: core is able to perform the tx csum in HW.
+ o rx_coe: the supports three check sum offloading engine types:
+	   type_1, type_2 (full csum) and no RX coe.
+ o bugged_jumbo: some HWs are not able to perform the csum in HW for
+		over-sized frames due to limited buffer sizes.
+		Setting this flag the csum will be done in SW on
+		JUMBO frames.
+ o pmt: core has the embedded power module (optional).
+ o force_sf_dma_mode: force DMA to use the Store and Forward mode
+		     instead of the Threshold.
+ o force_thresh_dma_mode: force DMA to use the Threshold mode other than
+		     the Store and Forward mode.
+ o riwt_off: force to disable the RX watchdog feature and switch to NAPI mode.
+ o fix_mac_speed: this callback is used for modifying some syscfg registers
+		 (on ST SoCs) according to the link speed negotiated by the
+		 physical layer .
+ o bus_setup: perform HW setup of the bus. For example, on some ST platforms
+	     this field is used to configure the AMBA  bridge to generate more
+	     efficient STBus traffic.
+ o init/exit: callbacks used for calling a custom initialization;
+	     this is sometime necessary on some platforms (e.g. ST boxes)
+	     where the HW needs to have set some PIO lines or system cfg
+	     registers.  init/exit callbacks should not use or modify
+	     platform data.
+ o bsp_priv: another private pointer.
+ o has_gmac4: uses GMAC4 core.
+ o tso_en: Enables TSO (TCP Segmentation Offload) feature.
+
+For MDIO bus The we have:
+
+ struct stmmac_mdio_bus_data {
+	int (*phy_reset)(void *priv);
+	unsigned int phy_mask;
+	int *irqs;
+	int probed_phy_irq;
+ };
+
+Where:
+ o phy_reset: hook to reset the phy device attached to the bus.
+ o phy_mask: phy mask passed when register the MDIO bus within the driver.
+ o irqs: list of IRQs, one per PHY.
+ o probed_phy_irq: if irqs is NULL, use this for probed PHY.
+
+For DMA engine we have the following internal fields that should be
+tuned according to the HW capabilities.
+
+struct stmmac_dma_cfg {
+	int pbl;
+	int txpbl;
+	int rxpbl;
+	bool pblx8;
+	int fixed_burst;
+	int mixed_burst;
+	bool aal;
+};
+
+Where:
+ o pbl: Programmable Burst Length (tx and rx)
+ o txpbl: Transmit Programmable Burst Length. Only for GMAC and newer.
+	 If set, DMA tx will use this value rather than pbl.
+ o rxpbl: Receive Programmable Burst Length. Only for GMAC and newer.
+	 If set, DMA rx will use this value rather than pbl.
+ o pblx8: Enable 8xPBL (4xPBL for core rev < 3.50). Enabled by default.
+ o fixed_burst: program the DMA to use the fixed burst mode
+ o mixed_burst: program the DMA to use the mixed burst mode
+ o aal: Address-Aligned Beats
+
+---
+
+Below an example how the structures above are using on ST platforms.
+
+ static struct plat_stmmacenet_data stxYYY_ethernet_platform_data = {
+	.has_gmac = 0,
+	.enh_desc = 0,
+	.fix_mac_speed = stxYYY_ethernet_fix_mac_speed,
+				|
+				|-> to write an internal syscfg
+				|   on this platform when the
+				|   link speed changes from 10 to
+				|   100 and viceversa
+	.init = &stmmac_claim_resource,
+				|
+				|-> On ST SoC this calls own "PAD"
+				|   manager framework to claim
+				|   all the resources necessary
+				|   (GPIO ...). The .custom_cfg field
+				|   is used to pass a custom config.
+};
+
+Below the usage of the stmmac_mdio_bus_data: on this SoC, in fact,
+there are two MAC cores: one MAC is for MDIO Bus/PHY emulation
+with fixed_link support.
+
+static struct stmmac_mdio_bus_data stmmac1_mdio_bus = {
+	.phy_reset = phy_reset;
+		|
+		|-> function to provide the phy_reset on this board
+	.phy_mask = 0,
+};
+
+static struct fixed_phy_status stmmac0_fixed_phy_status = {
+	.link = 1,
+	.speed = 100,
+	.duplex = 1,
+};
+
+During the board's device_init we can configure the first
+MAC for fixed_link by calling:
+  fixed_phy_add(PHY_POLL, 1, &stmmac0_fixed_phy_status, -1);
+and the second one, with a real PHY device attached to the bus,
+by using the stmmac_mdio_bus_data structure (to provide the id, the
+reset procedure etc).
+
+Note that, starting from new chips, where it is available the HW capability
+register, many configurations are discovered at run-time for example to
+understand if EEE, HW csum, PTP, enhanced descriptor etc are actually
+available. As strategy adopted in this driver, the information from the HW
+capability register can replace what has been passed from the platform.
+
+4.10) Device-tree support.
+
+Please see the following document:
+	Documentation/devicetree/bindings/net/stmmac.txt
+
+4.11) This is a summary of the content of some relevant files:
+ o stmmac_main.c: implements the main network device driver;
+ o stmmac_mdio.c: provides MDIO functions;
+ o stmmac_pci: this is the PCI driver;
+ o stmmac_platform.c: this the platform driver (OF supported);
+ o stmmac_ethtool.c: implements the ethtool support;
+ o stmmac.h: private driver structure;
+ o common.h: common definitions and VFTs;
+ o mmc_core.c/mmc.h: Management MAC Counters;
+ o stmmac_hwtstamp.c: HW timestamp support for PTP;
+ o stmmac_ptp.c: PTP 1588 clock;
+ o stmmac_pcs.h: Physical Coding Sublayer common implementation;
+ o dwmac-<XXX>.c: these are for the platform glue-logic file; e.g. dwmac-sti.c
+   for STMicroelectronics SoCs.
+
+- GMAC 3.x
+ o descs.h: descriptor structure definitions;
+ o dwmac1000_core.c: dwmac GiGa core functions;
+ o dwmac1000_dma.c: dma functions for the GMAC chip;
+ o dwmac1000.h: specific header file for the dwmac GiGa;
+ o dwmac100_core: dwmac 100 core code;
+ o dwmac100_dma.c: dma functions for the dwmac 100 chip;
+ o dwmac1000.h: specific header file for the MAC;
+ o dwmac_lib.c: generic DMA functions;
+ o enh_desc.c: functions for handling enhanced descriptors;
+ o norm_desc.c: functions for handling normal descriptors;
+ o chain_mode.c/ring_mode.c:: functions to manage RING/CHAINED modes;
+
+- GMAC4.x generation
+ o dwmac4_core.c: dwmac GMAC4.x core functions;
+ o dwmac4_desc.c: functions for handling GMAC4.x descriptors;
+ o dwmac4_descs.h: descriptor definitions;
+ o dwmac4_dma.c: dma functions for the GMAC4.x chip;
+ o dwmac4_dma.h: dma definitions for the GMAC4.x chip;
+ o dwmac4.h: core definitions for the GMAC4.x chip;
+ o dwmac4_lib.c: generic GMAC4.x functions;
+
+4.12) TSO support (GMAC4.x)
+
+TSO (Tcp Segmentation Offload) feature is supported by GMAC 4.x chip family.
+When a packet is sent through TCP protocol, the TCP stack ensures that
+the SKB provided to the low level driver (stmmac in our case) matches with
+the maximum frame len (IP header + TCP header + payload <= 1500 bytes (for
+MTU set to 1500)). It means that if an application using TCP want to send a
+packet which will have a length (after adding headers) > 1514 the packet
+will be split in several TCP packets: The data payload is split and headers
+(TCP/IP ..) are added. It is done by software.
+
+When TSO is enabled, the TCP stack doesn't care about the maximum frame
+length and provide SKB packet to stmmac as it is. The GMAC IP will have to
+perform the segmentation by it self to match with maximum frame length.
+
+This feature can be enabled in device tree through "snps,tso" entry.
+
+5) Debug Information
+
+The driver exports many information i.e. internal statistics,
+debug information, MAC and DMA registers etc.
+
+These can be read in several ways depending on the
+type of the information actually needed.
+
+For example a user can be use the ethtool support
+to get statistics: e.g. using: ethtool -S ethX
+(that shows the Management counters (MMC) if supported)
+or sees the MAC/DMA registers: e.g. using: ethtool -d ethX
+
+Compiling the Kernel with CONFIG_DEBUG_FS the driver will export the following
+debugfs entries:
+
+/sys/kernel/debug/stmmaceth/descriptors_status
+  To show the DMA TX/RX descriptor rings
+
+Developer can also use the "debug" module parameter to get further debug
+information (please see: NETIF Msg Level).
+
+6) Energy Efficient Ethernet
+
+Energy Efficient Ethernet(EEE) enables IEEE 802.3 MAC sublayer along
+with a family of Physical layer to operate in the Low power Idle(LPI)
+mode. The EEE mode supports the IEEE 802.3 MAC operation at 100Mbps,
+1000Mbps & 10Gbps.
+
+The LPI mode allows power saving by switching off parts of the
+communication device functionality when there is no data to be
+transmitted & received. The system on both the side of the link can
+disable some functionalities & save power during the period of low-link
+utilization. The MAC controls whether the system should enter or exit
+the LPI mode & communicate this to PHY.
+
+As soon as the interface is opened, the driver verifies if the EEE can
+be supported. This is done by looking at both the DMA HW capability
+register and the PHY devices MCD registers.
+To enter in Tx LPI mode the driver needs to have a software timer
+that enable and disable the LPI mode when there is nothing to be
+transmitted.
+
+7) Precision Time Protocol (PTP)
+The driver supports the IEEE 1588-2002, Precision Time Protocol (PTP),
+which enables precise synchronization of clocks in measurement and
+control systems implemented with technologies such as network
+communication.
+
+In addition to the basic timestamp features mentioned in IEEE 1588-2002
+Timestamps, new GMAC cores support the advanced timestamp features.
+IEEE 1588-2008 that can be enabled when configure the Kernel.
+
+8) SGMII/RGMII support
+New GMAC devices provide own way to manage RGMII/SGMII.
+This information is available at run-time by looking at the
+HW capability register. This means that the stmmac can manage
+auto-negotiation and link status w/o using the PHYLIB stuff.
+In fact, the HW provides a subset of extended registers to
+restart the ANE, verify Full/Half duplex mode and Speed.
+Thanks to these registers, it is possible to look at the
+Auto-negotiated Link Parter Ability.
diff --git a/Documentation/networking/strparser.txt b/Documentation/networking/strparser.txt
new file mode 100644
index 000000000..a7d354ddd
--- /dev/null
+++ b/Documentation/networking/strparser.txt
@@ -0,0 +1,207 @@
+Stream Parser (strparser)
+
+Introduction
+============
+
+The stream parser (strparser) is a utility that parses messages of an
+application layer protocol running over a data stream. The stream
+parser works in conjunction with an upper layer in the kernel to provide
+kernel support for application layer messages. For instance, Kernel
+Connection Multiplexor (KCM) uses the Stream Parser to parse messages
+using a BPF program.
+
+The strparser works in one of two modes: receive callback or general
+mode.
+
+In receive callback mode, the strparser is called from the data_ready
+callback of a TCP socket. Messages are parsed and delivered as they are
+received on the socket.
+
+In general mode, a sequence of skbs are fed to strparser from an
+outside source. Message are parsed and delivered as the sequence is
+processed. This modes allows strparser to be applied to arbitrary
+streams of data.
+
+Interface
+=========
+
+The API includes a context structure, a set of callbacks, utility
+functions, and a data_ready function for receive callback mode. The
+callbacks include a parse_msg function that is called to perform
+parsing (e.g.  BPF parsing in case of KCM), and a rcv_msg function
+that is called when a full message has been completed.
+
+Functions
+=========
+
+strp_init(struct strparser *strp, struct sock *sk,
+	  const struct strp_callbacks *cb)
+
+     Called to initialize a stream parser. strp is a struct of type
+     strparser that is allocated by the upper layer. sk is the TCP
+     socket associated with the stream parser for use with receive
+     callback mode; in general mode this is set to NULL. Callbacks
+     are called by the stream parser (the callbacks are listed below).
+
+void strp_pause(struct strparser *strp)
+
+     Temporarily pause a stream parser. Message parsing is suspended
+     and no new messages are delivered to the upper layer.
+
+void strp_unpause(struct strparser *strp)
+
+     Unpause a paused stream parser.
+
+void strp_stop(struct strparser *strp);
+
+     strp_stop is called to completely stop stream parser operations.
+     This is called internally when the stream parser encounters an
+     error, and it is called from the upper layer to stop parsing
+     operations.
+
+void strp_done(struct strparser *strp);
+
+     strp_done is called to release any resources held by the stream
+     parser instance. This must be called after the stream processor
+     has been stopped.
+
+int strp_process(struct strparser *strp, struct sk_buff *orig_skb,
+		 unsigned int orig_offset, size_t orig_len,
+		 size_t max_msg_size, long timeo)
+
+    strp_process is called in general mode for a stream parser to
+    parse an sk_buff. The number of bytes processed or a negative
+    error number is returned. Note that strp_process does not
+    consume the sk_buff. max_msg_size is maximum size the stream
+    parser will parse. timeo is timeout for completing a message.
+
+void strp_data_ready(struct strparser *strp);
+
+    The upper layer calls strp_tcp_data_ready when data is ready on
+    the lower socket for strparser to process. This should be called
+    from a data_ready callback that is set on the socket. Note that
+    maximum messages size is the limit of the receive socket
+    buffer and message timeout is the receive timeout for the socket.
+
+void strp_check_rcv(struct strparser *strp);
+
+    strp_check_rcv is called to check for new messages on the socket.
+    This is normally called at initialization of a stream parser
+    instance or after strp_unpause.
+
+Callbacks
+=========
+
+There are six callbacks:
+
+int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
+
+    parse_msg is called to determine the length of the next message
+    in the stream. The upper layer must implement this function. It
+    should parse the sk_buff as containing the headers for the
+    next application layer message in the stream.
+
+    The skb->cb in the input skb is a struct strp_msg. Only
+    the offset field is relevant in parse_msg and gives the offset
+    where the message starts in the skb.
+
+    The return values of this function are:
+
+    >0 : indicates length of successfully parsed message
+    0  : indicates more data must be received to parse the message
+    -ESTRPIPE : current message should not be processed by the
+          kernel, return control of the socket to userspace which
+          can proceed to read the messages itself
+    other < 0 : Error in parsing, give control back to userspace
+          assuming that synchronization is lost and the stream
+          is unrecoverable (application expected to close TCP socket)
+
+    In the case that an error is returned (return value is less than
+    zero) and the parser is in receive callback mode, then it will set
+    the error on TCP socket and wake it up. If parse_msg returned
+    -ESTRPIPE and the stream parser had previously read some bytes for
+    the current message, then the error set on the attached socket is
+    ENODATA since the stream is unrecoverable in that case.
+
+void (*lock)(struct strparser *strp)
+
+    The lock callback is called to lock the strp structure when
+    the strparser is performing an asynchronous operation (such as
+    processing a timeout). In receive callback mode the default
+    function is to lock_sock for the associated socket. In general
+    mode the callback must be set appropriately.
+
+void (*unlock)(struct strparser *strp)
+
+    The unlock callback is called to release the lock obtained
+    by the lock callback. In receive callback mode the default
+    function is release_sock for the associated socket. In general
+    mode the callback must be set appropriately.
+
+void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);
+
+    rcv_msg is called when a full message has been received and
+    is queued. The callee must consume the sk_buff; it can
+    call strp_pause to prevent any further messages from being
+    received in rcv_msg (see strp_pause above). This callback
+    must be set.
+
+    The skb->cb in the input skb is a struct strp_msg. This
+    struct contains two fields: offset and full_len. Offset is
+    where the message starts in the skb, and full_len is the
+    the length of the message. skb->len - offset may be greater
+    then full_len since strparser does not trim the skb.
+
+int (*read_sock_done)(struct strparser *strp, int err);
+
+     read_sock_done is called when the stream parser is done reading
+     the TCP socket in receive callback mode. The stream parser may
+     read multiple messages in a loop and this function allows cleanup
+     to occur when exiting the loop. If the callback is not set (NULL
+     in strp_init) a default function is used.
+
+void (*abort_parser)(struct strparser *strp, int err);
+
+     This function is called when stream parser encounters an error
+     in parsing. The default function stops the stream parser and
+     sets the error in the socket if the parser is in receive callback
+     mode. The default function can be changed by setting the callback
+     to non-NULL in strp_init.
+
+Statistics
+==========
+
+Various counters are kept for each stream parser instance. These are in
+the strp_stats structure. strp_aggr_stats is a convenience structure for
+accumulating statistics for multiple stream parser instances.
+save_strp_stats and aggregate_strp_stats are helper functions to save
+and aggregate statistics.
+
+Message assembly limits
+=======================
+
+The stream parser provide mechanisms to limit the resources consumed by
+message assembly.
+
+A timer is set when assembly starts for a new message. In receive
+callback mode the message timeout is taken from rcvtime for the
+associated TCP socket. In general mode, the timeout is passed as an
+argument in strp_process. If the timer fires before assembly completes
+the stream parser is aborted and the ETIMEDOUT error is set on the TCP
+socket if in receive callback mode.
+
+In receive callback mode, message length is limited to the receive
+buffer size of the associated TCP socket. If the length returned by
+parse_msg is greater than the socket buffer size then the stream parser
+is aborted with EMSGSIZE error set on the TCP socket. Note that this
+makes the maximum size of receive skbuffs for a socket with a stream
+parser to be 2*sk_rcvbuf of the TCP socket.
+
+In general mode the message length limit is passed in as an argument
+to strp_process.
+
+Author
+======
+
+Tom Herbert (tom@quantonium.net)
+
diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
new file mode 100644
index 000000000..82236a17b
--- /dev/null
+++ b/Documentation/networking/switchdev.txt
@@ -0,0 +1,394 @@
+Ethernet switch device driver model (switchdev)
+===============================================
+Copyright (c) 2014 Jiri Pirko <jiri@resnulli.us>
+Copyright (c) 2014-2015 Scott Feldman <sfeldma@gmail.com>
+
+
+The Ethernet switch device driver model (switchdev) is an in-kernel driver
+model for switch devices which offload the forwarding (data) plane from the
+kernel.
+
+Figure 1 is a block diagram showing the components of the switchdev model for
+an example setup using a data-center-class switch ASIC chip.  Other setups
+with SR-IOV or soft switches, such as OVS, are possible.
+
+
+                             User-space tools
+
+       user space                   |
+      +-------------------------------------------------------------------+
+       kernel                       | Netlink
+                                    |
+                     +--------------+-------------------------------+
+                     |         Network stack                        |
+                     |           (Linux)                            |
+                     |                                              |
+                     +----------------------------------------------+
+
+                           sw1p2     sw1p4     sw1p6
+                      sw1p1  +  sw1p3  +  sw1p5  +          eth1
+                        +    |    +    |    +    |            +
+                        |    |    |    |    |    |            |
+                     +--+----+----+----+----+----+---+  +-----+-----+
+                     |         Switch driver         |  |    mgmt   |
+                     |        (this document)        |  |   driver  |
+                     |                               |  |           |
+                     +--------------+----------------+  +-----------+
+                                    |
+       kernel                       | HW bus (eg PCI)
+      +-------------------------------------------------------------------+
+       hardware                     |
+                     +--------------+----------------+
+                     |         Switch device (sw1)   |
+                     |  +----+                       +--------+
+                     |  |    v offloaded data path   | mgmt port
+                     |  |    |                       |
+                     +--|----|----+----+----+----+---+
+                        |    |    |    |    |    |
+                        +    +    +    +    +    +
+                       p1   p2   p3   p4   p5   p6
+
+                             front-panel ports
+
+
+                                    Fig 1.
+
+
+Include Files
+-------------
+
+#include <linux/netdevice.h>
+#include <net/switchdev.h>
+
+
+Configuration
+-------------
+
+Use "depends NET_SWITCHDEV" in driver's Kconfig to ensure switchdev model
+support is built for driver.
+
+
+Switch Ports
+------------
+
+On switchdev driver initialization, the driver will allocate and register a
+struct net_device (using register_netdev()) for each enumerated physical switch
+port, called the port netdev.  A port netdev is the software representation of
+the physical port and provides a conduit for control traffic to/from the
+controller (the kernel) and the network, as well as an anchor point for higher
+level constructs such as bridges, bonds, VLANs, tunnels, and L3 routers.  Using
+standard netdev tools (iproute2, ethtool, etc), the port netdev can also
+provide to the user access to the physical properties of the switch port such
+as PHY link state and I/O statistics.
+
+There is (currently) no higher-level kernel object for the switch beyond the
+port netdevs.  All of the switchdev driver ops are netdev ops or switchdev ops.
+
+A switch management port is outside the scope of the switchdev driver model.
+Typically, the management port is not participating in offloaded data plane and
+is loaded with a different driver, such as a NIC driver, on the management port
+device.
+
+Switch ID
+^^^^^^^^^
+
+The switchdev driver must implement the switchdev op switchdev_port_attr_get
+for SWITCHDEV_ATTR_ID_PORT_PARENT_ID for each port netdev, returning the same
+physical ID for each port of a switch.  The ID must be unique between switches
+on the same system.  The ID does not need to be unique between switches on
+different systems.
+
+The switch ID is used to locate ports on a switch and to know if aggregated
+ports belong to the same switch.
+
+Port Netdev Naming
+^^^^^^^^^^^^^^^^^^
+
+Udev rules should be used for port netdev naming, using some unique attribute
+of the port as a key, for example the port MAC address or the port PHYS name.
+Hard-coding of kernel netdev names within the driver is discouraged; let the
+kernel pick the default netdev name, and let udev set the final name based on a
+port attribute.
+
+Using port PHYS name (ndo_get_phys_port_name) for the key is particularly
+useful for dynamically-named ports where the device names its ports based on
+external configuration.  For example, if a physical 40G port is split logically
+into 4 10G ports, resulting in 4 port netdevs, the device can give a unique
+name for each port using port PHYS name.  The udev rule would be:
+
+SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
+	ATTR{phys_port_name}!="", NAME="swX$attr{phys_port_name}"
+
+Suggested naming convention is "swXpYsZ", where X is the switch name or ID, Y
+is the port name or ID, and Z is the sub-port name or ID.  For example, sw1p1s0
+would be sub-port 0 on port 1 on switch 1.
+
+Port Features
+^^^^^^^^^^^^^
+
+NETIF_F_NETNS_LOCAL
+
+If the switchdev driver (and device) only supports offloading of the default
+network namespace (netns), the driver should set this feature flag to prevent
+the port netdev from being moved out of the default netns.  A netns-aware
+driver/device would not set this flag and be responsible for partitioning
+hardware to preserve netns containment.  This means hardware cannot forward
+traffic from a port in one namespace to another port in another namespace.
+
+Port Topology
+^^^^^^^^^^^^^
+
+The port netdevs representing the physical switch ports can be organized into
+higher-level switching constructs.  The default construct is a standalone
+router port, used to offload L3 forwarding.  Two or more ports can be bonded
+together to form a LAG.  Two or more ports (or LAGs) can be bridged to bridge
+L2 networks.  VLANs can be applied to sub-divide L2 networks.  L2-over-L3
+tunnels can be built on ports.  These constructs are built using standard Linux
+tools such as the bridge driver, the bonding/team drivers, and netlink-based
+tools such as iproute2.
+
+The switchdev driver can know a particular port's position in the topology by
+monitoring NETDEV_CHANGEUPPER notifications.  For example, a port moved into a
+bond will see it's upper master change.  If that bond is moved into a bridge,
+the bond's upper master will change.  And so on.  The driver will track such
+movements to know what position a port is in in the overall topology by
+registering for netdevice events and acting on NETDEV_CHANGEUPPER.
+
+L2 Forwarding Offload
+---------------------
+
+The idea is to offload the L2 data forwarding (switching) path from the kernel
+to the switchdev device by mirroring bridge FDB entries down to the device.  An
+FDB entry is the {port, MAC, VLAN} tuple forwarding destination.
+
+To offloading L2 bridging, the switchdev driver/device should support:
+
+	- Static FDB entries installed on a bridge port
+	- Notification of learned/forgotten src mac/vlans from device
+	- STP state changes on the port
+	- VLAN flooding of multicast/broadcast and unknown unicast packets
+
+Static FDB Entries
+^^^^^^^^^^^^^^^^^^
+
+The switchdev driver should implement ndo_fdb_add, ndo_fdb_del and ndo_fdb_dump
+to support static FDB entries installed to the device.  Static bridge FDB
+entries are installed, for example, using iproute2 bridge cmd:
+
+	bridge fdb add ADDR dev DEV [vlan VID] [self]
+
+The driver should use the helper switchdev_port_fdb_xxx ops for ndo_fdb_xxx
+ops, and handle add/delete/dump of SWITCHDEV_OBJ_ID_PORT_FDB object using
+switchdev_port_obj_xxx ops.
+
+XXX: what should be done if offloading this rule to hardware fails (for
+example, due to full capacity in hardware tables) ?
+
+Note: by default, the bridge does not filter on VLAN and only bridges untagged
+traffic.  To enable VLAN support, turn on VLAN filtering:
+
+	echo 1 >/sys/class/net/<bridge>/bridge/vlan_filtering
+
+Notification of Learned/Forgotten Source MAC/VLANs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The switch device will learn/forget source MAC address/VLAN on ingress packets
+and notify the switch driver of the mac/vlan/port tuples.  The switch driver,
+in turn, will notify the bridge driver using the switchdev notifier call:
+
+	err = call_switchdev_notifiers(val, dev, info);
+
+Where val is SWITCHDEV_FDB_ADD when learning and SWITCHDEV_FDB_DEL when
+forgetting, and info points to a struct switchdev_notifier_fdb_info.  On
+SWITCHDEV_FDB_ADD, the bridge driver will install the FDB entry into the
+bridge's FDB and mark the entry as NTF_EXT_LEARNED.  The iproute2 bridge
+command will label these entries "offload":
+
+	$ bridge fdb
+	52:54:00:12:35:01 dev sw1p1 master br0 permanent
+	00:02:00:00:02:00 dev sw1p1 master br0 offload
+	00:02:00:00:02:00 dev sw1p1 self
+	52:54:00:12:35:02 dev sw1p2 master br0 permanent
+	00:02:00:00:03:00 dev sw1p2 master br0 offload
+	00:02:00:00:03:00 dev sw1p2 self
+	33:33:00:00:00:01 dev eth0 self permanent
+	01:00:5e:00:00:01 dev eth0 self permanent
+	33:33:ff:00:00:00 dev eth0 self permanent
+	01:80:c2:00:00:0e dev eth0 self permanent
+	33:33:00:00:00:01 dev br0 self permanent
+	01:00:5e:00:00:01 dev br0 self permanent
+	33:33:ff:12:35:01 dev br0 self permanent
+
+Learning on the port should be disabled on the bridge using the bridge command:
+
+	bridge link set dev DEV learning off
+
+Learning on the device port should be enabled, as well as learning_sync:
+
+	bridge link set dev DEV learning on self
+	bridge link set dev DEV learning_sync on self
+
+Learning_sync attribute enables syncing of the learned/forgotten FDB entry to
+the bridge's FDB.  It's possible, but not optimal, to enable learning on the
+device port and on the bridge port, and disable learning_sync.
+
+To support learning and learning_sync port attributes, the driver implements
+switchdev op switchdev_port_attr_get/set for
+SWITCHDEV_ATTR_PORT_ID_BRIDGE_FLAGS. The driver should initialize the attributes
+to the hardware defaults.
+
+FDB Ageing
+^^^^^^^^^^
+
+The bridge will skip ageing FDB entries marked with NTF_EXT_LEARNED and it is
+the responsibility of the port driver/device to age out these entries.  If the
+port device supports ageing, when the FDB entry expires, it will notify the
+driver which in turn will notify the bridge with SWITCHDEV_FDB_DEL.  If the
+device does not support ageing, the driver can simulate ageing using a
+garbage collection timer to monitor FDB entries.  Expired entries will be
+notified to the bridge using SWITCHDEV_FDB_DEL.  See rocker driver for
+example of driver running ageing timer.
+
+To keep an NTF_EXT_LEARNED entry "alive", the driver should refresh the FDB
+entry by calling call_switchdev_notifiers(SWITCHDEV_FDB_ADD, ...).  The
+notification will reset the FDB entry's last-used time to now.  The driver
+should rate limit refresh notifications, for example, no more than once a
+second.  (The last-used time is visible using the bridge -s fdb option).
+
+STP State Change on Port
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Internally or with a third-party STP protocol implementation (e.g. mstpd), the
+bridge driver maintains the STP state for ports, and will notify the switch
+driver of STP state change on a port using the switchdev op
+switchdev_attr_port_set for SWITCHDEV_ATTR_PORT_ID_STP_UPDATE.
+
+State is one of BR_STATE_*.  The switch driver can use STP state updates to
+update ingress packet filter list for the port.  For example, if port is
+DISABLED, no packets should pass, but if port moves to BLOCKED, then STP BPDUs
+and other IEEE 01:80:c2:xx:xx:xx link-local multicast packets can pass.
+
+Note that STP BDPUs are untagged and STP state applies to all VLANs on the port
+so packet filters should be applied consistently across untagged and tagged
+VLANs on the port.
+
+Flooding L2 domain
+^^^^^^^^^^^^^^^^^^
+
+For a given L2 VLAN domain, the switch device should flood multicast/broadcast
+and unknown unicast packets to all ports in domain, if allowed by port's
+current STP state.  The switch driver, knowing which ports are within which
+vlan L2 domain, can program the switch device for flooding.  The packet may
+be sent to the port netdev for processing by the bridge driver.  The
+bridge should not reflood the packet to the same ports the device flooded,
+otherwise there will be duplicate packets on the wire.
+
+To avoid duplicate packets, the switch driver should mark a packet as already
+forwarded by setting the skb->offload_fwd_mark bit. The bridge driver will mark
+the skb using the ingress bridge port's mark and prevent it from being forwarded
+through any bridge port with the same mark.
+
+It is possible for the switch device to not handle flooding and push the
+packets up to the bridge driver for flooding.  This is not ideal as the number
+of ports scale in the L2 domain as the device is much more efficient at
+flooding packets that software.
+
+If supported by the device, flood control can be offloaded to it, preventing
+certain netdevs from flooding unicast traffic for which there is no FDB entry.
+
+IGMP Snooping
+^^^^^^^^^^^^^
+
+In order to support IGMP snooping, the port netdevs should trap to the bridge
+driver all IGMP join and leave messages.
+The bridge multicast module will notify port netdevs on every multicast group
+changed whether it is static configured or dynamically joined/leave.
+The hardware implementation should be forwarding all registered multicast
+traffic groups only to the configured ports.
+
+L3 Routing Offload
+------------------
+
+Offloading L3 routing requires that device be programmed with FIB entries from
+the kernel, with the device doing the FIB lookup and forwarding.  The device
+does a longest prefix match (LPM) on FIB entries matching route prefix and
+forwards the packet to the matching FIB entry's nexthop(s) egress ports.
+
+To program the device, the driver has to register a FIB notifier handler
+using register_fib_notifier. The following events are available:
+FIB_EVENT_ENTRY_ADD: used for both adding a new FIB entry to the device,
+                     or modifying an existing entry on the device.
+FIB_EVENT_ENTRY_DEL: used for removing a FIB entry
+FIB_EVENT_RULE_ADD, FIB_EVENT_RULE_DEL: used to propagate FIB rule changes
+
+FIB_EVENT_ENTRY_ADD and FIB_EVENT_ENTRY_DEL events pass:
+
+	struct fib_entry_notifier_info {
+		struct fib_notifier_info info; /* must be first */
+		u32 dst;
+		int dst_len;
+		struct fib_info *fi;
+		u8 tos;
+		u8 type;
+		u32 tb_id;
+		u32 nlflags;
+	};
+
+to add/modify/delete IPv4 dst/dest_len prefix on table tb_id.  The *fi
+structure holds details on the route and route's nexthops.  *dev is one of the
+port netdevs mentioned in the route's next hop list.
+
+Routes offloaded to the device are labeled with "offload" in the ip route
+listing:
+
+	$ ip route show
+	default via 192.168.0.2 dev eth0
+	11.0.0.0/30 dev sw1p1  proto kernel  scope link  src 11.0.0.2 offload
+	11.0.0.4/30 via 11.0.0.1 dev sw1p1  proto zebra  metric 20 offload
+	11.0.0.8/30 dev sw1p2  proto kernel  scope link  src 11.0.0.10 offload
+	11.0.0.12/30 via 11.0.0.9 dev sw1p2  proto zebra  metric 20 offload
+	12.0.0.2  proto zebra  metric 30 offload
+		nexthop via 11.0.0.1  dev sw1p1 weight 1
+		nexthop via 11.0.0.9  dev sw1p2 weight 1
+	12.0.0.3 via 11.0.0.1 dev sw1p1  proto zebra  metric 20 offload
+	12.0.0.4 via 11.0.0.9 dev sw1p2  proto zebra  metric 20 offload
+	192.168.0.0/24 dev eth0  proto kernel  scope link  src 192.168.0.15
+
+The "offload" flag is set in case at least one device offloads the FIB entry.
+
+XXX: add/mod/del IPv6 FIB API
+
+Nexthop Resolution
+^^^^^^^^^^^^^^^^^^
+
+The FIB entry's nexthop list contains the nexthop tuple (gateway, dev), but for
+the switch device to forward the packet with the correct dst mac address, the
+nexthop gateways must be resolved to the neighbor's mac address.  Neighbor mac
+address discovery comes via the ARP (or ND) process and is available via the
+arp_tbl neighbor table.  To resolve the routes nexthop gateways, the driver
+should trigger the kernel's neighbor resolution process.  See the rocker
+driver's rocker_port_ipv4_resolve() for an example.
+
+The driver can monitor for updates to arp_tbl using the netevent notifier
+NETEVENT_NEIGH_UPDATE.  The device can be programmed with resolved nexthops
+for the routes as arp_tbl updates.  The driver implements ndo_neigh_destroy
+to know when arp_tbl neighbor entries are purged from the port.
+
+Transaction item queue
+^^^^^^^^^^^^^^^^^^^^^^
+
+For switchdev ops attr_set and obj_add, there is a 2 phase transaction model
+used. First phase is to "prepare" anything needed, including various checks,
+memory allocation, etc. The goal is to handle the stuff that is not unlikely
+to fail here. The second phase is to "commit" the actual changes.
+
+Switchdev provides an infrastructure for sharing items (for example memory
+allocations) between the two phases.
+
+The object created by a driver in "prepare" phase and it is queued up by:
+switchdev_trans_item_enqueue()
+During the "commit" phase, the driver gets the object by:
+switchdev_trans_item_dequeue()
+
+If a transaction is aborted during "prepare" phase, switchdev code will handle
+cleanup of the queued-up objects.
diff --git a/Documentation/networking/tc-actions-env-rules.txt b/Documentation/networking/tc-actions-env-rules.txt
new file mode 100644
index 000000000..f37814693
--- /dev/null
+++ b/Documentation/networking/tc-actions-env-rules.txt
@@ -0,0 +1,24 @@
+
+The "environmental" rules for authors of any new tc actions are:
+
+1) If you stealeth or borroweth any packet thou shalt be branching
+from the righteous path and thou shalt cloneth.
+
+For example if your action queues a packet to be processed later,
+or intentionally branches by redirecting a packet, then you need to
+clone the packet.
+
+2) If you munge any packet thou shalt call pskb_expand_head in the case
+someone else is referencing the skb. After that you "own" the skb.
+
+3) Dropping packets you don't own is a no-no. You simply return
+TC_ACT_SHOT to the caller and they will drop it.
+
+The "environmental" rules for callers of actions (qdiscs etc) are:
+
+*) Thou art responsible for freeing anything returned as being
+TC_ACT_SHOT/STOLEN/QUEUED. If none of TC_ACT_SHOT/STOLEN/QUEUED is
+returned, then all is great and you don't need to do anything.
+
+Post on netdev if something is unclear.
+
diff --git a/Documentation/networking/tcp-thin.txt b/Documentation/networking/tcp-thin.txt
new file mode 100644
index 000000000..151e22998
--- /dev/null
+++ b/Documentation/networking/tcp-thin.txt
@@ -0,0 +1,47 @@
+Thin-streams and TCP
+====================
+A wide range of Internet-based services that use reliable transport
+protocols display what we call thin-stream properties. This means
+that the application sends data with such a low rate that the
+retransmission mechanisms of the transport protocol are not fully
+effective. In time-dependent scenarios (like online games, control
+systems, stock trading etc.) where the user experience depends
+on the data delivery latency, packet loss can be devastating for
+the service quality. Extreme latencies are caused by TCP's
+dependency on the arrival of new data from the application to trigger
+retransmissions effectively through fast retransmit instead of
+waiting for long timeouts.
+
+After analysing a large number of time-dependent interactive
+applications, we have seen that they often produce thin streams
+and also stay with this traffic pattern throughout its entire
+lifespan. The combination of time-dependency and the fact that the
+streams provoke high latencies when using TCP is unfortunate.
+
+In order to reduce application-layer latency when packets are lost,
+a set of mechanisms has been made, which address these latency issues
+for thin streams. In short, if the kernel detects a thin stream,
+the retransmission mechanisms are modified in the following manner:
+
+1) If the stream is thin, fast retransmit on the first dupACK.
+2) If the stream is thin, do not apply exponential backoff.
+
+These enhancements are applied only if the stream is detected as
+thin. This is accomplished by defining a threshold for the number
+of packets in flight. If there are less than 4 packets in flight,
+fast retransmissions can not be triggered, and the stream is prone
+to experience high retransmission latencies.
+
+Since these mechanisms are targeted at time-dependent applications,
+they must be specifically activated by the application using the
+TCP_THIN_LINEAR_TIMEOUTS and TCP_THIN_DUPACK IOCTLS or the
+tcp_thin_linear_timeouts and tcp_thin_dupack sysctls. Both
+modifications are turned off by default.
+
+References
+==========
+More information on the modifications, as well as a wide range of
+experimental data can be found here:
+"Improving latency for interactive, thin-stream applications over
+reliable transport"
+http://simula.no/research/nd/publications/Simula.nd.477/simula_pdf_file
diff --git a/Documentation/networking/tcp.txt b/Documentation/networking/tcp.txt
new file mode 100644
index 000000000..9c7139d57
--- /dev/null
+++ b/Documentation/networking/tcp.txt
@@ -0,0 +1,101 @@
+TCP protocol
+============
+
+Last updated: 3 June 2017
+
+Contents
+========
+
+- Congestion control
+- How the new TCP output machine [nyi] works
+
+Congestion control
+==================
+
+The following variables are used in the tcp_sock for congestion control:
+snd_cwnd		The size of the congestion window
+snd_ssthresh		Slow start threshold. We are in slow start if
+			snd_cwnd is less than this.
+snd_cwnd_cnt		A counter used to slow down the rate of increase
+			once we exceed slow start threshold.
+snd_cwnd_clamp		This is the maximum size that snd_cwnd can grow to.
+snd_cwnd_stamp		Timestamp for when congestion window last validated.
+snd_cwnd_used		Used as a highwater mark for how much of the
+			congestion window is in use. It is used to adjust
+			snd_cwnd down when the link is limited by the
+			application rather than the network.
+
+As of 2.6.13, Linux supports pluggable congestion control algorithms.
+A congestion control mechanism can be registered through functions in
+tcp_cong.c. The functions used by the congestion control mechanism are
+registered via passing a tcp_congestion_ops struct to
+tcp_register_congestion_control. As a minimum, the congestion control
+mechanism must provide a valid name and must implement either ssthresh,
+cong_avoid and undo_cwnd hooks or the "omnipotent" cong_control hook.
+
+Private data for a congestion control mechanism is stored in tp->ca_priv.
+tcp_ca(tp) returns a pointer to this space.  This is preallocated space - it
+is important to check the size of your private data will fit this space, or
+alternatively, space could be allocated elsewhere and a pointer to it could
+be stored here.
+
+There are three kinds of congestion control algorithms currently: The
+simplest ones are derived from TCP reno (highspeed, scalable) and just
+provide an alternative congestion window calculation. More complex
+ones like BIC try to look at other events to provide better
+heuristics.  There are also round trip time based algorithms like
+Vegas and Westwood+.
+
+Good TCP congestion control is a complex problem because the algorithm
+needs to maintain fairness and performance. Please review current
+research and RFC's before developing new modules.
+
+The default congestion control mechanism is chosen based on the
+DEFAULT_TCP_CONG Kconfig parameter. If you really want a particular default
+value then you can set it using sysctl net.ipv4.tcp_congestion_control. The
+module will be autoloaded if needed and you will get the expected protocol. If
+you ask for an unknown congestion method, then the sysctl attempt will fail.
+
+If you remove a TCP congestion control module, then you will get the next
+available one. Since reno cannot be built as a module, and cannot be
+removed, it will always be available.
+
+How the new TCP output machine [nyi] works.
+===========================================
+
+Data is kept on a single queue. The skb->users flag tells us if the frame is
+one that has been queued already. To add a frame we throw it on the end. Ack
+walks down the list from the start.
+
+We keep a set of control flags
+
+
+	sk->tcp_pend_event
+
+		TCP_PEND_ACK			Ack needed
+		TCP_ACK_NOW			Needed now
+		TCP_WINDOW			Window update check
+		TCP_WINZERO			Zero probing
+
+
+	sk->transmit_queue		The transmission frame begin
+	sk->transmit_new		First new frame pointer
+	sk->transmit_end		Where to add frames
+
+	sk->tcp_last_tx_ack		Last ack seen
+	sk->tcp_dup_ack			Dup ack count for fast retransmit
+
+
+Frames are queued for output by tcp_write. We do our best to send the frames
+off immediately if possible, but otherwise queue and compute the body
+checksum in the copy. 
+
+When a write is done we try to clear any pending events and piggy back them.
+If the window is full we queue full sized frames. On the first timeout in
+zero window we split this.
+
+On a timer we walk the retransmit list to send any retransmits, update the
+backoff timers etc. A change of route table stamp causes a change of header
+and recompute. We add any new tcp level headers and refinish the checksum
+before sending. 
+
diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 000000000..5a013686b
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
+Team devices are driven from userspace via libteam library which is here:
+	https://github.com/jpirko/libteam
diff --git a/Documentation/networking/ti-cpsw.txt b/Documentation/networking/ti-cpsw.txt
new file mode 100644
index 000000000..d4d4c0751
--- /dev/null
+++ b/Documentation/networking/ti-cpsw.txt
@@ -0,0 +1,541 @@
+* Texas Instruments CPSW ethernet driver
+
+Multiqueue & CBS & MQPRIO
+=====================================================================
+=====================================================================
+
+The cpsw has 3 CBS shapers for each external ports. This document
+describes MQPRIO and CBS Qdisc offload configuration for cpsw driver
+based on examples. It potentially can be used in audio video bridging
+(AVB) and time sensitive networking (TSN).
+
+The following examples were tested on AM572x EVM and BBB boards.
+
+Test setup
+==========
+
+Under consideration two examples with AM572x EVM running cpsw driver
+in dual_emac mode.
+
+Several prerequisites:
+- TX queues must be rated starting from txq0 that has highest priority
+- Traffic classes are used starting from 0, that has highest priority
+- CBS shapers should be used with rated queues
+- The bandwidth for CBS shapers has to be set a little bit more then
+  potential incoming rate, thus, rate of all incoming tx queues has
+  to be a little less
+- Real rates can differ, due to discreetness
+- Map skb-priority to txq is not enough, also skb-priority to l2 prio
+  map has to be created with ip or vconfig tool
+- Any l2/socket prio (0 - 7) for classes can be used, but for
+  simplicity default values are used: 3 and 2
+- only 2 classes tested: A and B, but checked and can work with more,
+  maximum allowed 4, but only for 3 rate can be set.
+
+Test setup for examples
+=======================
+                                    +-------------------------------+
+                                    |--+                            |
+                                    |  |      Workstation0          |
+                                    |E |  MAC 18:03:73:66:87:42     |
++-----------------------------+  +--|t |                            |
+|                    | 1  | E |  |  |h |./tsn_listener -d \         |
+|  Target board:     | 0  | t |--+  |0 | 18:03:73:66:87:42 -i eth0 \|
+|  AM572x EVM        | 0  | h |     |  | -s 1500                    |
+|                    | 0  | 0 |     |--+                            |
+|  Only 2 classes:   |Mb  +---|     +-------------------------------+
+|  class A, class B  |        |
+|                    |    +---|     +-------------------------------+
+|                    | 1  | E |     |--+                            |
+|                    | 0  | t |     |  |      Workstation1          |
+|                    | 0  | h |--+  |E |  MAC 20:cf:30:85:7d:fd     |
+|                    |Mb  | 1 |  +--|t |                            |
++-----------------------------+     |h |./tsn_listener -d \         |
+                                    |0 | 20:cf:30:85:7d:fd -i eth0 \|
+                                    |  | -s 1500                    |
+                                    |--+                            |
+                                    +-------------------------------+
+
+*********************************************************************
+*********************************************************************
+*********************************************************************
+Example 1: One port tx AVB configuration scheme for target board
+----------------------------------------------------------------------
+(prints and scheme for AM572x evm, applicable for single port boards)
+
+tc - traffic class
+txq - transmit queue
+p - priority
+f - fifo (cpsw fifo)
+S - shaper configured
+
++------------------------------------------------------------------+ u
+| +---------------+  +---------------+  +------+ +------+          | s
+| |               |  |               |  |      | |      |          | e
+| | App 1         |  | App 2         |  | Apps | | Apps |          | r
+| | Class A       |  | Class B       |  | Rest | | Rest |          |
+| | Eth0          |  | Eth0          |  | Eth0 | | Eth1 |          | s
+| | VLAN100       |  | VLAN100       |  |   |  | |   |  |          | p
+| | 40 Mb/s       |  | 20 Mb/s       |  |   |  | |   |  |          | a
+| | SO_PRIORITY=3 |  | SO_PRIORITY=2 |  |   |  | |   |  |          | c
+| |   |           |  |   |           |  |   |  | |   |  |          | e
+| +---|-----------+  +---|-----------+  +---|--+ +---|--+          |
++-----|------------------|------------------|--------|-------------+
+    +-+     +------------+                  |        |
+    |       |             +-----------------+     +--+
+    |       |             |                       |
++---|-------|-------------|-----------------------|----------------+
+| +----+ +----+ +----+ +----+                   +----+             |
+| | p3 | | p2 | | p1 | | p0 |                   | p0 |             | k
+| \    / \    / \    / \    /                   \    /             | e
+|  \  /   \  /   \  /   \  /                     \  /              | r
+|   \/     \/     \/     \/                       \/               | n
+|    |     |             |                        |                | e
+|    |     |       +-----+                        |                | l
+|    |     |       |                              |                |
+| +----+ +----+ +----+                          +----+             | s
+| |tc0 | |tc1 | |tc2 |                          |tc0 |             | p
+| \    / \    / \    /                          \    /             | a
+|  \  /   \  /   \  /                            \  /              | c
+|   \/     \/     \/                              \/               | e
+|   |      |       +-----+                        |                |
+|   |      |       |     |                        |                |
+|   |      |       |     |                        |                |
+|   |      |       |     |                        |                |
+| +----+ +----+ +----+ +----+                   +----+             |
+| |txq0| |txq1| |txq2| |txq3|                   |txq4|             |
+| \    / \    / \    / \    /                   \    /             |
+|  \  /   \  /   \  /   \  /                     \  /              |
+|   \/     \/     \/     \/                       \/               |
+| +-|------|------|------|--+                  +--|--------------+ |
+| | |      |      |      |  | Eth0.100         |  |     Eth1     | |
++---|------|------|------|------------------------|----------------+
+    |      |      |      |                        |
+    p      p      p      p                        |
+    3      2      0-1, 4-7  <- L2 priority        |
+    |      |      |      |                        |
+    |      |      |      |                        |
++---|------|------|------|------------------------|----------------+
+|   |      |      |      |             |----------+                |
+| +----+ +----+ +----+ +----+       +----+                         |
+| |dma7| |dma6| |dma5| |dma4|       |dma3|                         |
+| \    / \    / \    / \    /       \    /                         | c
+|  \S /   \S /   \  /   \  /         \  /                          | p
+|   \/     \/     \/     \/           \/                           | s
+|   |      |      | +-----            |                            | w
+|   |      |      | |                 |                            |
+|   |      |      | |                 |                            | d
+| +----+ +----+ +----+p            p+----+                         | r
+| |    | |    | |    |o            o|    |                         | i
+| | f3 | | f2 | | f0 |r            r| f0 |                         | v
+| |tc0 | |tc1 | |tc2 |t            t|tc0 |                         | e
+| \CBS / \CBS / \CBS /1            2\CBS /                         | r
+|  \S /   \S /   \  /                \  /                          |
+|   \/     \/     \/                  \/                           |
++------------------------------------------------------------------+
+========================================Eth==========================>
+
+1)
+// Add 4 tx queues, for interface Eth0, and 1 tx queue for Eth1
+$ ethtool -L eth0 rx 1 tx 5
+rx unmodified, ignoring
+
+2)
+// Check if num of queues is set correctly:
+$ ethtool -l eth0
+Channel parameters for eth0:
+Pre-set maximums:
+RX:             8
+TX:             8
+Other:          0
+Combined:       0
+Current hardware settings:
+RX:             1
+TX:             5
+Other:          0
+Combined:       0
+
+3)
+// TX queues must be rated starting from 0, so set bws for tx0 and tx1
+// Set rates 40 and 20 Mb/s appropriately.
+// Pay attention, real speed can differ a bit due to discreetness.
+// Leave last 2 tx queues not rated.
+$ echo 40 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
+$ echo 20 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
+
+4)
+// Check maximum rate of tx (cpdma) queues:
+$ cat /sys/class/net/eth0/queues/tx-*/tx_maxrate
+40
+20
+0
+0
+0
+
+5)
+// Map skb->priority to traffic class:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq0, tc1 -> txq1, tc2 -> (txq2, txq3)
+$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1
+
+5a)
+// As two interface sharing same set of tx queues, assign all traffic
+// coming to interface Eth1 to separate queue in order to not mix it
+// with traffic from interface Eth0, so use separate txq to send
+// packets to Eth1, so all prio -> tc0 and tc0 -> txq4
+// Here hw 0, so here still default configuration for eth1 in hw
+$ tc qdisc replace dev eth1 handle 100: parent root mqprio num_tc 1 \
+map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@4 hw 0
+
+6)
+// Check classes settings
+$ tc -g class show dev eth0
++---(100:ffe2) mqprio
+|    +---(100:3) mqprio
+|    +---(100:4) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:2) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:1) mqprio
+
+$ tc -g class show dev eth1
++---(100:ffe0) mqprio
+     +---(100:5) mqprio
+
+7)
+// Set rate for class A - 41 Mbit (tc0, txq0) using CBS Qdisc
+// Set it +1 Mb for reserve (important!)
+// here only idle slope is important, others arg are ignored
+// Pay attention, real speed can differ a bit due to discreetness
+$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1438 \
+hicredit 62 sendslope -959000 idleslope 41000 offload 1
+net eth0: set FIFO3 bw = 50
+
+8)
+// Set rate for class B - 21 Mbit (tc1, txq1) using CBS Qdisc:
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1468 \
+hicredit 65 sendslope -979000 idleslope 21000 offload 1
+net eth0: set FIFO2 bw = 30
+
+9)
+// Create vlan 100 to map sk->priority to vlan qos
+$ ip link add link eth0 name eth0.100 type vlan id 100
+8021q: 802.1Q VLAN Support v1.8
+8021q: adding VLAN 0 to HW filter on device eth0
+8021q: adding VLAN 0 to HW filter on device eth1
+net eth0: Adding vlanid 100 to vlan filter
+
+10)
+// Map skb->priority to L2 prio, 1 to 1
+$ ip link set eth0.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+11)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth0.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+12)
+// Run your appropriate tools with socket option "SO_PRIORITY"
+// to 3 for class A and/or to 2 for class B
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
+
+13)
+// run your listener on workstation (should be in same vlan)
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39000 kbps
+
+14)
+// Restore default configuration if needed
+$ ip link del eth0.100
+$ tc qdisc del dev eth1 root
+$ tc qdisc del dev eth0 root
+net eth0: Prev FIFO2 is shaped
+net eth0: set FIFO3 bw = 0
+net eth0: set FIFO2 bw = 0
+$ ethtool -L eth0 rx 1 tx 1
+
+*********************************************************************
+*********************************************************************
+*********************************************************************
+Example 2: Two port tx AVB configuration scheme for target board
+----------------------------------------------------------------------
+(prints and scheme for AM572x evm, for dual emac boards only)
+
++------------------------------------------------------------------+ u
+| +----------+  +----------+  +------+  +----------+  +----------+ | s
+| |          |  |          |  |      |  |          |  |          | | e
+| | App 1    |  | App 2    |  | Apps |  | App 3    |  | App 4    | | r
+| | Class A  |  | Class B  |  | Rest |  | Class B  |  | Class A  | |
+| | Eth0     |  | Eth0     |  |   |  |  | Eth1     |  | Eth1     | | s
+| | VLAN100  |  | VLAN100  |  |   |  |  | VLAN100  |  | VLAN100  | | p
+| | 40 Mb/s  |  | 20 Mb/s  |  |   |  |  | 10 Mb/s  |  | 30 Mb/s  | | a
+| | SO_PRI=3 |  | SO_PRI=2 |  |   |  |  | SO_PRI=3 |  | SO_PRI=2 | | c
+| |   |      |  |   |      |  |   |  |  |   |      |  |   |      | | e
+| +---|------+  +---|------+  +---|--+  +---|------+  +---|------+ |
++-----|-------------|-------------|---------|-------------|--------+
+    +-+     +-------+             |         +----------+  +----+
+    |       |             +-------+------+             |       |
+    |       |             |              |             |       |
++---|-------|-------------|--------------|-------------|-------|---+
+| +----+ +----+ +----+ +----+          +----+ +----+ +----+ +----+ |
+| | p3 | | p2 | | p1 | | p0 |          | p0 | | p1 | | p2 | | p3 | | k
+| \    / \    / \    / \    /          \    / \    / \    / \    / | e
+|  \  /   \  /   \  /   \  /            \  /   \  /   \  /   \  /  | r
+|   \/     \/     \/     \/              \/     \/     \/     \/   | n
+|   |      |             |                |             |      |   | e
+|   |      |        +----+                +----+        |      |   | l
+|   |      |        |                          |        |      |   |
+| +----+ +----+ +----+                        +----+ +----+ +----+ | s
+| |tc0 | |tc1 | |tc2 |                        |tc2 | |tc1 | |tc0 | | p
+| \    / \    / \    /                        \    / \    / \    / | a
+|  \  /   \  /   \  /                          \  /   \  /   \  /  | c
+|   \/     \/     \/                            \/     \/     \/   | e
+|   |      |       +-----+                +-----+      |       |   |
+|   |      |       |     |                |     |      |       |   |
+|   |      |       |     |                |     |      |       |   |
+|   |      |       |     |    E      E    |     |      |       |   |
+| +----+ +----+ +----+ +----+ t      t +----+ +----+ +----+ +----+ |
+| |txq0| |txq1| |txq4| |txq5| h      h |txq6| |txq7| |txq3| |txq2| |
+| \    / \    / \    / \    / 0      1 \    / \    / \    / \    / |
+|  \  /   \  /   \  /   \  /  .      .  \  /   \  /   \  /   \  /  |
+|   \/     \/     \/     \/   1      1   \/     \/     \/     \/   |
+| +-|------|------|------|--+ 0      0 +-|------|------|------|--+ |
+| | |      |      |      |  | 0      0 | |      |      |      |  | |
++---|------|------|------|---------------|------|------|------|----+
+    |      |      |      |               |      |      |      |
+    p      p      p      p               p      p      p      p
+    3      2      0-1, 4-7   <-L2 pri->  0-1, 4-7      2      3
+    |      |      |      |               |      |      |      |
+    |      |      |      |               |      |      |      |
++---|------|------|------|---------------|------|------|------|----+
+|   |      |      |      |               |      |      |      |    |
+| +----+ +----+ +----+ +----+          +----+ +----+ +----+ +----+ |
+| |dma7| |dma6| |dma3| |dma2|          |dma1| |dma0| |dma4| |dma5| |
+| \    / \    / \    / \    /          \    / \    / \    / \    / | c
+|  \S /   \S /   \  /   \  /            \  /   \  /   \S /   \S /  | p
+|   \/     \/     \/     \/              \/     \/     \/     \/   | s
+|   |      |      | +-----                |      |      |      |   | w
+|   |      |      | |                     +----+ |      |      |   |
+|   |      |      | |                          | |      |      |   | d
+| +----+ +----+ +----+p                      p+----+ +----+ +----+ | r
+| |    | |    | |    |o                      o|    | |    | |    | | i
+| | f3 | | f2 | | f0 |r        CPSW          r| f3 | | f2 | | f0 | | v
+| |tc0 | |tc1 | |tc2 |t                      t|tc0 | |tc1 | |tc2 | | e
+| \CBS / \CBS / \CBS /1                      2\CBS / \CBS / \CBS / | r
+|  \S /   \S /   \  /                          \S /   \S /   \  /  |
+|   \/     \/     \/                            \/     \/     \/   |
++------------------------------------------------------------------+
+========================================Eth==========================>
+
+1)
+// Add 8 tx queues, for interface Eth0, but they are common, so are accessed
+// by two interfaces Eth0 and Eth1.
+$ ethtool -L eth1 rx 1 tx 8
+rx unmodified, ignoring
+
+2)
+// Check if num of queues is set correctly:
+$ ethtool -l eth0
+Channel parameters for eth0:
+Pre-set maximums:
+RX:             8
+TX:             8
+Other:          0
+Combined:       0
+Current hardware settings:
+RX:             1
+TX:             8
+Other:          0
+Combined:       0
+
+3)
+// TX queues must be rated starting from 0, so set bws for tx0 and tx1 for Eth0
+// and for tx2 and tx3 for Eth1. That is, rates 40 and 20 Mb/s appropriately
+// for Eth0 and 30 and 10 Mb/s for Eth1.
+// Real speed can differ a bit due to discreetness
+// Leave last 4 tx queues as not rated
+$ echo 40 > /sys/class/net/eth0/queues/tx-0/tx_maxrate
+$ echo 20 > /sys/class/net/eth0/queues/tx-1/tx_maxrate
+$ echo 30 > /sys/class/net/eth1/queues/tx-2/tx_maxrate
+$ echo 10 > /sys/class/net/eth1/queues/tx-3/tx_maxrate
+
+4)
+// Check maximum rate of tx (cpdma) queues:
+$ cat /sys/class/net/eth0/queues/tx-*/tx_maxrate
+40
+20
+30
+10
+0
+0
+0
+0
+
+5)
+// Map skb->priority to traffic class for Eth0:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq0, tc1 -> txq1, tc2 -> (txq4, txq5)
+$ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@4 hw 1
+
+6)
+// Check classes settings
+$ tc -g class show dev eth0
++---(100:ffe2) mqprio
+|    +---(100:5) mqprio
+|    +---(100:6) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:2) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:1) mqprio
+
+7)
+// Set rate for class A - 41 Mbit (tc0, txq0) using CBS Qdisc for Eth0
+// here only idle slope is important, others ignored
+// Real speed can differ a bit due to discreetness
+$ tc qdisc add dev eth0 parent 100:1 cbs locredit -1470 \
+hicredit 62 sendslope -959000 idleslope 41000 offload 1
+net eth0: set FIFO3 bw = 50
+
+8)
+// Set rate for class B - 21 Mbit (tc1, txq1) using CBS Qdisc for Eth0
+$ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \
+hicredit 65 sendslope -979000 idleslope 21000 offload 1
+net eth0: set FIFO2 bw = 30
+
+9)
+// Create vlan 100 to map sk->priority to vlan qos for Eth0
+$ ip link add link eth0 name eth0.100 type vlan id 100
+net eth0: Adding vlanid 100 to vlan filter
+
+10)
+// Map skb->priority to L2 prio for Eth0.100, one to one
+$ ip link set eth0.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+11)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth0.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+12)
+// Map skb->priority to traffic class for Eth1:
+// 3pri -> tc0, 2pri -> tc1, (0,1,4-7)pri -> tc2
+// Map traffic class to transmit queue:
+// tc0 -> txq2, tc1 -> txq3, tc2 -> (txq6, txq7)
+$ tc qdisc replace dev eth1 handle 100: parent root mqprio num_tc 3 \
+map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@2 1@3 2@6 hw 1
+
+13)
+// Check classes settings
+$ tc -g class show dev eth1
++---(100:ffe2) mqprio
+|    +---(100:7) mqprio
+|    +---(100:8) mqprio
+|
++---(100:ffe1) mqprio
+|    +---(100:4) mqprio
+|
++---(100:ffe0) mqprio
+     +---(100:3) mqprio
+
+14)
+// Set rate for class A - 31 Mbit (tc0, txq2) using CBS Qdisc for Eth1
+// here only idle slope is important, others ignored, but calculated
+// for interface speed - 100Mb for eth1 port.
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth1 parent 100:3 cbs locredit -1035 \
+hicredit 465 sendslope -69000 idleslope 31000 offload 1
+net eth1: set FIFO3 bw = 31
+
+15)
+// Set rate for class B - 11 Mbit (tc1, txq3) using CBS Qdisc for Eth1
+// Set it +1 Mb for reserve (important!)
+$ tc qdisc add dev eth1 parent 100:4 cbs locredit -1335 \
+hicredit 405 sendslope -89000 idleslope 11000 offload 1
+net eth1: set FIFO2 bw = 11
+
+16)
+// Create vlan 100 to map sk->priority to vlan qos for Eth1
+$ ip link add link eth1 name eth1.100 type vlan id 100
+net eth1: Adding vlanid 100 to vlan filter
+
+17)
+// Map skb->priority to L2 prio for Eth1.100, one to one
+$ ip link set eth1.100 type vlan \
+egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+18)
+// Check egress map for vlan 100
+$ cat /proc/net/vlan/eth1.100
+[...]
+INGRESS priority mappings: 0:0  1:0  2:0  3:0  4:0  5:0  6:0 7:0
+EGRESS priority mappings: 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+19)
+// Run appropriate tools with socket option "SO_PRIORITY" to 3
+// for class A and to 2 for class B. For both interfaces
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p2 -s 1500&
+./tsn_talker -d 18:03:73:66:87:42 -i eth0.100 -p3 -s 1500&
+./tsn_talker -d 20:cf:30:85:7d:fd -i eth1.100 -p2 -s 1500&
+./tsn_talker -d 20:cf:30:85:7d:fd -i eth1.100 -p3 -s 1500&
+
+20)
+// run your listener on workstation (should be in same vlan)
+// (I took at https://www.spinics.net/lists/netdev/msg460869.html)
+./tsn_listener -d 18:03:73:66:87:42 -i enp5s0 -s 1500
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39012 kbps
+Receiving data rate: 39000 kbps
+
+21)
+// Restore default configuration if needed
+$ ip link del eth1.100
+$ ip link del eth0.100
+$ tc qdisc del dev eth1 root
+net eth1: Prev FIFO2 is shaped
+net eth1: set FIFO3 bw = 0
+net eth1: set FIFO2 bw = 0
+$ tc qdisc del dev eth0 root
+net eth0: Prev FIFO2 is shaped
+net eth0: set FIFO3 bw = 0
+net eth0: set FIFO2 bw = 0
+$ ethtool -L eth0 rx 1 tx 1
diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
new file mode 100644
index 000000000..1be0b6f9e
--- /dev/null
+++ b/Documentation/networking/timestamping.txt
@@ -0,0 +1,538 @@
+
+1. Control Interfaces
+
+The interfaces for receiving network packages timestamps are:
+
+* SO_TIMESTAMP
+  Generates a timestamp for each incoming packet in (not necessarily
+  monotonic) system time. Reports the timestamp via recvmsg() in a
+  control message as struct timeval (usec resolution).
+
+* SO_TIMESTAMPNS
+  Same timestamping mechanism as SO_TIMESTAMP, but reports the
+  timestamp as struct timespec (nsec resolution).
+
+* IP_MULTICAST_LOOP + SO_TIMESTAMP[NS]
+  Only for multicast:approximate transmit timestamp obtained by
+  reading the looped packet receive timestamp.
+
+* SO_TIMESTAMPING
+  Generates timestamps on reception, transmission or both. Supports
+  multiple timestamp sources, including hardware. Supports generating
+  timestamps for stream sockets.
+
+
+1.1 SO_TIMESTAMP:
+
+This socket option enables timestamping of datagrams on the reception
+path. Because the destination socket, if any, is not known early in
+the network stack, the feature has to be enabled for all packets. The
+same is true for all early receive timestamp options.
+
+For interface details, see `man 7 socket`.
+
+
+1.2 SO_TIMESTAMPNS:
+
+This option is identical to SO_TIMESTAMP except for the returned data type.
+Its struct timespec allows for higher resolution (ns) timestamps than the
+timeval of SO_TIMESTAMP (ms).
+
+
+1.3 SO_TIMESTAMPING:
+
+Supports multiple types of timestamp requests. As a result, this
+socket option takes a bitmap of flags, not a boolean. In
+
+  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
+
+val is an integer with any of the following bits set. Setting other
+bit returns EINVAL and does not change the current state.
+
+The socket option configures timestamp generation for individual
+sk_buffs (1.3.1), timestamp reporting to the socket's error
+queue (1.3.2) and options (1.3.3). Timestamp generation can also
+be enabled for individual sendmsg calls using cmsg (1.3.4).
+
+
+1.3.1 Timestamp Generation
+
+Some bits are requests to the stack to try to generate timestamps. Any
+combination of them is valid. Changes to these bits apply to newly
+created packets, not to packets already in the stack. As a result, it
+is possible to selectively request timestamps for a subset of packets
+(e.g., for sampling) by embedding an send() call within two setsockopt
+calls, one to enable timestamp generation and one to disable it.
+Timestamps may also be generated for reasons other than being
+requested by a particular socket, such as when receive timestamping is
+enabled system wide, as explained earlier.
+
+SOF_TIMESTAMPING_RX_HARDWARE:
+  Request rx timestamps generated by the network adapter.
+
+SOF_TIMESTAMPING_RX_SOFTWARE:
+  Request rx timestamps when data enters the kernel. These timestamps
+  are generated just after a device driver hands a packet to the
+  kernel receive stack.
+
+SOF_TIMESTAMPING_TX_HARDWARE:
+  Request tx timestamps generated by the network adapter. This flag
+  can be enabled via both socket options and control messages.
+
+SOF_TIMESTAMPING_TX_SOFTWARE:
+  Request tx timestamps when data leaves the kernel. These timestamps
+  are generated in the device driver as close as possible, but always
+  prior to, passing the packet to the network interface. Hence, they
+  require driver support and may not be available for all devices.
+  This flag can be enabled via both socket options and control messages.
+
+
+SOF_TIMESTAMPING_TX_SCHED:
+  Request tx timestamps prior to entering the packet scheduler. Kernel
+  transmit latency is, if long, often dominated by queuing delay. The
+  difference between this timestamp and one taken at
+  SOF_TIMESTAMPING_TX_SOFTWARE will expose this latency independent
+  of protocol processing. The latency incurred in protocol
+  processing, if any, can be computed by subtracting a userspace
+  timestamp taken immediately before send() from this timestamp. On
+  machines with virtual devices where a transmitted packet travels
+  through multiple devices and, hence, multiple packet schedulers,
+  a timestamp is generated at each layer. This allows for fine
+  grained measurement of queuing delay. This flag can be enabled
+  via both socket options and control messages.
+
+SOF_TIMESTAMPING_TX_ACK:
+  Request tx timestamps when all data in the send buffer has been
+  acknowledged. This only makes sense for reliable protocols. It is
+  currently only implemented for TCP. For that protocol, it may
+  over-report measurement, because the timestamp is generated when all
+  data up to and including the buffer at send() was acknowledged: the
+  cumulative acknowledgment. The mechanism ignores SACK and FACK.
+  This flag can be enabled via both socket options and control messages.
+
+
+1.3.2 Timestamp Reporting
+
+The other three bits control which timestamps will be reported in a
+generated control message. Changes to the bits take immediate
+effect at the timestamp reporting locations in the stack. Timestamps
+are only reported for packets that also have the relevant timestamp
+generation request set.
+
+SOF_TIMESTAMPING_SOFTWARE:
+  Report any software timestamps when available.
+
+SOF_TIMESTAMPING_SYS_HARDWARE:
+  This option is deprecated and ignored.
+
+SOF_TIMESTAMPING_RAW_HARDWARE:
+  Report hardware timestamps as generated by
+  SOF_TIMESTAMPING_TX_HARDWARE when available.
+
+
+1.3.3 Timestamp Options
+
+The interface supports the options
+
+SOF_TIMESTAMPING_OPT_ID:
+
+  Generate a unique identifier along with each packet. A process can
+  have multiple concurrent timestamping requests outstanding. Packets
+  can be reordered in the transmit path, for instance in the packet
+  scheduler. In that case timestamps will be queued onto the error
+  queue out of order from the original send() calls. It is not always
+  possible to uniquely match timestamps to the original send() calls
+  based on timestamp order or payload inspection alone, then.
+
+  This option associates each packet at send() with a unique
+  identifier and returns that along with the timestamp. The identifier
+  is derived from a per-socket u32 counter (that wraps). For datagram
+  sockets, the counter increments with each sent packet. For stream
+  sockets, it increments with every byte.
+
+  The counter starts at zero. It is initialized the first time that
+  the socket option is enabled. It is reset each time the option is
+  enabled after having been disabled. Resetting the counter does not
+  change the identifiers of existing packets in the system.
+
+  This option is implemented only for transmit timestamps. There, the
+  timestamp is always looped along with a struct sock_extended_err.
+  The option modifies field ee_data to pass an id that is unique
+  among all possibly concurrently outstanding timestamp requests for
+  that socket.
+
+
+SOF_TIMESTAMPING_OPT_CMSG:
+
+  Support recv() cmsg for all timestamped packets. Control messages
+  are already supported unconditionally on all packets with receive
+  timestamps and on IPv6 packets with transmit timestamp. This option
+  extends them to IPv4 packets with transmit timestamp. One use case
+  is to correlate packets with their egress device, by enabling socket
+  option IP_PKTINFO simultaneously.
+
+
+SOF_TIMESTAMPING_OPT_TSONLY:
+
+  Applies to transmit timestamps only. Makes the kernel return the
+  timestamp as a cmsg alongside an empty packet, as opposed to
+  alongside the original packet. This reduces the amount of memory
+  charged to the socket's receive budget (SO_RCVBUF) and delivers
+  the timestamp even if sysctl net.core.tstamp_allow_data is 0.
+  This option disables SOF_TIMESTAMPING_OPT_CMSG.
+
+SOF_TIMESTAMPING_OPT_STATS:
+
+  Optional stats that are obtained along with the transmit timestamps.
+  It must be used together with SOF_TIMESTAMPING_OPT_TSONLY. When the
+  transmit timestamp is available, the stats are available in a
+  separate control message of type SCM_TIMESTAMPING_OPT_STATS, as a
+  list of TLVs (struct nlattr) of types. These stats allow the
+  application to associate various transport layer stats with
+  the transmit timestamps, such as how long a certain block of
+  data was limited by peer's receiver window.
+
+SOF_TIMESTAMPING_OPT_PKTINFO:
+
+  Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
+  packets with hardware timestamps. The message contains struct
+  scm_ts_pktinfo, which supplies the index of the real interface which
+  received the packet and its length at layer 2. A valid (non-zero)
+  interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is
+  enabled and the driver is using NAPI. The struct contains also two
+  other fields, but they are reserved and undefined.
+
+SOF_TIMESTAMPING_OPT_TX_SWHW:
+
+  Request both hardware and software timestamps for outgoing packets
+  when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
+  are enabled at the same time. If both timestamps are generated,
+  two separate messages will be looped to the socket's error queue,
+  each containing just one timestamp.
+
+New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
+disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
+regardless of the setting of sysctl net.core.tstamp_allow_data.
+
+An exception is when a process needs additional cmsg data, for
+instance SOL_IP/IP_PKTINFO to detect the egress network interface.
+Then pass option SOF_TIMESTAMPING_OPT_CMSG. This option depends on
+having access to the contents of the original packet, so cannot be
+combined with SOF_TIMESTAMPING_OPT_TSONLY.
+
+
+1.3.4. Enabling timestamps via control messages
+
+In addition to socket options, timestamp generation can be requested
+per write via cmsg, only for SOF_TIMESTAMPING_TX_* (see Section 1.3.1).
+Using this feature, applications can sample timestamps per sendmsg()
+without paying the overhead of enabling and disabling timestamps via
+setsockopt:
+
+  struct msghdr *msg;
+  ...
+  cmsg			       = CMSG_FIRSTHDR(msg);
+  cmsg->cmsg_level	       = SOL_SOCKET;
+  cmsg->cmsg_type	       = SO_TIMESTAMPING;
+  cmsg->cmsg_len	       = CMSG_LEN(sizeof(__u32));
+  *((__u32 *) CMSG_DATA(cmsg)) = SOF_TIMESTAMPING_TX_SCHED |
+				 SOF_TIMESTAMPING_TX_SOFTWARE |
+				 SOF_TIMESTAMPING_TX_ACK;
+  err = sendmsg(fd, msg, 0);
+
+The SOF_TIMESTAMPING_TX_* flags set via cmsg will override
+the SOF_TIMESTAMPING_TX_* flags set via setsockopt.
+
+Moreover, applications must still enable timestamp reporting via
+setsockopt to receive timestamps:
+
+  __u32 val = SOF_TIMESTAMPING_SOFTWARE |
+	      SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
+  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
+
+
+1.4 Bytestream Timestamps
+
+The SO_TIMESTAMPING interface supports timestamping of bytes in a
+bytestream. Each request is interpreted as a request for when the
+entire contents of the buffer has passed a timestamping point. That
+is, for streams option SOF_TIMESTAMPING_TX_SOFTWARE will record
+when all bytes have reached the device driver, regardless of how
+many packets the data has been converted into.
+
+In general, bytestreams have no natural delimiters and therefore
+correlating a timestamp with data is non-trivial. A range of bytes
+may be split across segments, any segments may be merged (possibly
+coalescing sections of previously segmented buffers associated with
+independent send() calls). Segments can be reordered and the same
+byte range can coexist in multiple segments for protocols that
+implement retransmissions.
+
+It is essential that all timestamps implement the same semantics,
+regardless of these possible transformations, as otherwise they are
+incomparable. Handling "rare" corner cases differently from the
+simple case (a 1:1 mapping from buffer to skb) is insufficient
+because performance debugging often needs to focus on such outliers.
+
+In practice, timestamps can be correlated with segments of a
+bytestream consistently, if both semantics of the timestamp and the
+timing of measurement are chosen correctly. This challenge is no
+different from deciding on a strategy for IP fragmentation. There, the
+definition is that only the first fragment is timestamped. For
+bytestreams, we chose that a timestamp is generated only when all
+bytes have passed a point. SOF_TIMESTAMPING_TX_ACK as defined is easy to
+implement and reason about. An implementation that has to take into
+account SACK would be more complex due to possible transmission holes
+and out of order arrival.
+
+On the host, TCP can also break the simple 1:1 mapping from buffer to
+skbuff as a result of Nagle, cork, autocork, segmentation and GSO. The
+implementation ensures correctness in all cases by tracking the
+individual last byte passed to send(), even if it is no longer the
+last byte after an skbuff extend or merge operation. It stores the
+relevant sequence number in skb_shinfo(skb)->tskey. Because an skbuff
+has only one such field, only one timestamp can be generated.
+
+In rare cases, a timestamp request can be missed if two requests are
+collapsed onto the same skb. A process can detect this situation by
+enabling SOF_TIMESTAMPING_OPT_ID and comparing the byte offset at
+send time with the value returned for each timestamp. It can prevent
+the situation by always flushing the TCP stack in between requests,
+for instance by enabling TCP_NODELAY and disabling TCP_CORK and
+autocork.
+
+These precautions ensure that the timestamp is generated only when all
+bytes have passed a timestamp point, assuming that the network stack
+itself does not reorder the segments. The stack indeed tries to avoid
+reordering. The one exception is under administrator control: it is
+possible to construct a packet scheduler configuration that delays
+segments from the same stream differently. Such a setup would be
+unusual.
+
+
+2 Data Interfaces
+
+Timestamps are read using the ancillary data feature of recvmsg().
+See `man 3 cmsg` for details of this interface. The socket manual
+page (`man 7 socket`) describes how timestamps generated with
+SO_TIMESTAMP and SO_TIMESTAMPNS records can be retrieved.
+
+
+2.1 SCM_TIMESTAMPING records
+
+These timestamps are returned in a control message with cmsg_level
+SOL_SOCKET, cmsg_type SCM_TIMESTAMPING, and payload of type
+
+struct scm_timestamping {
+	struct timespec ts[3];
+};
+
+The structure can return up to three timestamps. This is a legacy
+feature. At least one field is non-zero at any time. Most timestamps
+are passed in ts[0]. Hardware timestamps are passed in ts[2].
+
+ts[1] used to hold hardware timestamps converted to system time.
+Instead, expose the hardware clock device on the NIC directly as
+a HW PTP clock source, to allow time conversion in userspace and
+optionally synchronize system time with a userspace PTP stack such
+as linuxptp. For the PTP clock API, see Documentation/ptp/ptp.txt.
+
+Note that if the SO_TIMESTAMP or SO_TIMESTAMPNS option is enabled
+together with SO_TIMESTAMPING using SOF_TIMESTAMPING_SOFTWARE, a false
+software timestamp will be generated in the recvmsg() call and passed
+in ts[0] when a real software timestamp is missing. This happens also
+on hardware transmit timestamps.
+
+2.1.1 Transmit timestamps with MSG_ERRQUEUE
+
+For transmit timestamps the outgoing packet is looped back to the
+socket's error queue with the send timestamp(s) attached. A process
+receives the timestamps by calling recvmsg() with flag MSG_ERRQUEUE
+set and with a msg_control buffer sufficiently large to receive the
+relevant metadata structures. The recvmsg call returns the original
+outgoing data packet with two ancillary messages attached.
+
+A message of cm_level SOL_IP(V6) and cm_type IP(V6)_RECVERR
+embeds a struct sock_extended_err. This defines the error type. For
+timestamps, the ee_errno field is ENOMSG. The other ancillary message
+will have cm_level SOL_SOCKET and cm_type SCM_TIMESTAMPING. This
+embeds the struct scm_timestamping.
+
+
+2.1.1.2 Timestamp types
+
+The semantics of the three struct timespec are defined by field
+ee_info in the extended error structure. It contains a value of
+type SCM_TSTAMP_* to define the actual timestamp passed in
+scm_timestamping.
+
+The SCM_TSTAMP_* types are 1:1 matches to the SOF_TIMESTAMPING_*
+control fields discussed previously, with one exception. For legacy
+reasons, SCM_TSTAMP_SND is equal to zero and can be set for both
+SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE. It
+is the first if ts[2] is non-zero, the second otherwise, in which
+case the timestamp is stored in ts[0].
+
+
+2.1.1.3 Fragmentation
+
+Fragmentation of outgoing datagrams is rare, but is possible, e.g., by
+explicitly disabling PMTU discovery. If an outgoing packet is fragmented,
+then only the first fragment is timestamped and returned to the sending
+socket.
+
+
+2.1.1.4 Packet Payload
+
+The calling application is often not interested in receiving the whole
+packet payload that it passed to the stack originally: the socket
+error queue mechanism is just a method to piggyback the timestamp on.
+In this case, the application can choose to read datagrams with a
+smaller buffer, possibly even of length 0. The payload is truncated
+accordingly. Until the process calls recvmsg() on the error queue,
+however, the full packet is queued, taking up budget from SO_RCVBUF.
+
+
+2.1.1.5 Blocking Read
+
+Reading from the error queue is always a non-blocking operation. To
+block waiting on a timestamp, use poll or select. poll() will return
+POLLERR in pollfd.revents if any data is ready on the error queue.
+There is no need to pass this flag in pollfd.events. This flag is
+ignored on request. See also `man 2 poll`.
+
+
+2.1.2 Receive timestamps
+
+On reception, there is no reason to read from the socket error queue.
+The SCM_TIMESTAMPING ancillary data is sent along with the packet data
+on a normal recvmsg(). Since this is not a socket error, it is not
+accompanied by a message SOL_IP(V6)/IP(V6)_RECVERROR. In this case,
+the meaning of the three fields in struct scm_timestamping is
+implicitly defined. ts[0] holds a software timestamp if set, ts[1]
+is again deprecated and ts[2] holds a hardware timestamp if set.
+
+
+3. Hardware Timestamping configuration: SIOCSHWTSTAMP and SIOCGHWTSTAMP
+
+Hardware time stamping must also be initialized for each device driver
+that is expected to do hardware time stamping. The parameter is defined in
+/include/linux/net_tstamp.h as:
+
+struct hwtstamp_config {
+	int flags;	/* no flags defined right now, must be zero */
+	int tx_type;	/* HWTSTAMP_TX_* */
+	int rx_filter;	/* HWTSTAMP_FILTER_* */
+};
+
+Desired behavior is passed into the kernel and to a specific device by
+calling ioctl(SIOCSHWTSTAMP) with a pointer to a struct ifreq whose
+ifr_data points to a struct hwtstamp_config. The tx_type and
+rx_filter are hints to the driver what it is expected to do. If
+the requested fine-grained filtering for incoming packets is not
+supported, the driver may time stamp more than just the requested types
+of packets.
+
+Drivers are free to use a more permissive configuration than the requested
+configuration. It is expected that drivers should only implement directly the
+most generic mode that can be supported. For example if the hardware can
+support HWTSTAMP_FILTER_V2_EVENT, then it should generally always upscale
+HWTSTAMP_FILTER_V2_L2_SYNC_MESSAGE, and so forth, as HWTSTAMP_FILTER_V2_EVENT
+is more generic (and more useful to applications).
+
+A driver which supports hardware time stamping shall update the struct
+with the actual, possibly more permissive configuration. If the
+requested packets cannot be time stamped, then nothing should be
+changed and ERANGE shall be returned (in contrast to EINVAL, which
+indicates that SIOCSHWTSTAMP is not supported at all).
+
+Only a processes with admin rights may change the configuration. User
+space is responsible to ensure that multiple processes don't interfere
+with each other and that the settings are reset.
+
+Any process can read the actual configuration by passing this
+structure to ioctl(SIOCGHWTSTAMP) in the same way.  However, this has
+not been implemented in all drivers.
+
+/* possible values for hwtstamp_config->tx_type */
+enum {
+	/*
+	 * no outgoing packet will need hardware time stamping;
+	 * should a packet arrive which asks for it, no hardware
+	 * time stamping will be done
+	 */
+	HWTSTAMP_TX_OFF,
+
+	/*
+	 * enables hardware time stamping for outgoing packets;
+	 * the sender of the packet decides which are to be
+	 * time stamped by setting SOF_TIMESTAMPING_TX_SOFTWARE
+	 * before sending the packet
+	 */
+	HWTSTAMP_TX_ON,
+};
+
+/* possible values for hwtstamp_config->rx_filter */
+enum {
+	/* time stamp no incoming packet at all */
+	HWTSTAMP_FILTER_NONE,
+
+	/* time stamp any incoming packet */
+	HWTSTAMP_FILTER_ALL,
+
+	/* return value: time stamp all packets requested plus some others */
+	HWTSTAMP_FILTER_SOME,
+
+	/* PTP v1, UDP, any kind of event packet */
+	HWTSTAMP_FILTER_PTP_V1_L4_EVENT,
+
+	/* for the complete list of values, please check
+	 * the include file /include/linux/net_tstamp.h
+	 */
+};
+
+3.1 Hardware Timestamping Implementation: Device Drivers
+
+A driver which supports hardware time stamping must support the
+SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with
+the actual values as described in the section on SIOCSHWTSTAMP.  It
+should also support SIOCGHWTSTAMP.
+
+Time stamps for received packets must be stored in the skb. To get a pointer
+to the shared time stamp structure of the skb call skb_hwtstamps(). Then
+set the time stamps in the structure:
+
+struct skb_shared_hwtstamps {
+	/* hardware time stamp transformed into duration
+	 * since arbitrary point in time
+	 */
+	ktime_t	hwtstamp;
+};
+
+Time stamps for outgoing packets are to be generated as follows:
+- In hard_start_xmit(), check if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)
+  is set no-zero. If yes, then the driver is expected to do hardware time
+  stamping.
+- If this is possible for the skb and requested, then declare
+  that the driver is doing the time stamping by setting the flag
+  SKBTX_IN_PROGRESS in skb_shinfo(skb)->tx_flags , e.g. with
+
+      skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
+
+  You might want to keep a pointer to the associated skb for the next step
+  and not free the skb. A driver not supporting hardware time stamping doesn't
+  do that. A driver must never touch sk_buff::tstamp! It is used to store
+  software generated time stamps by the network subsystem.
+- Driver should call skb_tx_timestamp() as close to passing sk_buff to hardware
+  as possible. skb_tx_timestamp() provides a software time stamp if requested
+  and hardware timestamping is not possible (SKBTX_IN_PROGRESS not set).
+- As soon as the driver has sent the packet and/or obtained a
+  hardware time stamp for it, it passes the time stamp back by
+  calling skb_hwtstamp_tx() with the original skb, the raw
+  hardware time stamp. skb_hwtstamp_tx() clones the original skb and
+  adds the timestamps, therefore the original skb has to be freed now.
+  If obtaining the hardware time stamp somehow fails, then the driver
+  should not fall back to software time stamping. The rationale is that
+  this would occur at a later time in the processing pipeline than other
+  software time stamping and therefore could lead to unexpected deltas
+  between time stamps.
diff --git a/Documentation/networking/tlan.txt b/Documentation/networking/tlan.txt
new file mode 100644
index 000000000..34550dfce
--- /dev/null
+++ b/Documentation/networking/tlan.txt
@@ -0,0 +1,117 @@
+(C) 1997-1998 Caldera, Inc.
+(C) 1998 James Banks
+(C) 1999-2001 Torben Mathiasen <tmm@image.dk, torben.mathiasen@compaq.com>
+
+For driver information/updates visit http://www.compaq.com
+
+
+TLAN driver for Linux, version 1.14a
+README
+
+
+I.  Supported Devices.
+
+    Only PCI devices will work with this driver.
+
+    Supported:
+    Vendor ID	Device ID	Name
+    0e11	ae32		Compaq Netelligent 10/100 TX PCI UTP
+    0e11	ae34		Compaq Netelligent 10 T PCI UTP
+    0e11	ae35		Compaq Integrated NetFlex 3/P
+    0e11	ae40		Compaq Netelligent Dual 10/100 TX PCI UTP
+    0e11	ae43		Compaq Netelligent Integrated 10/100 TX UTP
+    0e11	b011		Compaq Netelligent 10/100 TX Embedded UTP
+    0e11	b012		Compaq Netelligent 10 T/2 PCI UTP/Coax
+    0e11	b030		Compaq Netelligent 10/100 TX UTP
+    0e11	f130		Compaq NetFlex 3/P
+    0e11	f150		Compaq NetFlex 3/P
+    108d	0012		Olicom OC-2325	
+    108d	0013		Olicom OC-2183
+    108d	0014		Olicom OC-2326	
+
+
+    Caveats:
+    
+    I am not sure if 100BaseTX daughterboards (for those cards which
+    support such things) will work.  I haven't had any solid evidence
+    either way.
+
+    However, if a card supports 100BaseTx without requiring an add
+    on daughterboard, it should work with 100BaseTx.
+
+    The "Netelligent 10 T/2 PCI UTP/Coax" (b012) device is untested,
+    but I do not expect any problems.
+    
+
+II.   Driver Options
+	1. You can append debug=x to the end of the insmod line to get
+           debug messages, where x is a bit field where the bits mean
+	   the following:
+	   
+	   0x01		Turn on general debugging messages.
+	   0x02		Turn on receive debugging messages.
+	   0x04		Turn on transmit debugging messages.
+	   0x08		Turn on list debugging messages.
+
+	2. You can append aui=1 to the end of the insmod line to cause
+           the adapter to use the AUI interface instead of the 10 Base T
+           interface.  This is also what to do if you want to use the BNC
+	   connector on a TLAN based device.  (Setting this option on a
+	   device that does not have an AUI/BNC connector will probably
+	   cause it to not function correctly.)
+
+	3. You can set duplex=1 to force half duplex, and duplex=2 to
+	   force full duplex.
+
+	4. You can set speed=10 to force 10Mbs operation, and speed=100
+	   to force 100Mbs operation. (I'm not sure what will happen
+	   if a card which only supports 10Mbs is forced into 100Mbs
+	   mode.)
+
+	5. You have to use speed=X duplex=Y together now. If you just
+	   do "insmod tlan.o speed=100" the driver will do Auto-Neg.
+	   To force a 10Mbps Half-Duplex link do "insmod tlan.o speed=10 
+	   duplex=1".
+
+	6. If the driver is built into the kernel, you can use the 3rd
+	   and 4th parameters to set aui and debug respectively.  For
+	   example:
+
+	   ether=0,0,0x1,0x7,eth0
+
+	   This sets aui to 0x1 and debug to 0x7, assuming eth0 is a
+	   supported TLAN device.
+
+	   The bits in the third byte are assigned as follows:
+
+		0x01 = aui
+		0x02 = use half duplex
+		0x04 = use full duplex
+		0x08 = use 10BaseT
+		0x10 = use 100BaseTx
+
+	   You also need to set both speed and duplex settings when forcing
+	   speeds with kernel-parameters. 
+	   ether=0,0,0x12,0,eth0 will force link to 100Mbps Half-Duplex.
+
+	7. If you have more than one tlan adapter in your system, you can
+	   use the above options on a per adapter basis. To force a 100Mbit/HD
+	   link with your eth1 adapter use:
+	   
+	   insmod tlan speed=0,100 duplex=0,1
+
+	   Now eth0 will use auto-neg and eth1 will be forced to 100Mbit/HD.
+	   Note that the tlan driver supports a maximum of 8 adapters.
+
+
+III.  Things to try if you have problems.
+	1. Make sure your card's PCI id is among those listed in
+	   section I, above.
+	2. Make sure routing is correct.
+	3. Try forcing different speed/duplex settings
+
+
+There is also a tlan mailing list which you can join by sending "subscribe tlan"
+in the body of an email to majordomo@vuser.vu.union.edu.
+There is also a tlan website at http://www.compaq.com
+
diff --git a/Documentation/networking/tls.txt b/Documentation/networking/tls.txt
new file mode 100644
index 000000000..58b5ef75f
--- /dev/null
+++ b/Documentation/networking/tls.txt
@@ -0,0 +1,197 @@
+Overview
+========
+
+Transport Layer Security (TLS) is a Upper Layer Protocol (ULP) that runs over
+TCP. TLS provides end-to-end data integrity and confidentiality.
+
+User interface
+==============
+
+Creating a TLS connection
+-------------------------
+
+First create a new TCP socket and set the TLS ULP.
+
+  sock = socket(AF_INET, SOCK_STREAM, 0);
+  setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));
+
+Setting the TLS ULP allows us to set/get TLS socket options. Currently
+only the symmetric encryption is handled in the kernel.  After the TLS
+handshake is complete, we have all the parameters required to move the
+data-path to the kernel. There is a separate socket option for moving
+the transmit and the receive into the kernel.
+
+  /* From linux/tls.h */
+  struct tls_crypto_info {
+          unsigned short version;
+          unsigned short cipher_type;
+  };
+
+  struct tls12_crypto_info_aes_gcm_128 {
+          struct tls_crypto_info info;
+          unsigned char iv[TLS_CIPHER_AES_GCM_128_IV_SIZE];
+          unsigned char key[TLS_CIPHER_AES_GCM_128_KEY_SIZE];
+          unsigned char salt[TLS_CIPHER_AES_GCM_128_SALT_SIZE];
+          unsigned char rec_seq[TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE];
+  };
+
+
+  struct tls12_crypto_info_aes_gcm_128 crypto_info;
+
+  crypto_info.info.version = TLS_1_2_VERSION;
+  crypto_info.info.cipher_type = TLS_CIPHER_AES_GCM_128;
+  memcpy(crypto_info.iv, iv_write, TLS_CIPHER_AES_GCM_128_IV_SIZE);
+  memcpy(crypto_info.rec_seq, seq_number_write,
+					TLS_CIPHER_AES_GCM_128_REC_SEQ_SIZE);
+  memcpy(crypto_info.key, cipher_key_write, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
+  memcpy(crypto_info.salt, implicit_iv_write, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
+
+  setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info));
+
+Transmit and receive are set separately, but the setup is the same, using either
+TLS_TX or TLS_RX.
+
+Sending TLS application data
+----------------------------
+
+After setting the TLS_TX socket option all application data sent over this
+socket is encrypted using TLS and the parameters provided in the socket option.
+For example, we can send an encrypted hello world record as follows:
+
+  const char *msg = "hello world\n";
+  send(sock, msg, strlen(msg));
+
+send() data is directly encrypted from the userspace buffer provided
+to the encrypted kernel send buffer if possible.
+
+The sendfile system call will send the file's data over TLS records of maximum
+length (2^14).
+
+  file = open(filename, O_RDONLY);
+  fstat(file, &stat);
+  sendfile(sock, file, &offset, stat.st_size);
+
+TLS records are created and sent after each send() call, unless
+MSG_MORE is passed.  MSG_MORE will delay creation of a record until
+MSG_MORE is not passed, or the maximum record size is reached.
+
+The kernel will need to allocate a buffer for the encrypted data.
+This buffer is allocated at the time send() is called, such that
+either the entire send() call will return -ENOMEM (or block waiting
+for memory), or the encryption will always succeed.  If send() returns
+-ENOMEM and some data was left on the socket buffer from a previous
+call using MSG_MORE, the MSG_MORE data is left on the socket buffer.
+
+Receiving TLS application data
+------------------------------
+
+After setting the TLS_RX socket option, all recv family socket calls
+are decrypted using TLS parameters provided.  A full TLS record must
+be received before decryption can happen.
+
+  char buffer[16384];
+  recv(sock, buffer, 16384);
+
+Received data is decrypted directly in to the user buffer if it is
+large enough, and no additional allocations occur.  If the userspace
+buffer is too small, data is decrypted in the kernel and copied to
+userspace.
+
+EINVAL is returned if the TLS version in the received message does not
+match the version passed in setsockopt.
+
+EMSGSIZE is returned if the received message is too big.
+
+EBADMSG is returned if decryption failed for any other reason.
+
+Send TLS control messages
+-------------------------
+
+Other than application data, TLS has control messages such as alert
+messages (record type 21) and handshake messages (record type 22), etc.
+These messages can be sent over the socket by providing the TLS record type
+via a CMSG. For example the following function sends @data of @length bytes
+using a record of type @record_type.
+
+/* send TLS control message using record_type */
+  static int klts_send_ctrl_message(int sock, unsigned char record_type,
+                                  void *data, size_t length)
+  {
+        struct msghdr msg = {0};
+        int cmsg_len = sizeof(record_type);
+        struct cmsghdr *cmsg;
+        char buf[CMSG_SPACE(cmsg_len)];
+        struct iovec msg_iov;   /* Vector of data to send/receive into.  */
+
+        msg.msg_control = buf;
+        msg.msg_controllen = sizeof(buf);
+        cmsg = CMSG_FIRSTHDR(&msg);
+        cmsg->cmsg_level = SOL_TLS;
+        cmsg->cmsg_type = TLS_SET_RECORD_TYPE;
+        cmsg->cmsg_len = CMSG_LEN(cmsg_len);
+        *CMSG_DATA(cmsg) = record_type;
+        msg.msg_controllen = cmsg->cmsg_len;
+
+        msg_iov.iov_base = data;
+        msg_iov.iov_len = length;
+        msg.msg_iov = &msg_iov;
+        msg.msg_iovlen = 1;
+
+        return sendmsg(sock, &msg, 0);
+  }
+
+Control message data should be provided unencrypted, and will be
+encrypted by the kernel.
+
+Receiving TLS control messages
+------------------------------
+
+TLS control messages are passed in the userspace buffer, with message
+type passed via cmsg.  If no cmsg buffer is provided, an error is
+returned if a control message is received.  Data messages may be
+received without a cmsg buffer set.
+
+  char buffer[16384];
+  char cmsg[CMSG_SPACE(sizeof(unsigned char))];
+  struct msghdr msg = {0};
+  msg.msg_control = cmsg;
+  msg.msg_controllen = sizeof(cmsg);
+
+  struct iovec msg_iov;
+  msg_iov.iov_base = buffer;
+  msg_iov.iov_len = 16384;
+
+  msg.msg_iov = &msg_iov;
+  msg.msg_iovlen = 1;
+
+  int ret = recvmsg(sock, &msg, 0 /* flags */);
+
+  struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
+  if (cmsg->cmsg_level == SOL_TLS &&
+      cmsg->cmsg_type == TLS_GET_RECORD_TYPE) {
+      int record_type = *((unsigned char *)CMSG_DATA(cmsg));
+      // Do something with record_type, and control message data in
+      // buffer.
+      //
+      // Note that record_type may be == to application data (23).
+  } else {
+      // Buffer contains application data.
+  }
+
+recv will never return data from mixed types of TLS records.
+
+Integrating in to userspace TLS library
+---------------------------------------
+
+At a high level, the kernel TLS ULP is a replacement for the record
+layer of a userspace TLS library.
+
+A patchset to OpenSSL to use ktls as the record layer is here:
+
+https://github.com/Mellanox/openssl/commits/tls_rx2
+
+An example of calling send directly after a handshake using
+gnutls.  Since it doesn't implement a full record layer, control
+messages are not supported:
+
+https://github.com/ktls/af_ktls-tool/commits/RX
diff --git a/Documentation/networking/tproxy.txt b/Documentation/networking/tproxy.txt
new file mode 100644
index 000000000..b9a188823
--- /dev/null
+++ b/Documentation/networking/tproxy.txt
@@ -0,0 +1,104 @@
+Transparent proxy support
+=========================
+
+This feature adds Linux 2.2-like transparent proxy support to current kernels.
+To use it, enable the socket match and the TPROXY target in your kernel config.
+You will need policy routing too, so be sure to enable that as well.
+
+From Linux 4.18 transparent proxy support is also available in nf_tables.
+
+1. Making non-local sockets work
+================================
+
+The idea is that you identify packets with destination address matching a local
+socket on your box, set the packet mark to a certain value:
+
+# iptables -t mangle -N DIVERT
+# iptables -t mangle -A PREROUTING -p tcp -m socket -j DIVERT
+# iptables -t mangle -A DIVERT -j MARK --set-mark 1
+# iptables -t mangle -A DIVERT -j ACCEPT
+
+Alternatively you can do this in nft with the following commands:
+
+# nft add table filter
+# nft add chain filter divert "{ type filter hook prerouting priority -150; }"
+# nft add rule filter divert meta l4proto tcp socket transparent 1 meta mark set 1 accept
+
+And then match on that value using policy routing to have those packets
+delivered locally:
+
+# ip rule add fwmark 1 lookup 100
+# ip route add local 0.0.0.0/0 dev lo table 100
+
+Because of certain restrictions in the IPv4 routing output code you'll have to
+modify your application to allow it to send datagrams _from_ non-local IP
+addresses. All you have to do is enable the (SOL_IP, IP_TRANSPARENT) socket
+option before calling bind:
+
+fd = socket(AF_INET, SOCK_STREAM, 0);
+/* - 8< -*/
+int value = 1;
+setsockopt(fd, SOL_IP, IP_TRANSPARENT, &value, sizeof(value));
+/* - 8< -*/
+name.sin_family = AF_INET;
+name.sin_port = htons(0xCAFE);
+name.sin_addr.s_addr = htonl(0xDEADBEEF);
+bind(fd, &name, sizeof(name));
+
+A trivial patch for netcat is available here:
+http://people.netfilter.org/hidden/tproxy/netcat-ip_transparent-support.patch
+
+
+2. Redirecting traffic
+======================
+
+Transparent proxying often involves "intercepting" traffic on a router. This is
+usually done with the iptables REDIRECT target; however, there are serious
+limitations of that method. One of the major issues is that it actually
+modifies the packets to change the destination address -- which might not be
+acceptable in certain situations. (Think of proxying UDP for example: you won't
+be able to find out the original destination address. Even in case of TCP
+getting the original destination address is racy.)
+
+The 'TPROXY' target provides similar functionality without relying on NAT. Simply
+add rules like this to the iptables ruleset above:
+
+# iptables -t mangle -A PREROUTING -p tcp --dport 80 -j TPROXY \
+  --tproxy-mark 0x1/0x1 --on-port 50080
+
+Or the following rule to nft:
+
+# nft add rule filter divert tcp dport 80 tproxy to :50080 meta mark set 1 accept
+
+Note that for this to work you'll have to modify the proxy to enable (SOL_IP,
+IP_TRANSPARENT) for the listening socket.
+
+As an example implementation, tcprdr is available here:
+https://git.breakpoint.cc/cgit/fw/tcprdr.git/
+This tool is written by Florian Westphal and it was used for testing during the
+nf_tables implementation.
+
+3. Iptables and nf_tables extensions
+====================================
+
+To use tproxy you'll need to have the following modules compiled for iptables:
+ - NETFILTER_XT_MATCH_SOCKET
+ - NETFILTER_XT_TARGET_TPROXY
+
+Or the floowing modules for nf_tables:
+ - NFT_SOCKET
+ - NFT_TPROXY
+
+4. Application support
+======================
+
+4.1. Squid
+----------
+
+Squid 3.HEAD has support built-in. To use it, pass
+'--enable-linux-netfilter' to configure and set the 'tproxy' option on
+the HTTP listener you redirect traffic to with the TPROXY iptables
+target.
+
+For more information please consult the following page on the Squid
+wiki: http://wiki.squid-cache.org/Features/Tproxy4
diff --git a/Documentation/networking/tuntap.txt b/Documentation/networking/tuntap.txt
new file mode 100644
index 000000000..949d5dcdd
--- /dev/null
+++ b/Documentation/networking/tuntap.txt
@@ -0,0 +1,227 @@
+Universal TUN/TAP device driver.
+Copyright (C) 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com>
+
+  Linux, Solaris drivers 
+  Copyright (C) 1999-2000 Maxim Krasnyansky <max_mk@yahoo.com>
+
+  FreeBSD TAP driver 
+  Copyright (c) 1999-2000 Maksim Yevmenkin <m_evmenkin@yahoo.com>
+
+  Revision of this document 2002 by Florian Thiel <florian.thiel@gmx.net>
+
+1. Description
+  TUN/TAP provides packet reception and transmission for user space programs. 
+  It can be seen as a simple Point-to-Point or Ethernet device, which,
+  instead of receiving packets from physical media, receives them from 
+  user space program and instead of sending packets via physical media 
+  writes them to the user space program. 
+
+  In order to use the driver a program has to open /dev/net/tun and issue a
+  corresponding ioctl() to register a network device with the kernel. A network
+  device will appear as tunXX or tapXX, depending on the options chosen. When
+  the program closes the file descriptor, the network device and all
+  corresponding routes will disappear.
+
+  Depending on the type of device chosen the userspace program has to read/write
+  IP packets (with tun) or ethernet frames (with tap). Which one is being used
+  depends on the flags given with the ioctl().
+
+  The package from http://vtun.sourceforge.net/tun contains two simple examples
+  for how to use tun and tap devices. Both programs work like a bridge between
+  two network interfaces.
+  br_select.c - bridge based on select system call.
+  br_sigio.c  - bridge based on async io and SIGIO signal.
+  However, the best example is VTun http://vtun.sourceforge.net :))
+
+2. Configuration 
+  Create device node:
+     mkdir /dev/net (if it doesn't exist already)
+     mknod /dev/net/tun c 10 200
+  
+  Set permissions:
+     e.g. chmod 0666 /dev/net/tun
+     There's no harm in allowing the device to be accessible by non-root users,
+     since CAP_NET_ADMIN is required for creating network devices or for 
+     connecting to network devices which aren't owned by the user in question.
+     If you want to create persistent devices and give ownership of them to 
+     unprivileged users, then you need the /dev/net/tun device to be usable by
+     those users.
+
+  Driver module autoloading
+
+     Make sure that "Kernel module loader" - module auto-loading
+     support is enabled in your kernel.  The kernel should load it on
+     first access.
+  
+  Manual loading 
+     insert the module by hand:
+        modprobe tun
+
+  If you do it the latter way, you have to load the module every time you
+  need it, if you do it the other way it will be automatically loaded when
+  /dev/net/tun is being opened.
+
+3. Program interface 
+  3.1 Network device allocation:
+
+  char *dev should be the name of the device with a format string (e.g.
+  "tun%d"), but (as far as I can see) this can be any valid network device name.
+  Note that the character pointer becomes overwritten with the real device name
+  (e.g. "tun0")
+
+  #include <linux/if.h>
+  #include <linux/if_tun.h>
+
+  int tun_alloc(char *dev)
+  {
+      struct ifreq ifr;
+      int fd, err;
+
+      if( (fd = open("/dev/net/tun", O_RDWR)) < 0 )
+         return tun_alloc_old(dev);
+
+      memset(&ifr, 0, sizeof(ifr));
+
+      /* Flags: IFF_TUN   - TUN device (no Ethernet headers) 
+       *        IFF_TAP   - TAP device  
+       *
+       *        IFF_NO_PI - Do not provide packet information  
+       */ 
+      ifr.ifr_flags = IFF_TUN; 
+      if( *dev )
+         strncpy(ifr.ifr_name, dev, IFNAMSIZ);
+
+      if( (err = ioctl(fd, TUNSETIFF, (void *) &ifr)) < 0 ){
+         close(fd);
+         return err;
+      }
+      strcpy(dev, ifr.ifr_name);
+      return fd;
+  }              
+ 
+  3.2 Frame format:
+  If flag IFF_NO_PI is not set each frame format is: 
+     Flags [2 bytes]
+     Proto [2 bytes]
+     Raw protocol(IP, IPv6, etc) frame.
+
+  3.3 Multiqueue tuntap interface:
+
+  From version 3.8, Linux supports multiqueue tuntap which can uses multiple
+  file descriptors (queues) to parallelize packets sending or receiving. The
+  device allocation is the same as before, and if user wants to create multiple
+  queues, TUNSETIFF with the same device name must be called many times with
+  IFF_MULTI_QUEUE flag.
+
+  char *dev should be the name of the device, queues is the number of queues to
+  be created, fds is used to store and return the file descriptors (queues)
+  created to the caller. Each file descriptor were served as the interface of a
+  queue which could be accessed by userspace.
+
+  #include <linux/if.h>
+  #include <linux/if_tun.h>
+
+  int tun_alloc_mq(char *dev, int queues, int *fds)
+  {
+      struct ifreq ifr;
+      int fd, err, i;
+
+      if (!dev)
+          return -1;
+
+      memset(&ifr, 0, sizeof(ifr));
+      /* Flags: IFF_TUN   - TUN device (no Ethernet headers)
+       *        IFF_TAP   - TAP device
+       *
+       *        IFF_NO_PI - Do not provide packet information
+       *        IFF_MULTI_QUEUE - Create a queue of multiqueue device
+       */
+      ifr.ifr_flags = IFF_TAP | IFF_NO_PI | IFF_MULTI_QUEUE;
+      strcpy(ifr.ifr_name, dev);
+
+      for (i = 0; i < queues; i++) {
+          if ((fd = open("/dev/net/tun", O_RDWR)) < 0)
+             goto err;
+          err = ioctl(fd, TUNSETIFF, (void *)&ifr);
+          if (err) {
+             close(fd);
+             goto err;
+          }
+          fds[i] = fd;
+      }
+
+      return 0;
+  err:
+      for (--i; i >= 0; i--)
+          close(fds[i]);
+      return err;
+  }
+
+  A new ioctl(TUNSETQUEUE) were introduced to enable or disable a queue. When
+  calling it with IFF_DETACH_QUEUE flag, the queue were disabled. And when
+  calling it with IFF_ATTACH_QUEUE flag, the queue were enabled. The queue were
+  enabled by default after it was created through TUNSETIFF.
+
+  fd is the file descriptor (queue) that we want to enable or disable, when
+  enable is true we enable it, otherwise we disable it
+
+  #include <linux/if.h>
+  #include <linux/if_tun.h>
+
+  int tun_set_queue(int fd, int enable)
+  {
+      struct ifreq ifr;
+
+      memset(&ifr, 0, sizeof(ifr));
+
+      if (enable)
+         ifr.ifr_flags = IFF_ATTACH_QUEUE;
+      else
+         ifr.ifr_flags = IFF_DETACH_QUEUE;
+
+      return ioctl(fd, TUNSETQUEUE, (void *)&ifr);
+  }
+
+Universal TUN/TAP device driver Frequently Asked Question.
+   
+1. What platforms are supported by TUN/TAP driver ?
+Currently driver has been written for 3 Unices:
+   Linux kernels 2.2.x, 2.4.x 
+   FreeBSD 3.x, 4.x, 5.x
+   Solaris 2.6, 7.0, 8.0
+
+2. What is TUN/TAP driver used for?
+As mentioned above, main purpose of TUN/TAP driver is tunneling. 
+It is used by VTun (http://vtun.sourceforge.net).
+
+Another interesting application using TUN/TAP is pipsecd
+(http://perso.enst.fr/~beyssac/pipsec/), a userspace IPSec
+implementation that can use complete kernel routing (unlike FreeS/WAN).
+
+3. How does Virtual network device actually work ? 
+Virtual network device can be viewed as a simple Point-to-Point or
+Ethernet device, which instead of receiving packets from a physical 
+media, receives them from user space program and instead of sending 
+packets via physical media sends them to the user space program. 
+
+Let's say that you configured IPX on the tap0, then whenever 
+the kernel sends an IPX packet to tap0, it is passed to the application
+(VTun for example). The application encrypts, compresses and sends it to 
+the other side over TCP or UDP. The application on the other side decompresses
+and decrypts the data received and writes the packet to the TAP device, 
+the kernel handles the packet like it came from real physical device.
+
+4. What is the difference between TUN driver and TAP driver?
+TUN works with IP frames. TAP works with Ethernet frames.
+
+This means that you have to read/write IP packets when you are using tun and
+ethernet frames when using tap.
+
+5. What is the difference between BPF and TUN/TAP driver?
+BPF is an advanced packet filter. It can be attached to existing
+network interface. It does not provide a virtual network interface.
+A TUN/TAP driver does provide a virtual network interface and it is possible
+to attach BPF to this interface.
+
+6. Does TAP driver support kernel Ethernet bridging?
+Yes. Linux and FreeBSD drivers support Ethernet bridging. 
diff --git a/Documentation/networking/udplite.txt b/Documentation/networking/udplite.txt
new file mode 100644
index 000000000..53a726855
--- /dev/null
+++ b/Documentation/networking/udplite.txt
@@ -0,0 +1,278 @@
+  ===========================================================================
+                      The UDP-Lite protocol (RFC 3828)
+  ===========================================================================
+
+
+  UDP-Lite is a Standards-Track IETF transport protocol whose characteristic
+  is a variable-length checksum. This has advantages for transport of multimedia
+  (video, VoIP) over wireless networks, as partly damaged packets can still be
+  fed into the codec instead of being discarded due to a failed checksum test.
+
+  This file briefly describes the existing kernel support and the socket API.
+  For in-depth information, you can consult:
+
+   o The UDP-Lite Homepage:
+	http://web.archive.org/web/*/http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/ 
+       From here you can also download some example application source code.
+
+   o The UDP-Lite HOWTO on
+       http://web.archive.org/web/*/http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/
+	files/UDP-Lite-HOWTO.txt
+
+   o The Wireshark UDP-Lite WiKi (with capture files):
+       https://wiki.wireshark.org/Lightweight_User_Datagram_Protocol
+
+   o The Protocol Spec, RFC 3828, http://www.ietf.org/rfc/rfc3828.txt
+
+
+  I) APPLICATIONS
+
+  Several applications have been ported successfully to UDP-Lite. Ethereal
+  (now called wireshark) has UDP-Litev4/v6 support by default. 
+  Porting applications to UDP-Lite is straightforward: only socket level and
+  IPPROTO need to be changed; senders additionally set the checksum coverage
+  length (default = header length = 8). Details are in the next section.
+
+
+  II) PROGRAMMING API
+
+  UDP-Lite provides a connectionless, unreliable datagram service and hence
+  uses the same socket type as UDP. In fact, porting from UDP to UDP-Lite is
+  very easy: simply add `IPPROTO_UDPLITE' as the last argument of the socket(2)
+  call so that the statement looks like:
+
+      s = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDPLITE);
+
+                      or, respectively,
+
+      s = socket(PF_INET6, SOCK_DGRAM, IPPROTO_UDPLITE);
+
+  With just the above change you are able to run UDP-Lite services or connect
+  to UDP-Lite servers. The kernel will assume that you are not interested in
+  using partial checksum coverage and so emulate UDP mode (full coverage).
+
+  To make use of the partial checksum coverage facilities requires setting a
+  single socket option, which takes an integer specifying the coverage length:
+
+    * Sender checksum coverage: UDPLITE_SEND_CSCOV
+
+      For example,
+
+        int val = 20;
+        setsockopt(s, SOL_UDPLITE, UDPLITE_SEND_CSCOV, &val, sizeof(int));
+
+      sets the checksum coverage length to 20 bytes (12b data + 8b header).
+      Of each packet only the first 20 bytes (plus the pseudo-header) will be
+      checksummed. This is useful for RTP applications which have a 12-byte
+      base header.
+
+
+    * Receiver checksum coverage: UDPLITE_RECV_CSCOV
+
+      This option is the receiver-side analogue. It is truly optional, i.e. not
+      required to enable traffic with partial checksum coverage. Its function is
+      that of a traffic filter: when enabled, it instructs the kernel to drop
+      all packets which have a coverage _less_ than this value. For example, if
+      RTP and UDP headers are to be protected, a receiver can enforce that only
+      packets with a minimum coverage of 20 are admitted:
+
+        int min = 20;
+        setsockopt(s, SOL_UDPLITE, UDPLITE_RECV_CSCOV, &min, sizeof(int));
+
+  The calls to getsockopt(2) are analogous. Being an extension and not a stand-
+  alone protocol, all socket options known from UDP can be used in exactly the
+  same manner as before, e.g. UDP_CORK or UDP_ENCAP.
+
+  A detailed discussion of UDP-Lite checksum coverage options is in section IV.
+
+
+  III) HEADER FILES
+
+  The socket API requires support through header files in /usr/include:
+
+    * /usr/include/netinet/in.h
+        to define IPPROTO_UDPLITE
+
+    * /usr/include/netinet/udplite.h
+        for UDP-Lite header fields and protocol constants
+
+  For testing purposes, the following can serve as a `mini' header file:
+
+    #define IPPROTO_UDPLITE       136
+    #define SOL_UDPLITE           136
+    #define UDPLITE_SEND_CSCOV     10
+    #define UDPLITE_RECV_CSCOV     11
+
+  Ready-made header files for various distros are in the UDP-Lite tarball.
+
+
+  IV) KERNEL BEHAVIOUR WITH REGARD TO THE VARIOUS SOCKET OPTIONS
+
+  To enable debugging messages, the log level need to be set to 8, as most
+  messages use the KERN_DEBUG level (7).
+
+  1) Sender Socket Options
+
+  If the sender specifies a value of 0 as coverage length, the module
+  assumes full coverage, transmits a packet with coverage length of 0
+  and according checksum.  If the sender specifies a coverage < 8 and
+  different from 0, the kernel assumes 8 as default value.  Finally,
+  if the specified coverage length exceeds the packet length, the packet
+  length is used instead as coverage length.
+
+  2) Receiver Socket Options
+
+  The receiver specifies the minimum value of the coverage length it
+  is willing to accept.  A value of 0 here indicates that the receiver
+  always wants the whole of the packet covered. In this case, all
+  partially covered packets are dropped and an error is logged.
+
+  It is not possible to specify illegal values (<0 and <8); in these
+  cases the default of 8 is assumed.
+
+  All packets arriving with a coverage value less than the specified
+  threshold are discarded, these events are also logged.
+
+  3) Disabling the Checksum Computation
+
+  On both sender and receiver, checksumming will always be performed
+  and cannot be disabled using SO_NO_CHECK. Thus
+
+        setsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK,  ... );
+
+  will always will be ignored, while the value of
+
+        getsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, &value, ...);
+
+  is meaningless (as in TCP). Packets with a zero checksum field are
+  illegal (cf. RFC 3828, sec. 3.1) and will be silently discarded.
+
+  4) Fragmentation
+
+  The checksum computation respects both buffersize and MTU. The size
+  of UDP-Lite packets is determined by the size of the send buffer. The
+  minimum size of the send buffer is 2048 (defined as SOCK_MIN_SNDBUF
+  in include/net/sock.h), the default value is configurable as
+  net.core.wmem_default or via setting the SO_SNDBUF socket(7)
+  option. The maximum upper bound for the send buffer is determined
+  by net.core.wmem_max.
+
+  Given a payload size larger than the send buffer size, UDP-Lite will
+  split the payload into several individual packets, filling up the
+  send buffer size in each case.
+
+  The precise value also depends on the interface MTU. The interface MTU,
+  in turn, may trigger IP fragmentation. In this case, the generated
+  UDP-Lite packet is split into several IP packets, of which only the
+  first one contains the L4 header.
+
+  The send buffer size has implications on the checksum coverage length.
+  Consider the following example:
+
+  Payload: 1536 bytes          Send Buffer:     1024 bytes
+  MTU:     1500 bytes          Coverage Length:  856 bytes
+
+  UDP-Lite will ship the 1536 bytes in two separate packets:
+
+  Packet 1: 1024 payload + 8 byte header + 20 byte IP header = 1052 bytes
+  Packet 2:  512 payload + 8 byte header + 20 byte IP header =  540 bytes
+
+  The coverage packet covers the UDP-Lite header and 848 bytes of the
+  payload in the first packet, the second packet is fully covered. Note
+  that for the second packet, the coverage length exceeds the packet
+  length. The kernel always re-adjusts the coverage length to the packet
+  length in such cases.
+
+  As an example of what happens when one UDP-Lite packet is split into
+  several tiny fragments, consider the following example.
+
+  Payload: 1024 bytes            Send buffer size: 1024 bytes
+  MTU:      300 bytes            Coverage length:   575 bytes
+
+  +-+-----------+--------------+--------------+--------------+
+  |8|    272    |      280     |     280      |     280      |
+  +-+-----------+--------------+--------------+--------------+
+               280            560            840           1032
+                                    ^
+  *****checksum coverage*************
+
+  The UDP-Lite module generates one 1032 byte packet (1024 + 8 byte
+  header). According to the interface MTU, these are split into 4 IP
+  packets (280 byte IP payload + 20 byte IP header). The kernel module
+  sums the contents of the entire first two packets, plus 15 bytes of
+  the last packet before releasing the fragments to the IP module.
+
+  To see the analogous case for IPv6 fragmentation, consider a link
+  MTU of 1280 bytes and a write buffer of 3356 bytes. If the checksum
+  coverage is less than 1232 bytes (MTU minus IPv6/fragment header
+  lengths), only the first fragment needs to be considered. When using
+  larger checksum coverage lengths, each eligible fragment needs to be
+  checksummed. Suppose we have a checksum coverage of 3062. The buffer
+  of 3356 bytes will be split into the following fragments:
+
+    Fragment 1: 1280 bytes carrying  1232 bytes of UDP-Lite data
+    Fragment 2: 1280 bytes carrying  1232 bytes of UDP-Lite data
+    Fragment 3:  948 bytes carrying   900 bytes of UDP-Lite data
+
+  The first two fragments have to be checksummed in full, of the last
+  fragment only 598 (= 3062 - 2*1232) bytes are checksummed.
+
+  While it is important that such cases are dealt with correctly, they
+  are (annoyingly) rare: UDP-Lite is designed for optimising multimedia
+  performance over wireless (or generally noisy) links and thus smaller
+  coverage lengths are likely to be expected.
+
+
+  V) UDP-LITE RUNTIME STATISTICS AND THEIR MEANING
+
+  Exceptional and error conditions are logged to syslog at the KERN_DEBUG
+  level.  Live statistics about UDP-Lite are available in /proc/net/snmp
+  and can (with newer versions of netstat) be viewed using
+
+                            netstat -svu
+
+  This displays UDP-Lite statistics variables, whose meaning is as follows.
+
+   InDatagrams:     The total number of datagrams delivered to users.
+
+   NoPorts:         Number of packets received to an unknown port.
+                    These cases are counted separately (not as InErrors).
+
+   InErrors:        Number of erroneous UDP-Lite packets. Errors include:
+                      * internal socket queue receive errors
+                      * packet too short (less than 8 bytes or stated
+                        coverage length exceeds received length)
+                      * xfrm4_policy_check() returned with error
+                      * application has specified larger min. coverage
+                        length than that of incoming packet
+                      * checksum coverage violated
+                      * bad checksum
+
+   OutDatagrams:    Total number of sent datagrams.
+
+   These statistics derive from the UDP MIB (RFC 2013).
+
+
+  VI) IPTABLES
+
+  There is packet match support for UDP-Lite as well as support for the LOG target.
+  If you copy and paste the following line into /etc/protocols,
+
+  udplite 136     UDP-Lite        # UDP-Lite [RFC 3828]
+
+  then
+              iptables -A INPUT -p udplite -j LOG
+
+  will produce logging output to syslog. Dropping and rejecting packets also works.
+
+
+  VII) MAINTAINER ADDRESS
+
+  The UDP-Lite patch was developed at
+                    University of Aberdeen
+                    Electronics Research Group
+                    Department of Engineering
+                    Fraser Noble Building
+                    Aberdeen AB24 3UE; UK
+  The current maintainer is Gerrit Renker, <gerrit@erg.abdn.ac.uk>. Initial
+  code was developed by William  Stanislaus, <william@erg.abdn.ac.uk>.
diff --git a/Documentation/networking/vortex.txt b/Documentation/networking/vortex.txt
new file mode 100644
index 000000000..ad3dead05
--- /dev/null
+++ b/Documentation/networking/vortex.txt
@@ -0,0 +1,448 @@
+Documentation/networking/vortex.txt
+Andrew Morton
+30 April 2000
+
+
+This document describes the usage and errata of the 3Com "Vortex" device
+driver for Linux, 3c59x.c.
+
+The driver was written by Donald Becker <becker@scyld.com>
+
+Don is no longer the prime maintainer of this version of the driver. 
+Please report problems to one or more of:
+
+  Andrew Morton
+  Netdev mailing list <netdev@vger.kernel.org>
+  Linux kernel mailing list <linux-kernel@vger.kernel.org>
+
+Please note the 'Reporting and Diagnosing Problems' section at the end
+of this file.
+
+
+Since kernel 2.3.99-pre6, this driver incorporates the support for the
+3c575-series Cardbus cards which used to be handled by 3c575_cb.c.
+
+This driver supports the following hardware:
+
+	3c590 Vortex 10Mbps
+	3c592 EISA 10Mbps Demon/Vortex
+	3c597 EISA Fast Demon/Vortex
+	3c595 Vortex 100baseTx
+	3c595 Vortex 100baseT4
+	3c595 Vortex 100base-MII
+	3c900 Boomerang 10baseT
+	3c900 Boomerang 10Mbps Combo
+	3c900 Cyclone 10Mbps TPO
+	3c900 Cyclone 10Mbps Combo
+	3c900 Cyclone 10Mbps TPC
+	3c900B-FL Cyclone 10base-FL
+	3c905 Boomerang 100baseTx
+	3c905 Boomerang 100baseT4
+	3c905B Cyclone 100baseTx
+	3c905B Cyclone 10/100/BNC
+	3c905B-FX Cyclone 100baseFx
+	3c905C Tornado
+	3c920B-EMB-WNM (ATI Radeon 9100 IGP)
+	3c980 Cyclone
+	3c980C Python-T
+	3cSOHO100-TX Hurricane
+	3c555 Laptop Hurricane
+	3c556 Laptop Tornado
+	3c556B Laptop Hurricane
+	3c575 [Megahertz] 10/100 LAN  CardBus
+	3c575 Boomerang CardBus
+	3CCFE575BT Cyclone CardBus
+	3CCFE575CT Tornado CardBus
+	3CCFE656 Cyclone CardBus
+	3CCFEM656B Cyclone+Winmodem CardBus
+	3CXFEM656C Tornado+Winmodem CardBus
+	3c450 HomePNA Tornado
+	3c920 Tornado
+	3c982 Hydra Dual Port A
+	3c982 Hydra Dual Port B
+	3c905B-T4
+	3c920B-EMB-WNM Tornado
+
+Module parameters
+=================
+
+There are several parameters which may be provided to the driver when
+its module is loaded.  These are usually placed in /etc/modprobe.d/*.conf
+configuration files.  Example:
+
+options 3c59x debug=3 rx_copybreak=300
+
+If you are using the PCMCIA tools (cardmgr) then the options may be
+placed in /etc/pcmcia/config.opts:
+
+module "3c59x" opts "debug=3 rx_copybreak=300"
+
+
+The supported parameters are:
+
+debug=N
+
+  Where N is a number from 0 to 7.  Anything above 3 produces a lot
+  of output in your system logs.  debug=1 is default.
+
+options=N1,N2,N3,...
+
+  Each number in the list provides an option to the corresponding
+  network card.  So if you have two 3c905's and you wish to provide
+  them with option 0x204 you would use:
+
+    options=0x204,0x204
+
+  The individual options are composed of a number of bitfields which
+  have the following meanings:
+
+  Possible media type settings
+	0	10baseT
+	1	10Mbs AUI
+	2	undefined
+	3	10base2 (BNC)
+	4	100base-TX
+	5	100base-FX
+	6	MII (Media Independent Interface)
+	7	Use default setting from EEPROM
+	8       Autonegotiate
+	9       External MII
+	10      Use default setting from EEPROM
+
+  When generating a value for the 'options' setting, the above media
+  selection values may be OR'ed (or added to) the following:
+
+  0x8000  Set driver debugging level to 7
+  0x4000  Set driver debugging level to 2
+  0x0400  Enable Wake-on-LAN
+  0x0200  Force full duplex mode.
+  0x0010  Bus-master enable bit (Old Vortex cards only)
+
+  For example:
+
+    insmod 3c59x options=0x204
+
+  will force full-duplex 100base-TX, rather than allowing the usual
+  autonegotiation.
+
+global_options=N
+
+  Sets the `options' parameter for all 3c59x NICs in the machine. 
+  Entries in the `options' array above will override any setting of
+  this.
+
+full_duplex=N1,N2,N3...
+
+  Similar to bit 9 of 'options'.  Forces the corresponding card into
+  full-duplex mode.  Please use this in preference to the `options'
+  parameter.
+
+  In fact, please don't use this at all! You're better off getting
+  autonegotiation working properly.
+
+global_full_duplex=N1
+
+  Sets full duplex mode for all 3c59x NICs in the machine.  Entries
+  in the `full_duplex' array above will override any setting of this.
+
+flow_ctrl=N1,N2,N3...
+
+  Use 802.3x MAC-layer flow control.  The 3com cards only support the
+  PAUSE command, which means that they will stop sending packets for a
+  short period if they receive a PAUSE frame from the link partner. 
+
+  The driver only allows flow control on a link which is operating in
+  full duplex mode.
+
+  This feature does not appear to work on the 3c905 - only 3c905B and
+  3c905C have been tested.
+
+  The 3com cards appear to only respond to PAUSE frames which are
+  sent to the reserved destination address of 01:80:c2:00:00:01.  They
+  do not honour PAUSE frames which are sent to the station MAC address.
+
+rx_copybreak=M
+
+  The driver preallocates 32 full-sized (1536 byte) network buffers
+  for receiving.  When a packet arrives, the driver has to decide
+  whether to leave the packet in its full-sized buffer, or to allocate
+  a smaller buffer and copy the packet across into it.
+
+  This is a speed/space tradeoff.
+
+  The value of rx_copybreak is used to decide when to make the copy. 
+  If the packet size is less than rx_copybreak, the packet is copied. 
+  The default value for rx_copybreak is 200 bytes.
+
+max_interrupt_work=N
+
+  The driver's interrupt service routine can handle many receive and
+  transmit packets in a single invocation.  It does this in a loop. 
+  The value of max_interrupt_work governs how many times the interrupt
+  service routine will loop.  The default value is 32 loops.  If this
+  is exceeded the interrupt service routine gives up and generates a
+  warning message "eth0: Too much work in interrupt".
+
+hw_checksums=N1,N2,N3,...
+
+  Recent 3com NICs are able to generate IPv4, TCP and UDP checksums
+  in hardware.  Linux has used the Rx checksumming for a long time. 
+  The "zero copy" patch which is planned for the 2.4 kernel series
+  allows you to make use of the NIC's DMA scatter/gather and transmit
+  checksumming as well.
+
+  The driver is set up so that, when the zerocopy patch is applied,
+  all Tornado and Cyclone devices will use S/G and Tx checksums.
+
+  This module parameter has been provided so you can override this
+  decision.  If you think that Tx checksums are causing a problem, you
+  may disable the feature with `hw_checksums=0'.
+
+  If you think your NIC should be performing Tx checksumming and the
+  driver isn't enabling it, you can force the use of hardware Tx
+  checksumming with `hw_checksums=1'.
+
+  The driver drops a message in the logfiles to indicate whether or
+  not it is using hardware scatter/gather and hardware Tx checksums.
+
+  Scatter/gather and hardware checksums provide considerable
+  performance improvement for the sendfile() system call, but a small
+  decrease in throughput for send().  There is no effect upon receive
+  efficiency.
+
+compaq_ioaddr=N
+compaq_irq=N
+compaq_device_id=N
+
+  "Variables to work-around the Compaq PCI BIOS32 problem"....
+
+watchdog=N
+
+  Sets the time duration (in milliseconds) after which the kernel
+  decides that the transmitter has become stuck and needs to be reset. 
+  This is mainly for debugging purposes, although it may be advantageous
+  to increase this value on LANs which have very high collision rates.
+  The default value is 5000 (5.0 seconds).
+
+enable_wol=N1,N2,N3,...
+
+  Enable Wake-on-LAN support for the relevant interface.  Donald
+  Becker's `ether-wake' application may be used to wake suspended
+  machines.
+
+  Also enables the NIC's power management support.
+
+global_enable_wol=N
+
+  Sets enable_wol mode for all 3c59x NICs in the machine.  Entries in
+  the `enable_wol' array above will override any setting of this.
+
+Media selection
+---------------
+
+A number of the older NICs such as the 3c590 and 3c900 series have
+10base2 and AUI interfaces.
+
+Prior to January, 2001 this driver would autoeselect the 10base2 or AUI
+port if it didn't detect activity on the 10baseT port.  It would then
+get stuck on the 10base2 port and a driver reload was necessary to
+switch back to 10baseT.  This behaviour could not be prevented with a
+module option override.
+
+Later (current) versions of the driver _do_ support locking of the
+media type.  So if you load the driver module with
+
+	modprobe 3c59x options=0
+
+it will permanently select the 10baseT port.  Automatic selection of
+other media types does not occur.
+
+
+Transmit error, Tx status register 82
+-------------------------------------
+
+This is a common error which is almost always caused by another host on
+the same network being in full-duplex mode, while this host is in
+half-duplex mode.  You need to find that other host and make it run in
+half-duplex mode or fix this host to run in full-duplex mode.
+
+As a last resort, you can force the 3c59x driver into full-duplex mode
+with
+
+	options 3c59x full_duplex=1
+
+but this has to be viewed as a workaround for broken network gear and
+should only really be used for equipment which cannot autonegotiate.
+
+
+Additional resources
+--------------------
+
+Details of the device driver implementation are at the top of the source file.
+
+Additional documentation is available at Don Becker's Linux Drivers site:
+
+     http://www.scyld.com/vortex.html
+
+Donald Becker's driver development site:
+
+     http://www.scyld.com/network.html
+
+Donald's vortex-diag program is useful for inspecting the NIC's state:
+
+     http://www.scyld.com/ethercard_diag.html
+
+Donald's mii-diag program may be used for inspecting and manipulating
+the NIC's Media Independent Interface subsystem:
+
+     http://www.scyld.com/ethercard_diag.html#mii-diag
+
+Donald's wake-on-LAN page:
+
+     http://www.scyld.com/wakeonlan.html
+
+3Com's DOS-based application for setting up the NICs EEPROMs:
+
+	ftp://ftp.3com.com/pub/nic/3c90x/3c90xx2.exe
+
+
+Autonegotiation notes
+---------------------
+
+  The driver uses a one-minute heartbeat for adapting to changes in
+  the external LAN environment if link is up and 5 seconds if link is down.
+  This means that when, for example, a machine is unplugged from a hubbed
+  10baseT LAN plugged into a  switched 100baseT LAN, the throughput
+  will be quite dreadful for up to sixty seconds.  Be patient.
+
+  Cisco interoperability note from Walter Wong <wcw+@CMU.EDU>:
+
+  On a side note, adding HAS_NWAY seems to share a problem with the
+  Cisco 6509 switch.  Specifically, you need to change the spanning
+  tree parameter for the port the machine is plugged into to 'portfast'
+  mode.  Otherwise, the negotiation fails.  This has been an issue
+  we've noticed for a while but haven't had the time to track down.
+
+  Cisco switches    (Jeff Busch <jbusch@deja.com>)
+
+    My "standard config" for ports to which PC's/servers connect directly:
+
+        interface FastEthernet0/N
+        description machinename
+        load-interval 30
+        spanning-tree portfast
+
+    If autonegotiation is a problem, you may need to specify "speed
+    100" and "duplex full" as well (or "speed 10" and "duplex half").
+
+    WARNING: DO NOT hook up hubs/switches/bridges to these
+    specially-configured ports! The switch will become very confused.
+
+
+Reporting and diagnosing problems
+---------------------------------
+
+Maintainers find that accurate and complete problem reports are
+invaluable in resolving driver problems.  We are frequently not able to
+reproduce problems and must rely on your patience and efforts to get to
+the bottom of the problem.
+
+If you believe you have a driver problem here are some of the
+steps you should take:
+
+- Is it really a driver problem?
+
+   Eliminate some variables: try different cards, different
+   computers, different cables, different ports on the switch/hub,
+   different versions of the kernel or of the driver, etc.
+
+- OK, it's a driver problem.
+
+   You need to generate a report.  Typically this is an email to the
+   maintainer and/or netdev@vger.kernel.org.  The maintainer's
+   email address will be in the driver source or in the MAINTAINERS file.
+
+- The contents of your report will vary a lot depending upon the
+  problem.  If it's a kernel crash then you should refer to the
+  admin-guide/reporting-bugs.rst file.
+
+  But for most problems it is useful to provide the following:
+
+   o Kernel version, driver version
+
+   o A copy of the banner message which the driver generates when
+     it is initialised.  For example:
+
+     eth0: 3Com PCI 3c905C Tornado at 0xa400,  00:50:da:6a:88:f0, IRQ 19
+     8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
+     MII transceiver found at address 24, status 782d.
+     Enabling bus-master transmits and whole-frame receives.
+
+     NOTE: You must provide the `debug=2' modprobe option to generate
+     a full detection message.  Please do this:
+
+	modprobe 3c59x debug=2
+
+   o If it is a PCI device, the relevant output from 'lspci -vx', eg:
+
+     00:09.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)
+             Subsystem: 3Com Corporation: Unknown device 9200
+             Flags: bus master, medium devsel, latency 32, IRQ 19
+             I/O ports at a400 [size=128]
+             Memory at db000000 (32-bit, non-prefetchable) [size=128]
+             Expansion ROM at <unassigned> [disabled] [size=128K]
+             Capabilities: [dc] Power Management version 2
+     00: b7 10 00 92 07 00 10 02 74 00 00 02 08 20 00 00
+     10: 01 a4 00 00 00 00 00 db 00 00 00 00 00 00 00 00
+     20: 00 00 00 00 00 00 00 00 00 00 00 00 b7 10 00 10
+     30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 01 0a 0a
+
+   o A description of the environment: 10baseT? 100baseT?
+     full/half duplex? switched or hubbed?
+
+   o Any additional module parameters which you may be providing to the driver.
+
+   o Any kernel logs which are produced.  The more the merrier. 
+     If this is a large file and you are sending your report to a
+     mailing list, mention that you have the logfile, but don't send
+     it.  If you're reporting direct to the maintainer then just send
+     it.
+
+     To ensure that all kernel logs are available, add the
+     following line to /etc/syslog.conf:
+
+         kern.* /var/log/messages
+
+     Then restart syslogd with:
+
+         /etc/rc.d/init.d/syslog restart
+
+     (The above may vary, depending upon which Linux distribution you use).
+
+    o If your problem is reproducible then that's great.  Try the
+      following:
+
+      1) Increase the debug level.  Usually this is done via:
+
+         a) modprobe driver debug=7
+         b) In /etc/modprobe.d/driver.conf:
+            options driver debug=7
+
+      2) Recreate the problem with the higher debug level,
+         send all logs to the maintainer.
+
+      3) Download you card's diagnostic tool from Donald
+         Becker's website <http://www.scyld.com/ethercard_diag.html>.
+         Download mii-diag.c as well.  Build these.
+
+         a) Run 'vortex-diag -aaee' and 'mii-diag -v' when the card is
+            working correctly.  Save the output.
+
+         b) Run the above commands when the card is malfunctioning.  Send
+            both sets of output.
+
+Finally, please be patient and be prepared to do some work.  You may
+end up working on this problem for a week or more as the maintainer
+asks more questions, asks for more tests, asks for patches to be
+applied, etc.  At the end of it all, the problem may even remain
+unresolved.
diff --git a/Documentation/networking/vrf.txt b/Documentation/networking/vrf.txt
new file mode 100644
index 000000000..8ff7b4c8f
--- /dev/null
+++ b/Documentation/networking/vrf.txt
@@ -0,0 +1,404 @@
+Virtual Routing and Forwarding (VRF)
+====================================
+The VRF device combined with ip rules provides the ability to create virtual
+routing and forwarding domains (aka VRFs, VRF-lite to be specific) in the
+Linux network stack. One use case is the multi-tenancy problem where each
+tenant has their own unique routing tables and in the very least need
+different default gateways.
+
+Processes can be "VRF aware" by binding a socket to the VRF device. Packets
+through the socket then use the routing table associated with the VRF
+device. An important feature of the VRF device implementation is that it
+impacts only Layer 3 and above so L2 tools (e.g., LLDP) are not affected
+(ie., they do not need to be run in each VRF). The design also allows
+the use of higher priority ip rules (Policy Based Routing, PBR) to take
+precedence over the VRF device rules directing specific traffic as desired.
+
+In addition, VRF devices allow VRFs to be nested within namespaces. For
+example network namespaces provide separation of network interfaces at the
+device layer, VLANs on the interfaces within a namespace provide L2 separation
+and then VRF devices provide L3 separation.
+
+Design
+------
+A VRF device is created with an associated route table. Network interfaces
+are then enslaved to a VRF device:
+
+         +-----------------------------+
+         |           vrf-blue          |  ===> route table 10
+         +-----------------------------+
+            |        |            |
+         +------+ +------+     +-------------+
+         | eth1 | | eth2 | ... |    bond1    |
+         +------+ +------+     +-------------+
+                                  |       |
+                              +------+ +------+
+                              | eth8 | | eth9 |
+                              +------+ +------+
+
+Packets received on an enslaved device and are switched to the VRF device
+in the IPv4 and IPv6 processing stacks giving the impression that packets
+flow through the VRF device. Similarly on egress routing rules are used to
+send packets to the VRF device driver before getting sent out the actual
+interface. This allows tcpdump on a VRF device to capture all packets into
+and out of the VRF as a whole.[1] Similarly, netfilter[2] and tc rules can be
+applied using the VRF device to specify rules that apply to the VRF domain
+as a whole.
+
+[1] Packets in the forwarded state do not flow through the device, so those
+    packets are not seen by tcpdump. Will revisit this limitation in a
+    future release.
+
+[2] Iptables on ingress supports PREROUTING with skb->dev set to the real
+    ingress device and both INPUT and PREROUTING rules with skb->dev set to
+    the VRF device. For egress POSTROUTING and OUTPUT rules can be written
+    using either the VRF device or real egress device.
+
+Setup
+-----
+1. VRF device is created with an association to a FIB table.
+   e.g, ip link add vrf-blue type vrf table 10
+        ip link set dev vrf-blue up
+
+2. An l3mdev FIB rule directs lookups to the table associated with the device.
+   A single l3mdev rule is sufficient for all VRFs. The VRF device adds the
+   l3mdev rule for IPv4 and IPv6 when the first device is created with a
+   default preference of 1000. Users may delete the rule if desired and add
+   with a different priority or install per-VRF rules.
+
+   Prior to the v4.8 kernel iif and oif rules are needed for each VRF device:
+       ip ru add oif vrf-blue table 10
+       ip ru add iif vrf-blue table 10
+
+3. Set the default route for the table (and hence default route for the VRF).
+       ip route add table 10 unreachable default metric 4278198272
+
+   This high metric value ensures that the default unreachable route can
+   be overridden by a routing protocol suite.  FRRouting interprets
+   kernel metrics as a combined admin distance (upper byte) and priority
+   (lower 3 bytes).  Thus the above metric translates to [255/8192].
+
+4. Enslave L3 interfaces to a VRF device.
+       ip link set dev eth1 master vrf-blue
+
+   Local and connected routes for enslaved devices are automatically moved to
+   the table associated with VRF device. Any additional routes depending on
+   the enslaved device are dropped and will need to be reinserted to the VRF
+   FIB table following the enslavement.
+
+   The IPv6 sysctl option keep_addr_on_down can be enabled to keep IPv6 global
+   addresses as VRF enslavement changes.
+       sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
+
+5. Additional VRF routes are added to associated table.
+       ip route add table 10 ...
+
+
+Applications
+------------
+Applications that are to work within a VRF need to bind their socket to the
+VRF device:
+
+    setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1);
+
+or to specify the output device using cmsg and IP_PKTINFO.
+
+TCP & UDP services running in the default VRF context (ie., not bound
+to any VRF device) can work across all VRF domains by enabling the
+tcp_l3mdev_accept and udp_l3mdev_accept sysctl options:
+    sysctl -w net.ipv4.tcp_l3mdev_accept=1
+    sysctl -w net.ipv4.udp_l3mdev_accept=1
+
+netfilter rules on the VRF device can be used to limit access to services
+running in the default VRF context as well.
+
+The default VRF does not have limited scope with respect to port bindings.
+That is, if a process does a wildcard bind to a port in the default VRF it
+owns the port across all VRF domains within the network namespace.
+
+################################################################################
+
+Using iproute2 for VRFs
+=======================
+iproute2 supports the vrf keyword as of v4.7. For backwards compatibility this
+section lists both commands where appropriate -- with the vrf keyword and the
+older form without it.
+
+1. Create a VRF
+
+   To instantiate a VRF device and associate it with a table:
+       $ ip link add dev NAME type vrf table ID
+
+   As of v4.8 the kernel supports the l3mdev FIB rule where a single rule
+   covers all VRFs. The l3mdev rule is created for IPv4 and IPv6 on first
+   device create.
+
+2. List VRFs
+
+   To list VRFs that have been created:
+       $ ip [-d] link show type vrf
+         NOTE: The -d option is needed to show the table id
+
+   For example:
+   $ ip -d link show type vrf
+   11: mgmt: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+       link/ether 72:b3:ba:91:e2:24 brd ff:ff:ff:ff:ff:ff promiscuity 0
+       vrf table 1 addrgenmode eui64
+   12: red: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+       link/ether b6:6f:6e:f6:da:73 brd ff:ff:ff:ff:ff:ff promiscuity 0
+       vrf table 10 addrgenmode eui64
+   13: blue: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+       link/ether 36:62:e8:7d:bb:8c brd ff:ff:ff:ff:ff:ff promiscuity 0
+       vrf table 66 addrgenmode eui64
+   14: green: <NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
+       link/ether e6:28:b8:63:70:bb brd ff:ff:ff:ff:ff:ff promiscuity 0
+       vrf table 81 addrgenmode eui64
+
+
+   Or in brief output:
+
+   $ ip -br link show type vrf
+   mgmt         UP             72:b3:ba:91:e2:24 <NOARP,MASTER,UP,LOWER_UP>
+   red          UP             b6:6f:6e:f6:da:73 <NOARP,MASTER,UP,LOWER_UP>
+   blue         UP             36:62:e8:7d:bb:8c <NOARP,MASTER,UP,LOWER_UP>
+   green        UP             e6:28:b8:63:70:bb <NOARP,MASTER,UP,LOWER_UP>
+
+
+3. Assign a Network Interface to a VRF
+
+   Network interfaces are assigned to a VRF by enslaving the netdevice to a
+   VRF device:
+       $ ip link set dev NAME master NAME
+
+   On enslavement connected and local routes are automatically moved to the
+   table associated with the VRF device.
+
+   For example:
+   $ ip link set dev eth0 master mgmt
+
+
+4. Show Devices Assigned to a VRF
+
+   To show devices that have been assigned to a specific VRF add the master
+   option to the ip command:
+       $ ip link show vrf NAME
+       $ ip link show master NAME
+
+   For example:
+   $ ip link show vrf red
+   3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP mode DEFAULT group default qlen 1000
+       link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff
+   4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP mode DEFAULT group default qlen 1000
+       link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff
+   7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master red state DOWN mode DEFAULT group default qlen 1000
+       link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff
+
+
+   Or using the brief output:
+   $ ip -br link show vrf red
+   eth1             UP             02:00:00:00:02:02 <BROADCAST,MULTICAST,UP,LOWER_UP>
+   eth2             UP             02:00:00:00:02:03 <BROADCAST,MULTICAST,UP,LOWER_UP>
+   eth5             DOWN           02:00:00:00:02:06 <BROADCAST,MULTICAST>
+
+
+5. Show Neighbor Entries for a VRF
+
+   To list neighbor entries associated with devices enslaved to a VRF device
+   add the master option to the ip command:
+       $ ip [-6] neigh show vrf NAME
+       $ ip [-6] neigh show master NAME
+
+   For example:
+   $  ip neigh show vrf red
+   10.2.1.254 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE
+   10.2.2.254 dev eth2 lladdr 5e:54:01:6a:ee:80 REACHABLE
+
+   $ ip -6 neigh show vrf red
+   2002:1::64 dev eth1 lladdr a6:d9:c7:4f:06:23 REACHABLE
+
+
+6. Show Addresses for a VRF
+
+   To show addresses for interfaces associated with a VRF add the master
+   option to the ip command:
+       $ ip addr show vrf NAME
+       $ ip addr show master NAME
+
+   For example:
+   $ ip addr show vrf red
+   3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
+       link/ether 02:00:00:00:02:02 brd ff:ff:ff:ff:ff:ff
+       inet 10.2.1.2/24 brd 10.2.1.255 scope global eth1
+          valid_lft forever preferred_lft forever
+       inet6 2002:1::2/120 scope global
+          valid_lft forever preferred_lft forever
+       inet6 fe80::ff:fe00:202/64 scope link
+          valid_lft forever preferred_lft forever
+   4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master red state UP group default qlen 1000
+       link/ether 02:00:00:00:02:03 brd ff:ff:ff:ff:ff:ff
+       inet 10.2.2.2/24 brd 10.2.2.255 scope global eth2
+          valid_lft forever preferred_lft forever
+       inet6 2002:2::2/120 scope global
+          valid_lft forever preferred_lft forever
+       inet6 fe80::ff:fe00:203/64 scope link
+          valid_lft forever preferred_lft forever
+   7: eth5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop master red state DOWN group default qlen 1000
+       link/ether 02:00:00:00:02:06 brd ff:ff:ff:ff:ff:ff
+
+   Or in brief format:
+   $ ip -br addr show vrf red
+   eth1             UP             10.2.1.2/24 2002:1::2/120 fe80::ff:fe00:202/64
+   eth2             UP             10.2.2.2/24 2002:2::2/120 fe80::ff:fe00:203/64
+   eth5             DOWN
+
+
+7. Show Routes for a VRF
+
+   To show routes for a VRF use the ip command to display the table associated
+   with the VRF device:
+       $ ip [-6] route show vrf NAME
+       $ ip [-6] route show table ID
+
+   For example:
+   $ ip route show vrf red
+   unreachable default  metric 4278198272
+   broadcast 10.2.1.0 dev eth1  proto kernel  scope link  src 10.2.1.2
+   10.2.1.0/24 dev eth1  proto kernel  scope link  src 10.2.1.2
+   local 10.2.1.2 dev eth1  proto kernel  scope host  src 10.2.1.2
+   broadcast 10.2.1.255 dev eth1  proto kernel  scope link  src 10.2.1.2
+   broadcast 10.2.2.0 dev eth2  proto kernel  scope link  src 10.2.2.2
+   10.2.2.0/24 dev eth2  proto kernel  scope link  src 10.2.2.2
+   local 10.2.2.2 dev eth2  proto kernel  scope host  src 10.2.2.2
+   broadcast 10.2.2.255 dev eth2  proto kernel  scope link  src 10.2.2.2
+
+   $ ip -6 route show vrf red
+   local 2002:1:: dev lo  proto none  metric 0  pref medium
+   local 2002:1::2 dev lo  proto none  metric 0  pref medium
+   2002:1::/120 dev eth1  proto kernel  metric 256  pref medium
+   local 2002:2:: dev lo  proto none  metric 0  pref medium
+   local 2002:2::2 dev lo  proto none  metric 0  pref medium
+   2002:2::/120 dev eth2  proto kernel  metric 256  pref medium
+   local fe80:: dev lo  proto none  metric 0  pref medium
+   local fe80:: dev lo  proto none  metric 0  pref medium
+   local fe80::ff:fe00:202 dev lo  proto none  metric 0  pref medium
+   local fe80::ff:fe00:203 dev lo  proto none  metric 0  pref medium
+   fe80::/64 dev eth1  proto kernel  metric 256  pref medium
+   fe80::/64 dev eth2  proto kernel  metric 256  pref medium
+   ff00::/8 dev red  metric 256  pref medium
+   ff00::/8 dev eth1  metric 256  pref medium
+   ff00::/8 dev eth2  metric 256  pref medium
+   unreachable default dev lo  metric 4278198272  error -101 pref medium
+
+8. Route Lookup for a VRF
+
+   A test route lookup can be done for a VRF:
+       $ ip [-6] route get vrf NAME ADDRESS
+       $ ip [-6] route get oif NAME ADDRESS
+
+   For example:
+   $ ip route get 10.2.1.40 vrf red
+   10.2.1.40 dev eth1  table red  src 10.2.1.2
+       cache
+
+   $ ip -6 route get 2002:1::32 vrf red
+   2002:1::32 from :: dev eth1  table red  proto kernel  src 2002:1::2  metric 256  pref medium
+
+
+9. Removing Network Interface from a VRF
+
+   Network interfaces are removed from a VRF by breaking the enslavement to
+   the VRF device:
+       $ ip link set dev NAME nomaster
+
+   Connected routes are moved back to the default table and local entries are
+   moved to the local table.
+
+   For example:
+   $ ip link set dev eth0 nomaster
+
+--------------------------------------------------------------------------------
+
+Commands used in this example:
+
+cat >> /etc/iproute2/rt_tables.d/vrf.conf <<EOF
+1  mgmt
+10 red
+66 blue
+81 green
+EOF
+
+function vrf_create
+{
+    VRF=$1
+    TBID=$2
+
+    # create VRF device
+    ip link add ${VRF} type vrf table ${TBID}
+
+    if [ "${VRF}" != "mgmt" ]; then
+        ip route add table ${TBID} unreachable default metric 4278198272
+    fi
+    ip link set dev ${VRF} up
+}
+
+vrf_create mgmt 1
+ip link set dev eth0 master mgmt
+
+vrf_create red 10
+ip link set dev eth1 master red
+ip link set dev eth2 master red
+ip link set dev eth5 master red
+
+vrf_create blue 66
+ip link set dev eth3 master blue
+
+vrf_create green 81
+ip link set dev eth4 master green
+
+
+Interface addresses from /etc/network/interfaces:
+auto eth0
+iface eth0 inet static
+      address 10.0.0.2
+      netmask 255.255.255.0
+      gateway 10.0.0.254
+
+iface eth0 inet6 static
+      address 2000:1::2
+      netmask 120
+
+auto eth1
+iface eth1 inet static
+      address 10.2.1.2
+      netmask 255.255.255.0
+
+iface eth1 inet6 static
+      address 2002:1::2
+      netmask 120
+
+auto eth2
+iface eth2 inet static
+      address 10.2.2.2
+      netmask 255.255.255.0
+
+iface eth2 inet6 static
+      address 2002:2::2
+      netmask 120
+
+auto eth3
+iface eth3 inet static
+      address 10.2.3.2
+      netmask 255.255.255.0
+
+iface eth3 inet6 static
+      address 2002:3::2
+      netmask 120
+
+auto eth4
+iface eth4 inet static
+      address 10.2.4.2
+      netmask 255.255.255.0
+
+iface eth4 inet6 static
+      address 2002:4::2
+      netmask 120
diff --git a/Documentation/networking/vxge.txt b/Documentation/networking/vxge.txt
new file mode 100644
index 000000000..abfec245f
--- /dev/null
+++ b/Documentation/networking/vxge.txt
@@ -0,0 +1,93 @@
+Neterion's (Formerly S2io) X3100 Series 10GbE PCIe Server Adapter Linux driver
+==============================================================================
+
+Contents
+--------
+
+1) Introduction
+2) Features supported
+3) Configurable driver parameters
+4) Troubleshooting
+
+1) Introduction:
+----------------
+This Linux driver supports all Neterion's X3100 series 10 GbE PCIe I/O
+Virtualized Server adapters.
+The X3100 series supports four modes of operation, configurable via
+firmware -
+	Single function mode
+	Multi function mode
+	SRIOV mode
+	MRIOV mode
+The functions share a 10GbE link and the pci-e bus, but hardly anything else
+inside the ASIC. Features like independent hw reset, statistics, bandwidth/
+priority allocation and guarantees, GRO, TSO, interrupt moderation etc are
+supported independently on each function.
+
+(See below for a complete list of features supported for both IPv4 and IPv6)
+
+2) Features supported:
+----------------------
+
+i)   Single function mode (up to 17 queues)
+
+ii)  Multi function mode (up to 17 functions)
+
+iii) PCI-SIG's I/O Virtualization
+       - Single Root mode: v1.0 (up to 17 functions)
+       - Multi-Root mode: v1.0 (up to 17 functions)
+
+iv)  Jumbo frames
+       X3100 Series supports MTU up to 9600 bytes, modifiable using
+       ip command.
+
+v)   Offloads supported: (Enabled by default)
+       Checksum offload (TCP/UDP/IP) on transmit and receive paths
+       TCP Segmentation Offload (TSO) on transmit path
+       Generic Receive Offload (GRO) on receive path
+
+vi)  MSI-X: (Enabled by default)
+       Resulting in noticeable performance improvement (up to 7% on certain
+       platforms).
+
+vii) NAPI: (Enabled by default)
+       For better Rx interrupt moderation.
+
+viii)RTH (Receive Traffic Hash): (Enabled by default)
+       Receive side steering for better scaling.
+
+ix)  Statistics
+       Comprehensive MAC-level and software statistics displayed using
+       "ethtool -S" option.
+
+x)   Multiple hardware queues: (Enabled by default)
+       Up to 17 hardware based transmit and receive data channels, with
+       multiple steering options (transmit multiqueue enabled by default).
+
+3) Configurable driver parameters:
+----------------------------------
+
+i)  max_config_dev
+       Specifies maximum device functions to be enabled.
+       Valid range: 1-8
+
+ii) max_config_port
+       Specifies number of ports to be enabled.
+       Valid range: 1,2
+       Default: 1
+
+iii)max_config_vpath
+       Specifies maximum VPATH(s) configured for each device function.
+       Valid range: 1-17
+
+iv) vlan_tag_strip
+       Enables/disables vlan tag stripping from all received tagged frames that
+       are not replicated at the internal L2 switch.
+       Valid range: 0,1 (disabled, enabled respectively)
+       Default: 1
+
+v)  addr_learn_en
+       Enable learning the mac address of the guest OS interface in
+       virtualization environment.
+       Valid range: 0,1 (disabled, enabled respectively)
+       Default: 0
diff --git a/Documentation/networking/vxlan.txt b/Documentation/networking/vxlan.txt
new file mode 100644
index 000000000..c28f4989c
--- /dev/null
+++ b/Documentation/networking/vxlan.txt
@@ -0,0 +1,51 @@
+Virtual eXtensible Local Area Networking documentation
+======================================================
+
+The VXLAN protocol is a tunnelling protocol designed to solve the
+problem of limited VLAN IDs (4096) in IEEE 802.1q.  With VXLAN the
+size of the identifier is expanded to 24 bits (16777216).
+
+VXLAN is described by IETF RFC 7348, and has been implemented by a
+number of vendors.  The protocol runs over UDP using a single
+destination port.  This document describes the Linux kernel tunnel
+device, there is also a separate implementation of VXLAN for
+Openvswitch.
+
+Unlike most tunnels, a VXLAN is a 1 to N network, not just point to
+point. A VXLAN device can learn the IP address of the other endpoint
+either dynamically in a manner similar to a learning bridge, or make
+use of statically-configured forwarding entries.
+
+The management of vxlan is done in a manner similar to its two closest
+neighbors GRE and VLAN. Configuring VXLAN requires the version of
+iproute2 that matches the kernel release where VXLAN was first merged
+upstream.
+
+1. Create vxlan device
+ # ip link add vxlan0 type vxlan id 42 group 239.1.1.1 dev eth1 dstport 4789
+
+This creates a new device named vxlan0.  The device uses the multicast
+group 239.1.1.1 over eth1 to handle traffic for which there is no
+entry in the forwarding table.  The destination port number is set to
+the IANA-assigned value of 4789.  The Linux implementation of VXLAN
+pre-dates the IANA's selection of a standard destination port number
+and uses the Linux-selected value by default to maintain backwards
+compatibility.
+
+2. Delete vxlan device
+  # ip link delete vxlan0
+
+3. Show vxlan info
+  # ip -d link show vxlan0
+
+It is possible to create, destroy and display the vxlan
+forwarding table using the new bridge command.
+
+1. Create forwarding table entry
+  # bridge fdb add to 00:17:42:8a:b4:05 dst 192.19.0.2 dev vxlan0
+
+2. Delete forwarding table entry
+  # bridge fdb delete 00:17:42:8a:b4:05 dev vxlan0
+
+3. Show forwarding table
+  # bridge fdb show dev vxlan0
diff --git a/Documentation/networking/x25-iface.txt b/Documentation/networking/x25-iface.txt
new file mode 100644
index 000000000..7f213b556
--- /dev/null
+++ b/Documentation/networking/x25-iface.txt
@@ -0,0 +1,123 @@
+			X.25 Device Driver Interface 1.1
+
+			   Jonathan Naylor 26.12.96
+
+This is a description of the messages to be passed between the X.25 Packet
+Layer and the X.25 device driver. They are designed to allow for the easy
+setting of the LAPB mode from within the Packet Layer.
+
+The X.25 device driver will be coded normally as per the Linux device driver
+standards. Most X.25 device drivers will be moderately similar to the
+already existing Ethernet device drivers. However unlike those drivers, the
+X.25 device driver has a state associated with it, and this information
+needs to be passed to and from the Packet Layer for proper operation.
+
+All messages are held in sk_buff's just like real data to be transmitted
+over the LAPB link. The first byte of the skbuff indicates the meaning of
+the rest of the skbuff, if any more information does exist.
+
+
+Packet Layer to Device Driver
+-----------------------------
+
+First Byte = 0x00 (X25_IFACE_DATA)
+
+This indicates that the rest of the skbuff contains data to be transmitted
+over the LAPB link. The LAPB link should already exist before any data is
+passed down.
+
+First Byte = 0x01 (X25_IFACE_CONNECT)
+
+Establish the LAPB link. If the link is already established then the connect
+confirmation message should be returned as soon as possible.
+
+First Byte = 0x02 (X25_IFACE_DISCONNECT)
+
+Terminate the LAPB link. If it is already disconnected then the disconnect
+confirmation message should be returned as soon as possible.
+
+First Byte = 0x03 (X25_IFACE_PARAMS)
+
+LAPB parameters. To be defined.
+
+
+Device Driver to Packet Layer
+-----------------------------
+
+First Byte = 0x00 (X25_IFACE_DATA)
+
+This indicates that the rest of the skbuff contains data that has been
+received over the LAPB link.
+
+First Byte = 0x01 (X25_IFACE_CONNECT)
+
+LAPB link has been established. The same message is used for both a LAPB
+link connect_confirmation and a connect_indication.
+
+First Byte = 0x02 (X25_IFACE_DISCONNECT)
+
+LAPB link has been terminated. This same message is used for both a LAPB
+link disconnect_confirmation and a disconnect_indication.
+
+First Byte = 0x03 (X25_IFACE_PARAMS)
+
+LAPB parameters. To be defined.
+
+
+
+Possible Problems
+=================
+
+(Henner Eisen, 2000-10-28)
+
+The X.25 packet layer protocol depends on a reliable datalink service.
+The LAPB protocol provides such reliable service. But this reliability
+is not preserved by the Linux network device driver interface:
+
+- With Linux 2.4.x (and above) SMP kernels, packet ordering is not
+  preserved. Even if a device driver calls netif_rx(skb1) and later
+  netif_rx(skb2), skb2 might be delivered to the network layer
+  earlier that skb1.
+- Data passed upstream by means of netif_rx() might be dropped by the
+  kernel if the backlog queue is congested.
+
+The X.25 packet layer protocol will detect this and reset the virtual
+call in question. But many upper layer protocols are not designed to
+handle such N-Reset events gracefully. And frequent N-Reset events
+will always degrade performance.
+
+Thus, driver authors should make netif_rx() as reliable as possible:
+
+SMP re-ordering will not occur if the driver's interrupt handler is
+always executed on the same CPU. Thus,
+
+- Driver authors should use irq affinity for the interrupt handler.
+
+The probability of packet loss due to backlog congestion can be
+reduced by the following measures or a combination thereof:
+
+(1) Drivers for kernel versions 2.4.x and above should always check the
+    return value of netif_rx(). If it returns NET_RX_DROP, the
+    driver's LAPB protocol must not confirm reception of the frame
+    to the peer. 
+    This will reliably suppress packet loss. The LAPB protocol will
+    automatically cause the peer to re-transmit the dropped packet
+    later.
+    The lapb module interface was modified to support this. Its
+    data_indication() method should now transparently pass the
+    netif_rx() return value to the (lapb module) caller.
+(2) Drivers for kernel versions 2.2.x should always check the global
+    variable netdev_dropping when a new frame is received. The driver
+    should only call netif_rx() if netdev_dropping is zero. Otherwise
+    the driver should not confirm delivery of the frame and drop it.
+    Alternatively, the driver can queue the frame internally and call
+    netif_rx() later when netif_dropping is 0 again. In that case, delivery
+    confirmation should also be deferred such that the internal queue
+    cannot grow to much.
+    This will not reliably avoid packet loss, but the probability
+    of packet loss in netif_rx() path will be significantly reduced.
+(3) Additionally, driver authors might consider to support
+    CONFIG_NET_HW_FLOWCONTROL. This allows the driver to be woken up
+    when a previously congested backlog queue becomes empty again.
+    The driver could uses this for flow-controlling the peer by means
+    of the LAPB protocol's flow-control service.
diff --git a/Documentation/networking/x25.txt b/Documentation/networking/x25.txt
new file mode 100644
index 000000000..c91c6d715
--- /dev/null
+++ b/Documentation/networking/x25.txt
@@ -0,0 +1,44 @@
+Linux X.25 Project
+
+As my third year dissertation at University I have taken it upon myself to
+write an X.25 implementation for Linux. My aim is to provide a complete X.25
+Packet Layer and a LAPB module to allow for "normal" X.25 to be run using
+Linux. There are two sorts of X.25 cards available, intelligent ones that
+implement LAPB on the card itself, and unintelligent ones that simply do
+framing, bit-stuffing and checksumming. These both need to be handled by the
+system.
+
+I therefore decided to write the implementation such that as far as the
+Packet Layer is concerned, the link layer was being performed by a lower
+layer of the Linux kernel and therefore it did not concern itself with
+implementation of LAPB. Therefore the LAPB modules would be called by
+unintelligent X.25 card drivers and not by intelligent ones, this would
+provide a uniform device driver interface, and simplify configuration.
+
+To confuse matters a little, an 802.2 LLC implementation for Linux is being
+written which will allow X.25 to be run over an Ethernet (or Token Ring) and
+conform with the JNT "Pink Book", this will have a different interface to
+the Packet Layer but there will be no confusion since the class of device
+being served by the LLC will be completely separate from LAPB. The LLC
+implementation is being done as part of another protocol project (SNA) and
+by a different author.
+
+Just when you thought that it could not become more confusing, another
+option appeared, XOT. This allows X.25 Packet Layer frames to operate over
+the Internet using TCP/IP as a reliable link layer. RFC1613 specifies the
+format and behaviour of the protocol. If time permits this option will also
+be actively considered.
+
+A linux-x25 mailing list has been created at vger.kernel.org to support the
+development and use of Linux X.25. It is early days yet, but interested
+parties are welcome to subscribe to it. Just send a message to
+majordomo@vger.kernel.org with the following in the message body:
+
+subscribe linux-x25
+end
+
+The contents of the Subject line are ignored.
+
+Jonathan
+
+g4klx@g4klx.demon.co.uk
diff --git a/Documentation/networking/xfrm_device.txt b/Documentation/networking/xfrm_device.txt
new file mode 100644
index 000000000..50c34ca65
--- /dev/null
+++ b/Documentation/networking/xfrm_device.txt
@@ -0,0 +1,135 @@
+
+===============================================
+XFRM device - offloading the IPsec computations
+===============================================
+Shannon Nelson <shannon.nelson@oracle.com>
+
+
+Overview
+========
+
+IPsec is a useful feature for securing network traffic, but the
+computational cost is high: a 10Gbps link can easily be brought down
+to under 1Gbps, depending on the traffic and link configuration.
+Luckily, there are NICs that offer a hardware based IPsec offload which
+can radically increase throughput and decrease CPU utilization.  The XFRM
+Device interface allows NIC drivers to offer to the stack access to the
+hardware offload.
+
+Userland access to the offload is typically through a system such as
+libreswan or KAME/raccoon, but the iproute2 'ip xfrm' command set can
+be handy when experimenting.  An example command might look something
+like this:
+
+  ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \
+     reqid 0x07 replay-window 32 \
+     aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \
+     sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \
+     offload dev eth4 dir in
+
+Yes, that's ugly, but that's what shell scripts and/or libreswan are for.
+
+
+
+Callbacks to implement
+======================
+
+/* from include/linux/netdevice.h */
+struct xfrmdev_ops {
+	int	(*xdo_dev_state_add) (struct xfrm_state *x);
+	void	(*xdo_dev_state_delete) (struct xfrm_state *x);
+	void	(*xdo_dev_state_free) (struct xfrm_state *x);
+	bool	(*xdo_dev_offload_ok) (struct sk_buff *skb,
+				       struct xfrm_state *x);
+	void    (*xdo_dev_state_advance_esn) (struct xfrm_state *x);
+};
+
+The NIC driver offering ipsec offload will need to implement these
+callbacks to make the offload available to the network stack's
+XFRM subsytem.  Additionally, the feature bits NETIF_F_HW_ESP and
+NETIF_F_HW_ESP_TX_CSUM will signal the availability of the offload.
+
+
+
+Flow
+====
+
+At probe time and before the call to register_netdev(), the driver should
+set up local data structures and XFRM callbacks, and set the feature bits.
+The XFRM code's listener will finish the setup on NETDEV_REGISTER.
+
+		adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops;
+		adapter->netdev->features |= NETIF_F_HW_ESP;
+		adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP;
+
+When new SAs are set up with a request for "offload" feature, the
+driver's xdo_dev_state_add() will be given the new SA to be offloaded
+and an indication of whether it is for Rx or Tx.  The driver should
+	- verify the algorithm is supported for offloads
+	- store the SA information (key, salt, target-ip, protocol, etc)
+	- enable the HW offload of the SA
+
+The driver can also set an offload_handle in the SA, an opaque void pointer
+that can be used to convey context into the fast-path offload requests.
+
+		xs->xso.offload_handle = context;
+
+
+When the network stack is preparing an IPsec packet for an SA that has
+been setup for offload, it first calls into xdo_dev_offload_ok() with
+the skb and the intended offload state to ask the driver if the offload
+will serviceable.  This can check the packet information to be sure the
+offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and
+return true of false to signify its support.
+
+When ready to send, the driver needs to inspect the Tx packet for the
+offload information, including the opaque context, and set up the packet
+send accordingly.
+
+		xs = xfrm_input_state(skb);
+		context = xs->xso.offload_handle;
+		set up HW for send
+
+The stack has already inserted the appropriate IPsec headers in the
+packet data, the offload just needs to do the encryption and fix up the
+header values.
+
+
+When a packet is received and the HW has indicated that it offloaded a
+decryption, the driver needs to add a reference to the decoded SA into
+the packet's skb.  At this point the data should be decrypted but the
+IPsec headers are still in the packet data; they are removed later up
+the stack in xfrm_input().
+
+	find and hold the SA that was used to the Rx skb
+		get spi, protocol, and destination IP from packet headers
+		xs = find xs from (spi, protocol, dest_IP)
+		xfrm_state_hold(xs);
+
+	store the state information into the skb
+		skb->sp = secpath_dup(skb->sp);
+		skb->sp->xvec[skb->sp->len++] = xs;
+		skb->sp->olen++;
+
+	indicate the success and/or error status of the offload
+		xo = xfrm_offload(skb);
+		xo->flags = CRYPTO_DONE;
+		xo->status = crypto_status;
+
+	hand the packet to napi_gro_receive() as usual
+
+In ESN mode, xdo_dev_state_advance_esn() is called from xfrm_replay_advance_esn().
+Driver will check packet seq number and update HW ESN state machine if needed.
+
+When the SA is removed by the user, the driver's xdo_dev_state_delete()
+is asked to disable the offload.  Later, xdo_dev_state_free() is called
+from a garbage collection routine after all reference counts to the state
+have been removed and any remaining resources can be cleared for the
+offload state.  How these are used by the driver will depend on specific
+hardware needs.
+
+As a netdev is set to DOWN the XFRM stack's netdev listener will call
+xdo_dev_state_delete() and xdo_dev_state_free() on any remaining offloaded
+states.
+
+
diff --git a/Documentation/networking/xfrm_proc.txt b/Documentation/networking/xfrm_proc.txt
new file mode 100644
index 000000000..2eae619ab
--- /dev/null
+++ b/Documentation/networking/xfrm_proc.txt
@@ -0,0 +1,82 @@
+XFRM proc - /proc/net/xfrm_* files
+==================================
+Masahide NAKAMURA <nakam@linux-ipv6.org>
+
+
+Transformation Statistics
+-------------------------
+
+The xfrm_proc code is a set of statistics showing numbers of packets
+dropped by the transformation code and why.  These counters are defined
+as part of the linux private MIB.  These counters can be viewed in
+/proc/net/xfrm_stat.
+
+
+Inbound errors
+~~~~~~~~~~~~~~
+XfrmInError:
+	All errors which is not matched others
+XfrmInBufferError:
+	No buffer is left
+XfrmInHdrError:
+	Header error
+XfrmInNoStates:
+	No state is found
+	i.e. Either inbound SPI, address, or IPsec protocol at SA is wrong
+XfrmInStateProtoError:
+	Transformation protocol specific error
+	e.g. SA key is wrong
+XfrmInStateModeError:
+	Transformation mode specific error
+XfrmInStateSeqError:
+	Sequence error
+	i.e. Sequence number is out of window
+XfrmInStateExpired:
+	State is expired
+XfrmInStateMismatch:
+	State has mismatch option
+	e.g. UDP encapsulation type is mismatch
+XfrmInStateInvalid:
+	State is invalid
+XfrmInTmplMismatch:
+	No matching template for states
+	e.g. Inbound SAs are correct but SP rule is wrong
+XfrmInNoPols:
+	No policy is found for states
+	e.g. Inbound SAs are correct but no SP is found
+XfrmInPolBlock:
+	Policy discards
+XfrmInPolError:
+	Policy error
+XfrmAcquireError:
+	State hasn't been fully acquired before use
+XfrmFwdHdrError:
+	Forward routing of a packet is not allowed
+
+Outbound errors
+~~~~~~~~~~~~~~~
+XfrmOutError:
+	All errors which is not matched others
+XfrmOutBundleGenError:
+	Bundle generation error
+XfrmOutBundleCheckError:
+	Bundle check error
+XfrmOutNoStates:
+	No state is found
+XfrmOutStateProtoError:
+	Transformation protocol specific error
+XfrmOutStateModeError:
+	Transformation mode specific error
+XfrmOutStateSeqError:
+	Sequence error
+	i.e. Sequence number overflow
+XfrmOutStateExpired:
+	State is expired
+XfrmOutPolBlock:
+	Policy discards
+XfrmOutPolDead:
+	Policy is dead
+XfrmOutPolError:
+	Policy error
+XfrmOutStateInvalid:
+	State is invalid, perhaps expired
diff --git a/Documentation/networking/xfrm_sync.txt b/Documentation/networking/xfrm_sync.txt
new file mode 100644
index 000000000..8d88e0f2e
--- /dev/null
+++ b/Documentation/networking/xfrm_sync.txt
@@ -0,0 +1,169 @@
+
+The sync patches work is based on initial patches from
+Krisztian <hidden@balabit.hu> and others and additional patches
+from Jamal <hadi@cyberus.ca>.
+
+The end goal for syncing is to be able to insert attributes + generate
+events so that the SA can be safely moved from one machine to another
+for HA purposes.
+The idea is to synchronize the SA so that the takeover machine can do
+the processing of the SA as accurate as possible if it has access to it.
+
+We already have the ability to generate SA add/del/upd events.
+These patches add ability to sync and have accurate lifetime byte (to
+ensure proper decay of SAs) and replay counters to avoid replay attacks
+with as minimal loss at failover time.
+This way a backup stays as closely up-to-date as an active member.
+
+Because the above items change for every packet the SA receives,
+it is possible for a lot of the events to be generated.
+For this reason, we also add a nagle-like algorithm to restrict
+the events. i.e we are going to set thresholds to say "let me
+know if the replay sequence threshold is reached or 10 secs have passed"
+These thresholds are set system-wide via sysctls or can be updated
+per SA.
+
+The identified items that need to be synchronized are:
+- the lifetime byte counter
+note that: lifetime time limit is not important if you assume the failover
+machine is known ahead of time since the decay of the time countdown
+is not driven by packet arrival.
+- the replay sequence for both inbound and outbound
+
+1) Message Structure
+----------------------
+
+nlmsghdr:aevent_id:optional-TLVs.
+
+The netlink message types are:
+
+XFRM_MSG_NEWAE and XFRM_MSG_GETAE.
+
+A XFRM_MSG_GETAE does not have TLVs.
+A XFRM_MSG_NEWAE will have at least two TLVs (as is
+discussed further below).
+
+aevent_id structure looks like:
+
+   struct xfrm_aevent_id {
+             struct xfrm_usersa_id           sa_id;
+             xfrm_address_t                  saddr;
+             __u32                           flags;
+             __u32                           reqid;
+   };
+
+The unique SA is identified by the combination of xfrm_usersa_id,
+reqid and saddr.
+
+flags are used to indicate different things. The possible
+flags are:
+        XFRM_AE_RTHR=1, /* replay threshold*/
+        XFRM_AE_RVAL=2, /* replay value */
+        XFRM_AE_LVAL=4, /* lifetime value */
+        XFRM_AE_ETHR=8, /* expiry timer threshold */
+        XFRM_AE_CR=16, /* Event cause is replay update */
+        XFRM_AE_CE=32, /* Event cause is timer expiry */
+        XFRM_AE_CU=64, /* Event cause is policy update */
+
+How these flags are used is dependent on the direction of the
+message (kernel<->user) as well the cause (config, query or event).
+This is described below in the different messages.
+
+The pid will be set appropriately in netlink to recognize direction
+(0 to the kernel and pid = processid that created the event
+when going from kernel to user space)
+
+A program needs to subscribe to multicast group XFRMNLGRP_AEVENTS
+to get notified of these events.
+
+2) TLVS reflect the different parameters:
+-----------------------------------------
+
+a) byte value (XFRMA_LTIME_VAL)
+This TLV carries the running/current counter for byte lifetime since
+last event.
+
+b)replay value (XFRMA_REPLAY_VAL)
+This TLV carries the running/current counter for replay sequence since
+last event.
+
+c)replay threshold (XFRMA_REPLAY_THRESH)
+This TLV carries the threshold being used by the kernel to trigger events
+when the replay sequence is exceeded.
+
+d) expiry timer (XFRMA_ETIMER_THRESH)
+This is a timer value in milliseconds which is used as the nagle
+value to rate limit the events.
+
+3) Default configurations for the parameters:
+----------------------------------------------
+
+By default these events should be turned off unless there is
+at least one listener registered to listen to the multicast
+group XFRMNLGRP_AEVENTS.
+
+Programs installing SAs will need to specify the two thresholds, however,
+in order to not change existing applications such as racoon
+we also provide default threshold values for these different parameters
+in case they are not specified.
+
+the two sysctls/proc entries are:
+a) /proc/sys/net/core/sysctl_xfrm_aevent_etime
+used to provide default values for the XFRMA_ETIMER_THRESH in incremental
+units of time of 100ms. The default is 10 (1 second)
+
+b) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth
+used to provide default values for XFRMA_REPLAY_THRESH parameter
+in incremental packet count. The default is two packets.
+
+4) Message types
+----------------
+
+a) XFRM_MSG_GETAE issued by user-->kernel.
+XFRM_MSG_GETAE does not carry any TLVs.
+The response is a XFRM_MSG_NEWAE which is formatted based on what
+XFRM_MSG_GETAE queried for.
+The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
+*if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved
+*if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved
+
+b) XFRM_MSG_NEWAE is issued by either user space to configure
+or kernel to announce events or respond to a XFRM_MSG_GETAE.
+
+i) user --> kernel to configure a specific SA.
+any of the values or threshold parameters can be updated by passing the
+appropriate TLV.
+A response is issued back to the sender in user space to indicate success
+or failure.
+In the case of success, additionally an event with
+XFRM_MSG_NEWAE is also issued to any listeners as described in iii).
+
+ii) kernel->user direction as a response to XFRM_MSG_GETAE
+The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
+The threshold TLVs will be included if explicitly requested in
+the XFRM_MSG_GETAE message.
+
+iii) kernel->user to report as event if someone sets any values or
+thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above).
+In such a case XFRM_AE_CU flag is set to inform the user that
+the change happened as a result of an update.
+The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
+
+iv) kernel->user to report event when replay threshold or a timeout
+is exceeded.
+In such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout
+happened) is set to inform the user what happened.
+Note the two flags are mutually exclusive.
+The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
+
+Exceptions to threshold settings
+--------------------------------
+
+If you have an SA that is getting hit by traffic in bursts such that
+there is a period where the timer threshold expires with no packets
+seen, then an odd behavior is seen as follows:
+The first packet arrival after a timer expiry will trigger a timeout
+event; i.e we don't wait for a timeout period or a packet threshold
+to be reached. This is done for simplicity and efficiency reasons.
+
+-JHS
diff --git a/Documentation/networking/xfrm_sysctl.txt b/Documentation/networking/xfrm_sysctl.txt
new file mode 100644
index 000000000..5bbd16792
--- /dev/null
+++ b/Documentation/networking/xfrm_sysctl.txt
@@ -0,0 +1,4 @@
+/proc/sys/net/core/xfrm_* Variables:
+
+xfrm_acq_expires - INTEGER
+	default 30 - hard timeout in seconds for acquire requests
diff --git a/Documentation/networking/z8530book.rst b/Documentation/networking/z8530book.rst
new file mode 100644
index 000000000..fea2c40e7
--- /dev/null
+++ b/Documentation/networking/z8530book.rst
@@ -0,0 +1,256 @@
+=======================
+Z8530 Programming Guide
+=======================
+
+:Author: Alan Cox
+
+Introduction
+============
+
+The Z85x30 family synchronous/asynchronous controller chips are used on
+a large number of cheap network interface cards. The kernel provides a
+core interface layer that is designed to make it easy to provide WAN
+services using this chip.
+
+The current driver only support synchronous operation. Merging the
+asynchronous driver support into this code to allow any Z85x30 device to
+be used as both a tty interface and as a synchronous controller is a
+project for Linux post the 2.4 release
+
+Driver Modes
+============
+
+The Z85230 driver layer can drive Z8530, Z85C30 and Z85230 devices in
+three different modes. Each mode can be applied to an individual channel
+on the chip (each chip has two channels).
+
+The PIO synchronous mode supports the most common Z8530 wiring. Here the
+chip is interface to the I/O and interrupt facilities of the host
+machine but not to the DMA subsystem. When running PIO the Z8530 has
+extremely tight timing requirements. Doing high speeds, even with a
+Z85230 will be tricky. Typically you should expect to achieve at best
+9600 baud with a Z8C530 and 64Kbits with a Z85230.
+
+The DMA mode supports the chip when it is configured to use dual DMA
+channels on an ISA bus. The better cards tend to support this mode of
+operation for a single channel. With DMA running the Z85230 tops out
+when it starts to hit ISA DMA constraints at about 512Kbits. It is worth
+noting here that many PC machines hang or crash when the chip is driven
+fast enough to hold the ISA bus solid.
+
+Transmit DMA mode uses a single DMA channel. The DMA channel is used for
+transmission as the transmit FIFO is smaller than the receive FIFO. it
+gives better performance than pure PIO mode but is nowhere near as ideal
+as pure DMA mode.
+
+Using the Z85230 driver
+=======================
+
+The Z85230 driver provides the back end interface to your board. To
+configure a Z8530 interface you need to detect the board and to identify
+its ports and interrupt resources. It is also your problem to verify the
+resources are available.
+
+Having identified the chip you need to fill in a struct z8530_dev,
+which describes each chip. This object must exist until you finally
+shutdown the board. Firstly zero the active field. This ensures nothing
+goes off without you intending it. The irq field should be set to the
+interrupt number of the chip. (Each chip has a single interrupt source
+rather than each channel). You are responsible for allocating the
+interrupt line. The interrupt handler should be set to
+:c:func:`z8530_interrupt()`. The device id should be set to the
+z8530_dev structure pointer. Whether the interrupt can be shared or not
+is board dependent, and up to you to initialise.
+
+The structure holds two channel structures. Initialise chanA.ctrlio and
+chanA.dataio with the address of the control and data ports. You can or
+this with Z8530_PORT_SLEEP to indicate your interface needs the 5uS
+delay for chip settling done in software. The PORT_SLEEP option is
+architecture specific. Other flags may become available on future
+platforms, eg for MMIO. Initialise the chanA.irqs to &z8530_nop to
+start the chip up as disabled and discarding interrupt events. This
+ensures that stray interrupts will be mopped up and not hang the bus.
+Set chanA.dev to point to the device structure itself. The private and
+name field you may use as you wish. The private field is unused by the
+Z85230 layer. The name is used for error reporting and it may thus make
+sense to make it match the network name.
+
+Repeat the same operation with the B channel if your chip has both
+channels wired to something useful. This isn't always the case. If it is
+not wired then the I/O values do not matter, but you must initialise
+chanB.dev.
+
+If your board has DMA facilities then initialise the txdma and rxdma
+fields for the relevant channels. You must also allocate the ISA DMA
+channels and do any necessary board level initialisation to configure
+them. The low level driver will do the Z8530 and DMA controller
+programming but not board specific magic.
+
+Having initialised the device you can then call
+:c:func:`z8530_init()`. This will probe the chip and reset it into
+a known state. An identification sequence is then run to identify the
+chip type. If the checks fail to pass the function returns a non zero
+error code. Typically this indicates that the port given is not valid.
+After this call the type field of the z8530_dev structure is
+initialised to either Z8530, Z85C30 or Z85230 according to the chip
+found.
+
+Once you have called z8530_init you can also make use of the utility
+function :c:func:`z8530_describe()`. This provides a consistent
+reporting format for the Z8530 devices, and allows all the drivers to
+provide consistent reporting.
+
+Attaching Network Interfaces
+============================
+
+If you wish to use the network interface facilities of the driver, then
+you need to attach a network device to each channel that is present and
+in use. In addition to use the generic HDLC you need to follow some
+additional plumbing rules. They may seem complex but a look at the
+example hostess_sv11 driver should reassure you.
+
+The network device used for each channel should be pointed to by the
+netdevice field of each channel. The hdlc-> priv field of the network
+device points to your private data - you will need to be able to find
+your private data from this.
+
+The way most drivers approach this particular problem is to create a
+structure holding the Z8530 device definition and put that into the
+private field of the network device. The network device fields of the
+channels then point back to the network devices.
+
+If you wish to use the generic HDLC then you need to register the HDLC
+device.
+
+Before you register your network device you will also need to provide
+suitable handlers for most of the network device callbacks. See the
+network device documentation for more details on this.
+
+Configuring And Activating The Port
+===================================
+
+The Z85230 driver provides helper functions and tables to load the port
+registers on the Z8530 chips. When programming the register settings for
+a channel be aware that the documentation recommends initialisation
+orders. Strange things happen when these are not followed.
+
+:c:func:`z8530_channel_load()` takes an array of pairs of
+initialisation values in an array of u8 type. The first value is the
+Z8530 register number. Add 16 to indicate the alternate register bank on
+the later chips. The array is terminated by a 255.
+
+The driver provides a pair of public tables. The z8530_hdlc_kilostream
+table is for the UK 'Kilostream' service and also happens to cover most
+other end host configurations. The z8530_hdlc_kilostream_85230 table
+is the same configuration using the enhancements of the 85230 chip. The
+configuration loaded is standard NRZ encoded synchronous data with HDLC
+bitstuffing. All of the timing is taken from the other end of the link.
+
+When writing your own tables be aware that the driver internally tracks
+register values. It may need to reload values. You should therefore be
+sure to set registers 1-7, 9-11, 14 and 15 in all configurations. Where
+the register settings depend on DMA selection the driver will update the
+bits itself when you open or close. Loading a new table with the
+interface open is not recommended.
+
+There are three standard configurations supported by the core code. In
+PIO mode the interface is programmed up to use interrupt driven PIO.
+This places high demands on the host processor to avoid latency. The
+driver is written to take account of latency issues but it cannot avoid
+latencies caused by other drivers, notably IDE in PIO mode. Because the
+drivers allocate buffers you must also prevent MTU changes while the
+port is open.
+
+Once the port is open it will call the rx_function of each channel
+whenever a completed packet arrived. This is invoked from interrupt
+context and passes you the channel and a network buffer (struct
+sk_buff) holding the data. The data includes the CRC bytes so most
+users will want to trim the last two bytes before processing the data.
+This function is very timing critical. When you wish to simply discard
+data the support code provides the function
+:c:func:`z8530_null_rx()` to discard the data.
+
+To active PIO mode sending and receiving the ``z8530_sync_open`` is called.
+This expects to be passed the network device and the channel. Typically
+this is called from your network device open callback. On a failure a
+non zero error status is returned.
+The :c:func:`z8530_sync_close()` function shuts down a PIO
+channel. This must be done before the channel is opened again and before
+the driver shuts down and unloads.
+
+The ideal mode of operation is dual channel DMA mode. Here the kernel
+driver will configure the board for DMA in both directions. The driver
+also handles ISA DMA issues such as controller programming and the
+memory range limit for you. This mode is activated by calling the
+:c:func:`z8530_sync_dma_open()` function. On failure a non zero
+error value is returned. Once this mode is activated it can be shut down
+by calling the :c:func:`z8530_sync_dma_close()`. You must call
+the close function matching the open mode you used.
+
+The final supported mode uses a single DMA channel to drive the transmit
+side. As the Z85C30 has a larger FIFO on the receive channel this tends
+to increase the maximum speed a little. This is activated by calling the
+``z8530_sync_txdma_open``. This returns a non zero error code on failure. The
+:c:func:`z8530_sync_txdma_close()` function closes down the Z8530
+interface from this mode.
+
+Network Layer Functions
+=======================
+
+The Z8530 layer provides functions to queue packets for transmission.
+The driver internally buffers the frame currently being transmitted and
+one further frame (in order to keep back to back transmission running).
+Any further buffering is up to the caller.
+
+The function :c:func:`z8530_queue_xmit()` takes a network buffer
+in sk_buff format and queues it for transmission. The caller must
+provide the entire packet with the exception of the bitstuffing and CRC.
+This is normally done by the caller via the generic HDLC interface
+layer. It returns 0 if the buffer has been queued and non zero values
+for queue full. If the function accepts the buffer it becomes property
+of the Z8530 layer and the caller should not free it.
+
+The function :c:func:`z8530_get_stats()` returns a pointer to an
+internally maintained per interface statistics block. This provides most
+of the interface code needed to implement the network layer get_stats
+callback.
+
+Porting The Z8530 Driver
+========================
+
+The Z8530 driver is written to be portable. In DMA mode it makes
+assumptions about the use of ISA DMA. These are probably warranted in
+most cases as the Z85230 in particular was designed to glue to PC type
+machines. The PIO mode makes no real assumptions.
+
+Should you need to retarget the Z8530 driver to another architecture the
+only code that should need changing are the port I/O functions. At the
+moment these assume PC I/O port accesses. This may not be appropriate
+for all platforms. Replacing :c:func:`z8530_read_port()` and
+``z8530_write_port`` is intended to be all that is required to port
+this driver layer.
+
+Known Bugs And Assumptions
+==========================
+
+Interrupt Locking
+    The locking in the driver is done via the global cli/sti lock. This
+    makes for relatively poor SMP performance. Switching this to use a
+    per device spin lock would probably materially improve performance.
+
+Occasional Failures
+    We have reports of occasional failures when run for very long
+    periods of time and the driver starts to receive junk frames. At the
+    moment the cause of this is not clear.
+
+Public Functions Provided
+=========================
+
+.. kernel-doc:: drivers/net/wan/z85230.c
+   :export:
+
+Internal Functions
+==================
+
+.. kernel-doc:: drivers/net/wan/z85230.c
+   :internal:
diff --git a/Documentation/networking/z8530drv.txt b/Documentation/networking/z8530drv.txt
new file mode 100644
index 000000000..2206abbc3
--- /dev/null
+++ b/Documentation/networking/z8530drv.txt
@@ -0,0 +1,657 @@
+This is a subset of the documentation. To use this driver you MUST have the
+full package from:
+
+Internet:
+=========
+
+1. ftp://ftp.ccac.rwth-aachen.de/pub/jr/z8530drv-utils_3.0-3.tar.gz
+
+2. ftp://ftp.pspt.fi/pub/ham/linux/ax25/z8530drv-utils_3.0-3.tar.gz
+
+Please note that the information in this document may be hopelessly outdated.
+A new version of the documentation, along with links to other important
+Linux Kernel AX.25 documentation and programs, is available on
+http://yaina.de/jreuter
+
+-----------------------------------------------------------------------------
+
+
+	 SCC.C - Linux driver for Z8530 based HDLC cards for AX.25      
+
+   ********************************************************************
+
+        (c) 1993,2000 by Joerg Reuter DL1BKE <jreuter@yaina.de>
+
+        portions (c) 1993 Guido ten Dolle PE1NNZ
+
+        for the complete copyright notice see >> Copying.Z8530DRV <<
+
+   ******************************************************************** 
+
+
+1. Initialization of the driver
+===============================
+
+To use the driver, 3 steps must be performed:
+
+     1. if compiled as module: loading the module
+     2. Setup of hardware, MODEM and KISS parameters with sccinit
+     3. Attach each channel to the Linux kernel AX.25 with "ifconfig"
+
+Unlike the versions below 2.4 this driver is a real network device
+driver. If you want to run xNOS instead of our fine kernel AX.25
+use a 2.x version (available from above sites) or read the
+AX.25-HOWTO on how to emulate a KISS TNC on network device drivers.
+
+
+1.1 Loading the module
+======================
+
+(If you're going to compile the driver as a part of the kernel image,
+ skip this chapter and continue with 1.2)
+
+Before you can use a module, you'll have to load it with
+
+	insmod scc.o
+
+please read 'man insmod' that comes with module-init-tools.
+
+You should include the insmod in one of the /etc/rc.d/rc.* files,
+and don't forget to insert a call of sccinit after that. It
+will read your /etc/z8530drv.conf.
+
+1.2. /etc/z8530drv.conf
+=======================
+
+To setup all parameters you must run /sbin/sccinit from one
+of your rc.*-files. This has to be done BEFORE you can
+"ifconfig" an interface. Sccinit reads the file /etc/z8530drv.conf
+and sets the hardware, MODEM and KISS parameters. A sample file is
+delivered with this package. Change it to your needs.
+
+The file itself consists of two main sections.
+
+1.2.1 configuration of hardware parameters
+==========================================
+
+The hardware setup section defines the following parameters for each
+Z8530:
+
+chip    1
+data_a  0x300                   # data port A
+ctrl_a  0x304                   # control port A
+data_b  0x301                   # data port B
+ctrl_b  0x305                   # control port B
+irq     5                       # IRQ No. 5
+pclock  4915200                 # clock
+board   BAYCOM                  # hardware type
+escc    no                      # enhanced SCC chip? (8580/85180/85280)
+vector  0                       # latch for interrupt vector
+special no                      # address of special function register
+option  0                       # option to set via sfr
+
+
+chip	- this is just a delimiter to make sccinit a bit simpler to
+	  program. A parameter has no effect.
+
+data_a  - the address of the data port A of this Z8530 (needed)
+ctrl_a  - the address of the control port A (needed)
+data_b  - the address of the data port B (needed)
+ctrl_b  - the address of the control port B (needed)
+
+irq     - the used IRQ for this chip. Different chips can use different
+          IRQs or the same. If they share an interrupt, it needs to be
+	  specified within one chip-definition only.
+
+pclock  - the clock at the PCLK pin of the Z8530 (option, 4915200 is
+          default), measured in Hertz
+
+board   - the "type" of the board:
+
+	   SCC type                 value
+	   ---------------------------------
+	   PA0HZP SCC card          PA0HZP
+	   EAGLE card               EAGLE
+	   PC100 card               PC100
+	   PRIMUS-PC (DG9BL) card   PRIMUS
+	   BayCom (U)SCC card       BAYCOM
+
+escc    - if you want support for ESCC chips (8580, 85180, 85280), set
+          this to "yes" (option, defaults to "no")
+
+vector  - address of the vector latch (aka "intack port") for PA0HZP
+          cards. There can be only one vector latch for all chips!
+	  (option, defaults to 0)
+
+special - address of the special function register on several cards.
+          (option, defaults to 0)
+
+option  - The value you write into that register (option, default is 0)
+
+You can specify up to four chips (8 channels). If this is not enough,
+just change
+
+	#define MAXSCC 4
+
+to a higher value.
+
+Example for the BAYCOM USCC:
+----------------------------
+
+chip    1
+data_a  0x300                   # data port A
+ctrl_a  0x304                   # control port A
+data_b  0x301                   # data port B
+ctrl_b  0x305                   # control port B
+irq     5                       # IRQ No. 5 (#)
+board   BAYCOM                  # hardware type (*)
+#
+# SCC chip 2
+#
+chip    2
+data_a  0x302
+ctrl_a  0x306
+data_b  0x303
+ctrl_b  0x307
+board   BAYCOM
+
+An example for a PA0HZP card:
+-----------------------------
+
+chip 1
+data_a 0x153
+data_b 0x151
+ctrl_a 0x152
+ctrl_b 0x150
+irq 9
+pclock 4915200
+board PA0HZP
+vector 0x168
+escc no
+#
+#
+#
+chip 2
+data_a 0x157
+data_b 0x155
+ctrl_a 0x156
+ctrl_b 0x154
+irq 9
+pclock 4915200
+board PA0HZP
+vector 0x168
+escc no
+
+A DRSI would should probably work with this:
+--------------------------------------------
+(actually: two DRSI cards...)
+
+chip 1
+data_a 0x303
+data_b 0x301
+ctrl_a 0x302
+ctrl_b 0x300
+irq 7
+pclock 4915200
+board DRSI
+escc no
+#
+#
+#
+chip 2
+data_a 0x313
+data_b 0x311
+ctrl_a 0x312
+ctrl_b 0x310
+irq 7
+pclock 4915200
+board DRSI
+escc no
+
+Note that you cannot use the on-board baudrate generator off DRSI
+cards. Use "mode dpll" for clock source (see below).
+
+This is based on information provided by Mike Bilow (and verified
+by Paul Helay)
+
+The utility "gencfg"
+--------------------
+
+If you only know the parameters for the PE1CHL driver for DOS,
+run gencfg. It will generate the correct port addresses (I hope).
+Its parameters are exactly the same as the ones you use with
+the "attach scc" command in net, except that the string "init" must 
+not appear. Example:
+
+gencfg 2 0x150 4 2 0 1 0x168 9 4915200 
+
+will print a skeleton z8530drv.conf for the OptoSCC to stdout.
+
+gencfg 2 0x300 2 4 5 -4 0 7 4915200 0x10
+
+does the same for the BAYCOM USCC card. In my opinion it is much easier
+to edit scc_config.h... 
+
+
+1.2.2 channel configuration
+===========================
+
+The channel definition is divided into three sub sections for each
+channel:
+
+An example for scc0:
+
+# DEVICE
+
+device scc0	# the device for the following params
+
+# MODEM / BUFFERS
+
+speed 1200		# the default baudrate
+clock dpll		# clock source: 
+			# 	dpll     = normal half duplex operation
+			# 	external = MODEM provides own Rx/Tx clock
+			#	divider  = use full duplex divider if
+			#		   installed (1)
+mode nrzi		# HDLC encoding mode
+			#	nrzi = 1k2 MODEM, G3RUH 9k6 MODEM
+			#	nrz  = DF9IC 9k6 MODEM
+			#
+bufsize	384		# size of buffers. Note that this must include
+			# the AX.25 header, not only the data field!
+			# (optional, defaults to 384)
+
+# KISS (Layer 1)
+
+txdelay 36              # (see chapter 1.4)
+persist 64
+slot    8
+tail    8
+fulldup 0
+wait    12
+min     3
+maxkey  7
+idle    3
+maxdef  120
+group   0
+txoff   off
+softdcd on                   
+slip    off
+
+The order WITHIN these sections is unimportant. The order OF these
+sections IS important. The MODEM parameters are set with the first
+recognized KISS parameter...
+
+Please note that you can initialize the board only once after boot
+(or insmod). You can change all parameters but "mode" and "clock" 
+later with the Sccparam program or through KISS. Just to avoid 
+security holes... 
+
+(1) this divider is usually mounted on the SCC-PBC (PA0HZP) or not
+    present at all (BayCom). It feeds back the output of the DPLL 
+    (digital pll) as transmit clock. Using this mode without a divider 
+    installed will normally result in keying the transceiver until 
+    maxkey expires --- of course without sending anything (useful).
+
+2. Attachment of a channel by your AX.25 software
+=================================================
+
+2.1 Kernel AX.25
+================
+
+To set up an AX.25 device you can simply type:
+
+	ifconfig scc0 44.128.1.1 hw ax25 dl0tha-7
+
+This will create a network interface with the IP number 44.128.20.107 
+and the callsign "dl0tha". If you do not have any IP number (yet) you 
+can use any of the 44.128.0.0 network. Note that you do not need 
+axattach. The purpose of axattach (like slattach) is to create a KISS 
+network device linked to a TTY. Please read the documentation of the 
+ax25-utils and the AX.25-HOWTO to learn how to set the parameters of
+the kernel AX.25.
+
+2.2 NOS, NET and TFKISS
+=======================
+
+Since the TTY driver (aka KISS TNC emulation) is gone you need
+to emulate the old behaviour. The cost of using these programs is
+that you probably need to compile the kernel AX.25, regardless of whether
+you actually use it or not. First setup your /etc/ax25/axports,
+for example:
+
+	9k6	dl0tha-9  9600  255 4 9600 baud port (scc3)
+	axlink	dl0tha-15 38400 255 4 Link to NOS
+
+Now "ifconfig" the scc device:
+
+	ifconfig scc3 44.128.1.1 hw ax25 dl0tha-9
+
+You can now axattach a pseudo-TTY:
+
+	axattach /dev/ptys0 axlink
+
+and start your NOS and attach /dev/ptys0 there. The problem is that
+NOS is reachable only via digipeating through the kernel AX.25
+(disastrous on a DAMA controlled channel). To solve this problem,
+configure "rxecho" to echo the incoming frames from "9k6" to "axlink"
+and outgoing frames from "axlink" to "9k6" and start:
+
+	rxecho
+
+Or simply use "kissbridge" coming with z8530drv-utils:
+
+	ifconfig scc3 hw ax25 dl0tha-9
+	kissbridge scc3 /dev/ptys0
+
+
+3. Adjustment and Display of parameters
+=======================================
+
+3.1 Displaying SCC Parameters:
+==============================
+
+Once a SCC channel has been attached, the parameter settings and 
+some statistic information can be shown using the param program:
+
+dl1bke-u:~$ sccstat scc0
+
+Parameters:
+
+speed       : 1200 baud
+txdelay     : 36
+persist     : 255
+slottime    : 0
+txtail      : 8
+fulldup     : 1
+waittime    : 12
+mintime     : 3 sec
+maxkeyup    : 7 sec
+idletime    : 3 sec
+maxdefer    : 120 sec
+group       : 0x00
+txoff       : off
+softdcd     : on
+SLIP        : off
+
+Status:
+
+HDLC                  Z8530           Interrupts         Buffers
+-----------------------------------------------------------------------
+Sent       :     273  RxOver :     0  RxInts :   125074  Size    :  384
+Received   :    1095  TxUnder:     0  TxInts :     4684  NoSpace :    0
+RxErrors   :    1591                  ExInts :    11776
+TxErrors   :       0                  SpInts :     1503
+Tx State   :    idle
+
+
+The status info shown is:
+
+Sent		- number of frames transmitted
+Received	- number of frames received
+RxErrors	- number of receive errors (CRC, ABORT)
+TxErrors	- number of discarded Tx frames (due to various reasons) 
+Tx State	- status of the Tx interrupt handler: idle/busy/active/tail (2)
+RxOver		- number of receiver overruns
+TxUnder		- number of transmitter underruns
+RxInts		- number of receiver interrupts
+TxInts		- number of transmitter interrupts
+EpInts		- number of receiver special condition interrupts
+SpInts		- number of external/status interrupts
+Size		- maximum size of an AX.25 frame (*with* AX.25 headers!)
+NoSpace		- number of times a buffer could not get allocated
+
+An overrun is abnormal. If lots of these occur, the product of
+baudrate and number of interfaces is too high for the processing
+power of your computer. NoSpace errors are unlikely to be caused by the
+driver or the kernel AX.25.
+
+
+3.2 Setting Parameters
+======================
+
+
+The setting of parameters of the emulated KISS TNC is done in the 
+same way in the SCC driver. You can change parameters by using
+the kissparms program from the ax25-utils package or use the program 
+"sccparam":
+
+     sccparam <device> <paramname> <decimal-|hexadecimal value>
+
+You can change the following parameters:
+
+param	    : value
+------------------------
+speed       : 1200
+txdelay     : 36
+persist     : 255
+slottime    : 0
+txtail      : 8
+fulldup     : 1
+waittime    : 12
+mintime     : 3
+maxkeyup    : 7
+idletime    : 3
+maxdefer    : 120
+group       : 0x00
+txoff       : off
+softdcd     : on
+SLIP        : off
+
+
+The parameters have the following meaning:
+
+speed:
+     The baudrate on this channel in bits/sec
+
+     Example: sccparam /dev/scc3 speed 9600
+
+txdelay:
+     The delay (in units of 10 ms) after keying of the 
+     transmitter, until the first byte is sent. This is usually 
+     called "TXDELAY" in a TNC.  When 0 is specified, the driver 
+     will just wait until the CTS signal is asserted. This 
+     assumes the presence of a timer or other circuitry in the 
+     MODEM and/or transmitter, that asserts CTS when the 
+     transmitter is ready for data.
+     A normal value of this parameter is 30-36.
+
+     Example: sccparam /dev/scc0 txd 20
+
+persist:
+     This is the probability that the transmitter will be keyed 
+     when the channel is found to be free.  It is a value from 0 
+     to 255, and the probability is (value+1)/256.  The value 
+     should be somewhere near 50-60, and should be lowered when 
+     the channel is used more heavily.
+
+     Example: sccparam /dev/scc2 persist 20
+
+slottime:
+     This is the time between samples of the channel. It is 
+     expressed in units of 10 ms.  About 200-300 ms (value 20-30) 
+     seems to be a good value.
+
+     Example: sccparam /dev/scc0 slot 20
+
+tail:
+     The time the transmitter will remain keyed after the last 
+     byte of a packet has been transferred to the SCC. This is 
+     necessary because the CRC and a flag still have to leave the 
+     SCC before the transmitter is keyed down. The value depends 
+     on the baudrate selected.  A few character times should be 
+     sufficient, e.g. 40ms at 1200 baud. (value 4)
+     The value of this parameter is in 10 ms units.
+
+     Example: sccparam /dev/scc2 4
+
+full:
+     The full-duplex mode switch. This can be one of the following 
+     values:
+
+     0:   The interface will operate in CSMA mode (the normal 
+          half-duplex packet radio operation)
+     1:   Fullduplex mode, i.e. the transmitter will be keyed at 
+          any time, without checking the received carrier.  It 
+          will be unkeyed when there are no packets to be sent.
+     2:   Like 1, but the transmitter will remain keyed, also 
+          when there are no packets to be sent.  Flags will be 
+          sent in that case, until a timeout (parameter 10) 
+          occurs.
+
+     Example: sccparam /dev/scc0 fulldup off
+
+wait:
+     The initial waittime before any transmit attempt, after the 
+     frame has been queue for transmit.  This is the length of 
+     the first slot in CSMA mode.  In full duplex modes it is
+     set to 0 for maximum performance.
+     The value of this parameter is in 10 ms units. 
+
+     Example: sccparam /dev/scc1 wait 4
+
+maxkey:
+     The maximal time the transmitter will be keyed to send 
+     packets, in seconds.  This can be useful on busy CSMA 
+     channels, to avoid "getting a bad reputation" when you are 
+     generating a lot of traffic.  After the specified time has 
+     elapsed, no new frame will be started. Instead, the trans-
+     mitter will be switched off for a specified time (parameter 
+     min), and then the selected algorithm for keyup will be 
+     started again.
+     The value 0 as well as "off" will disable this feature, 
+     and allow infinite transmission time. 
+
+     Example: sccparam /dev/scc0 maxk 20
+
+min:
+     This is the time the transmitter will be switched off when 
+     the maximum transmission time is exceeded.
+
+     Example: sccparam /dev/scc3 min 10
+
+idle
+     This parameter specifies the maximum idle time in full duplex 
+     2 mode, in seconds.  When no frames have been sent for this 
+     time, the transmitter will be keyed down.  A value of 0 is
+     has same result as the fullduplex mode 1. This parameter
+     can be disabled.
+
+     Example: sccparam /dev/scc2 idle off	# transmit forever
+
+maxdefer
+     This is the maximum time (in seconds) to wait for a free channel
+     to send. When this timer expires the transmitter will be keyed 
+     IMMEDIATELY. If you love to get trouble with other users you
+     should set this to a very low value ;-)
+
+     Example: sccparam /dev/scc0 maxdefer 240	# 2 minutes
+
+
+txoff:
+     When this parameter has the value 0, the transmission of packets
+     is enable. Otherwise it is disabled.
+
+     Example: sccparam /dev/scc2 txoff on
+
+group:
+     It is possible to build special radio equipment to use more than 
+     one frequency on the same band, e.g. using several receivers and 
+     only one transmitter that can be switched between frequencies.
+     Also, you can connect several radios that are active on the same 
+     band.  In these cases, it is not possible, or not a good idea, to 
+     transmit on more than one frequency.  The SCC driver provides a 
+     method to lock transmitters on different interfaces, using the 
+     "param <interface> group <x>" command.  This will only work when 
+     you are using CSMA mode (parameter full = 0).
+     The number <x> must be 0 if you want no group restrictions, and 
+     can be computed as follows to create restricted groups:
+     <x> is the sum of some OCTAL numbers:
+
+     200  This transmitter will only be keyed when all other 
+          transmitters in the group are off.
+     100  This transmitter will only be keyed when the carrier 
+          detect of all other interfaces in the group is off.
+     0xx  A byte that can be used to define different groups.  
+          Interfaces are in the same group, when the logical AND 
+          between their xx values is nonzero.
+
+     Examples:
+     When 2 interfaces use group 201, their transmitters will never be 
+     keyed at the same time.
+     When 2 interfaces use group 101, the transmitters will only key 
+     when both channels are clear at the same time.  When group 301, 
+     the transmitters will not be keyed at the same time.
+
+     Don't forget to convert the octal numbers into decimal before
+     you set the parameter.
+
+     Example: (to be written)
+
+softdcd:
+     use a software dcd instead of the real one... Useful for a very
+     slow squelch.
+
+     Example: sccparam /dev/scc0 soft on
+
+
+4. Problems 
+===========
+
+If you have tx-problems with your BayCom USCC card please check
+the manufacturer of the 8530. SGS chips have a slightly
+different timing. Try Zilog...  A solution is to write to register 8 
+instead to the data port, but this won't work with the ESCC chips. 
+*SIGH!*
+
+A very common problem is that the PTT locks until the maxkeyup timer
+expires, although interrupts and clock source are correct. In most
+cases compiling the driver with CONFIG_SCC_DELAY (set with
+make config) solves the problems. For more hints read the (pseudo) FAQ 
+and the documentation coming with z8530drv-utils.
+
+I got reports that the driver has problems on some 386-based systems.
+(i.e. Amstrad) Those systems have a bogus AT bus timing which will
+lead to delayed answers on interrupts. You can recognize these
+problems by looking at the output of Sccstat for the suspected
+port. If it shows under- and overruns you own such a system.
+
+Delayed processing of received data: This depends on
+
+- the kernel version
+
+- kernel profiling compiled or not
+
+- a high interrupt load
+
+- a high load of the machine --- running X, Xmorph, XV and Povray,
+  while compiling the kernel... hmm ... even with 32 MB RAM ...  ;-)
+  Or running a named for the whole .ampr.org domain on an 8 MB
+  box...
+
+- using information from rxecho or kissbridge.
+
+Kernel panics: please read /linux/README and find out if it
+really occurred within the scc driver.
+
+If you cannot solve a problem, send me
+
+- a description of the problem,
+- information on your hardware (computer system, scc board, modem)
+- your kernel version
+- the output of cat /proc/net/z8530
+
+4. Thor RLC100
+==============
+
+Mysteriously this board seems not to work with the driver. Anyone
+got it up-and-running?
+
+
+Many thanks to Linus Torvalds and Alan Cox for including the driver
+in the Linux standard distribution and their support.
+
+Joerg Reuter	ampr-net: dl1bke@db0pra.ampr.org
+		AX-25   : DL1BKE @ DB0ABH.#BAY.DEU.EU
+		Internet: jreuter@yaina.de
+		WWW     : http://yaina.de/jreuter