diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-06 01:02:30 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-06 01:02:30 +0000 |
commit | 76cb841cb886eef6b3bee341a2266c76578724ad (patch) | |
tree | f5892e5ba6cc11949952a6ce4ecbe6d516d6ce58 /Documentation/blockdev | |
parent | Initial commit. (diff) | |
download | linux-upstream.tar.xz linux-upstream.zip |
Adding upstream version 4.19.249.upstream/4.19.249upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
-rw-r--r-- | Documentation/blockdev/00-INDEX | 18 | ||||
-rw-r--r-- | Documentation/blockdev/README.DAC960 | 756 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/DRBD-8.3-data-packets.svg | 588 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/DRBD-data-packets.svg | 459 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/README.txt | 16 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/conn-states-8.dot | 18 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/data-structure-v9.txt | 38 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/disk-states-8.dot | 16 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/drbd-connection-state-overview.dot | 85 | ||||
-rw-r--r-- | Documentation/blockdev/drbd/node-states-8.dot | 14 | ||||
-rw-r--r-- | Documentation/blockdev/floppy.txt | 245 | ||||
-rw-r--r-- | Documentation/blockdev/nbd.txt | 31 | ||||
-rw-r--r-- | Documentation/blockdev/paride.txt | 417 | ||||
-rw-r--r-- | Documentation/blockdev/ramdisk.txt | 174 | ||||
-rw-r--r-- | Documentation/blockdev/zram.txt | 271 |
15 files changed, 3146 insertions, 0 deletions
diff --git a/Documentation/blockdev/00-INDEX b/Documentation/blockdev/00-INDEX new file mode 100644 index 000000000..c08df56dd --- /dev/null +++ b/Documentation/blockdev/00-INDEX @@ -0,0 +1,18 @@ +00-INDEX + - this file +README.DAC960 + - info on Mylex DAC960/DAC1100 PCI RAID Controller Driver for Linux. +cciss.txt + - info, major/minor #'s for Compaq's SMART Array Controllers. +cpqarray.txt + - info on using Compaq's SMART2 Intelligent Disk Array Controllers. +floppy.txt + - notes and driver options for the floppy disk driver. +mflash.txt + - info on mGine m(g)flash driver for linux. +nbd.txt + - info on a TCP implementation of a network block device. +paride.txt + - information about the parallel port IDE subsystem. +ramdisk.txt + - short guide on how to set up and use the RAM disk. diff --git a/Documentation/blockdev/README.DAC960 b/Documentation/blockdev/README.DAC960 new file mode 100644 index 000000000..bd85fb9dc --- /dev/null +++ b/Documentation/blockdev/README.DAC960 @@ -0,0 +1,756 @@ + Linux Driver for Mylex DAC960/AcceleRAID/eXtremeRAID PCI RAID Controllers + + Version 2.2.11 for Linux 2.2.19 + Version 2.4.11 for Linux 2.4.12 + + PRODUCTION RELEASE + + 11 October 2001 + + Leonard N. Zubkoff + Dandelion Digital + lnz@dandelion.com + + Copyright 1998-2001 by Leonard N. Zubkoff <lnz@dandelion.com> + + + INTRODUCTION + +Mylex, Inc. designs and manufactures a variety of high performance PCI RAID +controllers. Mylex Corporation is located at 34551 Ardenwood Blvd., Fremont, +California 94555, USA and can be reached at 510.796.6100 or on the World Wide +Web at http://www.mylex.com. Mylex Technical Support can be reached by +electronic mail at mylexsup@us.ibm.com, by voice at 510.608.2400, or by FAX at +510.745.7715. Contact information for offices in Europe and Japan is available +on their Web site. + +The latest information on Linux support for DAC960 PCI RAID Controllers, as +well as the most recent release of this driver, will always be available from +my Linux Home Page at URL "http://www.dandelion.com/Linux/". The Linux DAC960 +driver supports all current Mylex PCI RAID controllers including the new +eXtremeRAID 2000/3000 and AcceleRAID 352/170/160 models which have an entirely +new firmware interface from the older eXtremeRAID 1100, AcceleRAID 150/200/250, +and DAC960PJ/PG/PU/PD/PL. See below for a complete controller list as well as +minimum firmware version requirements. For simplicity, in most places this +documentation refers to DAC960 generically rather than explicitly listing all +the supported models. + +Driver bug reports should be sent via electronic mail to "lnz@dandelion.com". +Please include with the bug report the complete configuration messages reported +by the driver at startup, along with any subsequent system messages relevant to +the controller's operation, and a detailed description of your system's +hardware configuration. Driver bugs are actually quite rare; if you encounter +problems with disks being marked offline, for example, please contact Mylex +Technical Support as the problem is related to the hardware configuration +rather than the Linux driver. + +Please consult the RAID controller documentation for detailed information +regarding installation and configuration of the controllers. This document +primarily provides information specific to the Linux support. + + + DRIVER FEATURES + +The DAC960 RAID controllers are supported solely as high performance RAID +controllers, not as interfaces to arbitrary SCSI devices. The Linux DAC960 +driver operates at the block device level, the same level as the SCSI and IDE +drivers. Unlike other RAID controllers currently supported on Linux, the +DAC960 driver is not dependent on the SCSI subsystem, and hence avoids all the +complexity and unnecessary code that would be associated with an implementation +as a SCSI driver. The DAC960 driver is designed for as high a performance as +possible with no compromises or extra code for compatibility with lower +performance devices. The DAC960 driver includes extensive error logging and +online configuration management capabilities. Except for initial configuration +of the controller and adding new disk drives, most everything can be handled +from Linux while the system is operational. + +The DAC960 driver is architected to support up to 8 controllers per system. +Each DAC960 parallel SCSI controller can support up to 15 disk drives per +channel, for a maximum of 60 drives on a four channel controller; the fibre +channel eXtremeRAID 3000 controller supports up to 125 disk drives per loop for +a total of 250 drives. The drives installed on a controller are divided into +one or more "Drive Groups", and then each Drive Group is subdivided further +into 1 to 32 "Logical Drives". Each Logical Drive has a specific RAID Level +and caching policy associated with it, and it appears to Linux as a single +block device. Logical Drives are further subdivided into up to 7 partitions +through the normal Linux and PC disk partitioning schemes. Logical Drives are +also known as "System Drives", and Drive Groups are also called "Packs". Both +terms are in use in the Mylex documentation; I have chosen to standardize on +the more generic "Logical Drive" and "Drive Group". + +DAC960 RAID disk devices are named in the style of the obsolete Device File +System (DEVFS). The device corresponding to Logical Drive D on Controller C +is referred to as /dev/rd/cCdD, and the partitions are called /dev/rd/cCdDp1 +through /dev/rd/cCdDp7. For example, partition 3 of Logical Drive 5 on +Controller 2 is referred to as /dev/rd/c2d5p3. Note that unlike with SCSI +disks the device names will not change in the event of a disk drive failure. +The DAC960 driver is assigned major numbers 48 - 55 with one major number per +controller. The 8 bits of minor number are divided into 5 bits for the Logical +Drive and 3 bits for the partition. + + + SUPPORTED DAC960/AcceleRAID/eXtremeRAID PCI RAID CONTROLLERS + +The following list comprises the supported DAC960, AcceleRAID, and eXtremeRAID +PCI RAID Controllers as of the date of this document. It is recommended that +anyone purchasing a Mylex PCI RAID Controller not in the following table +contact the author beforehand to verify that it is or will be supported. + +eXtremeRAID 3000 + 1 Wide Ultra-2/LVD SCSI channel + 2 External Fibre FC-AL channels + 233MHz StrongARM SA 110 Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 32MB/64MB ECC SDRAM Memory + +eXtremeRAID 2000 + 4 Wide Ultra-160 LVD SCSI channels + 233MHz StrongARM SA 110 Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 32MB/64MB ECC SDRAM Memory + +AcceleRAID 352 + 2 Wide Ultra-160 LVD SCSI channels + 100MHz Intel i960RN RISC Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 32MB/64MB ECC SDRAM Memory + +AcceleRAID 170 + 1 Wide Ultra-160 LVD SCSI channel + 100MHz Intel i960RM RISC Processor + 16MB/32MB/64MB ECC SDRAM Memory + +AcceleRAID 160 (AcceleRAID 170LP) + 1 Wide Ultra-160 LVD SCSI channel + 100MHz Intel i960RS RISC Processor + Built in 16M ECC SDRAM Memory + PCI Low Profile Form Factor - fit for 2U height + +eXtremeRAID 1100 (DAC1164P) + 3 Wide Ultra-2/LVD SCSI channels + 233MHz StrongARM SA 110 Processor + 64 Bit 33MHz PCI (backward compatible with 32 Bit PCI slots) + 16MB/32MB/64MB Parity SDRAM Memory with Battery Backup + +AcceleRAID 250 (DAC960PTL1) + Uses onboard Symbios SCSI chips on certain motherboards + Also includes one onboard Wide Ultra-2/LVD SCSI Channel + 66MHz Intel i960RD RISC Processor + 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory + +AcceleRAID 200 (DAC960PTL0) + Uses onboard Symbios SCSI chips on certain motherboards + Includes no onboard SCSI Channels + 66MHz Intel i960RD RISC Processor + 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory + +AcceleRAID 150 (DAC960PRL) + Uses onboard Symbios SCSI chips on certain motherboards + Also includes one onboard Wide Ultra-2/LVD SCSI Channel + 33MHz Intel i960RP RISC Processor + 4MB Parity EDO Memory + +DAC960PJ 1/2/3 Wide Ultra SCSI-3 Channels + 66MHz Intel i960RD RISC Processor + 4MB/8MB/16MB/32MB/64MB/128MB ECC EDO Memory + +DAC960PG 1/2/3 Wide Ultra SCSI-3 Channels + 33MHz Intel i960RP RISC Processor + 4MB/8MB ECC EDO Memory + +DAC960PU 1/2/3 Wide Ultra SCSI-3 Channels + Intel i960CF RISC Processor + 4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory + +DAC960PD 1/2/3 Wide Fast SCSI-2 Channels + Intel i960CF RISC Processor + 4MB/8MB EDRAM or 2MB/4MB/8MB/16MB/32MB DRAM Memory + +DAC960PL 1/2/3 Wide Fast SCSI-2 Channels + Intel i960 RISC Processor + 2MB/4MB/8MB/16MB/32MB DRAM Memory + +DAC960P 1/2/3 Wide Fast SCSI-2 Channels + Intel i960 RISC Processor + 2MB/4MB/8MB/16MB/32MB DRAM Memory + +For the eXtremeRAID 2000/3000 and AcceleRAID 352/170/160, firmware version +6.00-01 or above is required. + +For the eXtremeRAID 1100, firmware version 5.06-0-52 or above is required. + +For the AcceleRAID 250, 200, and 150, firmware version 4.06-0-57 or above is +required. + +For the DAC960PJ and DAC960PG, firmware version 4.06-0-00 or above is required. + +For the DAC960PU, DAC960PD, DAC960PL, and DAC960P, either firmware version +3.51-0-04 or above is required (for dual Flash ROM controllers), or firmware +version 2.73-0-00 or above is required (for single Flash ROM controllers) + +Please note that not all SCSI disk drives are suitable for use with DAC960 +controllers, and only particular firmware versions of any given model may +actually function correctly. Similarly, not all motherboards have a BIOS that +properly initializes the AcceleRAID 250, AcceleRAID 200, AcceleRAID 150, +DAC960PJ, and DAC960PG because the Intel i960RD/RP is a multi-function device. +If in doubt, contact Mylex RAID Technical Support (mylexsup@us.ibm.com) to +verify compatibility. Mylex makes available a hard disk compatibility list at +http://www.mylex.com/support/hdcomp/hd-lists.html. + + + DRIVER INSTALLATION + +This distribution was prepared for Linux kernel version 2.2.19 or 2.4.12. + +To install the DAC960 RAID driver, you may use the following commands, +replacing "/usr/src" with wherever you keep your Linux kernel source tree: + + cd /usr/src + tar -xvzf DAC960-2.2.11.tar.gz (or DAC960-2.4.11.tar.gz) + mv README.DAC960 linux/Documentation + mv DAC960.[ch] linux/drivers/block + patch -p0 < DAC960.patch (if DAC960.patch is included) + cd linux + make config + make bzImage (or zImage) + +Then install "arch/x86/boot/bzImage" or "arch/x86/boot/zImage" as your +standard kernel, run lilo if appropriate, and reboot. + +To create the necessary devices in /dev, the "make_rd" script included in +"DAC960-Utilities.tar.gz" from http://www.dandelion.com/Linux/ may be used. +LILO 21 and FDISK v2.9 include DAC960 support; also included in this archive +are patches to LILO 20 and FDISK v2.8 that add DAC960 support, along with +statically linked executables of LILO and FDISK. This modified version of LILO +will allow booting from a DAC960 controller and/or mounting the root file +system from a DAC960. + +Red Hat Linux 6.0 and SuSE Linux 6.1 include support for Mylex PCI RAID +controllers. Installing directly onto a DAC960 may be problematic from other +Linux distributions until their installation utilities are updated. + + + INSTALLATION NOTES + +Before installing Linux or adding DAC960 logical drives to an existing Linux +system, the controller must first be configured to provide one or more logical +drives using the BIOS Configuration Utility or DACCF. Please note that since +there are only at most 6 usable partitions on each logical drive, systems +requiring more partitions should subdivide a drive group into multiple logical +drives, each of which can have up to 6 usable partitions. Also, note that with +large disk arrays it is advisable to enable the 8GB BIOS Geometry (255/63) +rather than accepting the default 2GB BIOS Geometry (128/32); failing to so do +will cause the logical drive geometry to have more than 65535 cylinders which +will make it impossible for FDISK to be used properly. The 8GB BIOS Geometry +can be enabled by configuring the DAC960 BIOS, which is accessible via Alt-M +during the BIOS initialization sequence. + +For maximum performance and the most efficient E2FSCK performance, it is +recommended that EXT2 file systems be built with a 4KB block size and 16 block +stride to match the DAC960 controller's 64KB default stripe size. The command +"mke2fs -b 4096 -R stride=16 <device>" is appropriate. Unless there will be a +large number of small files on the file systems, it is also beneficial to add +the "-i 16384" option to increase the bytes per inode parameter thereby +reducing the file system metadata. Finally, on systems that will only be run +with Linux 2.2 or later kernels it is beneficial to enable sparse superblocks +with the "-s 1" option. + + + DAC960 ANNOUNCEMENTS MAILING LIST + +The DAC960 Announcements Mailing List provides a forum for informing Linux +users of new driver releases and other announcements regarding Linux support +for DAC960 PCI RAID Controllers. To join the mailing list, send a message to +"dac960-announce-request@dandelion.com" with the line "subscribe" in the +message body. + + + CONTROLLER CONFIGURATION AND STATUS MONITORING + +The DAC960 RAID controllers running firmware 4.06 or above include a Background +Initialization facility so that system downtime is minimized both for initial +installation and subsequent configuration of additional storage. The BIOS +Configuration Utility (accessible via Alt-R during the BIOS initialization +sequence) is used to quickly configure the controller, and then the logical +drives that have been created are available for immediate use even while they +are still being initialized by the controller. The primary need for online +configuration and status monitoring is then to avoid system downtime when disk +drives fail and must be replaced. Mylex's online monitoring and configuration +utilities are being ported to Linux and will become available at some point in +the future. Note that with a SAF-TE (SCSI Accessed Fault-Tolerant Enclosure) +enclosure, the controller is able to rebuild failed drives automatically as +soon as a drive replacement is made available. + +The primary interfaces for controller configuration and status monitoring are +special files created in the /proc/rd/... hierarchy along with the normal +system console logging mechanism. Whenever the system is operating, the DAC960 +driver queries each controller for status information every 10 seconds, and +checks for additional conditions every 60 seconds. The initial status of each +controller is always available for controller N in /proc/rd/cN/initial_status, +and the current status as of the last status monitoring query is available in +/proc/rd/cN/current_status. In addition, status changes are also logged by the +driver to the system console and will appear in the log files maintained by +syslog. The progress of asynchronous rebuild or consistency check operations +is also available in /proc/rd/cN/current_status, and progress messages are +logged to the system console at most every 60 seconds. + +Starting with the 2.2.3/2.0.3 versions of the driver, the status information +available in /proc/rd/cN/initial_status and /proc/rd/cN/current_status has been +augmented to include the vendor, model, revision, and serial number (if +available) for each physical device found connected to the controller: + +***** DAC960 RAID Driver Version 2.2.3 of 19 August 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PRL PCI RAID Controller + Firmware Version: 4.07-0-07, Channels: 1, Memory Size: 16MB + PCI Bus: 1, Device: 4, Function: 1, I/O Address: Unassigned + PCI Address: 0xFE300000 mapped at 0xA0800000, IRQ Channel: 21 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + SAF-TE Enclosure Management Enabled + Physical Devices: + 0:0 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68016775HA + Disk Status: Online, 17928192 blocks + 0:1 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68004E53HA + Disk Status: Online, 17928192 blocks + 0:2 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 13013935HA + Disk Status: Online, 17928192 blocks + 0:3 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 13016897HA + Disk Status: Online, 17928192 blocks + 0:4 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68019905HA + Disk Status: Online, 17928192 blocks + 0:5 Vendor: IBM Model: DRVS09D Revision: 0270 + Serial Number: 68012753HA + Disk Status: Online, 17928192 blocks + 0:6 Vendor: ESG-SHV Model: SCA HSBP M6 Revision: 0.61 + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 89640960 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +To simplify the monitoring process for custom software, the special file +/proc/rd/status returns "OK" when all DAC960 controllers in the system are +operating normally and no failures have occurred, or "ALERT" if any logical +drives are offline or critical or any non-standby physical drives are dead. + +Configuration commands for controller N are available via the special file +/proc/rd/cN/user_command. A human readable command can be written to this +special file to initiate a configuration operation, and the results of the +operation can then be read back from the special file in addition to being +logged to the system console. The shell command sequence + + echo "<configuration-command>" > /proc/rd/c0/user_command + cat /proc/rd/c0/user_command + +is typically used to execute configuration commands. The configuration +commands are: + + flush-cache + + The "flush-cache" command flushes the controller's cache. The system + automatically flushes the cache at shutdown or if the driver module is + unloaded, so this command is only needed to be certain a write back cache + is flushed to disk before the system is powered off by a command to a UPS. + Note that the flush-cache command also stops an asynchronous rebuild or + consistency check, so it should not be used except when the system is being + halted. + + kill <channel>:<target-id> + + The "kill" command marks the physical drive <channel>:<target-id> as DEAD. + This command is provided primarily for testing, and should not be used + during normal system operation. + + make-online <channel>:<target-id> + + The "make-online" command changes the physical drive <channel>:<target-id> + from status DEAD to status ONLINE. In cases where multiple physical drives + have been killed simultaneously, this command may be used to bring all but + one of them back online, after which a rebuild to the final drive is + necessary. + + Warning: make-online should only be used on a dead physical drive that is + an active part of a drive group, never on a standby drive. The command + should never be used on a dead drive that is part of a critical logical + drive; rebuild should be used if only a single drive is dead. + + make-standby <channel>:<target-id> + + The "make-standby" command changes physical drive <channel>:<target-id> + from status DEAD to status STANDBY. It should only be used in cases where + a dead drive was replaced after an automatic rebuild was performed onto a + standby drive. It cannot be used to add a standby drive to the controller + configuration if one was not created initially; the BIOS Configuration + Utility must be used for that currently. + + rebuild <channel>:<target-id> + + The "rebuild" command initiates an asynchronous rebuild onto physical drive + <channel>:<target-id>. It should only be used when a dead drive has been + replaced. + + check-consistency <logical-drive-number> + + The "check-consistency" command initiates an asynchronous consistency check + of <logical-drive-number> with automatic restoration. It can be used + whenever it is desired to verify the consistency of the redundancy + information. + + cancel-rebuild + cancel-consistency-check + + The "cancel-rebuild" and "cancel-consistency-check" commands cancel any + rebuild or consistency check operations previously initiated. + + + EXAMPLE I - DRIVE FAILURE WITHOUT A STANDBY DRIVE + +The following annotated logs demonstrate the controller configuration and and +online status monitoring capabilities of the Linux DAC960 Driver. The test +configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a +DAC960PJ controller. The physical drives are configured into a single drive +group without a standby drive, and the drive group has been configured into two +logical drives, one RAID-5 and one RAID-6. Note that these logs are from an +earlier version of the driver and the messages have changed somewhat with newer +releases, but the functionality remains similar. First, here is the current +status of the RAID configuration: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status +***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PJ PCI RAID Controller + Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB + PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned + PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +gwynedd:/u/lnz# cat /proc/rd/status +OK + +The above messages indicate that everything is healthy, and /proc/rd/status +returns "OK" indicating that there are no problems with any DAC960 controller +in the system. For demonstration purposes, while I/O is active Physical Drive +1:1 is now disconnected, simulating a drive failure. The failure is noted by +the driver within 10 seconds of the controller's having detected it, and the +driver logs the following console status messages indicating that Logical +Drives 0 and 1 are now CRITICAL as a result of Physical Drive 1:1 being DEAD: + +DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:1 killed because of timeout on SCSI command +DAC960#0: Physical Drive 1:1 is now DEAD +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL + +The Sense Keys logged here are just Check Condition / Unit Attention conditions +arising from a SCSI bus reset that is forced by the controller during its error +recovery procedures. Concurrently with the above, the driver status available +from /proc/rd also reflects the drive failure. The status message in +/proc/rd/status has changed from "OK" to "ALERT": + +gwynedd:/u/lnz# cat /proc/rd/status +ALERT + +and /proc/rd/c0/current_status has been updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Dead, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +Since there are no standby drives configured, the system can continue to access +the logical drives in a performance degraded mode until the failed drive is +replaced and a rebuild operation completed to restore the redundancy of the +logical drives. Once Physical Drive 1:1 is replaced with a properly +functioning drive, or if the physical drive was killed without having failed +(e.g., due to electrical problems on the SCSI bus), the user can instruct the +controller to initiate a rebuild operation onto the newly replaced drive: + +gwynedd:/u/lnz# echo "rebuild 1:1" > /proc/rd/c0/user_command +gwynedd:/u/lnz# cat /proc/rd/c0/user_command +Rebuild of Physical Drive 1:1 Initiated + +The echo command instructs the controller to initiate an asynchronous rebuild +operation onto Physical Drive 1:1, and the status message that results from the +operation is then available for reading from /proc/rd/c0/user_command, as well +as being logged to the console by the driver. + +Within 10 seconds of this command the driver logs the initiation of the +asynchronous rebuild operation: + +DAC960#0: Rebuild of Physical Drive 1:1 Initiated +DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01 +DAC960#0: Physical Drive 1:1 is now WRITE-ONLY +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 1% completed + +and /proc/rd/c0/current_status is updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Write-Only, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 6% completed + +As the rebuild progresses, the current status in /proc/rd/c0/current_status is +updated every 10 seconds: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Write-Only, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 3305472 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 15% completed + +and every minute a progress message is logged to the console by the driver: + +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 32% completed +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 63% completed +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 94% completed +DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 94% completed + +Finally, the rebuild completes successfully. The driver logs the status of the +logical and physical drives and the rebuild completion: + +DAC960#0: Rebuild Completed Successfully +DAC960#0: Physical Drive 1:1 is now ONLINE +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE + +/proc/rd/c0/current_status is updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 5498880 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 3305472 blocks, Write Thru + Rebuild Completed Successfully + +and /proc/rd/status indicates that everything is healthy once again: + +gwynedd:/u/lnz# cat /proc/rd/status +OK + + + EXAMPLE II - DRIVE FAILURE WITH A STANDBY DRIVE + +The following annotated logs demonstrate the controller configuration and and +online status monitoring capabilities of the Linux DAC960 Driver. The test +configuration comprises 6 1GB Quantum Atlas I disk drives on two channels of a +DAC960PJ controller. The physical drives are configured into a single drive +group with a standby drive, and the drive group has been configured into two +logical drives, one RAID-5 and one RAID-6. Note that these logs are from an +earlier version of the driver and the messages have changed somewhat with newer +releases, but the functionality remains similar. First, here is the current +status of the RAID configuration: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status +***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PJ PCI RAID Controller + Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB + PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned + PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Online, 2201600 blocks + 1:3 - Disk: Standby, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru + No Rebuild or Consistency Check in Progress + +gwynedd:/u/lnz# cat /proc/rd/status +OK + +The above messages indicate that everything is healthy, and /proc/rd/status +returns "OK" indicating that there are no problems with any DAC960 controller +in the system. For demonstration purposes, while I/O is active Physical Drive +1:2 is now disconnected, simulating a drive failure. The failure is noted by +the driver within 10 seconds of the controller's having detected it, and the +driver logs the following console status messages: + +DAC960#0: Physical Drive 1:1 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:3 Error Log: Sense Key = 6, ASC = 29, ASCQ = 02 +DAC960#0: Physical Drive 1:2 killed because of timeout on SCSI command +DAC960#0: Physical Drive 1:2 is now DEAD +DAC960#0: Physical Drive 1:2 killed because it was removed +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now CRITICAL +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now CRITICAL + +Since a standby drive is configured, the controller automatically begins +rebuilding onto the standby drive: + +DAC960#0: Physical Drive 1:3 is now WRITE-ONLY +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed + +Concurrently with the above, the driver status available from /proc/rd also +reflects the drive failure and automatic rebuild. The status message in +/proc/rd/status has changed from "OK" to "ALERT": + +gwynedd:/u/lnz# cat /proc/rd/status +ALERT + +and /proc/rd/c0/current_status has been updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Dead, 2201600 blocks + 1:3 - Disk: Write-Only, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 4% completed + +As the rebuild progresses, the current status in /proc/rd/c0/current_status is +updated every 10 seconds: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Dead, 2201600 blocks + 1:3 - Disk: Write-Only, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Critical, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Critical, 2754560 blocks, Write Thru + Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed + +and every minute a progress message is logged on the console by the driver: + +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 40% completed +DAC960#0: Rebuild in Progress: Logical Drive 0 (/dev/rd/c0d0) 76% completed +DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 66% completed +DAC960#0: Rebuild in Progress: Logical Drive 1 (/dev/rd/c0d1) 84% completed + +Finally, the rebuild completes successfully. The driver logs the status of the +logical and physical drives and the rebuild completion: + +DAC960#0: Rebuild Completed Successfully +DAC960#0: Physical Drive 1:3 is now ONLINE +DAC960#0: Logical Drive 0 (/dev/rd/c0d0) is now ONLINE +DAC960#0: Logical Drive 1 (/dev/rd/c0d1) is now ONLINE + +/proc/rd/c0/current_status is updated: + +***** DAC960 RAID Driver Version 2.0.0 of 23 March 1999 ***** +Copyright 1998-1999 by Leonard N. Zubkoff <lnz@dandelion.com> +Configuring Mylex DAC960PJ PCI RAID Controller + Firmware Version: 4.06-0-08, Channels: 3, Memory Size: 8MB + PCI Bus: 0, Device: 19, Function: 1, I/O Address: Unassigned + PCI Address: 0xFD4FC000 mapped at 0x8807000, IRQ Channel: 9 + Controller Queue Depth: 128, Maximum Blocks per Command: 128 + Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33 + Stripe Size: 64KB, Segment Size: 8KB, BIOS Geometry: 255/63 + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Dead, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru + Rebuild Completed Successfully + +and /proc/rd/status indicates that everything is healthy once again: + +gwynedd:/u/lnz# cat /proc/rd/status +OK + +Note that the absence of a viable standby drive does not create an "ALERT" +status. Once dead Physical Drive 1:2 has been replaced, the controller must be +told that this has occurred and that the newly replaced drive should become the +new standby drive: + +gwynedd:/u/lnz# echo "make-standby 1:2" > /proc/rd/c0/user_command +gwynedd:/u/lnz# cat /proc/rd/c0/user_command +Make Standby of Physical Drive 1:2 Succeeded + +The echo command instructs the controller to make Physical Drive 1:2 into a +standby drive, and the status message that results from the operation is then +available for reading from /proc/rd/c0/user_command, as well as being logged to +the console by the driver. Within 60 seconds of this command the driver logs: + +DAC960#0: Physical Drive 1:2 Error Log: Sense Key = 6, ASC = 29, ASCQ = 01 +DAC960#0: Physical Drive 1:2 is now STANDBY +DAC960#0: Make Standby of Physical Drive 1:2 Succeeded + +and /proc/rd/c0/current_status is updated: + +gwynedd:/u/lnz# cat /proc/rd/c0/current_status + ... + Physical Devices: + 0:1 - Disk: Online, 2201600 blocks + 0:2 - Disk: Online, 2201600 blocks + 0:3 - Disk: Online, 2201600 blocks + 1:1 - Disk: Online, 2201600 blocks + 1:2 - Disk: Standby, 2201600 blocks + 1:3 - Disk: Online, 2201600 blocks + Logical Drives: + /dev/rd/c0d0: RAID-5, Online, 4399104 blocks, Write Thru + /dev/rd/c0d1: RAID-6, Online, 2754560 blocks, Write Thru + Rebuild Completed Successfully diff --git a/Documentation/blockdev/drbd/DRBD-8.3-data-packets.svg b/Documentation/blockdev/drbd/DRBD-8.3-data-packets.svg new file mode 100644 index 000000000..f87cfa0dc --- /dev/null +++ b/Documentation/blockdev/drbd/DRBD-8.3-data-packets.svg @@ -0,0 +1,588 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> +<svg + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + version="1.0" + width="210mm" + height="297mm" + viewBox="0 0 21000 29700" + id="svg2" + style="fill-rule:evenodd"> + <defs + id="defs4" /> + <g + id="Default" + style="visibility:visible"> + <desc + id="desc180">Master slide</desc> + </g> + <path + d="M 11999,8601 L 11899,8301 L 12099,8301 L 11999,8601 z" + id="path193" + style="fill:#008000;visibility:visible" /> + <path + d="M 11999,7801 L 11999,8361" + id="path197" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 7999,10401 L 7899,10101 L 8099,10101 L 7999,10401 z" + id="path209" + style="fill:#008000;visibility:visible" /> + <path + d="M 7999,9601 L 7999,10161" + id="path213" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 11999,7801 L 11685,7840 L 11724,7644 L 11999,7801 z" + id="path225" + style="fill:#008000;visibility:visible" /> + <path + d="M 7999,7001 L 11764,7754" + id="path229" + style="fill:none;stroke:#008000;visibility:visible" /> + <g + transform="matrix(0.9895258,-0.1443562,0.1443562,0.9895258,-1244.4792,1416.5139)" + id="g245" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <text + id="text247"> + <tspan + x="9139 9368 9579 9808 9986 10075 10252 10481 10659 10837 10909" + y="9284" + id="tspan249">RSDataReply</tspan> + </text> + </g> + <path + d="M 7999,9601 L 8281,9458 L 8311,9655 L 7999,9601 z" + id="path259" + style="fill:#008000;visibility:visible" /> + <path + d="M 11999,9001 L 8236,9565" + id="path263" + style="fill:none;stroke:#008000;visibility:visible" /> + <g + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,1620.9382,-1639.4947)" + id="g279" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <text + id="text281"> + <tspan + x="8743 8972 9132 9310 9573 9801 10013 10242 10419 10597 10775 10953 11114" + y="7023" + id="tspan283">CsumRSRequest</tspan> + </text> + </g> + <text + id="text297" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4034 4263 4440 4703 4881 5042 5219 5397 5503 5681 5842 6003 6180 6341 6519 6625 6803 6980 7158 7336 7497 7586 7692" + y="5707" + id="tspan299">w_make_resync_request()</tspan> + </text> + <text + id="text313" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12305 12483 12644 12821 12893 13054 13232 13410 13638 13816 13905 14083 14311 14489 14667 14845 15023 15184 15272 15378" + y="7806" + id="tspan315">receive_DataRequest()</tspan> + </text> + <text + id="text329" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12377 12483 12660 12838 13016 13194 13372 13549 13621 13799 13977 14083 14261 14438 14616 14794 14955 15133 15294 15399" + y="8606" + id="tspan331">drbd_endio_read_sec()</tspan> + </text> + <text + id="text345" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12191 12420 12597 12775 12953 13131 13309 13486 13664 13825 13986 14164 14426 14604 14710 14871 15049 15154 15332 15510 15616" + y="9007" + id="tspan347">w_e_end_csum_rs_req()</tspan> + </text> + <text + id="text361" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4444 4550 4728 4889 5066 5138 5299 5477 5655 5883 6095 6324 6501 6590 6768 6997 7175 7352 7424 7585 7691" + y="9507" + id="tspan363">receive_RSDataReply()</tspan> + </text> + <text + id="text377" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4457 4635 4741 4918 5096 5274 5452 5630 5807 5879 6057 6235 6464 6569 6641 6730 6908 7086 7247 7425 7585 7691" + y="10407" + id="tspan379">drbd_endio_write_sec()</tspan> + </text> + <text + id="text393" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4647 4825 5003 5180 5358 5536 5714 5820 5997 6158 6319 6497 6658 6836 7013 7085 7263 7424 7585 7691" + y="10907" + id="tspan395">e_end_resync_block()</tspan> + </text> + <path + d="M 11999,11601 L 11685,11640 L 11724,11444 L 11999,11601 z" + id="path405" + style="fill:#000080;visibility:visible" /> + <path + d="M 7999,10801 L 11764,11554" + id="path409" + style="fill:none;stroke:#000080;visibility:visible" /> + <g + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,2434.7562,-1674.649)" + id="g425" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <text + id="text427"> + <tspan + x="9320 9621 9726 9798 9887 10065 10277 10438" + y="10943" + id="tspan429">WriteAck</tspan> + </text> + </g> + <text + id="text443" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12377 12555 12644 12821 13033 13105 13283 13444 13604 13816 13977 14138 14244" + y="11559" + id="tspan445">got_BlockAck()</tspan> + </text> + <text + id="text459" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="7999 8304 8541 8778 8990 9201 9413 9650 10001 10120 10357 10594 10806 11043 11280 11398 11703 11940 12152 12364 12601 12812 12931 13049 13261 13498 13710 13947 14065 14302 14540 14658 14777 14870 15107 15225 15437 15649 15886" + y="4877" + id="tspan461">Checksum based Resync, case not in sync</tspan> + </text> + <text + id="text475" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="6961 7266 7571 7854 8159 8299 8536 8654 8891 9010 9247 9484 9603 9840 9958 10077 10170 10407" + y="2806" + id="tspan477">DRBD-8.3 data flow</tspan> + </text> + <text + id="text491" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5190 5419 5596 5774 5952 6113 6291 6468 6646 6824 6985 7146 7324 7586 7692" + y="7005" + id="tspan493">w_e_send_csum()</tspan> + </text> + <path + d="M 11999,17601 L 11899,17301 L 12099,17301 L 11999,17601 z" + id="path503" + style="fill:#008000;visibility:visible" /> + <path + d="M 11999,16801 L 11999,17361" + id="path507" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 11999,16801 L 11685,16840 L 11724,16644 L 11999,16801 z" + id="path519" + style="fill:#008000;visibility:visible" /> + <path + d="M 7999,16001 L 11764,16754" + id="path523" + style="fill:none;stroke:#008000;visibility:visible" /> + <g + transform="matrix(0.9895258,-0.1443562,0.1443562,0.9895258,-2539.5806,1529.3491)" + id="g539" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <text + id="text541"> + <tspan + x="9269 9498 9709 9798 9959 10048 10226 10437 10598 10776" + y="18265" + id="tspan543">RSIsInSync</tspan> + </text> + </g> + <path + d="M 7999,18601 L 8281,18458 L 8311,18655 L 7999,18601 z" + id="path553" + style="fill:#000080;visibility:visible" /> + <path + d="M 11999,18001 L 8236,18565" + id="path557" + style="fill:none;stroke:#000080;visibility:visible" /> + <g + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,3461.4027,-1449.3012)" + id="g573" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <text + id="text575"> + <tspan + x="8743 8972 9132 9310 9573 9801 10013 10242 10419 10597 10775 10953 11114" + y="16023" + id="tspan577">CsumRSRequest</tspan> + </text> + </g> + <text + id="text591" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12305 12483 12644 12821 12893 13054 13232 13410 13638 13816 13905 14083 14311 14489 14667 14845 15023 15184 15272 15378" + y="16806" + id="tspan593">receive_DataRequest()</tspan> + </text> + <text + id="text607" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12377 12483 12660 12838 13016 13194 13372 13549 13621 13799 13977 14083 14261 14438 14616 14794 14955 15133 15294 15399" + y="17606" + id="tspan609">drbd_endio_read_sec()</tspan> + </text> + <text + id="text623" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12191 12420 12597 12775 12953 13131 13309 13486 13664 13825 13986 14164 14426 14604 14710 14871 15049 15154 15332 15510 15616" + y="18007" + id="tspan625">w_e_end_csum_rs_req()</tspan> + </text> + <text + id="text639" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5735 5913 6091 6180 6357 6446 6607 6696 6874 7085 7246 7424 7585 7691" + y="18507" + id="tspan641">got_IsInSync()</tspan> + </text> + <text + id="text655" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="7999 8304 8541 8778 8990 9201 9413 9650 10001 10120 10357 10594 10806 11043 11280 11398 11703 11940 12152 12364 12601 12812 12931 13049 13261 13498 13710 13947 14065 14159 14396 14514 14726 14937 15175" + y="13877" + id="tspan657">Checksum based Resync, case in sync</tspan> + </text> + <path + d="M 12000,24601 L 11900,24301 L 12100,24301 L 12000,24601 z" + id="path667" + style="fill:#008000;visibility:visible" /> + <path + d="M 12000,23801 L 12000,24361" + id="path671" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 8000,26401 L 7900,26101 L 8100,26101 L 8000,26401 z" + id="path683" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,25601 L 8000,26161" + id="path687" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 12000,23801 L 11686,23840 L 11725,23644 L 12000,23801 z" + id="path699" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,23001 L 11765,23754" + id="path703" + style="fill:none;stroke:#008000;visibility:visible" /> + <g + transform="matrix(0.9895258,-0.1443562,0.1443562,0.9895258,-3543.8452,1630.5143)" + id="g719" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <text + id="text721"> + <tspan + x="9464 9710 9921 10150 10328 10505 10577" + y="25236" + id="tspan723">OVReply</tspan> + </text> + </g> + <path + d="M 8000,25601 L 8282,25458 L 8312,25655 L 8000,25601 z" + id="path733" + style="fill:#008000;visibility:visible" /> + <path + d="M 12000,25001 L 8237,25565" + id="path737" + style="fill:none;stroke:#008000;visibility:visible" /> + <g + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,4918.2801,-1381.2128)" + id="g753" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <text + id="text755"> + <tspan + x="9142 9388 9599 9828 10006 10183 10361 10539 10700" + y="23106" + id="tspan757">OVRequest</tspan> + </text> + </g> + <text + id="text771" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12306 12484 12645 12822 12894 13055 13233 13411 13656 13868 14097 14274 14452 14630 14808 14969 15058 15163" + y="23806" + id="tspan773">receive_OVRequest()</tspan> + </text> + <text + id="text787" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12378 12484 12661 12839 13017 13195 13373 13550 13622 13800 13978 14084 14262 14439 14617 14795 14956 15134 15295 15400" + y="24606" + id="tspan789">drbd_endio_read_sec()</tspan> + </text> + <text + id="text803" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12192 12421 12598 12776 12954 13132 13310 13487 13665 13843 14004 14182 14288 14465 14643 14749" + y="25007" + id="tspan805">w_e_end_ov_req()</tspan> + </text> + <text + id="text819" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5101 5207 5385 5546 5723 5795 5956 6134 6312 6557 6769 6998 7175 7353 7425 7586 7692" + y="25507" + id="tspan821">receive_OVReply()</tspan> + </text> + <text + id="text835" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4492 4670 4776 4953 5131 5309 5487 5665 5842 5914 6092 6270 6376 6554 6731 6909 7087 7248 7426 7587 7692" + y="26407" + id="tspan837">drbd_endio_read_sec()</tspan> + </text> + <text + id="text851" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4902 5131 5308 5486 5664 5842 6020 6197 6375 6553 6714 6892 6998 7175 7353 7425 7586 7692" + y="26907" + id="tspan853">w_e_end_ov_reply()</tspan> + </text> + <path + d="M 12000,27601 L 11686,27640 L 11725,27444 L 12000,27601 z" + id="path863" + style="fill:#000080;visibility:visible" /> + <path + d="M 8000,26801 L 11765,27554" + id="path867" + style="fill:none;stroke:#000080;visibility:visible" /> + <g + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,5704.1907,-1328.312)" + id="g883" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <text + id="text885"> + <tspan + x="9279 9525 9736 9965 10143 10303 10481 10553" + y="26935" + id="tspan887">OVResult</tspan> + </text> + </g> + <text + id="text901" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12378 12556 12645 12822 13068 13280 13508 13686 13847 14025 14097 14185 14291" + y="27559" + id="tspan903">got_OVResult()</tspan> + </text> + <text + id="text917" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="8000 8330 8567 8660 8754 8991 9228 9346 9558 9795 9935 10028 10146" + y="21877" + id="tspan919">Online verify</tspan> + </text> + <text + id="text933" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4641 4870 5047 5310 5488 5649 5826 6004 6182 6343 6521 6626 6804 6982 7160 7338 7499 7587 7693" + y="23005" + id="tspan935">w_make_ov_request()</tspan> + </text> + <path + d="M 8000,6500 L 7900,6200 L 8100,6200 L 8000,6500 z" + id="path945" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,5700 L 8000,6260" + id="path949" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 3900,5500 L 3700,5500 L 3700,11000 L 3900,11000" + id="path961" + style="fill:none;stroke:#000000;visibility:visible" /> + <path + d="M 3900,14500 L 3700,14500 L 3700,18600 L 3900,18600" + id="path973" + style="fill:none;stroke:#000000;visibility:visible" /> + <path + d="M 3900,22800 L 3700,22800 L 3700,26900 L 3900,26900" + id="path985" + style="fill:none;stroke:#000000;visibility:visible" /> + <text + id="text1001" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4492 4670 4776 4953 5131 5309 5487 5665 5842 5914 6092 6270 6376 6554 6731 6909 7087 7248 7426 7587 7692" + y="6506" + id="tspan1003">drbd_endio_read_sec()</tspan> + </text> + <text + id="text1017" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4034 4263 4440 4703 4881 5042 5219 5397 5503 5681 5842 6003 6180 6341 6519 6625 6803 6980 7158 7336 7497 7586 7692" + y="14708" + id="tspan1019">w_make_resync_request()</tspan> + </text> + <text + id="text1033" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5190 5419 5596 5774 5952 6113 6291 6468 6646 6824 6985 7146 7324 7586 7692" + y="16006" + id="tspan1035">w_e_send_csum()</tspan> + </text> + <path + d="M 8000,15501 L 7900,15201 L 8100,15201 L 8000,15501 z" + id="path1045" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,14701 L 8000,15261" + id="path1049" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + id="text1065" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4492 4670 4776 4953 5131 5309 5487 5665 5842 5914 6092 6270 6376 6554 6731 6909 7087 7248 7426 7587 7692" + y="15507" + id="tspan1067">drbd_endio_read_sec()</tspan> + </text> + <path + d="M 16100,9000 L 16300,9000 L 16300,7500 L 16100,7500" + id="path1077" + style="fill:none;stroke:#000000;visibility:visible" /> + <path + d="M 16100,18000 L 16300,18000 L 16300,16500 L 16100,16500" + id="path1089" + style="fill:none;stroke:#000000;visibility:visible" /> + <path + d="M 16100,25000 L 16300,25000 L 16300,23500 L 16100,23500" + id="path1101" + style="fill:none;stroke:#000000;visibility:visible" /> + <text + id="text1117" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="2026 2132 2293 2471 2648 2826 3004 3076 3254 3431 3503 3681 3787" + y="5402" + id="tspan1119">rs_begin_io()</tspan> + </text> + <text + id="text1133" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="2027 2133 2294 2472 2649 2827 3005 3077 3255 3432 3504 3682 3788" + y="14402" + id="tspan1135">rs_begin_io()</tspan> + </text> + <text + id="text1149" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="2026 2132 2293 2471 2648 2826 3004 3076 3254 3431 3503 3681 3787" + y="22602" + id="tspan1151">rs_begin_io()</tspan> + </text> + <text + id="text1165" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="1426 1532 1693 1871 2031 2209 2472 2649 2721 2899 2988 3166 3344 3416 3593 3699" + y="11302" + id="tspan1167">rs_complete_io()</tspan> + </text> + <text + id="text1181" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="1526 1632 1793 1971 2131 2309 2572 2749 2821 2999 3088 3266 3444 3516 3693 3799" + y="18931" + id="tspan1183">rs_complete_io()</tspan> + </text> + <text + id="text1197" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="1526 1632 1793 1971 2131 2309 2572 2749 2821 2999 3088 3266 3444 3516 3693 3799" + y="27231" + id="tspan1199">rs_complete_io()</tspan> + </text> + <text + id="text1213" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16126 16232 16393 16571 16748 16926 17104 17176 17354 17531 17603 17781 17887" + y="7402" + id="tspan1215">rs_begin_io()</tspan> + </text> + <text + id="text1229" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16127 16233 16394 16572 16749 16927 17105 17177 17355 17532 17604 17782 17888" + y="16331" + id="tspan1231">rs_begin_io()</tspan> + </text> + <text + id="text1245" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16127 16233 16394 16572 16749 16927 17105 17177 17355 17532 17604 17782 17888" + y="23302" + id="tspan1247">rs_begin_io()</tspan> + </text> + <text + id="text1261" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16115 16221 16382 16560 16720 16898 17161 17338 17410 17588 17677 17855 18033 18105 18282 18388" + y="9302" + id="tspan1263">rs_complete_io()</tspan> + </text> + <text + id="text1277" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16115 16221 16382 16560 16720 16898 17161 17338 17410 17588 17677 17855 18033 18105 18282 18388" + y="18331" + id="tspan1279">rs_complete_io()</tspan> + </text> + <text + id="text1293" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16126 16232 16393 16571 16731 16909 17172 17349 17421 17599 17688 17866 18044 18116 18293 18399" + y="25302" + id="tspan1295">rs_complete_io()</tspan> + </text> +</svg> diff --git a/Documentation/blockdev/drbd/DRBD-data-packets.svg b/Documentation/blockdev/drbd/DRBD-data-packets.svg new file mode 100644 index 000000000..48a1e2165 --- /dev/null +++ b/Documentation/blockdev/drbd/DRBD-data-packets.svg @@ -0,0 +1,459 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> +<svg + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + version="1.0" + width="210mm" + height="297mm" + viewBox="0 0 21000 29700" + id="svg2" + style="fill-rule:evenodd"> + <defs + id="defs4" /> + <g + id="Default" + style="visibility:visible"> + <desc + id="desc176">Master slide</desc> + </g> + <path + d="M 11999,19601 L 11899,19301 L 12099,19301 L 11999,19601 z" + id="path189" + style="fill:#008000;visibility:visible" /> + <path + d="M 11999,18801 L 11999,19361" + id="path193" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 7999,21401 L 7899,21101 L 8099,21101 L 7999,21401 z" + id="path205" + style="fill:#008000;visibility:visible" /> + <path + d="M 7999,20601 L 7999,21161" + id="path209" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 11999,18801 L 11685,18840 L 11724,18644 L 11999,18801 z" + id="path221" + style="fill:#008000;visibility:visible" /> + <path + d="M 7999,18001 L 11764,18754" + id="path225" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + x="-3023.845" + y="1106.8124" + transform="matrix(0.9895258,-0.1443562,0.1443562,0.9895258,0,0)" + id="text243" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="6115.1553 6344.1553 6555.1553 6784.1553 6962.1553 7051.1553 7228.1553 7457.1553 7635.1553 7813.1553 7885.1553" + y="21390.812" + id="tspan245">RSDataReply</tspan> + </text> + <path + d="M 7999,20601 L 8281,20458 L 8311,20655 L 7999,20601 z" + id="path255" + style="fill:#008000;visibility:visible" /> + <path + d="M 11999,20001 L 8236,20565" + id="path259" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + x="3502.5356" + y="-2184.6621" + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,0,0)" + id="text277" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12321.536 12550.536 12761.536 12990.536 13168.536 13257.536 13434.536 13663.536 13841.536 14019.536 14196.536 14374.536 14535.536" + y="15854.338" + id="tspan279">RSDataRequest</tspan> + </text> + <text + id="text293" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4034 4263 4440 4703 4881 5042 5219 5397 5503 5681 5842 6003 6180 6341 6519 6625 6803 6980 7158 7336 7497 7586 7692" + y="17807" + id="tspan295">w_make_resync_request()</tspan> + </text> + <text + id="text309" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12305 12483 12644 12821 12893 13054 13232 13410 13638 13816 13905 14083 14311 14489 14667 14845 15023 15184 15272 15378" + y="18806" + id="tspan311">receive_DataRequest()</tspan> + </text> + <text + id="text325" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12377 12483 12660 12838 13016 13194 13372 13549 13621 13799 13977 14083 14261 14438 14616 14794 14955 15133 15294 15399" + y="19606" + id="tspan327">drbd_endio_read_sec()</tspan> + </text> + <text + id="text341" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12191 12420 12597 12775 12953 13131 13309 13486 13664 13770 13931 14109 14287 14375 14553 14731 14837 15015 15192 15298" + y="20007" + id="tspan343">w_e_end_rsdata_req()</tspan> + </text> + <text + id="text357" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4444 4550 4728 4889 5066 5138 5299 5477 5655 5883 6095 6324 6501 6590 6768 6997 7175 7352 7424 7585 7691" + y="20507" + id="tspan359">receive_RSDataReply()</tspan> + </text> + <text + id="text373" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4457 4635 4741 4918 5096 5274 5452 5630 5807 5879 6057 6235 6464 6569 6641 6730 6908 7086 7247 7425 7585 7691" + y="21407" + id="tspan375">drbd_endio_write_sec()</tspan> + </text> + <text + id="text389" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4647 4825 5003 5180 5358 5536 5714 5820 5997 6158 6319 6497 6658 6836 7013 7085 7263 7424 7585 7691" + y="21907" + id="tspan391">e_end_resync_block()</tspan> + </text> + <path + d="M 11999,22601 L 11685,22640 L 11724,22444 L 11999,22601 z" + id="path401" + style="fill:#000080;visibility:visible" /> + <path + d="M 7999,21801 L 11764,22554" + id="path405" + style="fill:none;stroke:#000080;visibility:visible" /> + <text + x="4290.3008" + y="-2369.6162" + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,0,0)" + id="text423" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="13610.301 13911.301 14016.301 14088.301 14177.301 14355.301 14567.301 14728.301" + y="19573.385" + id="tspan425">WriteAck</tspan> + </text> + <text + id="text439" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12199 12377 12555 12644 12821 13033 13105 13283 13444 13604 13816 13977 14138 14244" + y="22559" + id="tspan441">got_BlockAck()</tspan> + </text> + <text + id="text455" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="7999 8304 8541 8753 8964 9201 9413 9531 9769 9862 10099 10310 10522 10734 10852 10971 11208 11348 11585 11822" + y="16877" + id="tspan457">Resync blocks, 4-32K</tspan> + </text> + <path + d="M 12000,7601 L 11900,7301 L 12100,7301 L 12000,7601 z" + id="path467" + style="fill:#008000;visibility:visible" /> + <path + d="M 12000,6801 L 12000,7361" + id="path471" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 12000,6801 L 11686,6840 L 11725,6644 L 12000,6801 z" + id="path483" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,6001 L 11765,6754" + id="path487" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + x="-1288.1796" + y="1279.7666" + transform="matrix(0.9895258,-0.1443562,0.1443562,0.9895258,0,0)" + id="text505" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="8174.8208 8475.8203 8580.8203 8652.8203 8741.8203 8919.8203 9131.8203 9292.8203" + y="9516.7666" + id="tspan507">WriteAck</tspan> + </text> + <path + d="M 8000,8601 L 8282,8458 L 8312,8655 L 8000,8601 z" + id="path517" + style="fill:#000080;visibility:visible" /> + <path + d="M 12000,8001 L 8237,8565" + id="path521" + style="fill:none;stroke:#000080;visibility:visible" /> + <text + x="1065.6655" + y="-2097.7664" + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,0,0)" + id="text539" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="10682.666 10911.666 11088.666 11177.666" + y="4107.2339" + id="tspan541">Data</tspan> + </text> + <text + id="text555" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4746 4924 5030 5207 5385 5563 5826 6003 6164 6342 6520 6626 6803 6981 7159 7337 7498 7587 7692" + y="5505" + id="tspan557">drbd_make_request()</tspan> + </text> + <text + id="text571" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12306 12484 12645 12822 12894 13055 13233 13411 13639 13817 13906 14084 14190" + y="6806" + id="tspan573">receive_Data()</tspan> + </text> + <text + id="text587" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12378 12484 12661 12839 13017 13195 13373 13550 13622 13800 13978 14207 14312 14384 14473 14651 14829 14990 15168 15328 15434" + y="7606" + id="tspan589">drbd_endio_write_sec()</tspan> + </text> + <text + id="text603" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12192 12370 12548 12725 12903 13081 13259 13437 13509 13686 13847 14008 14114" + y="8007" + id="tspan605">e_end_block()</tspan> + </text> + <text + id="text619" + style="font-size:318px;font-weight:400;fill:#000080;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5647 5825 6003 6092 6269 6481 6553 6731 6892 7052 7264 7425 7586 7692" + y="8606" + id="tspan621">got_BlockAck()</tspan> + </text> + <text + id="text635" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="8000 8305 8542 8779 9016 9109 9346 9486 9604 9956 10049 10189 10328 10565 10705 10942 11179 11298 11603 11742 11835 11954 12191 12310 12428 12665 12902 13139 13279 13516 13753" + y="4877" + id="tspan637">Regular mirrored write, 512-32K</tspan> + </text> + <text + id="text651" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5381 5610 5787 5948 6126 6304 6482 6659 6837 7015 7087 7265 7426 7587 7692" + y="6003" + id="tspan653">w_send_dblock()</tspan> + </text> + <path + d="M 8000,6800 L 7900,6500 L 8100,6500 L 8000,6800 z" + id="path663" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,6000 L 8000,6560" + id="path667" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + id="text683" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4602 4780 4886 5063 5241 5419 5597 5775 5952 6024 6202 6380 6609 6714 6786 6875 7053 7231 7409 7515 7587 7692" + y="6905" + id="tspan685">drbd_endio_write_pri()</tspan> + </text> + <path + d="M 12000,13602 L 11900,13302 L 12100,13302 L 12000,13602 z" + id="path695" + style="fill:#008000;visibility:visible" /> + <path + d="M 12000,12802 L 12000,13362" + id="path699" + style="fill:none;stroke:#008000;visibility:visible" /> + <path + d="M 12000,12802 L 11686,12841 L 11725,12645 L 12000,12802 z" + id="path711" + style="fill:#008000;visibility:visible" /> + <path + d="M 8000,12002 L 11765,12755" + id="path715" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + x="-2155.5266" + y="1201.5964" + transform="matrix(0.9895258,-0.1443562,0.1443562,0.9895258,0,0)" + id="text733" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="7202.4736 7431.4736 7608.4736 7697.4736 7875.4736 8104.4736 8282.4736 8459.4736 8531.4736" + y="15454.597" + id="tspan735">DataReply</tspan> + </text> + <path + d="M 8000,14602 L 8282,14459 L 8312,14656 L 8000,14602 z" + id="path745" + style="fill:#008000;visibility:visible" /> + <path + d="M 12000,14002 L 8237,14566" + id="path749" + style="fill:none;stroke:#008000;visibility:visible" /> + <text + x="2280.3804" + y="-2103.2141" + transform="matrix(0.9788674,0.2044961,-0.2044961,0.9788674,0,0)" + id="text767" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="11316.381 11545.381 11722.381 11811.381 11989.381 12218.381 12396.381 12573.381 12751.381 12929.381 13090.381" + y="9981.7861" + id="tspan769">DataRequest</tspan> + </text> + <text + id="text783" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="4746 4924 5030 5207 5385 5563 5826 6003 6164 6342 6520 6626 6803 6981 7159 7337 7498 7587 7692" + y="11506" + id="tspan785">drbd_make_request()</tspan> + </text> + <text + id="text799" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12306 12484 12645 12822 12894 13055 13233 13411 13639 13817 13906 14084 14312 14490 14668 14846 15024 15185 15273 15379" + y="12807" + id="tspan801">receive_DataRequest()</tspan> + </text> + <text + id="text815" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12200 12378 12484 12661 12839 13017 13195 13373 13550 13622 13800 13978 14084 14262 14439 14617 14795 14956 15134 15295 15400" + y="13607" + id="tspan817">drbd_endio_read_sec()</tspan> + </text> + <text + id="text831" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="12192 12421 12598 12776 12954 13132 13310 13487 13665 13843 14021 14110 14288 14465 14571 14749 14927 15033" + y="14008" + id="tspan833">w_e_end_data_req()</tspan> + </text> + <g + id="g835" + style="visibility:visible"> + <desc + id="desc837">Drawing</desc> + <text + id="text847" + style="font-size:318px;font-weight:400;fill:#008000;font-family:Helvetica embedded"> + <tspan + x="4885 4991 5169 5330 5507 5579 5740 5918 6096 6324 6502 6591 6769 6997 7175 7353 7425 7586 7692" + y="14607" + id="tspan849">receive_DataReply()</tspan> + </text> + </g> + <text + id="text863" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="8000 8305 8398 8610 8821 8914 9151 9363 9575 9693 9833 10070 10307 10544 10663 10781 11018 11255 11493 11632 11869 12106" + y="10878" + id="tspan865">Diskless read, 512-32K</tspan> + </text> + <text + id="text879" + style="font-size:318px;font-weight:400;fill:#008000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="5029 5258 5435 5596 5774 5952 6130 6307 6413 6591 6769 6947 7125 7230 7408 7586 7692" + y="12004" + id="tspan881">w_send_read_req()</tspan> + </text> + <text + id="text895" + style="font-size:423px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="6961 7266 7571 7854 8159 8278 8515 8633 8870 9107 9226 9463 9581 9700 9793 10030" + y="2806" + id="tspan897">DRBD 8 data flow</tspan> + </text> + <path + d="M 3900,5300 L 3700,5300 L 3700,7000 L 3900,7000" + id="path907" + style="fill:none;stroke:#000000;visibility:visible" /> + <path + d="M 3900,17600 L 3700,17600 L 3700,22000 L 3900,22000" + id="path919" + style="fill:none;stroke:#000000;visibility:visible" /> + <path + d="M 16100,20000 L 16300,20000 L 16300,18500 L 16100,18500" + id="path931" + style="fill:none;stroke:#000000;visibility:visible" /> + <text + id="text947" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="2126 2304 2376 2554 2731 2909 3087 3159 3337 3515 3587 3764 3870" + y="5202" + id="tspan949">al_begin_io()</tspan> + </text> + <text + id="text963" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="1632 1810 1882 2060 2220 2398 2661 2839 2910 3088 3177 3355 3533 3605 3783 3888" + y="7331" + id="tspan965">al_complete_io()</tspan> + </text> + <text + id="text979" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="2126 2232 2393 2571 2748 2926 3104 3176 3354 3531 3603 3781 3887" + y="17431" + id="tspan981">rs_begin_io()</tspan> + </text> + <text + id="text995" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="1626 1732 1893 2071 2231 2409 2672 2849 2921 3099 3188 3366 3544 3616 3793 3899" + y="22331" + id="tspan997">rs_complete_io()</tspan> + </text> + <text + id="text1011" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16027 16133 16294 16472 16649 16827 17005 17077 17255 17432 17504 17682 17788" + y="18402" + id="tspan1013">rs_begin_io()</tspan> + </text> + <text + id="text1027" + style="font-size:318px;font-weight:400;fill:#000000;visibility:visible;font-family:Helvetica embedded"> + <tspan + x="16115 16221 16382 16560 16720 16898 17161 17338 17410 17588 17677 17855 18033 18105 18282 18388" + y="20331" + id="tspan1029">rs_complete_io()</tspan> + </text> +</svg> diff --git a/Documentation/blockdev/drbd/README.txt b/Documentation/blockdev/drbd/README.txt new file mode 100644 index 000000000..627b0a1bf --- /dev/null +++ b/Documentation/blockdev/drbd/README.txt @@ -0,0 +1,16 @@ +Description + + DRBD is a shared-nothing, synchronously replicated block device. It + is designed to serve as a building block for high availability + clusters and in this context, is a "drop-in" replacement for shared + storage. Simplistically, you could see it as a network RAID 1. + + Please visit http://www.drbd.org to find out more. + +The here included files are intended to help understand the implementation + +DRBD-8.3-data-packets.svg, DRBD-data-packets.svg + relates some functions, and write packets. + +conn-states-8.dot, disk-states-8.dot, node-states-8.dot + The sub graphs of DRBD's state transitions diff --git a/Documentation/blockdev/drbd/conn-states-8.dot b/Documentation/blockdev/drbd/conn-states-8.dot new file mode 100644 index 000000000..025e8cf5e --- /dev/null +++ b/Documentation/blockdev/drbd/conn-states-8.dot @@ -0,0 +1,18 @@ +digraph conn_states { + StandAllone -> WFConnection [ label = "ioctl_set_net()" ] + WFConnection -> Unconnected [ label = "unable to bind()" ] + WFConnection -> WFReportParams [ label = "in connect() after accept" ] + WFReportParams -> StandAllone [ label = "checks in receive_param()" ] + WFReportParams -> Connected [ label = "in receive_param()" ] + WFReportParams -> WFBitMapS [ label = "sync_handshake()" ] + WFReportParams -> WFBitMapT [ label = "sync_handshake()" ] + WFBitMapS -> SyncSource [ label = "receive_bitmap()" ] + WFBitMapT -> SyncTarget [ label = "receive_bitmap()" ] + SyncSource -> Connected + SyncTarget -> Connected + SyncSource -> PausedSyncS + SyncTarget -> PausedSyncT + PausedSyncS -> SyncSource + PausedSyncT -> SyncTarget + Connected -> WFConnection [ label = "* on network error" ] +} diff --git a/Documentation/blockdev/drbd/data-structure-v9.txt b/Documentation/blockdev/drbd/data-structure-v9.txt new file mode 100644 index 000000000..1e52a0e32 --- /dev/null +++ b/Documentation/blockdev/drbd/data-structure-v9.txt @@ -0,0 +1,38 @@ +This describes the in kernel data structure for DRBD-9. Starting with +Linux v3.14 we are reorganizing DRBD to use this data structure. + +Basic Data Structure +==================== + +A node has a number of DRBD resources. Each such resource has a number of +devices (aka volumes) and connections to other nodes ("peer nodes"). Each DRBD +device is represented by a block device locally. + +The DRBD objects are interconnected to form a matrix as depicted below; a +drbd_peer_device object sits at each intersection between a drbd_device and a +drbd_connection: + + /--------------+---------------+.....+---------------\ + | resource | device | | device | + +--------------+---------------+.....+---------------+ + | connection | peer_device | | peer_device | + +--------------+---------------+.....+---------------+ + : : : : : + : : : : : + +--------------+---------------+.....+---------------+ + | connection | peer_device | | peer_device | + \--------------+---------------+.....+---------------/ + +In this table, horizontally, devices can be accessed from resources by their +volume number. Likewise, peer_devices can be accessed from connections by +their volume number. Objects in the vertical direction are connected by double +linked lists. There are back pointers from peer_devices to their connections a +devices, and from connections and devices to their resource. + +All resources are in the drbd_resources double-linked list. In addition, all +devices can be accessed by their minor device number via the drbd_devices idr. + +The drbd_resource, drbd_connection, and drbd_device objects are reference +counted. The peer_device objects only serve to establish the links between +devices and connections; their lifetime is determined by the lifetime of the +device and connection which they reference. diff --git a/Documentation/blockdev/drbd/disk-states-8.dot b/Documentation/blockdev/drbd/disk-states-8.dot new file mode 100644 index 000000000..d06cfb46f --- /dev/null +++ b/Documentation/blockdev/drbd/disk-states-8.dot @@ -0,0 +1,16 @@ +digraph disk_states { + Diskless -> Inconsistent [ label = "ioctl_set_disk()" ] + Diskless -> Consistent [ label = "ioctl_set_disk()" ] + Diskless -> Outdated [ label = "ioctl_set_disk()" ] + Consistent -> Outdated [ label = "receive_param()" ] + Consistent -> UpToDate [ label = "receive_param()" ] + Consistent -> Inconsistent [ label = "start resync" ] + Outdated -> Inconsistent [ label = "start resync" ] + UpToDate -> Inconsistent [ label = "ioctl_replicate" ] + Inconsistent -> UpToDate [ label = "resync completed" ] + Consistent -> Failed [ label = "io completion error" ] + Outdated -> Failed [ label = "io completion error" ] + UpToDate -> Failed [ label = "io completion error" ] + Inconsistent -> Failed [ label = "io completion error" ] + Failed -> Diskless [ label = "sending notify to peer" ] +} diff --git a/Documentation/blockdev/drbd/drbd-connection-state-overview.dot b/Documentation/blockdev/drbd/drbd-connection-state-overview.dot new file mode 100644 index 000000000..6d9cf0a7b --- /dev/null +++ b/Documentation/blockdev/drbd/drbd-connection-state-overview.dot @@ -0,0 +1,85 @@ +// vim: set sw=2 sts=2 : +digraph { + rankdir=BT + bgcolor=white + + node [shape=plaintext] + node [fontcolor=black] + + StandAlone [ style=filled,fillcolor=gray,label=StandAlone ] + + node [fontcolor=lightgray] + + Unconnected [ label=Unconnected ] + + CommTrouble [ shape=record, + label="{communication loss|{Timeout|BrokenPipe|NetworkFailure}}" ] + + node [fontcolor=gray] + + subgraph cluster_try_connect { + label="try to connect, handshake" + rank=max + WFConnection [ label=WFConnection ] + WFReportParams [ label=WFReportParams ] + } + + TearDown [ label=TearDown ] + + Connected [ label=Connected,style=filled,fillcolor=green,fontcolor=black ] + + node [fontcolor=lightblue] + + StartingSyncS [ label=StartingSyncS ] + StartingSyncT [ label=StartingSyncT ] + + subgraph cluster_bitmap_exchange { + node [fontcolor=red] + fontcolor=red + label="new application (WRITE?) requests blocked\lwhile bitmap is exchanged" + + WFBitMapT [ label=WFBitMapT ] + WFSyncUUID [ label=WFSyncUUID ] + WFBitMapS [ label=WFBitMapS ] + } + + node [fontcolor=blue] + + cluster_resync [ shape=record,label="{<any>resynchronisation process running\l'concurrent' application requests allowed|{{<T>PausedSyncT\nSyncTarget}|{<S>PausedSyncS\nSyncSource}}}" ] + + node [shape=box,fontcolor=black] + + // drbdadm [label="drbdadm connect"] + // handshake [label="drbd_connect()\ndrbd_do_handshake\ndrbd_sync_handshake() etc."] + // comm_error [label="communication trouble"] + + // + // edges + // -------------------------------------- + + StandAlone -> Unconnected [ label="drbdadm connect" ] + Unconnected -> StandAlone [ label="drbdadm disconnect\lor serious communication trouble" ] + Unconnected -> WFConnection [ label="receiver thread is started" ] + WFConnection -> WFReportParams [ headlabel="accept()\land/or \lconnect()\l" ] + + WFReportParams -> StandAlone [ label="during handshake\lpeers do not agree\labout something essential" ] + WFReportParams -> Connected [ label="data identical\lno sync needed",color=green,fontcolor=green ] + + WFReportParams -> WFBitMapS + WFReportParams -> WFBitMapT + WFBitMapT -> WFSyncUUID [minlen=0.1,constraint=false] + + WFBitMapS -> cluster_resync:S + WFSyncUUID -> cluster_resync:T + + edge [color=green] + cluster_resync:any -> Connected [ label="resnyc done",fontcolor=green ] + + edge [color=red] + WFReportParams -> CommTrouble + Connected -> CommTrouble + cluster_resync:any -> CommTrouble + edge [color=black] + CommTrouble -> Unconnected [label="receiver thread is stopped" ] + +} diff --git a/Documentation/blockdev/drbd/node-states-8.dot b/Documentation/blockdev/drbd/node-states-8.dot new file mode 100644 index 000000000..4a2b00c23 --- /dev/null +++ b/Documentation/blockdev/drbd/node-states-8.dot @@ -0,0 +1,14 @@ +digraph node_states { + Secondary -> Primary [ label = "ioctl_set_state()" ] + Primary -> Secondary [ label = "ioctl_set_state()" ] +} + +digraph peer_states { + Secondary -> Primary [ label = "recv state packet" ] + Primary -> Secondary [ label = "recv state packet" ] + Primary -> Unknown [ label = "connection lost" ] + Secondary -> Unknown [ label = "connection lost" ] + Unknown -> Primary [ label = "connected" ] + Unknown -> Secondary [ label = "connected" ] +} + diff --git a/Documentation/blockdev/floppy.txt b/Documentation/blockdev/floppy.txt new file mode 100644 index 000000000..e2240f5ab --- /dev/null +++ b/Documentation/blockdev/floppy.txt @@ -0,0 +1,245 @@ +This file describes the floppy driver. + +FAQ list: +========= + + A FAQ list may be found in the fdutils package (see below), and also +at <http://fdutils.linux.lu/faq.html>. + + +LILO configuration options (Thinkpad users, read this) +====================================================== + + The floppy driver is configured using the 'floppy=' option in +lilo. This option can be typed at the boot prompt, or entered in the +lilo configuration file. + + Example: If your kernel is called linux-2.6.9, type the following line +at the lilo boot prompt (if you have a thinkpad): + + linux-2.6.9 floppy=thinkpad + +You may also enter the following line in /etc/lilo.conf, in the description +of linux-2.6.9: + + append = "floppy=thinkpad" + + Several floppy related options may be given, example: + + linux-2.6.9 floppy=daring floppy=two_fdc + append = "floppy=daring floppy=two_fdc" + + If you give options both in the lilo config file and on the boot +prompt, the option strings of both places are concatenated, the boot +prompt options coming last. That's why there are also options to +restore the default behavior. + + +Module configuration options +============================ + + If you use the floppy driver as a module, use the following syntax: +modprobe floppy floppy="<options>" + +Example: + modprobe floppy floppy="omnibook messages" + + If you need certain options enabled every time you load the floppy driver, +you can put: + + options floppy floppy="omnibook messages" + +in a configuration file in /etc/modprobe.d/. + + + The floppy driver related options are: + + floppy=asus_pci + Sets the bit mask to allow only units 0 and 1. (default) + + floppy=daring + Tells the floppy driver that you have a well behaved floppy controller. + This allows more efficient and smoother operation, but may fail on + certain controllers. This may speed up certain operations. + + floppy=0,daring + Tells the floppy driver that your floppy controller should be used + with caution. + + floppy=one_fdc + Tells the floppy driver that you have only one floppy controller. + (default) + + floppy=two_fdc + floppy=<address>,two_fdc + Tells the floppy driver that you have two floppy controllers. + The second floppy controller is assumed to be at <address>. + This option is not needed if the second controller is at address + 0x370, and if you use the 'cmos' option. + + floppy=thinkpad + Tells the floppy driver that you have a Thinkpad. Thinkpads use an + inverted convention for the disk change line. + + floppy=0,thinkpad + Tells the floppy driver that you don't have a Thinkpad. + + floppy=omnibook + floppy=nodma + Tells the floppy driver not to use Dma for data transfers. + This is needed on HP Omnibooks, which don't have a workable + DMA channel for the floppy driver. This option is also useful + if you frequently get "Unable to allocate DMA memory" messages. + Indeed, dma memory needs to be continuous in physical memory, + and is thus harder to find, whereas non-dma buffers may be + allocated in virtual memory. However, I advise against this if + you have an FDC without a FIFO (8272A or 82072). 82072A and + later are OK. You also need at least a 486 to use nodma. + If you use nodma mode, I suggest you also set the FIFO + threshold to 10 or lower, in order to limit the number of data + transfer interrupts. + + If you have a FIFO-able FDC, the floppy driver automatically + falls back on non DMA mode if no DMA-able memory can be found. + If you want to avoid this, explicitly ask for 'yesdma'. + + floppy=yesdma + Tells the floppy driver that a workable DMA channel is available. + (default) + + floppy=nofifo + Disables the FIFO entirely. This is needed if you get "Bus + master arbitration error" messages from your Ethernet card (or + from other devices) while accessing the floppy. + + floppy=usefifo + Enables the FIFO. (default) + + floppy=<threshold>,fifo_depth + Sets the FIFO threshold. This is mostly relevant in DMA + mode. If this is higher, the floppy driver tolerates more + interrupt latency, but it triggers more interrupts (i.e. it + imposes more load on the rest of the system). If this is + lower, the interrupt latency should be lower too (faster + processor). The benefit of a lower threshold is less + interrupts. + + To tune the fifo threshold, switch on over/underrun messages + using 'floppycontrol --messages'. Then access a floppy + disk. If you get a huge amount of "Over/Underrun - retrying" + messages, then the fifo threshold is too low. Try with a + higher value, until you only get an occasional Over/Underrun. + It is a good idea to compile the floppy driver as a module + when doing this tuning. Indeed, it allows to try different + fifo values without rebooting the machine for each test. Note + that you need to do 'floppycontrol --messages' every time you + re-insert the module. + + Usually, tuning the fifo threshold should not be needed, as + the default (0xa) is reasonable. + + floppy=<drive>,<type>,cmos + Sets the CMOS type of <drive> to <type>. This is mandatory if + you have more than two floppy drives (only two can be + described in the physical CMOS), or if your BIOS uses + non-standard CMOS types. The CMOS types are: + + 0 - Use the value of the physical CMOS + 1 - 5 1/4 DD + 2 - 5 1/4 HD + 3 - 3 1/2 DD + 4 - 3 1/2 HD + 5 - 3 1/2 ED + 6 - 3 1/2 ED + 16 - unknown or not installed + + (Note: there are two valid types for ED drives. This is because 5 was + initially chosen to represent floppy *tapes*, and 6 for ED drives. + AMI ignored this, and used 5 for ED drives. That's why the floppy + driver handles both.) + + floppy=unexpected_interrupts + Print a warning message when an unexpected interrupt is received. + (default) + + floppy=no_unexpected_interrupts + floppy=L40SX + Don't print a message when an unexpected interrupt is received. This + is needed on IBM L40SX laptops in certain video modes. (There seems + to be an interaction between video and floppy. The unexpected + interrupts affect only performance, and can be safely ignored.) + + floppy=broken_dcl + Don't use the disk change line, but assume that the disk was + changed whenever the device node is reopened. Needed on some + boxes where the disk change line is broken or unsupported. + This should be regarded as a stopgap measure, indeed it makes + floppy operation less efficient due to unneeded cache + flushings, and slightly more unreliable. Please verify your + cable, connection and jumper settings if you have any DCL + problems. However, some older drives, and also some laptops + are known not to have a DCL. + + floppy=debug + Print debugging messages. + + floppy=messages + Print informational messages for some operations (disk change + notifications, warnings about over and underruns, and about + autodetection). + + floppy=silent_dcl_clear + Uses a less noisy way to clear the disk change line (which + doesn't involve seeks). Implied by 'daring' option. + + floppy=<nr>,irq + Sets the floppy IRQ to <nr> instead of 6. + + floppy=<nr>,dma + Sets the floppy DMA channel to <nr> instead of 2. + + floppy=slow + Use PS/2 stepping rate: + " PS/2 floppies have much slower step rates than regular floppies. + It's been recommended that take about 1/4 of the default speed + in some more extreme cases." + + +Supporting utilities and additional documentation: +================================================== + + Additional parameters of the floppy driver can be configured at +runtime. Utilities which do this can be found in the fdutils package. +This package also contains a new version of mtools which allows to +access high capacity disks (up to 1992K on a high density 3 1/2 disk!). +It also contains additional documentation about the floppy driver. + +The latest version can be found at fdutils homepage: + http://fdutils.linux.lu + +The fdutils releases can be found at: + http://fdutils.linux.lu/download.html + http://www.tux.org/pub/knaff/fdutils/ + ftp://metalab.unc.edu/pub/Linux/utils/disk-management/ + +Reporting problems about the floppy driver +========================================== + + If you have a question or a bug report about the floppy driver, mail +me at Alain.Knaff@poboxes.com . If you post to Usenet, preferably use +comp.os.linux.hardware. As the volume in these groups is rather high, +be sure to include the word "floppy" (or "FLOPPY") in the subject +line. If the reported problem happens when mounting floppy disks, be +sure to mention also the type of the filesystem in the subject line. + + Be sure to read the FAQ before mailing/posting any bug reports! + + Alain + +Changelog +========= + +10-30-2004 : Cleanup, updating, add reference to module configuration. + James Nelson <james4765@gmail.com> + +6-3-2000 : Original Document diff --git a/Documentation/blockdev/nbd.txt b/Documentation/blockdev/nbd.txt new file mode 100644 index 000000000..db242ea2b --- /dev/null +++ b/Documentation/blockdev/nbd.txt @@ -0,0 +1,31 @@ +Network Block Device (TCP version) +================================== + +1) Overview +----------- + +What is it: With this compiled in the kernel (or as a module), Linux +can use a remote server as one of its block devices. So every time +the client computer wants to read, e.g., /dev/nb0, it sends a +request over TCP to the server, which will reply with the data read. +This can be used for stations with low disk space (or even diskless) +to borrow disk space from another computer. +Unlike NFS, it is possible to put any filesystem on it, etc. + +For more information, or to download the nbd-client and nbd-server +tools, go to http://nbd.sf.net/. + +The nbd kernel module need only be installed on the client +system, as the nbd-server is completely in userspace. In fact, +the nbd-server has been successfully ported to other operating +systems, including Windows. + +A) NBD parameters +----------------- + +max_part + Number of partitions per device (default: 0). + +nbds_max + Number of block devices that should be initialized (default: 16). + diff --git a/Documentation/blockdev/paride.txt b/Documentation/blockdev/paride.txt new file mode 100644 index 000000000..ee6717e37 --- /dev/null +++ b/Documentation/blockdev/paride.txt @@ -0,0 +1,417 @@ + + Linux and parallel port IDE devices + +PARIDE v1.03 (c) 1997-8 Grant Guenther <grant@torque.net> + +1. Introduction + +Owing to the simplicity and near universality of the parallel port interface +to personal computers, many external devices such as portable hard-disk, +CD-ROM, LS-120 and tape drives use the parallel port to connect to their +host computer. While some devices (notably scanners) use ad-hoc methods +to pass commands and data through the parallel port interface, most +external devices are actually identical to an internal model, but with +a parallel-port adapter chip added in. Some of the original parallel port +adapters were little more than mechanisms for multiplexing a SCSI bus. +(The Iomega PPA-3 adapter used in the ZIP drives is an example of this +approach). Most current designs, however, take a different approach. +The adapter chip reproduces a small ISA or IDE bus in the external device +and the communication protocol provides operations for reading and writing +device registers, as well as data block transfer functions. Sometimes, +the device being addressed via the parallel cable is a standard SCSI +controller like an NCR 5380. The "ditto" family of external tape +drives use the ISA replicator to interface a floppy disk controller, +which is then connected to a floppy-tape mechanism. The vast majority +of external parallel port devices, however, are now based on standard +IDE type devices, which require no intermediate controller. If one +were to open up a parallel port CD-ROM drive, for instance, one would +find a standard ATAPI CD-ROM drive, a power supply, and a single adapter +that interconnected a standard PC parallel port cable and a standard +IDE cable. It is usually possible to exchange the CD-ROM device with +any other device using the IDE interface. + +The document describes the support in Linux for parallel port IDE +devices. It does not cover parallel port SCSI devices, "ditto" tape +drives or scanners. Many different devices are supported by the +parallel port IDE subsystem, including: + + MicroSolutions backpack CD-ROM + MicroSolutions backpack PD/CD + MicroSolutions backpack hard-drives + MicroSolutions backpack 8000t tape drive + SyQuest EZ-135, EZ-230 & SparQ drives + Avatar Shark + Imation Superdisk LS-120 + Maxell Superdisk LS-120 + FreeCom Power CD + Hewlett-Packard 5GB and 8GB tape drives + Hewlett-Packard 7100 and 7200 CD-RW drives + +as well as most of the clone and no-name products on the market. + +To support such a wide range of devices, PARIDE, the parallel port IDE +subsystem, is actually structured in three parts. There is a base +paride module which provides a registry and some common methods for +accessing the parallel ports. The second component is a set of +high-level drivers for each of the different types of supported devices: + + pd IDE disk + pcd ATAPI CD-ROM + pf ATAPI disk + pt ATAPI tape + pg ATAPI generic + +(Currently, the pg driver is only used with CD-R drives). + +The high-level drivers function according to the relevant standards. +The third component of PARIDE is a set of low-level protocol drivers +for each of the parallel port IDE adapter chips. Thanks to the interest +and encouragement of Linux users from many parts of the world, +support is available for almost all known adapter protocols: + + aten ATEN EH-100 (HK) + bpck Microsolutions backpack (US) + comm DataStor (old-type) "commuter" adapter (TW) + dstr DataStor EP-2000 (TW) + epat Shuttle EPAT (UK) + epia Shuttle EPIA (UK) + fit2 FIT TD-2000 (US) + fit3 FIT TD-3000 (US) + friq Freecom IQ cable (DE) + frpw Freecom Power (DE) + kbic KingByte KBIC-951A and KBIC-971A (TW) + ktti KT Technology PHd adapter (SG) + on20 OnSpec 90c20 (US) + on26 OnSpec 90c26 (US) + + +2. Using the PARIDE subsystem + +While configuring the Linux kernel, you may choose either to build +the PARIDE drivers into your kernel, or to build them as modules. + +In either case, you will need to select "Parallel port IDE device support" +as well as at least one of the high-level drivers and at least one +of the parallel port communication protocols. If you do not know +what kind of parallel port adapter is used in your drive, you could +begin by checking the file names and any text files on your DOS +installation floppy. Alternatively, you can look at the markings on +the adapter chip itself. That's usually sufficient to identify the +correct device. + +You can actually select all the protocol modules, and allow the PARIDE +subsystem to try them all for you. + +For the "brand-name" products listed above, here are the protocol +and high-level drivers that you would use: + + Manufacturer Model Driver Protocol + + MicroSolutions CD-ROM pcd bpck + MicroSolutions PD drive pf bpck + MicroSolutions hard-drive pd bpck + MicroSolutions 8000t tape pt bpck + SyQuest EZ, SparQ pd epat + Imation Superdisk pf epat + Maxell Superdisk pf friq + Avatar Shark pd epat + FreeCom CD-ROM pcd frpw + Hewlett-Packard 5GB Tape pt epat + Hewlett-Packard 7200e (CD) pcd epat + Hewlett-Packard 7200e (CD-R) pg epat + +2.1 Configuring built-in drivers + +We recommend that you get to know how the drivers work and how to +configure them as loadable modules, before attempting to compile a +kernel with the drivers built-in. + +If you built all of your PARIDE support directly into your kernel, +and you have just a single parallel port IDE device, your kernel should +locate it automatically for you. If you have more than one device, +you may need to give some command line options to your bootloader +(eg: LILO), how to do that is beyond the scope of this document. + +The high-level drivers accept a number of command line parameters, all +of which are documented in the source files in linux/drivers/block/paride. +By default, each driver will automatically try all parallel ports it +can find, and all protocol types that have been installed, until it finds +a parallel port IDE adapter. Once it finds one, the probe stops. So, +if you have more than one device, you will need to tell the drivers +how to identify them. This requires specifying the port address, the +protocol identification number and, for some devices, the drive's +chain ID. While your system is booting, a number of messages are +displayed on the console. Like all such messages, they can be +reviewed with the 'dmesg' command. Among those messages will be +some lines like: + + paride: bpck registered as protocol 0 + paride: epat registered as protocol 1 + +The numbers will always be the same until you build a new kernel with +different protocol selections. You should note these numbers as you +will need them to identify the devices. + +If you happen to be using a MicroSolutions backpack device, you will +also need to know the unit ID number for each drive. This is usually +the last two digits of the drive's serial number (but read MicroSolutions' +documentation about this). + +As an example, let's assume that you have a MicroSolutions PD/CD drive +with unit ID number 36 connected to the parallel port at 0x378, a SyQuest +EZ-135 connected to the chained port on the PD/CD drive and also an +Imation Superdisk connected to port 0x278. You could give the following +options on your boot command: + + pd.drive0=0x378,1 pf.drive0=0x278,1 pf.drive1=0x378,0,36 + +In the last option, pf.drive1 configures device /dev/pf1, the 0x378 +is the parallel port base address, the 0 is the protocol registration +number and 36 is the chain ID. + +Please note: while PARIDE will work both with and without the +PARPORT parallel port sharing system that is included by the +"Parallel port support" option, PARPORT must be included and enabled +if you want to use chains of devices on the same parallel port. + +2.2 Loading and configuring PARIDE as modules + +It is much faster and simpler to get to understand the PARIDE drivers +if you use them as loadable kernel modules. + +Note 1: using these drivers with the "kerneld" automatic module loading +system is not recommended for beginners, and is not documented here. + +Note 2: if you build PARPORT support as a loadable module, PARIDE must +also be built as loadable modules, and PARPORT must be loaded before the +PARIDE modules. + +To use PARIDE, you must begin by + + insmod paride + +this loads a base module which provides a registry for the protocols, +among other tasks. + +Then, load as many of the protocol modules as you think you might need. +As you load each module, it will register the protocols that it supports, +and print a log message to your kernel log file and your console. For +example: + + # insmod epat + paride: epat registered as protocol 0 + # insmod kbic + paride: k951 registered as protocol 1 + paride: k971 registered as protocol 2 + +Finally, you can load high-level drivers for each kind of device that +you have connected. By default, each driver will autoprobe for a single +device, but you can support up to four similar devices by giving their +individual co-ordinates when you load the driver. + +For example, if you had two no-name CD-ROM drives both using the +KingByte KBIC-951A adapter, one on port 0x378 and the other on 0x3bc +you could give the following command: + + # insmod pcd drive0=0x378,1 drive1=0x3bc,1 + +For most adapters, giving a port address and protocol number is sufficient, +but check the source files in linux/drivers/block/paride for more +information. (Hopefully someone will write some man pages one day !). + +As another example, here's what happens when PARPORT is installed, and +a SyQuest EZ-135 is attached to port 0x378: + + # insmod paride + paride: version 1.0 installed + # insmod epat + paride: epat registered as protocol 0 + # insmod pd + pd: pd version 1.0, major 45, cluster 64, nice 0 + pda: Sharing parport1 at 0x378 + pda: epat 1.0, Shuttle EPAT chip c3 at 0x378, mode 5 (EPP-32), delay 1 + pda: SyQuest EZ135A, 262144 blocks [128M], (512/16/32), removable media + pda: pda1 + +Note that the last line is the output from the generic partition table +scanner - in this case it reports that it has found a disk with one partition. + +2.3 Using a PARIDE device + +Once the drivers have been loaded, you can access PARIDE devices in the +same way as their traditional counterparts. You will probably need to +create the device "special files". Here is a simple script that you can +cut to a file and execute: + +#!/bin/bash +# +# mkd -- a script to create the device special files for the PARIDE subsystem +# +function mkdev { + mknod $1 $2 $3 $4 ; chmod 0660 $1 ; chown root:disk $1 +} +# +function pd { + D=$( printf \\$( printf "x%03x" $[ $1 + 97 ] ) ) + mkdev pd$D b 45 $[ $1 * 16 ] + for P in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 + do mkdev pd$D$P b 45 $[ $1 * 16 + $P ] + done +} +# +cd /dev +# +for u in 0 1 2 3 ; do pd $u ; done +for u in 0 1 2 3 ; do mkdev pcd$u b 46 $u ; done +for u in 0 1 2 3 ; do mkdev pf$u b 47 $u ; done +for u in 0 1 2 3 ; do mkdev pt$u c 96 $u ; done +for u in 0 1 2 3 ; do mkdev npt$u c 96 $[ $u + 128 ] ; done +for u in 0 1 2 3 ; do mkdev pg$u c 97 $u ; done +# +# end of mkd + +With the device files and drivers in place, you can access PARIDE devices +like any other Linux device. For example, to mount a CD-ROM in pcd0, use: + + mount /dev/pcd0 /cdrom + +If you have a fresh Avatar Shark cartridge, and the drive is pda, you +might do something like: + + fdisk /dev/pda -- make a new partition table with + partition 1 of type 83 + + mke2fs /dev/pda1 -- to build the file system + + mkdir /shark -- make a place to mount the disk + + mount /dev/pda1 /shark + +Devices like the Imation superdisk work in the same way, except that +they do not have a partition table. For example to make a 120MB +floppy that you could share with a DOS system: + + mkdosfs /dev/pf0 + mount /dev/pf0 /mnt + + +2.4 The pf driver + +The pf driver is intended for use with parallel port ATAPI disk +devices. The most common devices in this category are PD drives +and LS-120 drives. Traditionally, media for these devices are not +partitioned. Consequently, the pf driver does not support partitioned +media. This may be changed in a future version of the driver. + +2.5 Using the pt driver + +The pt driver for parallel port ATAPI tape drives is a minimal driver. +It does not yet support many of the standard tape ioctl operations. +For best performance, a block size of 32KB should be used. You will +probably want to set the parallel port delay to 0, if you can. + +2.6 Using the pg driver + +The pg driver can be used in conjunction with the cdrecord program +to create CD-ROMs. Please get cdrecord version 1.6.1 or later +from ftp://ftp.fokus.gmd.de/pub/unix/cdrecord/ . To record CD-R media +your parallel port should ideally be set to EPP mode, and the "port delay" +should be set to 0. With those settings it is possible to record at 2x +speed without any buffer underruns. If you cannot get the driver to work +in EPP mode, try to use "bidirectional" or "PS/2" mode and 1x speeds only. + + +3. Troubleshooting + +3.1 Use EPP mode if you can + +The most common problems that people report with the PARIDE drivers +concern the parallel port CMOS settings. At this time, none of the +PARIDE protocol modules support ECP mode, or any ECP combination modes. +If you are able to do so, please set your parallel port into EPP mode +using your CMOS setup procedure. + +3.2 Check the port delay + +Some parallel ports cannot reliably transfer data at full speed. To +offset the errors, the PARIDE protocol modules introduce a "port +delay" between each access to the i/o ports. Each protocol sets +a default value for this delay. In most cases, the user can override +the default and set it to 0 - resulting in somewhat higher transfer +rates. In some rare cases (especially with older 486 systems) the +default delays are not long enough. if you experience corrupt data +transfers, or unexpected failures, you may wish to increase the +port delay. The delay can be programmed using the "driveN" parameters +to each of the high-level drivers. Please see the notes above, or +read the comments at the beginning of the driver source files in +linux/drivers/block/paride. + +3.3 Some drives need a printer reset + +There appear to be a number of "noname" external drives on the market +that do not always power up correctly. We have noticed this with some +drives based on OnSpec and older Freecom adapters. In these rare cases, +the adapter can often be reinitialised by issuing a "printer reset" on +the parallel port. As the reset operation is potentially disruptive in +multiple device environments, the PARIDE drivers will not do it +automatically. You can however, force a printer reset by doing: + + insmod lp reset=1 + rmmod lp + +If you have one of these marginal cases, you should probably build +your paride drivers as modules, and arrange to do the printer reset +before loading the PARIDE drivers. + +3.4 Use the verbose option and dmesg if you need help + +While a lot of testing has gone into these drivers to make them work +as smoothly as possible, problems will arise. If you do have problems, +please check all the obvious things first: does the drive work in +DOS with the manufacturer's drivers ? If that doesn't yield any useful +clues, then please make sure that only one drive is hooked to your system, +and that either (a) PARPORT is enabled or (b) no other device driver +is using your parallel port (check in /proc/ioports). Then, load the +appropriate drivers (you can load several protocol modules if you want) +as in: + + # insmod paride + # insmod epat + # insmod bpck + # insmod kbic + ... + # insmod pd verbose=1 + +(using the correct driver for the type of device you have, of course). +The verbose=1 parameter will cause the drivers to log a trace of their +activity as they attempt to locate your drive. + +Use 'dmesg' to capture a log of all the PARIDE messages (any messages +beginning with paride:, a protocol module's name or a driver's name) and +include that with your bug report. You can submit a bug report in one +of two ways. Either send it directly to the author of the PARIDE suite, +by e-mail to grant@torque.net, or join the linux-parport mailing list +and post your report there. + +3.5 For more information or help + +You can join the linux-parport mailing list by sending a mail message +to + linux-parport-request@torque.net + +with the single word + + subscribe + +in the body of the mail message (not in the subject line). Please be +sure that your mail program is correctly set up when you do this, as +the list manager is a robot that will subscribe you using the reply +address in your mail headers. REMOVE any anti-spam gimmicks you may +have in your mail headers, when sending mail to the list server. + +You might also find some useful information on the linux-parport +web pages (although they are not always up to date) at + + http://web.archive.org/web/*/http://www.torque.net/parport/ + + diff --git a/Documentation/blockdev/ramdisk.txt b/Documentation/blockdev/ramdisk.txt new file mode 100644 index 000000000..501e12e03 --- /dev/null +++ b/Documentation/blockdev/ramdisk.txt @@ -0,0 +1,174 @@ +Using the RAM disk block device with Linux +------------------------------------------ + +Contents: + + 1) Overview + 2) Kernel Command Line Parameters + 3) Using "rdev -r" + 4) An Example of Creating a Compressed RAM Disk + + +1) Overview +----------- + +The RAM disk driver is a way to use main system memory as a block device. It +is required for initrd, an initial filesystem used if you need to load modules +in order to access the root filesystem (see Documentation/admin-guide/initrd.rst). It can +also be used for a temporary filesystem for crypto work, since the contents +are erased on reboot. + +The RAM disk dynamically grows as more space is required. It does this by using +RAM from the buffer cache. The driver marks the buffers it is using as dirty +so that the VM subsystem does not try to reclaim them later. + +The RAM disk supports up to 16 RAM disks by default, and can be reconfigured +to support an unlimited number of RAM disks (at your own risk). Just change +the configuration symbol BLK_DEV_RAM_COUNT in the Block drivers config menu +and (re)build the kernel. + +To use RAM disk support with your system, run './MAKEDEV ram' from the /dev +directory. RAM disks are all major number 1, and start with minor number 0 +for /dev/ram0, etc. If used, modern kernels use /dev/ram0 for an initrd. + +The new RAM disk also has the ability to load compressed RAM disk images, +allowing one to squeeze more programs onto an average installation or +rescue floppy disk. + + +2) Parameters +--------------------------------- + +2a) Kernel Command Line Parameters + + ramdisk_size=N + ============== + +This parameter tells the RAM disk driver to set up RAM disks of N k size. The +default is 4096 (4 MB). + +2b) Module parameters + + rd_nr + ===== + /dev/ramX devices created. + + max_part + ======== + Maximum partition number. + + rd_size + ======= + See ramdisk_size. + +3) Using "rdev -r" +------------------ + +The usage of the word (two bytes) that "rdev -r" sets in the kernel image is +as follows. The low 11 bits (0 -> 10) specify an offset (in 1 k blocks) of up +to 2 MB (2^11) of where to find the RAM disk (this used to be the size). Bit +14 indicates that a RAM disk is to be loaded, and bit 15 indicates whether a +prompt/wait sequence is to be given before trying to read the RAM disk. Since +the RAM disk dynamically grows as data is being written into it, a size field +is not required. Bits 11 to 13 are not currently used and may as well be zero. +These numbers are no magical secrets, as seen below: + +./arch/x86/kernel/setup.c:#define RAMDISK_IMAGE_START_MASK 0x07FF +./arch/x86/kernel/setup.c:#define RAMDISK_PROMPT_FLAG 0x8000 +./arch/x86/kernel/setup.c:#define RAMDISK_LOAD_FLAG 0x4000 + +Consider a typical two floppy disk setup, where you will have the +kernel on disk one, and have already put a RAM disk image onto disk #2. + +Hence you want to set bits 0 to 13 as 0, meaning that your RAM disk +starts at an offset of 0 kB from the beginning of the floppy. +The command line equivalent is: "ramdisk_start=0" + +You want bit 14 as one, indicating that a RAM disk is to be loaded. +The command line equivalent is: "load_ramdisk=1" + +You want bit 15 as one, indicating that you want a prompt/keypress +sequence so that you have a chance to switch floppy disks. +The command line equivalent is: "prompt_ramdisk=1" + +Putting that together gives 2^15 + 2^14 + 0 = 49152 for an rdev word. +So to create disk one of the set, you would do: + + /usr/src/linux# cat arch/x86/boot/zImage > /dev/fd0 + /usr/src/linux# rdev /dev/fd0 /dev/fd0 + /usr/src/linux# rdev -r /dev/fd0 49152 + +If you make a boot disk that has LILO, then for the above, you would use: + append = "ramdisk_start=0 load_ramdisk=1 prompt_ramdisk=1" +Since the default start = 0 and the default prompt = 1, you could use: + append = "load_ramdisk=1" + + +4) An Example of Creating a Compressed RAM Disk +---------------------------------------------- + +To create a RAM disk image, you will need a spare block device to +construct it on. This can be the RAM disk device itself, or an +unused disk partition (such as an unmounted swap partition). For this +example, we will use the RAM disk device, "/dev/ram0". + +Note: This technique should not be done on a machine with less than 8 MB +of RAM. If using a spare disk partition instead of /dev/ram0, then this +restriction does not apply. + +a) Decide on the RAM disk size that you want. Say 2 MB for this example. + Create it by writing to the RAM disk device. (This step is not currently + required, but may be in the future.) It is wise to zero out the + area (esp. for disks) so that maximal compression is achieved for + the unused blocks of the image that you are about to create. + + dd if=/dev/zero of=/dev/ram0 bs=1k count=2048 + +b) Make a filesystem on it. Say ext2fs for this example. + + mke2fs -vm0 /dev/ram0 2048 + +c) Mount it, copy the files you want to it (eg: /etc/* /dev/* ...) + and unmount it again. + +d) Compress the contents of the RAM disk. The level of compression + will be approximately 50% of the space used by the files. Unused + space on the RAM disk will compress to almost nothing. + + dd if=/dev/ram0 bs=1k count=2048 | gzip -v9 > /tmp/ram_image.gz + +e) Put the kernel onto the floppy + + dd if=zImage of=/dev/fd0 bs=1k + +f) Put the RAM disk image onto the floppy, after the kernel. Use an offset + that is slightly larger than the kernel, so that you can put another + (possibly larger) kernel onto the same floppy later without overlapping + the RAM disk image. An offset of 400 kB for kernels about 350 kB in + size would be reasonable. Make sure offset+size of ram_image.gz is + not larger than the total space on your floppy (usually 1440 kB). + + dd if=/tmp/ram_image.gz of=/dev/fd0 bs=1k seek=400 + +g) Use "rdev" to set the boot device, RAM disk offset, prompt flag, etc. + For prompt_ramdisk=1, load_ramdisk=1, ramdisk_start=400, one would + have 2^15 + 2^14 + 400 = 49552. + + rdev /dev/fd0 /dev/fd0 + rdev -r /dev/fd0 49552 + +That is it. You now have your boot/root compressed RAM disk floppy. Some +users may wish to combine steps (d) and (f) by using a pipe. + +-------------------------------------------------------------------------- + Paul Gortmaker 12/95 + +Changelog: +---------- + +10-22-04 : Updated to reflect changes in command line options, remove + obsolete references, general cleanup. + James Nelson (james4765@gmail.com) + + +12-95 : Original Document diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt new file mode 100644 index 000000000..875b2b56b --- /dev/null +++ b/Documentation/blockdev/zram.txt @@ -0,0 +1,271 @@ +zram: Compressed RAM based block devices +---------------------------------------- + +* Introduction + +The zram module creates RAM based block devices named /dev/zram<id> +(<id> = 0, 1, ...). Pages written to these disks are compressed and stored +in memory itself. These disks allow very fast I/O and compression provides +good amounts of memory savings. Some of the usecases include /tmp storage, +use as swap disks, various caches under /var and maybe many more :) + +Statistics for individual zram devices are exported through sysfs nodes at +/sys/block/zram<id>/ + +* Usage + +There are several ways to configure and manage zram device(-s): +a) using zram and zram_control sysfs attributes +b) using zramctl utility, provided by util-linux (util-linux@vger.kernel.org). + +In this document we will describe only 'manual' zram configuration steps, +IOW, zram and zram_control sysfs attributes. + +In order to get a better idea about zramctl please consult util-linux +documentation, zramctl man-page or `zramctl --help'. Please be informed +that zram maintainers do not develop/maintain util-linux or zramctl, should +you have any questions please contact util-linux@vger.kernel.org + +Following shows a typical sequence of steps for using zram. + +WARNING +======= +For the sake of simplicity we skip error checking parts in most of the +examples below. However, it is your sole responsibility to handle errors. + +zram sysfs attributes always return negative values in case of errors. +The list of possible return codes: +-EBUSY -- an attempt to modify an attribute that cannot be changed once +the device has been initialised. Please reset device first; +-ENOMEM -- zram was not able to allocate enough memory to fulfil your +needs; +-EINVAL -- invalid input has been provided. + +If you use 'echo', the returned value that is changed by 'echo' utility, +and, in general case, something like: + + echo 3 > /sys/block/zram0/max_comp_streams + if [ $? -ne 0 ]; + handle_error + fi + +should suffice. + +1) Load Module: + modprobe zram num_devices=4 + This creates 4 devices: /dev/zram{0,1,2,3} + +num_devices parameter is optional and tells zram how many devices should be +pre-created. Default: 1. + +2) Set max number of compression streams +Regardless the value passed to this attribute, ZRAM will always +allocate multiple compression streams - one per online CPUs - thus +allowing several concurrent compression operations. The number of +allocated compression streams goes down when some of the CPUs +become offline. There is no single-compression-stream mode anymore, +unless you are running a UP system or has only 1 CPU online. + +To find out how many streams are currently available: + cat /sys/block/zram0/max_comp_streams + +3) Select compression algorithm +Using comp_algorithm device attribute one can see available and +currently selected (shown in square brackets) compression algorithms, +change selected compression algorithm (once the device is initialised +there is no way to change compression algorithm). + +Examples: + #show supported compression algorithms + cat /sys/block/zram0/comp_algorithm + lzo [lz4] + + #select lzo compression algorithm + echo lzo > /sys/block/zram0/comp_algorithm + +For the time being, the `comp_algorithm' content does not necessarily +show every compression algorithm supported by the kernel. We keep this +list primarily to simplify device configuration and one can configure +a new device with a compression algorithm that is not listed in +`comp_algorithm'. The thing is that, internally, ZRAM uses Crypto API +and, if some of the algorithms were built as modules, it's impossible +to list all of them using, for instance, /proc/crypto or any other +method. This, however, has an advantage of permitting the usage of +custom crypto compression modules (implementing S/W or H/W compression). + +4) Set Disksize +Set disk size by writing the value to sysfs node 'disksize'. +The value can be either in bytes or you can use mem suffixes. +Examples: + # Initialize /dev/zram0 with 50MB disksize + echo $((50*1024*1024)) > /sys/block/zram0/disksize + + # Using mem suffixes + echo 256K > /sys/block/zram0/disksize + echo 512M > /sys/block/zram0/disksize + echo 1G > /sys/block/zram0/disksize + +Note: +There is little point creating a zram of greater than twice the size of memory +since we expect a 2:1 compression ratio. Note that zram uses about 0.1% of the +size of the disk when not in use so a huge zram is wasteful. + +5) Set memory limit: Optional +Set memory limit by writing the value to sysfs node 'mem_limit'. +The value can be either in bytes or you can use mem suffixes. +In addition, you could change the value in runtime. +Examples: + # limit /dev/zram0 with 50MB memory + echo $((50*1024*1024)) > /sys/block/zram0/mem_limit + + # Using mem suffixes + echo 256K > /sys/block/zram0/mem_limit + echo 512M > /sys/block/zram0/mem_limit + echo 1G > /sys/block/zram0/mem_limit + + # To disable memory limit + echo 0 > /sys/block/zram0/mem_limit + +6) Activate: + mkswap /dev/zram0 + swapon /dev/zram0 + + mkfs.ext4 /dev/zram1 + mount /dev/zram1 /tmp + +7) Add/remove zram devices + +zram provides a control interface, which enables dynamic (on-demand) device +addition and removal. + +In order to add a new /dev/zramX device, perform read operation on hot_add +attribute. This will return either new device's device id (meaning that you +can use /dev/zram<id>) or error code. + +Example: + cat /sys/class/zram-control/hot_add + 1 + +To remove the existing /dev/zramX device (where X is a device id) +execute + echo X > /sys/class/zram-control/hot_remove + +8) Stats: +Per-device statistics are exported as various nodes under /sys/block/zram<id>/ + +A brief description of exported device attributes. For more details please +read Documentation/ABI/testing/sysfs-block-zram. + +Name access description +---- ------ ----------- +disksize RW show and set the device's disk size +initstate RO shows the initialization state of the device +reset WO trigger device reset +mem_used_max WO reset the `mem_used_max' counter (see later) +mem_limit WO specifies the maximum amount of memory ZRAM can use + to store the compressed data +max_comp_streams RW the number of possible concurrent compress operations +comp_algorithm RW show and change the compression algorithm +compact WO trigger memory compaction +debug_stat RO this file is used for zram debugging purposes +backing_dev RW set up backend storage for zram to write out + + +User space is advised to use the following files to read the device statistics. + +File /sys/block/zram<id>/stat + +Represents block layer statistics. Read Documentation/block/stat.txt for +details. + +File /sys/block/zram<id>/io_stat + +The stat file represents device's I/O statistics not accounted by block +layer and, thus, not available in zram<id>/stat file. It consists of a +single line of text and contains the following stats separated by +whitespace: + failed_reads the number of failed reads + failed_writes the number of failed writes + invalid_io the number of non-page-size-aligned I/O requests + notify_free Depending on device usage scenario it may account + a) the number of pages freed because of swap slot free + notifications or b) the number of pages freed because of + REQ_DISCARD requests sent by bio. The former ones are + sent to a swap block device when a swap slot is freed, + which implies that this disk is being used as a swap disk. + The latter ones are sent by filesystem mounted with + discard option, whenever some data blocks are getting + discarded. + +File /sys/block/zram<id>/mm_stat + +The stat file represents device's mm statistics. It consists of a single +line of text and contains the following stats separated by whitespace: + orig_data_size uncompressed size of data stored in this disk. + This excludes same-element-filled pages (same_pages) since + no memory is allocated for them. + Unit: bytes + compr_data_size compressed size of data stored in this disk + mem_used_total the amount of memory allocated for this disk. This + includes allocator fragmentation and metadata overhead, + allocated for this disk. So, allocator space efficiency + can be calculated using compr_data_size and this statistic. + Unit: bytes + mem_limit the maximum amount of memory ZRAM can use to store + the compressed data + mem_used_max the maximum amount of memory zram have consumed to + store the data + same_pages the number of same element filled pages written to this disk. + No memory is allocated for such pages. + pages_compacted the number of pages freed during compaction + huge_pages the number of incompressible pages + +9) Deactivate: + swapoff /dev/zram0 + umount /dev/zram1 + +10) Reset: + Write any positive value to 'reset' sysfs node + echo 1 > /sys/block/zram0/reset + echo 1 > /sys/block/zram1/reset + + This frees all the memory allocated for the given device and + resets the disksize to zero. You must set the disksize again + before reusing the device. + +* Optional Feature + += writeback + +With incompressible pages, there is no memory saving with zram. +Instead, with CONFIG_ZRAM_WRITEBACK, zram can write incompressible page +to backing storage rather than keeping it in memory. +User should set up backing device via /sys/block/zramX/backing_dev +before disksize setting. + += memory tracking + +With CONFIG_ZRAM_MEMORY_TRACKING, user can know information of the +zram block. It could be useful to catch cold or incompressible +pages of the process with*pagemap. +If you enable the feature, you could see block state via +/sys/kernel/debug/zram/zram0/block_state". The output is as follows, + + 300 75.033841 .wh + 301 63.806904 s.. + 302 63.806919 ..h + +First column is zram's block index. +Second column is access time since the system was booted +Third column is state of the block. +(s: same page +w: written page to backing store +h: huge page) + +First line of above example says 300th block is accessed at 75.033841sec +and the block's state is huge so it is written back to the backing +storage. It's a debugging feature so anyone shouldn't rely on it to work +properly. + +Nitin Gupta +ngupta@vflare.org |