summaryrefslogtreecommitdiffstats
path: root/Documentation/admin-guide/device-mapper/dm-integrity.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/admin-guide/device-mapper/dm-integrity.rst')
-rw-r--r--Documentation/admin-guide/device-mapper/dm-integrity.rst303
1 files changed, 303 insertions, 0 deletions
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
new file mode 100644
index 0000000000..d8a5f14d0e
--- /dev/null
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -0,0 +1,303 @@
+============
+dm-integrity
+============
+
+The dm-integrity target emulates a block device that has additional
+per-sector tags that can be used for storing integrity information.
+
+A general problem with storing integrity tags with every sector is that
+writing the sector and the integrity tag must be atomic - i.e. in case of
+crash, either both sector and integrity tag or none of them is written.
+
+To guarantee write atomicity, the dm-integrity target uses journal, it
+writes sector data and integrity tags into a journal, commits the journal
+and then copies the data and integrity tags to their respective location.
+
+The dm-integrity target can be used with the dm-crypt target - in this
+situation the dm-crypt target creates the integrity data and passes them
+to the dm-integrity target via bio_integrity_payload attached to the bio.
+In this mode, the dm-crypt and dm-integrity targets provide authenticated
+disk encryption - if the attacker modifies the encrypted device, an I/O
+error is returned instead of random data.
+
+The dm-integrity target can also be used as a standalone target, in this
+mode it calculates and verifies the integrity tag internally. In this
+mode, the dm-integrity target can be used to detect silent data
+corruption on the disk or in the I/O path.
+
+There's an alternate mode of operation where dm-integrity uses a bitmap
+instead of a journal. If a bit in the bitmap is 1, the corresponding
+region's data and integrity tags are not synchronized - if the machine
+crashes, the unsynchronized regions will be recalculated. The bitmap mode
+is faster than the journal mode, because we don't have to write the data
+twice, but it is also less reliable, because if data corruption happens
+when the machine crashes, it may not be detected.
+
+When loading the target for the first time, the kernel driver will format
+the device. But it will only format the device if the superblock contains
+zeroes. If the superblock is neither valid nor zeroed, the dm-integrity
+target can't be loaded.
+
+Accesses to the on-disk metadata area containing checksums (aka tags) are
+buffered using dm-bufio. When an access to any given metadata area
+occurs, each unique metadata area gets its own buffer(s). The buffer size
+is capped at the size of the metadata area, but may be smaller, thereby
+requiring multiple buffers to represent the full metadata area. A smaller
+buffer size will produce a smaller resulting read/write operation to the
+metadata area for small reads/writes. The metadata is still read even in
+a full write to the data covered by a single buffer.
+
+To use the target for the first time:
+
+1. overwrite the superblock with zeroes
+2. load the dm-integrity target with one-sector size, the kernel driver
+ will format the device
+3. unload the dm-integrity target
+4. read the "provided_data_sectors" value from the superblock
+5. load the dm-integrity target with the target size
+ "provided_data_sectors"
+6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target
+ with the size "provided_data_sectors"
+
+
+Target arguments:
+
+1. the underlying block device
+
+2. the number of reserved sector at the beginning of the device - the
+ dm-integrity won't read of write these sectors
+
+3. the size of the integrity tag (if "-" is used, the size is taken from
+ the internal-hash algorithm)
+
+4. mode:
+
+ D - direct writes (without journal)
+ in this mode, journaling is
+ not used and data sectors and integrity tags are written
+ separately. In case of crash, it is possible that the data
+ and integrity tag doesn't match.
+ J - journaled writes
+ data and integrity tags are written to the
+ journal and atomicity is guaranteed. In case of crash,
+ either both data and tag or none of them are written. The
+ journaled mode degrades write throughput twice because the
+ data have to be written twice.
+ B - bitmap mode - data and metadata are written without any
+ synchronization, the driver maintains a bitmap of dirty
+ regions where data and metadata don't match. This mode can
+ only be used with internal hash.
+ R - recovery mode - in this mode, journal is not replayed,
+ checksums are not checked and writes to the device are not
+ allowed. This mode is useful for data recovery if the
+ device cannot be activated in any of the other standard
+ modes.
+
+5. the number of additional arguments
+
+Additional arguments:
+
+journal_sectors:number
+ The size of journal, this argument is used only if formatting the
+ device. If the device is already formatted, the value from the
+ superblock is used.
+
+interleave_sectors:number (default 32768)
+ The number of interleaved sectors. This values is rounded down to
+ a power of two. If the device is already formatted, the value from
+ the superblock is used.
+
+meta_device:device
+ Don't interleave the data and metadata on the device. Use a
+ separate device for metadata.
+
+buffer_sectors:number (default 128)
+ The number of sectors in one metadata buffer. The value is rounded
+ down to a power of two.
+
+journal_watermark:number (default 50)
+ The journal watermark in percents. When the size of the journal
+ exceeds this watermark, the thread that flushes the journal will
+ be started.
+
+commit_time:number (default 10000)
+ Commit time in milliseconds. When this time passes, the journal is
+ written. The journal is also written immediately if the FLUSH
+ request is received.
+
+internal_hash:algorithm(:key) (the key is optional)
+ Use internal hash or crc.
+ When this argument is used, the dm-integrity target won't accept
+ integrity tags from the upper target, but it will automatically
+ generate and verify the integrity tags.
+
+ You can use a crc algorithm (such as crc32), then integrity target
+ will protect the data against accidental corruption.
+ You can also use a hmac algorithm (for example
+ "hmac(sha256):0123456789abcdef"), in this mode it will provide
+ cryptographic authentication of the data without encryption.
+
+ When this argument is not used, the integrity tags are accepted
+ from an upper layer target, such as dm-crypt. The upper layer
+ target should check the validity of the integrity tags.
+
+recalculate
+ Recalculate the integrity tags automatically. It is only valid
+ when using internal hash.
+
+journal_crypt:algorithm(:key) (the key is optional)
+ Encrypt the journal using given algorithm to make sure that the
+ attacker can't read the journal. You can use a block cipher here
+ (such as "cbc(aes)") or a stream cipher (for example "chacha20"
+ or "ctr(aes)").
+
+ The journal contains history of last writes to the block device,
+ an attacker reading the journal could see the last sector numbers
+ that were written. From the sector numbers, the attacker can infer
+ the size of files that were written. To protect against this
+ situation, you can encrypt the journal.
+
+journal_mac:algorithm(:key) (the key is optional)
+ Protect sector numbers in the journal from accidental or malicious
+ modification. To protect against accidental modification, use a
+ crc algorithm, to protect against malicious modification, use a
+ hmac algorithm with a key.
+
+ This option is not needed when using internal-hash because in this
+ mode, the integrity of journal entries is checked when replaying
+ the journal. Thus, modified sector number would be detected at
+ this stage.
+
+block_size:number (default 512)
+ The size of a data block in bytes. The larger the block size the
+ less overhead there is for per-block integrity metadata.
+ Supported values are 512, 1024, 2048 and 4096 bytes.
+
+sectors_per_bit:number
+ In the bitmap mode, this parameter specifies the number of
+ 512-byte sectors that corresponds to one bitmap bit.
+
+bitmap_flush_interval:number
+ The bitmap flush interval in milliseconds. The metadata buffers
+ are synchronized when this interval expires.
+
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
+
+fix_padding
+ Use a smaller padding of the tag area that is more
+ space-efficient. If this option is not present, large padding is
+ used - that is for compatibility with older kernels.
+
+fix_hmac
+ Improve security of internal_hash and journal_mac:
+
+ - the section number is mixed to the mac, so that an attacker can't
+ copy sectors from one journal section to another journal section
+ - the superblock is protected by journal_mac
+ - a 16-byte salt stored in the superblock is mixed to the mac, so
+ that the attacker can't detect that two disks have the same hmac
+ key and also to disallow the attacker to move sectors from one
+ disk to another
+
+legacy_recalculate
+ Allow recalculating of volumes with HMAC keys. This is disabled by
+ default for security reasons - an attacker could modify the volume,
+ set recalc_sector to zero, and the kernel would not detect the
+ modification.
+
+The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
+allow_discards can be changed when reloading the target (load an inactive
+table and swap the tables with suspend and resume). The other arguments
+should not be changed when reloading the target because the layout of disk
+data depend on them and the reloaded target would be non-functional.
+
+For example, on a device using the default interleave_sectors of 32768, a
+block_size of 512, and an internal_hash of crc32c with a tag size of 4
+bytes, it will take 128 KiB of tags to track a full data area, requiring
+256 sectors of metadata per data area. With the default buffer_sectors of
+128, that means there will be 2 buffers per metadata area, or 2 buffers
+per 16 MiB of data.
+
+Status line:
+
+1. the number of integrity mismatches
+2. provided data sectors - that is the number of sectors that the user
+ could use
+3. the current recalculating position (or '-' if we didn't recalculate)
+
+
+The layout of the formatted block device:
+
+* reserved sectors
+ (they are not used by this target, they can be used for
+ storing LUKS metadata or for other purpose), the size of the reserved
+ area is specified in the target arguments
+
+* superblock (4kiB)
+ * magic string - identifies that the device was formatted
+ * version
+ * log2(interleave sectors)
+ * integrity tag size
+ * the number of journal sections
+ * provided data sectors - the number of sectors that this target
+ provides (i.e. the size of the device minus the size of all
+ metadata and padding). The user of this target should not send
+ bios that access data beyond the "provided data sectors" limit.
+ * flags
+ SB_FLAG_HAVE_JOURNAL_MAC
+ - a flag is set if journal_mac is used
+ SB_FLAG_RECALCULATING
+ - recalculating is in progress
+ SB_FLAG_DIRTY_BITMAP
+ - journal area contains the bitmap of dirty
+ blocks
+ * log2(sectors per block)
+ * a position where recalculating finished
+* journal
+ The journal is divided into sections, each section contains:
+
+ * metadata area (4kiB), it contains journal entries
+
+ - every journal entry contains:
+
+ * logical sector (specifies where the data and tag should
+ be written)
+ * last 8 bytes of data
+ * integrity tag (the size is specified in the superblock)
+
+ - every metadata sector ends with
+
+ * mac (8-bytes), all the macs in 8 metadata sectors form a
+ 64-byte value. It is used to store hmac of sector
+ numbers in the journal section, to protect against a
+ possibility that the attacker tampers with sector
+ numbers in the journal.
+ * commit id
+
+ * data area (the size is variable; it depends on how many journal
+ entries fit into the metadata area)
+
+ - every sector in the data area contains:
+
+ * data (504 bytes of data, the last 8 bytes are stored in
+ the journal entry)
+ * commit id
+
+ To test if the whole journal section was written correctly, every
+ 512-byte sector of the journal ends with 8-byte commit id. If the
+ commit id matches on all sectors in a journal section, then it is
+ assumed that the section was written correctly. If the commit id
+ doesn't match, the section was written partially and it should not
+ be replayed.
+
+* one or more runs of interleaved tags and data.
+ Each run contains:
+
+ * tag area - it contains integrity tags. There is one tag for each
+ sector in the data area. The size of this area is always 4KiB or
+ greater.
+ * data area - it contains data sectors. The number of data sectors
+ in one run must be a power of two. log2 of this value is stored
+ in the superblock.