Adding upstream version 6.9.11.upstream/6.9.11

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-08-07 13:16:07 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-08-07 13:16:07 +0000
commit: ff146e0175fa9b27a6838c64c76f1e3d4b4cc6a6 (patch)
tree: eb37b8a1f852d4b842862b002c1da338ee11d9d2 /fs/btrfs/ordered-data.c
parent: Adding upstream version 6.9.10. (diff)
download: linux-ff146e0175fa9b27a6838c64c76f1e3d4b4cc6a6.tar.xz
linux-ff146e0175fa9b27a6838c64c76f1e3d4b4cc6a6.zip
1 files changed, 31 insertions, 0 deletions
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index c2a42bcde9..7dbf4162c7 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -382,6 +382,37 @@ bool btrfs_finish_ordered_extent(struct btrfs_ordered_extent *ordered,
 	ret = can_finish_ordered_extent(ordered, page, file_offset, len, uptodate);
 	spin_unlock_irqrestore(&inode->ordered_tree_lock, flags);
 
+	/*
+	 * If this is a COW write it means we created new extent maps for the
+	 * range and they point to unwritten locations if we got an error either
+	 * before submitting a bio or during IO.
+	 *
+	 * We have marked the ordered extent with BTRFS_ORDERED_IOERR, and we
+	 * are queuing its completion below. During completion, at
+	 * btrfs_finish_one_ordered(), we will drop the extent maps for the
+	 * unwritten extents.
+	 *
+	 * However because completion runs in a work queue we can end up having
+	 * a fast fsync running before that. In the case of direct IO, once we
+	 * unlock the inode the fsync might start, and we queue the completion
+	 * before unlocking the inode. In the case of buffered IO when writeback
+	 * finishes (end_bbio_data_write()) we queue the completion, so if the
+	 * writeback was triggered by a fast fsync, the fsync might start
+	 * logging before ordered extent completion runs in the work queue.
+	 *
+	 * The fast fsync will log file extent items based on the extent maps it
+	 * finds, so if by the time it collects extent maps the ordered extent
+	 * completion didn't happen yet, it will log file extent items that
+	 * point to unwritten extents, resulting in a corruption if a crash
+	 * happens and the log tree is replayed. Note that a fast fsync does not
+	 * wait for completion of ordered extents in order to reduce latency.
+	 *
+	 * Set a flag in the inode so that the next fast fsync will wait for
+	 * ordered extents to complete before starting to log.
+	 */
+	if (!uptodate && !test_bit(BTRFS_ORDERED_NOCOW, &ordered->flags))
+		set_bit(BTRFS_INODE_COW_WRITE_ERROR, &inode->runtime_flags);
+
 	if (ret)
 		btrfs_queue_ordered_fn(ordered);
 	return ret;
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-08-07 13:16:07 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-08-07 13:16:07 +0000
commit	ff146e0175fa9b27a6838c64c76f1e3d4b4cc6a6 (patch)
tree	eb37b8a1f852d4b842862b002c1da338ee11d9d2 /fs/btrfs/ordered-data.c
parent	Adding upstream version 6.9.10. (diff)
download	linux-ff146e0175fa9b27a6838c64c76f1e3d4b4cc6a6.tar.xz linux-ff146e0175fa9b27a6838c64c76f1e3d4b4cc6a6.zip