drbd: always write bitmap on detach

If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
author: Lars Ellenberg 2012-09-27 15:18:21 +0200
committer: Philipp Reisner 2012-11-09 14:11:41 +0100
commit: edc9f5eb7afa3d832f540fcfe10e3e1087e6f527 (patch)
tree: eba63d771575a42a6aa81bd55a59f7d6253d18ea /drivers/block/drbd/drbd_main.c
parent: drbd: wait for meta data IO completion even with failed disk, unless force-de... (diff)
download: kernel-qcow2-linux-edc9f5eb7afa3d832f540fcfe10e3e1087e6f527.tar.gz
kernel-qcow2-linux-edc9f5eb7afa3d832f540fcfe10e3e1087e6f527.tar.xz
kernel-qcow2-linux-edc9f5eb7afa3d832f540fcfe10e3e1087e6f527.zip
1 files changed, 20 insertions, 0 deletions
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 5e5a6abb2819..0f73e157dee0 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -3226,6 +3226,26 @@ static int w_go_diskless(struct drbd_work *w, int unused)
 	 * inc/dec it frequently. Once we are D_DISKLESS, no one will touch
 	 * the protected members anymore, though, so once put_ldev reaches zero
 	 * again, it will be safe to free them. */
+
+	/* Try to write changed bitmap pages, read errors may have just
+	 * set some bits outside the area covered by the activity log.
+	 *
+	 * If we have an IO error during the bitmap writeout,
+	 * we will want a full sync next time, just in case.
+	 * (Do we want a specific meta data flag for this?)
+	 *
+	 * If that does not make it to stable storage either,
+	 * we cannot do anything about that anymore.  */
+	if (mdev->bitmap) {
+		if (drbd_bitmap_io_from_worker(mdev, drbd_bm_write,
+					"detach", BM_LOCKED_MASK)) {
+			if (test_bit(WAS_READ_ERROR, &mdev->flags)) {
+				drbd_md_set_flag(mdev, MDF_FULL_SYNC);
+				drbd_md_sync(mdev);
+			}
+		}
+	}
+
 	drbd_force_state(mdev, NS(disk, D_DISKLESS));
 	return 0;
 }
author	Lars Ellenberg	2012-09-27 15:18:21 +0200
committer	Philipp Reisner	2012-11-09 14:11:41 +0100
commit	edc9f5eb7afa3d832f540fcfe10e3e1087e6f527 (patch)
tree	eba63d771575a42a6aa81bd55a59f7d6253d18ea /drivers/block/drbd/drbd_main.c
parent	drbd: wait for meta data IO completion even with failed disk, unless force-de... (diff)
download	kernel-qcow2-linux-edc9f5eb7afa3d832f540fcfe10e3e1087e6f527.tar.gz kernel-qcow2-linux-edc9f5eb7afa3d832f540fcfe10e3e1087e6f527.tar.xz kernel-qcow2-linux-edc9f5eb7afa3d832f540fcfe10e3e1087e6f527.zip