summaryrefslogtreecommitdiffstats
path: root/block
Commit message (Collapse)AuthorAgeFilesLines
* block/parallels: Fix buffer-based write callHanna Reitz2022-07-261-2/+2
| | | | | | | | | | | | | | | | | | | Commit a4072543ccdddbd241d5962d9237b8b41fd006bf has changed the I/O here from working on a local one-element I/O vector to just using the buffer directly (using the bdrv_co_pread()/bdrv_co_pwrite() helper functions introduced shortly before). However, it only changed the bdrv_co_preadv() call to bdrv_co_pread() - the subsequent bdrv_co_pwritev() call stayed this way, and so still expects a QEMUIOVector pointer instead of a plain buffer. We must change that to be a bdrv_co_pwrite() call. Fixes: a4072543ccdddbd241d5962d ("block/parallels: use buffer-based io") Signed-off-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220714132801.72464-2-hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* block: Remove remaining unused symbols in coroutines.hAlberto Faria2022-07-122-22/+3Star
| | | | | | | | | | Some can be made static, others are unused generated_co_wrappers. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-19-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Reorganize some declarations in block-backend-io.hAlberto Faria2022-07-121-0/+22
| | | | | | | | | | | | | | Keep generated_co_wrapper and coroutine_fn pairs together. This should make it clear that each I/O function has these two versions. Also move blk_co_{pread,pwrite}()'s implementations out of the header file for consistency. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220705161527.1054072-18-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add blk_co_truncate()Alberto Faria2022-07-121-3/+4
| | | | | | | | | | Also convert blk_truncate() into a generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-17-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add blk_co_ioctl()Alberto Faria2022-07-122-9/+4Star
| | | | | | | | | | Also convert blk_ioctl() into a generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-16-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Implement blk_flush() using generated_co_wrapperAlberto Faria2022-07-122-13/+0Star
| | | | | | | | Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-15-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Implement blk_pdiscard() using generated_co_wrapperAlberto Faria2022-07-122-15/+0Star
| | | | | | | | Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-14-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Implement blk_pwrite_zeroes() using generated_co_wrapperAlberto Faria2022-07-121-8/+0Star
| | | | | | | | Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-13-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add blk_co_pwrite_compressed()Alberto Faria2022-07-121-4/+4
| | | | | | | | | | Also convert blk_pwrite_compressed() into a generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-12-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Change blk_pwrite_compressed() param orderAlberto Faria2022-07-121-2/+2
| | | | | | | | | | Swap 'buf' and 'bytes' around for consistency with other I/O functions. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-11-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Export blk_pwritev_part() in block-backend-io.hAlberto Faria2022-07-122-19/+0Star
| | | | | | | | | | Also convert it into a generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-10-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add blk_[co_]preadv_part()Alberto Faria2022-07-122-12/+23
| | | | | | | | | | Implement blk_preadv_part() using generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-9-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Implement blk_{pread,pwrite}() using generated_co_wrapperAlberto Faria2022-07-123-27/+1Star
| | | | | | | | | | We need to add include/sysemu/block-backend-io.h to the inputs of the block-gen.c target defined in block/meson.build. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-7-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Make 'bytes' param of blk_{pread,pwrite}() an int64_tAlberto Faria2022-07-121-3/+3
| | | | | | | | | | | For consistency with other I/O functions, and in preparation to implement them using generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-5-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Change blk_{pread,pwrite}() param orderAlberto Faria2022-07-1212-41/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | Swap 'buf' and 'bytes' around for consistency with blk_co_{pread,pwrite}(), and in preparation to implement these functions using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression blk, offset, buf, bytes, flags; @@ - blk_pread(blk, offset, buf, bytes, flags) + blk_pread(blk, offset, bytes, buf, flags) @@ expression blk, offset, buf, bytes, flags; @@ - blk_pwrite(blk, offset, buf, bytes, flags) + blk_pwrite(blk, offset, bytes, buf, flags) It had no effect on hw/block/nand.c, presumably due to the #if, so that file was updated manually. Overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-4-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add a 'flags' param to blk_pread()Alberto Faria2022-07-123-4/+5
| | | | | | | | | | | | | | | | | | | | | | | For consistency with other I/O functions, and in preparation to implement it using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression blk, offset, buf, bytes; @@ - blk_pread(blk, offset, buf, bytes) + blk_pread(blk, offset, buf, bytes, 0) It had no effect on hw/block/nand.c, presumably due to the #if, so that file was updated manually. Overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220705161527.1054072-3-afaria@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Make blk_{pread,pwrite}() return 0 on successAlberto Faria2022-07-122-8/+5Star
| | | | | | | | | | | | | They currently return the value of their 'bytes' parameter on success. Make them return 0 instead, for consistency with other I/O functions and in preparation to implement them using generated_co_wrapper. This also makes it clear that short reads/writes are not possible. Signed-off-by: Alberto Faria <afaria@redhat.com> Message-Id: <20220705161527.1054072-2-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block/qcow2: Use bdrv_pwrite_sync() in qcow2_mark_dirty()Alberto Faria2022-07-121-6/+3Star
| | | | | | | | | | | | Use bdrv_pwrite_sync() instead of calling bdrv_pwrite() and bdrv_flush() separately. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-11-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Use bdrv_co_pwrite_sync() when caller is coroutine_fnAlberto Faria2022-07-123-6/+6
| | | | | | | | | | | | Convert uses of bdrv_pwrite_sync() into bdrv_co_pwrite_sync() when the callers are already coroutine_fn. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-10-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add bdrv_co_pwrite_sync()Alberto Faria2022-07-121-4/+5
| | | | | | | | | | | | Also convert bdrv_pwrite_sync() to being implemented using generated_co_wrapper. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-9-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Implement bdrv_{pread,pwrite,pwrite_zeroes}() using generated_co_wrapperAlberto Faria2022-07-121-41/+0Star
| | | | | | | | | | | | | | | | bdrv_{pread,pwrite}() now return -EIO instead of -EINVAL when 'bytes' is negative, making them consistent with bdrv_{preadv,pwritev}() and bdrv_co_{pread,pwrite,preadv,pwritev}(). bdrv_pwrite_zeroes() now also calls trace_bdrv_co_pwrite_zeroes() and clears the BDRV_REQ_MAY_UNMAP flag when appropriate, which it didn't previously. Signed-off-by: Alberto Faria <afaria@redhat.com> Message-Id: <20220609152744.3891847-8-afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Make 'bytes' param of bdrv_co_{pread,pwrite,preadv,pwritev}() an int64_tAlberto Faria2022-07-121-2/+2
| | | | | | | | | | | | | | | | | For consistency with other I/O functions, and in preparation to implement bdrv_{pread,pwrite}() using generated_co_wrapper. unsigned int fits in int64_t, so all callers remain correct. bdrv_check_request32() is called further down the stack and causes -EIO to be returned if 'bytes' is negative or greater than BDRV_REQUEST_MAX_BYTES, which in turns never exceeds SIZE_MAX. Signed-off-by: Alberto Faria <afaria@redhat.com> Message-Id: <20220609152744.3891847-7-afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* crypto: Make block callbacks return 0 on successAlberto Faria2022-07-122-37/+37
| | | | | | | | | | | | | They currently return the value of their headerlen/buflen parameter on success. Returning 0 instead makes it clear that short reads/writes are not possible. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-5-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Make bdrv_{pread,pwrite}() return 0 on successAlberto Faria2022-07-129-29/+17Star
| | | | | | | | | | | | | | | | | | They currently return the value of their 'bytes' parameter on success. Make them return 0 instead, for consistency with other I/O functions and in preparation to implement them using generated_co_wrapper. This also makes it clear that short reads/writes are not possible. The few callers that rely on the previous behavior are adjusted accordingly by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220609152744.3891847-4-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Change bdrv_{pread,pwrite,pwrite_sync}() param orderAlberto Faria2022-07-1222-230/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Swap 'buf' and 'bytes' around for consistency with bdrv_co_{pread,pwrite}(), and in preparation to implement these functions using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression child, offset, buf, bytes, flags; @@ - bdrv_pread(child, offset, buf, bytes, flags) + bdrv_pread(child, offset, bytes, buf, flags) @@ expression child, offset, buf, bytes, flags; @@ - bdrv_pwrite(child, offset, buf, bytes, flags) + bdrv_pwrite(child, offset, bytes, buf, flags) @@ expression child, offset, buf, bytes, flags; @@ - bdrv_pwrite_sync(child, offset, buf, bytes, flags) + bdrv_pwrite_sync(child, offset, bytes, buf, flags) Resulting overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220609152744.3891847-3-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block: Add a 'flags' param to bdrv_{pread,pwrite,pwrite_sync}()Alberto Faria2022-07-1222-212/+211Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | For consistency with other I/O functions, and in preparation to implement them using generated_co_wrapper. Callers were updated using this Coccinelle script: @@ expression child, offset, buf, bytes; @@ - bdrv_pread(child, offset, buf, bytes) + bdrv_pread(child, offset, buf, bytes, 0) @@ expression child, offset, buf, bytes; @@ - bdrv_pwrite(child, offset, buf, bytes) + bdrv_pwrite(child, offset, buf, bytes, 0) @@ expression child, offset, buf, bytes; @@ - bdrv_pwrite_sync(child, offset, buf, bytes) + bdrv_pwrite_sync(child, offset, buf, bytes, 0) Resulting overly-long lines were then fixed by hand. Signed-off-by: Alberto Faria <afaria@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20220609152744.3891847-2-afaria@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
* block/io_uring: clarify that short reads can happenStefan Hajnoczi2022-07-071-6/+2Star
| | | | | | | | | | | | | | | | | | | | Jens Axboe has confirmed that short reads are rare but can happen: https://lore.kernel.org/io-uring/YsU%2FCGkl9ZXUI+Tj@stefanha-x1.localdomain/T/#m729963dc577d709b709c191922e98ec79d7eef54 The luring_resubmit_short_read() comment claimed they were only due to a specific io_uring bug that was fixed in Linux commit 9d93a3f5a0c ("io_uring: punt short reads to async context"), which is wrong. Dominique Martinet found that a btrfs bug also causes short reads. There may be more kernel code paths that result in short reads. Let's consider short reads fair game. Cc: Dominique Martinet <dominique.martinet@atmark-techno.com> Based-on: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20220706080341.1206476-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* io_uring: fix short read slow pathDominique Martinet2022-07-071-2/+2
| | | | | | | | | | | | | | | | | | | sqeq.off here is the offset to read within the disk image, so obviously not 'nread' (the amount we just read), but as the author meant to write its current value incremented by the amount we just read. Normally recent versions of linux will not issue short reads, but it can happen so we should fix this. This lead to weird image corruptions when short read happened Fixes: 6663a0a33764 ("block/io_uring: implements interfaces for io_uring") Link: https://lkml.kernel.org/r/YrrFGO4A1jS0GI0G@atmark-techno.com Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com> Message-Id: <20220630010137.2518851-1-dominique.martinet@atmark-techno.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* block: use 'unsigned' for in_flight field on driver stateDenis V. Lunev2022-06-292-2/+2
| | | | | | | | | | | | | | | This patch makes in_flight field 'unsigned' for BDRVNBDState and MirrorBlockJob. This matches the definition of this field on BDS and is generically correct - we should never get negative value here. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: John Snow <jsnow@redhat.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> CC: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* nbd: trace long NBD operationsDenis V. Lunev2022-06-292-1/+7
| | | | | | | | | | | | | | | | | | | At the moment there are 2 sources of lengthy operations if configured: * open connection, which could retry inside and * reconnect of already opened connection These operations could be quite lengthy and cumbersome to catch thus it would be quite natural to add trace points for them. This patch is based on the original downstream work made by Vladimir. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Eric Blake <eblake@redhat.com> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> CC: Kevin Wolf <kwolf@redhat.com> CC: Hanna Reitz <hreitz@redhat.com> CC: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* block/copy-before-write: implement cbw-timeout optionVladimir Sementsov-Ogievskiy2022-06-291-1/+21
| | | | | | | | | | | | | | | | | | | | | | In some scenarios, when copy-before-write operations lasts too long time, it's better to cancel it. Most useful would be to use the new option together with on-cbw-error=break-snapshot: this way if cbw operation takes too long time we'll just cancel backup process but do not disturb the guest too much. Note the tricky point of realization: we keep additional point in bs->in_flight during block_copy operation even if it's timed-out. Background "cancelled" block_copy operations will finish at some point and will want to access state. We should care to not free the state in .bdrv_close() earlier. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> [vsementsov: use bdrv_inc_in_flight()/bdrv_dec_in_flight() instead of direct manipulation on bs->in_flight] Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* block/block-copy: block_copy(): add timeout_ns parameterVladimir Sementsov-Ogievskiy2022-06-292-8/+27
| | | | | | | | | | | | | | Add possibility to limit block_copy() call in time. To be used in the next commit. As timed-out block_copy() call will continue in background anyway (we can't immediately cancel IO operation), it's important also give user a possibility to pass a callback, to do some additional actions on block-copy call finish. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* block/copy-before-write: add on-cbw-error open parameterVladimir Sementsov-Ogievskiy2022-06-281-2/+30
| | | | | | | | | | | | | | | | | | | Currently, behavior on copy-before-write operation failure is simple: report error to the guest. Let's implement alternative behavior: break the whole copy-before-write process (and corresponding backup job or NBD client) but keep guest working. It's needed if we consider guest stability as more important. The realisation is simple: on copy-before-write failure we set s->snapshot_ret and continue guest operations. s->snapshot_ret being set will lead to all further snapshot API requests. Note that all in-flight snapshot-API requests may still success: we do wait for them on BREAK_SNAPSHOT-failure path in cbw_do_copy_before_write(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* block/copy-before-write: refactor option parsingVladimir Sementsov-Ogievskiy2022-06-281-27/+29
| | | | | | | | | We are going to add one more option of enum type. Let's refactor option parsing so that we can simply work with BlockdevOptionsCbw object. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
* vduse-blk: Add name optionXie Yongji2022-06-241-2/+2
| | | | | | | | | | | | | | | | | Currently we use 'id' option as the name of VDUSE device. It's a bit confusing since we use one value for two different purposes: the ID to identfy the export within QEMU (must be distinct from any other exports in the same QEMU process, but can overlap with names used by other processes), and the VDUSE name to uniquely identify it on the host (must be distinct from other VDUSE devices on the same host, but can overlap with other export types like NBD in the same process). To make it clear, this patch adds a separate 'name' option to specify the VDUSE name for the vduse-blk export instead. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220614051532.92-7-xieyongji@bytedance.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* vduse-blk: Add serial optionXie Yongji2022-06-243-8/+18
| | | | | | | | | | Add a 'serial' option to allow user to specify this value explicitly. And the default value is changed to an empty string as what we did in "hw/block/virtio-blk.c". Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220614051532.92-6-xieyongji@bytedance.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* nbd: Drop dead code spotted by CoverityEric Blake2022-06-241-6/+2Star
| | | | | | | | | | | | | CID 1488362 points out that the second 'rc >= 0' check is now dead code. Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: 172f5f1a40(nbd: remove peppering of nbd_client_connected) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20220516210519.76135-1-eblake@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/gluster: correctly set max_pdiscardFabian Ebner2022-06-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | On 64-bit platforms, assigning SIZE_MAX to the int64_t max_pdiscard results in a negative value, and the following assertion would trigger down the line (it's not the same max_pdiscard, but computed from the other one): qemu-system-x86_64: ../block/io.c:3166: bdrv_co_pdiscard: Assertion `max_pdiscard >= bs->bl.request_alignment' failed. On 32-bit platforms, it's fine to keep using SIZE_MAX. The assertion in qemu_gluster_co_pdiscard() is checking that the value of 'bytes' can safely be passed to glfs_discard_async(), which takes a size_t for the argument in question, so it is kept as is. And since max_pdiscard is still <= SIZE_MAX, relying on max_pdiscard is still fine. Fixes: 0c8022876f ("block: use int64_t instead of int in driver discard handlers") Cc: qemu-stable@nongnu.org Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> Message-Id: <20220520075922.43972-1-f.ebner@proxmox.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/rbd: report a better error when namespace does not existStefano Garzarella2022-06-241-0/+24
| | | | | | | | | | | | | | | | | | | | | | | If the namespace does not exist, rbd_create() fails with -ENOENT and QEMU reports a generic "error rbd create: No such file or directory": $ qemu-img create rbd:rbd/namespace/image 1M Formatting 'rbd:rbd/namespace/image', fmt=raw size=1048576 qemu-img: rbd:rbd/namespace/image: error rbd create: No such file or directory Unfortunately rados_ioctx_set_namespace() does not fail if the namespace does not exist, so let's use rbd_namespace_exists() in qemu_rbd_connect() to check if the namespace exists, reporting a more understandable error: $ qemu-img create rbd:rbd/namespace/image 1M Formatting 'rbd:rbd/namespace/image', fmt=raw size=1048576 qemu-img: rbd:rbd/namespace/image: namespace 'namespace' does not exist Reported-by: Tingting Mao <timao@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517071012.6120-1-sgarzare@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* libvduse: Add support for reconnectingXie Yongji2022-06-241-1/+18
| | | | | | | | | | | | To support reconnecting after restart or crash, VDUSE backend might need to resubmit inflight I/Os. This stores the metadata such as the index of inflight I/O's descriptors to a shm file so that VDUSE backend can restore them during reconnecting. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220523084611.91-9-xieyongji@bytedance.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* vduse-blk: Add vduse-blk resize supportXie Yongji2022-06-241-0/+20
| | | | | | | | | | | To support block resize, this uses vduse_dev_update_config() to update the capacity field in configuration space and inject config interrupt on the block resize callback. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220523084611.91-8-xieyongji@bytedance.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* vduse-blk: Implement vduse-blk exportXie Yongji2022-06-244-0/+360
| | | | | | | | | | | | | | | | | | | | | | | | | This implements a VDUSE block backends based on the libvduse library. We can use it to export the BDSs for both VM and container (host) usage. The new command-line syntax is: $ qemu-storage-daemon \ --blockdev file,node-name=drive0,filename=test.img \ --export vduse-blk,node-name=drive0,id=vduse-export0,writable=on After the qemu-storage-daemon started, we need to use the "vdpa" command to attach the device to vDPA bus: $ vdpa dev add name vduse-export0 mgmtdev vduse Also the device must be removed via the "vdpa" command before we stop the qemu-storage-daemon. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220523084611.91-7-xieyongji@bytedance.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/export: Abstract out the logic of virtio-blk I/O processXie Yongji2022-06-244-239/+299
| | | | | | | | | | Abstract the common logic of virtio-blk I/O process to a function named virtio_blk_process_req(). It's needed for the following commit. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220523084611.91-4-xieyongji@bytedance.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/export: Fix incorrect length passed to vu_queue_push()Xie Yongji2022-06-241-3/+2Star
| | | | | | | | | | | Now the req->size is set to the correct value only when handling VIRTIO_BLK_T_GET_ID request. This patch fixes it. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Message-Id: <20220523084611.91-3-xieyongji@bytedance.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: Support passing NULL ops to blk_set_dev_ops()Xie Yongji2022-06-241-1/+1
| | | | | | | | | | This supports passing NULL ops to blk_set_dev_ops() so that we can remove stale ops in some cases. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220523084611.91-2-xieyongji@bytedance.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: simplify handling of try to merge different sized bitmapsVladimir Sementsov-Ogievskiy2022-06-242-19/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | We have too much logic to simply check that bitmaps are of the same size. Let's just define that hbitmap_merge() and bdrv_dirty_bitmap_merge_internal() require their argument bitmaps be of same size, this simplifies things. Let's look through the callers: For backup_init_bcs_bitmap() we already assert that merge can't fail. In bdrv_reclaim_dirty_bitmap_locked() we gracefully handle the error that can't happen: successor always has same size as its parent, drop this logic. In bdrv_merge_dirty_bitmap() we already has assertion and separate check. Make the check explicit and improve error message. Signed-off-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220517111206.23585-4-v.sementsov-og@mail.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: improve block_dirty_bitmap_merge(): don't allocate extra bitmapVladimir Sementsov-Ogievskiy2022-06-241-20/+21
| | | | | | | | | | | | | | | We don't need extra bitmap. All we need is to backup the original bitmap when we do first merge. So, drop extra temporary bitmap and work directly with target and backup. Still to keep old semantics, that on failure target is unchanged and user don't need to restore, we need a local_backup variable and do restore ourselves on failure path. Signed-off-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Message-Id: <20220517111206.23585-3-v.sementsov-og@mail.ru> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: block_dirty_bitmap_merge(): fix error pathVladimir Sementsov-Ogievskiy2022-06-241-1/+4
| | | | | | | | | | | At the end we ignore failure of bdrv_merge_dirty_bitmap() and report success. And still set errp. That's wrong. Signed-off-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Reviewed-by: Nikita Lapshin <nikita.lapshin@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220517111206.23585-2-v.sementsov-og@mail.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: get rid of blk->guest_block_sizeStefan Hajnoczi2022-06-242-11/+0Star
| | | | | | | | | | | | | | | | | | | | | | Commit 1b7fd729559c ("block: rename buffer_alignment to guest_block_size") noted: At this point, the field is set by the device emulation, but completely ignored by the block layer. The last time the value of buffer_alignment/guest_block_size was actually used was before commit 339064d50639 ("block: Don't use guest sector size for qemu_blockalign()"). This value has not been used since 2013. Get rid of it. Cc: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220518130945.2657905-1-stefanha@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block: drop unused bdrv_co_drain() APIStefan Hajnoczi2022-06-241-15/+0Star
| | | | | | | | | | | | | | | | | bdrv_co_drain() has not been used since commit 9a0cec664eef ("mirror: use bdrv_drained_begin/bdrv_drained_end") in 2016. Remove it so there are fewer drain scenarios to worry about. Use bdrv_drained_begin()/bdrv_drained_end() instead. They are "mixed" functions that can be called from coroutine context. Unlike bdrv_co_drain(), these functions provide control of the length of the drained section, which is usually the right thing. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220521122714.3837731-1-stefanha@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>