bwlp/qemu.git/block/nbd-client.c, branch spice_video_codecs

block/nbd: merge nbd-client.* to nbd.c

2019-06-13T14:55:09+00:00

No reason for keeping driver handlers realization separate from driver structure. We can get rid of extra header file. While being here, fix comments style, restore forgotten comments for NBD_FOREACH_REPLY_CHUNK and nbd_reply_chunk_iter_receive, remove extra includes. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20190611102720.86114-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake

block/nbd-client: drop stale logout

2019-06-13T14:35:53+00:00

Drop one on failure path (we have errp) and turn two others into trace points. Signed-off-by: Vladimir Sementsov-Ogievskiy Message-Id: <20190611102720.86114-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake Signed-off-by: Eric Blake

nbd/client: Trace server noncompliance on structured reads

2019-04-01T13:58:04+00:00

Just as we recently added a trace for a server sending block status that doesn't match the server's advertised minimum block alignment, let's do the same for read chunks. But since qemu 3.1 is such a server (because it advertised 512-byte alignment, but when serving a file that ends in data but is not sector-aligned, NBD_CMD_READ would detect a mid-sector change between data and hole at EOF and the resulting read chunks are unaligned), we don't want to change our behavior of otherwise tolerating unaligned reads. Note that even though we fixed the server for 4.0 to advertise an actual block alignment (which gets rid of the unaligned reads at EOF for posix files), we can still trigger it via other means: $ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file Arguably, that is a bug in the blkdebug block status function, for leaking a block status that is not aligned. It may also be possible to observe issues with a backing layer with smaller alignment than the active layer, although so far I have been unable to write a reliable iotest for that scenario. Signed-off-by: Eric Blake Message-Id: <20190330165349.32256-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy

nbd/client: Support qemu-img convert from unaligned size

2019-04-01T13:32:44+00:00

If an NBD server advertises a size that is not a multiple of a sector, the block layer rounds up that size, even though we set info.size to the exact byte value sent by the server. The block layer then proceeds to let us read or query block status on the hole that it added past EOF, which the NBD server is unlikely to be happy with. Fortunately, qemu as a server never advertizes an unaligned size, so we generally don't run into this problem; but the nbdkit server makes it easy to test: $ printf %1000d 1 > f1 $ ~/nbdkit/nbdkit -fv file f1 & pid=$! $ qemu-img convert -f raw nbd://localhost:10809 f2 $ kill $pid $ qemu-img compare f1 f2 Pre-patch, the server attempts a 1024-byte read, which nbdkit rightfully rejects as going beyond its advertised 1000 byte size; the conversion fails and the output files differ (not even the first sector is copied, because qemu-img does not follow ddrescue's habit of trying smaller reads to get as much information as possible in spite of errors). Post-patch, the client's attempts to read (and query block status, for new enough nbdkit) are properly truncated to the server's length, with sane handling of the hole the block layer forced on us. Although f2 ends up as a larger file (1024 bytes instead of 1000), qemu-img compare shows the two images to have identical contents for display to the guest. I didn't add iotests coverage since I didn't want to add a dependency on nbdkit in iotests. I also did NOT patch write, trim, or write zeroes - these commands continue to fail (usually with ENOSPC, but whatever the server chose), because we really can't write to the end of the file, and because 'qemu-img convert' is the most common case where we care about being tolerant (which is read-only). Perhaps we could truncate the request if the client is writing zeros to the tail, but that seems like more work, especially if the block layer is fixed in 4.1 to track byte-accurate sizing (in which case this patch would be reverted as unnecessary). Signed-off-by: Eric Blake Message-Id: <20190329042750.14704-5-eblake@redhat.com> Tested-by: Richard W.M. Jones

nbd/client: Report offsets in bdrv_block_status

2019-03-31T01:52:29+00:00

It is desirable for 'qemu-img map' to have the same output for a file whether it is served over file or nbd protocols. However, ever since we implemented block status for NBD (2.12), the NBD protocol forgot to inform the block layer that as the final layer in the chain, the offset is valid; without an offset, the human-readable form of qemu-img map gives up with the unhelpful: $ nbdkit -U - data data="1" size=512 --run 'qemu-img map $nbd' Offset Length Mapped to File qemu-img: File contains external, encrypted or compressed clusters. The --output=json form always works, because it is reporting the lower-level bdrv_block_status results directly rather than trying to filter out sparse ranges for human consumption - but now it also shows the offset member. With this patch, the human output changes to: Offset Length Mapped to File 0 0x200 0 nbd+unix://?socket=/tmp/nbdkitOxeoLa/socket This change is observable to several iotests. Fixes: 78a33ab5 Reported-by: Richard W.M. Jones Signed-off-by: Eric Blake Message-Id: <20190329042750.14704-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy

nbd-client: Work around server BLOCK_STATUS misalignment at EOF

2019-03-30T15:06:08+00:00

The NBD spec is clear that a server that advertises a minimum block size should reply to NBD_CMD_BLOCK_STATUS with extents aligned accordingly. However, we know that the qemu NBD server implementation has had a corner-case bug where it is not compliant with the spec, present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12 (and unlikely to be patched in time for 4.0). Namely, when qemu is serving a file that is not a multiple of 512 bytes, it rounds the size advertised over NBD up to the next sector boundary (someday, I'd like to fix that to be byte-accurate, but it's a much bigger audit not appropriate for this release); yet if the final sector contains data prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole mid-sector which qemu then reported over NBD. We are well within our rights to hang up on a server that can't follow the spec, but it is more useful to try and keep the connection alive in spite of the problem. Do so by tracing a message about the problem, and then either truncating the request back to an aligned boundary (if it covered more than the final sector) or widening it out to the full boundary with a forced status of data (since truncating would result in 0 bytes, but we have to make progress, and valid since data is a default-safe answer). And in practice, since the problem only happens on a sector that starts with data and ends with a hole, we are going to want to read that full sector anyway (where qemu as the server fills in the tail beyond EOF with appropriate NUL bytes). Easy reproduction: $ printf %1000d 1 > file $ qemu-nbd -f raw -t file & pid=$! $ qemu-img map --output=json -f raw nbd://localhost:10809 qemu-img: Could not read file metadata: Invalid argument $ kill $pid where the patched version instead succeeds with: [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}] Signed-off-by: Eric Blake Message-Id: <20190326171317.4036-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy

nbd: Permit simple error to NBD_CMD_BLOCK_STATUS

2019-03-30T15:06:08+00:00

The NBD spec is clear that when structured replies are active, a simple error reply is acceptable to any command except for NBD_CMD_READ. However, we were mistakenly requiring structured errors for NBD_CMD_BLOCK_STATUS, and hanging up on a server that gave a simple error (since qemu does not behave as such a server, we didn't notice the problem until now). Broken since its introduction in commit 78a33ab5 (v2.12). Noticed while debugging a separate failure reported by nbdkit while working out its initial implementation of BLOCK_STATUS, although it turns out that nbdkit also chose to send structured error replies for BLOCK_STATUS, so I had to manually provoke the situation by hacking qemu's server to send a simple error reply: | diff --git i/nbd/server.c w/nbd/server.c | index fd013a2817a..833288d7c45 100644 | 00--- i/nbd/server.c | +++ w/nbd/server.c | @@ -2269,6 +2269,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, | "discard failed", errp); | | case NBD_CMD_BLOCK_STATUS: | + return nbd_co_send_simple_reply(client, request->handle, ENOMEM, | + NULL, 0, errp); | if (!request->len) { | return nbd_send_generic_reply(client, request->handle, -EINVAL, | "need non-zero length", errp); | Signed-off-by: Eric Blake Acked-by: Richard W.M. Jones Message-Id: <20190325190104.30213-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy

nbd: Don't lose server's error to NBD_CMD_BLOCK_STATUS

2019-03-30T15:06:08+00:00

When the server replies with a (structured [*]) error to NBD_CMD_BLOCK_STATUS, without any extent information sent first, the client code was blindly throwing away the server's error code and instead telling the caller that EIO occurred. This has been broken since its introduction in 78a33ab5 (v2.12, where we should have called: error_setg(&local_err, "Server did not reply with any status extents"); nbd_iter_error(&iter, false, -EIO, &local_err); to declare the situation as a non-fatal error if no earlier error had already been flagged, rather than just blindly slamming iter.err and iter.ret), although it is more noticeable since commit 7f86068d, which actually tries hard to preserve the server's code thanks to a separate iter.request_ret. [*] The spec is clear that the server is also permitted to reply with a simple error, but that's a separate fix. I was able to provoke this scenario with a hack to the server, then seeing whether ENOMEM makes it back to the caller: | diff --git a/nbd/server.c b/nbd/server.c | index fd013a2817a..29c7995de02 100644 | --- a/nbd/server.c | +++ b/nbd/server.c | @@ -2269,6 +2269,8 @@ static coroutine_fn int nbd_handle_request(NBDClient *client, | "discard failed", errp); | | case NBD_CMD_BLOCK_STATUS: | + return nbd_send_generic_reply(client, request->handle, -ENOMEM, | + "no status for you today", errp); | if (!request->len) { | return nbd_send_generic_reply(client, request->handle, -EINVAL, | "need non-zero length", errp); | -- Signed-off-by: Eric Blake Message-Id: <20190325190104.30213-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy

nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS

2019-03-30T15:06:08+00:00

The NBD spec states that NBD_CMD_FLAG_REQ_ONE (which we currently always use) should not reply with an extent larger than our request, and that the server's response should be exactly one extent. Right now, that means that if a server sends more than one extent, we treat the server as broken, fail the block status request, and disconnect, which prevents all further use of the block device. But while good software should be strict in what it sends, it should be tolerant in what it receives. While trying to implement NBD_CMD_BLOCK_STATUS in nbdkit, we temporarily had a non-compliant server sending too many extents in spite of REQ_ONE. Oddly enough, 'qemu-img convert' with qemu 3.1 failed with a somewhat useful message: qemu-img: Protocol error: invalid payload for NBD_REPLY_TYPE_BLOCK_STATUS which then disappeared with commit d8b4bad8, on the grounds that an error message flagged only at the time of coroutine teardown is pointless, and instead we should rely on the actual failed API to report an error - in other words, the 3.1 behavior was masking the fact that qemu-img was not reporting an error. That has since been fixed in the previous patch, where qemu-img convert now fails with: qemu-img: error while reading block status of sector 0: Invalid argument But even that is harsh. Since we already partially relaxed things in commit acfd8f7a to tolerate a server that exceeds the cap (although that change was made prior to the NBD spec actually putting a cap on the extent length during REQ_ONE - in fact, the NBD spec change was BECAUSE of the qemu behavior prior to that commit), it's not that much harder to argue that we should also tolerate a server that sends too many extents. But at the same time, it's nice to trace when we are being tolerant of server non-compliance, in order to help server writers fix their implementations to be more portable (if they refer to our traces, rather than just stderr). Reported-by: Richard W.M. Jones Signed-off-by: Eric Blake Message-Id: <20190323212639.579-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy

nbd: Increase bs->in_flight during AioContext switch

2019-02-25T14:03:19+00:00

bdrv_drain() must not leave connection_co scheduled, so bs->in_flight needs to be increased while the coroutine is waiting to be scheduled in the new AioContext after nbd_client_attach_aio_context(). Signed-off-by: Kevin Wolf