summaryrefslogtreecommitdiffstats
path: root/qemu-nbd.c
Commit message (Collapse)AuthorAgeFilesLines
* nbd/server: Add --selinux-label optionRichard W.M. Jones2021-11-161-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under SELinux, Unix domain sockets have two labels. One is on the disk and can be set with commands such as chcon(1). There is a different label stored in memory (called the process label). This can only be set by the process creating the socket. When using SELinux + SVirt and wanting qemu to be able to connect to a qemu-nbd instance, you must set both labels correctly first. For qemu-nbd the options to set the second label are awkward. You can create the socket in a wrapper program and then exec into qemu-nbd. Or you could try something with LD_PRELOAD. This commit adds the ability to set the label straightforwardly on the command line, via the new --selinux-label flag. (The name of the flag is the same as the equivalent nbdkit option.) A worked example showing how to use the new option can be found in this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1984938 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1984938 Signed-off-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [eblake: rebase to configure changes, reject --selinux-label if it is not compiled in or not used on a Unix socket] Note that we may relax some of these restrictions at a later date, such as making it possible to label a TCP socket, although it may be smarter to do so as a generic QMP action rather than more one-off command lines in qemu-nbd. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20211115202944.615966-1-eblake@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> [eblake: adjust meson output as suggested by thuth] Signed-off-by: Eric Blake <eblake@redhat.com>
* qemu-nbd: Change default cache mode to writebackNir Soffer2021-09-291-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both qemu and qemu-img use writeback cache mode by default, which is already documented in qemu(1). qemu-nbd uses writethrough cache mode by default, and the default cache mode is not documented. According to the qemu-nbd(8): --cache=CACHE The cache mode to be used with the file. See the documentation of the emulator's -drive cache=... option for allowed values. qemu(1) says: The default mode is cache=writeback. So users have no reason to assume that qemu-nbd is using writethough cache mode. The only hint is the painfully slow writing when using the defaults. Looking in git history, it seems that qemu used writethrough in the past to support broken guests that did not flush data properly, or could not flush due to limitations in qemu. But qemu-nbd clients can use NBD_CMD_FLUSH to flush data, so using writethrough does not help anyone. Change the default cache mode to writback, and document the default and available values properly in the online help and manual. With this change converting image via qemu-nbd is 3.5 times faster. $ qemu-img create dst.img 50g $ qemu-nbd -t -f raw -k /tmp/nbd.sock dst.img Before this change: $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock" Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock Time (mean ± σ): 83.639 s ± 5.970 s [User: 2.733 s, System: 6.112 s] Range (min … max): 76.749 s … 87.245 s 3 runs After this change: $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock" Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock Time (mean ± σ): 23.522 s ± 0.433 s [User: 2.083 s, System: 5.475 s] Range (min … max): 23.234 s … 24.019 s 3 runs Users can avoid the issue by using --cache=writeback[1] but the defaults should give good performance for the common use case. [1] https://bugzilla.redhat.com/1990656 Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-Id: <20210813205519.50518-1-nsoffer@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com>
* error: Use error_fatal to simplify obvious fatal errors (again)Markus Armbruster2021-08-261-4/+1Star
| | | | | | | | | | | | | | | | | | | | | | We did this with scripts/coccinelle/use-error_fatal.cocci before, in commit 50beeb68094 and 007b06578ab. This commit cleans up rarer variations that don't seem worth matching with Coccinelle. Cc: Thomas Huth <thuth@redhat.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-2-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
* qemu-nbd: Use qcrypto_tls_creds_check_endpoint()Philippe Mathieu-Daudé2021-06-291-12/+7Star
| | | | | | | | | | Avoid accessing QCryptoTLSCreds internals by using the qcrypto_tls_creds_check_endpoint() helper. Tested-by: Akihiko Odaki <akihiko.odaki@gmail.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
* qemu-nbd: Use user_creatable_process_cmdline() for --objectKevin Wolf2021-03-191-31/+3Star
| | | | | | | | | | | | | This switches qemu-nbd from a QemuOpts-based parser for --object to user_creatable_process_cmdline() which uses a keyval parser and enforces the QAPI schema. Apart from being a cleanup, this makes non-scalar properties accessible. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* qemu-nbd: Permit --shared=0 for unlimited clientsEric Blake2021-02-121-3/+3
| | | | | | | | | This gives us better feature parity with QMP nbd-server-start, where max-connections defaults to 0 for unlimited. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20210209152759.209074-3-eblake@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
* qemu-nbd: Use SOMAXCONN for socket listen() backlogEric Blake2021-02-121-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our default of a backlog of 1 connection is rather puny; it gets in the way when we are explicitly allowing multiple clients (such as qemu-nbd -e N [--shared], or nbd-server-start with its default "max-connections":0 for unlimited), but is even a problem when we stick to qemu-nbd's default of only 1 active client but use -t [--persistent] where a second client can start using the server once the first finishes. While the effects are less noticeable on TCP sockets (since the client can poll() to learn when the server is ready again), it is definitely observable on Unix sockets, where on Linux, a client will fail with EAGAIN and no recourse but to sleep an arbitrary amount of time before retrying if the server backlog is already full. Since QMP nbd-server-start is always persistent, it now always requests a backlog of SOMAXCONN; meanwhile, qemu-nbd will request SOMAXCONN if persistent, otherwise its backlog should be based on the expected number of clients. See https://bugzilla.redhat.com/1925045 for a demonstration of where our low backlog prevents libnbd from connecting as many parallel clients as it wants. Reported-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> CC: qemu-stable@nongnu.org Message-Id: <20210209152759.209074-2-eblake@redhat.com> Tested-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* block: move blk_exp_close_all() to qemu_cleanup()Sergio Lopez2021-02-021-0/+1
| | | | | | | | | | | | | | Move blk_exp_close_all() from bdrv_close() to qemu_cleanup(), before bdrv_drain_all_begin(). Export drivers may have coroutines yielding at some point in the block layer, so we need to shut them down before draining the block layer, as otherwise they may get stuck blk_wait_while_drained(). RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1900505 Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210201125032.44713-3-slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qemu-nbd: Fix a memleak in nbd_client_thread()Alex Chen2021-01-201-23/+17Star
| | | | | | | | | | | | | | When the qio_channel_socket_connect_sync() fails we should goto 'out_socket' label to free the 'sioc' instead of goto 'out' label. In addition, there's a lot of redundant code in the successful branch and the error branch, optimize it. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Alex Chen <alex.chen@huawei.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201208134944.27962-1-alex.chen@huawei.com>
* qemu-nbd: Fix a memleak in qemu_nbd_client_list()Alex Chen2021-01-201-1/+1
| | | | | | | | | | | When the qio_channel_socket_connect_sync() fails we should goto 'out' label to free the 'sioc' instead of return. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Alex Chen <alex.chen@huawei.com> Message-Id: <20201130123651.17543-1-alex.chen@huawei.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* trace: remove argument from trace_init_filePaolo Bonzini2020-11-111-4/+2Star
| | | | | | | | | | It is not needed, all the callers are just saving what was retrieved from -trace and trace_init_file can retrieve it on its own. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20201102115841.4017692-1-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* nbd: Add 'qemu-nbd -A' to expose allocation depthEric Blake2020-10-301-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Allow the server to expose an additional metacontext to be requested by savvy clients. qemu-nbd adds a new option -A to expose the qemu:allocation-depth metacontext through NBD_CMD_BLOCK_STATUS; this can also be set via QMP when using block-export-add. qemu as client is hacked into viewing the key aspects of this new context by abusing the already-experimental x-dirty-bitmap option to collapse all depths greater than 2, which results in a tri-state value visible in the output of 'qemu-img map --output=json' (yes, that means x-dirty-bitmap is now a bit of a misnomer, but I didn't feel like renaming it as it would introduce a needless break of back-compat, even though we make no compat guarantees with x- members): unallocated (depth 0) => "zero":false, "data":true local (depth 1) => "zero":false, "data":false backing (depth 2+) => "zero":true, "data":true libnbd as client is probably a nicer way to get at the information without having to decipher such hacks in qemu as client. ;) Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-11-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* nbd: Update qapi to support exporting multiple bitmapsEric Blake2020-10-301-9/+9
| | | | | | | | | | | | | | | | | | | Since 'block-export-add' is new to 5.2, we can still tweak the interface; there, allowing 'bitmaps':['str'] is nicer than 'bitmap':'str'. This wires up the qapi and qemu-nbd changes to permit passing multiple bitmaps as distinct metadata contexts that the NBD client may request, but the actual support for more than one will require a further patch to the server. Note that there are no changes made to the existing deprecated 'nbd-server-add' command; this required splitting the QAPI type BlockExportOptionsNbd, which fortunately does not affect QMP introspection. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20201027050556.269064-5-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com>
* block: move block exports to libblockdevStefan Hajnoczi2020-10-231-13/+8Star
| | | | | | | | | | | | | | | | | | | | | | | | Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd. They are not used by other programs and are not otherwise needed in libblock. Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss. Since bdrv_close_all() (libblock) calls blk_exp_close_all() (libblockdev) a stub function is required.. Make qemu-nbd.c use signal handling utility functions instead of duplicating the code. This helps because os-posix.c is in libblockdev and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks. Once we use the signal handling utility functions we also end up providing the necessary symbol. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200929125516.186715-4-stefanha@redhat.com [Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* qemu-nbd: Honor SIGINT and SIGHUPEric Blake2020-10-091-7/+8
| | | | | | | | | | | | | | | | Honoring just SIGTERM on Linux is too weak; we also want to handle other common signals, and do so even on BSD. Why? Because at least 'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on bitmaps when the server is shut down via a signal. See also: http://bugzilla.redhat.com/1883608 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200930121105.667049-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: apply comment tweak suggested by Vladimir; fix ifdef around termsig_handler] Signed-off-by: Eric Blake <eblake@redhat.com>
* block/export: Move writable to BlockExportOptionsKevin Wolf2020-10-021-2/+2
| | | | | | | | | | | | The 'writable' option is a basic option that will probably be applicable to most if not all export types that we will implement. Move it from NBD to the generic BlockExport layer. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-26-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/export: Add 'id' option to block-export-addKevin Wolf2020-10-021-0/+1
| | | | | | | | | | | | | | | | | | We'll need an id to identify block exports in monitor commands. This adds one. Note that this is different from the 'name' option in the NBD server, which is the externally visible export name. While block export ids need to be unique in the whole process, export names must be unique only for the same server. Different export types or (potentially in the future) multiple NBD servers can have the same export name externally, but still need different block export ids internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-19-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/export: Add blk_exp_close_all(_type)Kevin Wolf2020-10-021-1/+1
| | | | | | | | | | | | | | | | | This adds a function to shut down all block exports, and another one to shut down the block exports of a single type. The latter is used for now when stopping the NBD server. As soon as we implement support for multiple NBD servers, we'll need a per-server list of exports and it will be replaced by a function using that. As a side effect, the BlockExport layer has a list tracking all existing exports now. closed_exports loses its only user and can go away. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-18-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/export: Add node-name to BlockExportOptionsKevin Wolf2020-10-021-1/+1
| | | | | | | | | | | | | | | | | | Every block export needs a block node to export, so add a 'node-name' option to BlockExportOptions and remove the replaced option 'device' from BlockExportOptionsNbd. To maintain compatibility in nbd-server-add, BlockExportOptionsNbd needs to be wrapped by a new type NbdServerAddOptions that adds 'device' back because nbd-server-add doesn't use the BlockExportOptions base type at all (so even without changing it to a 'node-name' option in block-export-add, this compatibility code would be necessary). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-16-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qemu-nbd: Use blk_exp_add() to create the exportKevin Wolf2020-10-021-6/+22
| | | | | | | | | | | | | | | | With this change, NBD exports are now only created through the BlockExport interface. This allows us finally to move things from the NBD layer to the BlockExport layer if they make sense for other export types, too. blk_exp_add() returns only a weak reference, so the explicit nbd_export_put() goes away. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-12-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* nbd: Remove NBDExport.close callbackKevin Wolf2020-10-021-10/+4Star
| | | | | | | | | | | | | | | | | The export close callback is unused by the built-in NBD server. qemu-nbd uses it only during shutdown to wait for the unrefed export to actually go away. It can just use nbd_export_close_all() instead and do without the callback. This removes the close callback from nbd_export_new() and makes both callers of it more similar. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200924152717.287415-11-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* block/export: Remove magic from block-export-addKevin Wolf2020-10-021-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | nbd-server-add tries to be convenient and adds two questionable features that we don't want to share in block-export-add, even for NBD exports: 1. When requesting a writable export of a read-only device, the export is silently downgraded to read-only. This should be an error in the context of block-export-add. 2. When using a BlockBackend name, unplugging the device from the guest will automatically stop the NBD server, too. This may sometimes be what you want, but it could also be very surprising. Let's keep things explicit with block-export-add. If the user wants to stop the export, they should tell us so. Move these things into the nbd-server-add QMP command handler so that they apply only there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200924152717.287415-8-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qemu-nbd: Use raw block driver for --offsetKevin Wolf2020-10-021-15/+12Star
| | | | | | | | | | | | | | | | | | | Instead of implementing qemu-nbd --offset in the NBD code, just put a raw block node with the requested offset on top of the user image and rely on that doing the job. This does not only simplify the nbd_export_new() interface and bring it closer to the set of options that the nbd-server-add QMP command offers, but in fact it also eliminates a potential source for bugs in the NBD code which previously had to add the offset manually in all relevant places. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200924152717.287415-7-kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* qemu/atomic.h: rename atomic_ to qatomic_Stefan Hajnoczi2020-09-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang's C11 atomic_fetch_*() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int *' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic\(64\)\?_[a-z0-9_]\+' include/qemu/atomic.h | \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
* nbd: disable signals and forking on Windows buildsDaniel P. Berrangé2020-09-021-0/+5
| | | | | | | | | | Disabling these parts are sufficient to get the qemu-nbd program compiling in a Windows build. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20200825103850.119911-4-berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* nbd: skip SIGTERM handler if NBD device support is not builtDaniel P. Berrangé2020-09-021-1/+4
| | | | | | | | | | | | The termsig_handler function is used by the client thread handling the host NBD device connection to do a graceful shutdown. IOW, if we have disabled NBD device support at compile time, we don't need the SIGTERM handler. This fixes a build issue for Windows. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20200825103850.119911-3-berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* block: add missing socket_init() calls to toolsDaniel P. Berrangé2020-09-021-0/+1
| | | | | | | | | | Any tool that uses sockets needs to call socket_init() in order to work on the Windows platform. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20200825103850.119911-2-berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* error: Use error_reportf_err() where appropriateMarkus Armbruster2020-05-271-4/+3Star
| | | | | | | | | | | | | | | | | Replace error_report("...: %s", ..., error_get_pretty(err)); by error_reportf_err(err, "...: ", ...); One of the replaced messages lacked a colon. Add it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200505101908.6207-6-armbru@redhat.com>
* qemu-nbd: Close inherited stderrRaphael Pour2020-05-181-1/+5
| | | | | | | | | | | | | Close inherited stderr of the parent if fork_process is false. Otherwise no one will close it. (introduced by e6df58a5) This only affected 'qemu-nbd -c /dev/nbd0'. Signed-off-by: Raphael Pour <raphael.pour@hetzner.com> Message-Id: <d8ddc993-9816-836e-a3de-c6edab9d9c49@hetzner.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: Enhance commit message] Signed-off-by: Eric Blake <eblake@redhat.com>
* qemu-nbd: Removed deprecated --partition optionEric Blake2020-02-061-131/+2Star
| | | | | | | | | The option was deprecated in 4.0.0 (commit 0ae2d546); it's now been long enough with no complaints to follow through with that process. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200123164650.1741798-3-eblake@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
* qemu-nbd: adds option for aio enginesAarushi Mehta2020-01-301-8/+4Star
| | | | | | | | | | Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com> Acked-by: Eric Blake <eblake@redhat.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20200120141858.587874-14-stefanha@redhat.com Message-Id: <20200120141858.587874-14-stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* nbd: Don't send oversize stringsEric Blake2019-11-181-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Qemu as server currently won't accept export names larger than 256 bytes, nor create dirty bitmap names longer than 1023 bytes, so most uses of qemu as client or server have no reason to get anywhere near the NBD spec maximum of a 4k limit per string. However, we weren't actually enforcing things, ignoring when the remote side violates the protocol on input, and also having several code paths where we send oversize strings on output (for example, qemu-nbd --description could easily send more than 4k). Tighten things up as follows: client: - Perform bounds check on export name and dirty bitmap request prior to handing it to server - Validate that copied server replies are not too long (ignoring NBD_INFO_* replies that are not copied is not too bad) server: - Perform bounds check on export name and description prior to advertising it to client - Reject client name or metadata query that is too long - Adjust things to allow full 4k name limit rather than previous 256 byte limit Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20191114024635.11363-4-eblake@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* qemu-nbd: Support help options for --objectKevin Wolf2019-10-141-1/+8
| | | | | | | | | Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-nbd, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
* nbd: Prepare for NBD_CMD_FLAG_FAST_ZEROEric Blake2019-09-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fe0480d6 and friends added BDRV_REQ_NO_FALLBACK as a way to avoid wasting time on a preliminary write-zero request that will later be rewritten by actual data, if it is known that the write-zero request will use a slow fallback; but in doing so, could not optimize for NBD. The NBD specification is now considering an extension that will allow passing on those semantics; this patch updates the new protocol bits and 'qemu-nbd --list' output to recognize the bit, as well as the new errno value possible when using the new flag; while upcoming patches will improve the client to use the feature when present, and the server to advertise support for it. The NBD spec recommends (but not requires) that ENOTSUP be avoided for all but failures of a fast zero (the only time it is mandatory to avoid an ENOTSUP failure is when fast zero is supported but not requested during write zeroes; the questionable use is for ENOTSUP to other actions like a normal write request). However, clients that get an unexpected ENOTSUP will either already be treating it the same as EINVAL, or may appreciate the extra bit of information. We were equally loose for returning EOVERFLOW in more situations than recommended by the spec, so if it turns out to be a problem in practice, a later patch can tighten handling for both error codes. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190823143726.27062-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: tweak commit message, also handle EOPNOTSUPP]
* nbd: Improve per-export flag handling in serverEric Blake2019-09-051-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating a read-only image, we are still advertising support for TRIM and WRITE_ZEROES to the client, even though the client should not be issuing those commands. But seeing this requires looking across multiple functions: All callers to nbd_export_new() passed a single flag based solely on whether the export allows writes. Later, we then pass a constant set of flags to nbd_negotiate_options() (namely, the set of flags which we always support, at least for writable images), which is then further dynamically modified with NBD_FLAG_SEND_DF based on client requests for structured options. Finally, when processing NBD_OPT_EXPORT_NAME or NBD_OPT_EXPORT_GO we bitwise-or the original caller's flag with the runtime set of flags we've built up over several functions. Let's refactor things to instead compute a baseline of flags as soon as possible which gets shared between multiple clients, in nbd_export_new(), and changing the signature for the callers to pass in a simpler bool rather than having to figure out flags. We can then get rid of the 'myflags' parameter to various functions, and instead refer to client for everything we need (we still have to perform a bitwise-OR for NBD_FLAG_SEND_DF during NBD_OPT_EXPORT_NAME and NBD_OPT_EXPORT_GO, but it's easier to see what is being computed). This lets us quit advertising senseless flags for read-only images, as well as making the next patch for exposing FAST_ZERO support easier to write. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190823143726.27062-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> [eblake: improve commit message, update iotest 223]
* nbd: Advertise multi-conn for shared read-only connectionsEric Blake2019-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NBD specification defines NBD_FLAG_CAN_MULTI_CONN, which can be advertised when the server promises cache consistency between simultaneous clients (basically, rules that determine what FUA and flush from one client are able to guarantee for reads from another client). When we don't permit simultaneous clients (such as qemu-nbd without -e), the bit makes no sense; and for writable images, we probably have a lot more work before we can declare that actions from one client are cache-consistent with actions from another. But for read-only images, where flush isn't changing any data, we might as well advertise multi-conn support. What's more, advertisement of the bit makes it easier for clients to determine if 'qemu-nbd -e' was in use, where a second connection will succeed rather than hang until the first client goes away. This patch affects qemu as server in advertising the bit. We may want to consider patches to qemu as client to attempt parallel connections for higher throughput by spreading the load over those connections when a server advertises multi-conn, but for now sticking to one connection per nbd:// BDS is okay. See also: https://bugzilla.redhat.com/1708300 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190815185024.7010-1-eblake@redhat.com> [eblake: tweak blockdev-nbd.c to not request shared when writable, fix iotest 233] Reviewed-by: John Snow <jsnow@redhat.com>
* socket: Add num connections to qio_net_listener_open_sync()Juan Quintela2019-09-031-1/+1
| | | | | Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>
* block/nbd: use non-blocking io channel for nbd negotiationVladimir Sementsov-Ogievskiy2019-08-151-1/+1
| | | | | | | | | | | No reason to use blocking channel for negotiation and we'll benefit in further reconnect feature, as qio_channel reads and writes will do qemu_coroutine_yield while waiting for io completion. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20190618114328.55249-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* qemu-nbd: Do not close stderrMax Reitz2019-06-131-1/+2
| | | | | | | | | | | We kept old_stderr specifically so we could keep emitting error message on stderr. However, qemu_daemon() closes stderr. Therefore, we need to dup() stderr to old_stderr before invoking qemu_daemon(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20190508211820.17851-4-mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* qemu-nbd: Add --pid-file optionMax Reitz2019-06-131-0/+11
| | | | | | | | | --fork is a bit boring if there is no way to get the child's PID. This option helps. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20190508211820.17851-2-mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* Include qemu-common.h exactly where neededMarkus Armbruster2019-06-121-0/+1
| | | | | | | | | | | | | | | | No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
* Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster2019-06-121-1/+1
| | | | | | | | | Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
* qemu-common: Move tcg_enabled() etc. to sysemu/tcg.hMarkus Armbruster2019-06-111-0/+1
| | | | | | | | | | | | | | | | | | | Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]
* qemu-nbd: Look up flag names in arrayMax Reitz2019-05-071-29/+17Star
| | | | | | | | | | The existing code to convert flag bits into strings looks a bit strange now, and if we ever add more flags, it will look even stranger. Prevent that from happening by making it look up the flag names in an array. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20190405191635.25740-1-mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
* log: Make glib logging go through QEMUChristophe Fergeau2019-04-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a error_init() helper which calls g_log_set_default_handler() so that glib logs (g_log, g_warning, ...) are handled similarly to other QEMU logs. This means they will get a timestamp if timestamps are enabled, and they will go through the HMP monitor if one is configured. This commit also adds a call to error_init() to the binaries installed by QEMU. Since error_init() also calls error_set_progname(), this means that *-linux-user, *-bsd-user and qemu-pr-helper messages output with error_report, info_report, ... will slightly change: they will be prefixed by the binary name. glib debug messages are enabled through G_MESSAGES_DEBUG similarly to the glib default log handler. At the moment, this change will mostly impact SPICE logging if your spice version is >= 0.14.1. With older spice versions, this is not going to work as expected, but will not have any ill effect, so this call is not conditional on the SPICE version. Signed-off-by: Christophe Fergeau <cfergeau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190131164614.19209-3-cfergeau@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
* qemu-nbd: add support for authorization of TLS clientsDaniel P. Berrange2019-03-061-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently any client which can complete the TLS handshake is able to use the NBD server. The server admin can turn on the 'verify-peer' option for the x509 creds to require the client to provide a x509 certificate. This means the client will have to acquire a certificate from the CA before they are permitted to use the NBD server. This is still a fairly low bar to cross. This adds a '--tls-authz OBJECT-ID' option to the qemu-nbd command which takes the ID of a previously added 'QAuthZ' object instance. This will be used to validate the client's x509 distinguished name. Clients failing the authorization check will not be permitted to use the NBD server. For example to setup authorization that only allows connection from a client whose x509 certificate distinguished name is CN=laptop.example.com,O=Example Org,L=London,ST=London,C=GB escape the commas in the name and use: qemu-nbd --object tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\ endpoint=server,verify-peer=yes \ --object 'authz-simple,id=auth0,identity=CN=laptop.example.com,,\ O=Example Org,,L=London,,ST=London,,C=GB' \ --tls-creds tls0 \ --tls-authz authz0 \ ....other qemu-nbd args... NB: a real shell command line would not have leading whitespace after the line continuation, it is just included here for clarity. Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-Id: <20190227162035.18543-2-berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: split long line in --help text, tweak 233 to show that whitespace after ,, in identity= portion is actually okay] Signed-off-by: Eric Blake <eblake@redhat.com>
* qemu-nbd: Deprecate qemu-nbd --partitionEric Blake2019-02-041-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing qemu-nbd --partition code claims to handle logical partitions up to 8, since its introduction in 2008 (commit 7a5ca86). However, the implementation is bogus (actual MBR logical partitions form a sort of linked list, with one partition per extended table entry, rather than four logical partitions in a single extended table), making the code unlikely to work for anything beyond -P5 on actual guest images. What's more, the code does not support GPT partitions, which are becoming more popular, and maintaining device subsetting in both NBD and the raw device is unnecessary duplication of effort (even if it is not too difficult). Note that obtaining the offsets of a partition (MBR or GPT) can be learned by using 'qemu-nbd -c /dev/nbd0 file.qcow2 && sfdisk --dump /dev/nbd0', but by the time you've done that, you might as well just mount /dev/nbd0p1 that the kernel creates for you instead of bothering with qemu exporting a subset. Or, keeping to just user-space code, use nbdkit's partition filter, which has already known both GPT and primary MBR partitions for a while, and was just recently enhanced to support arbitrary logical MBR parititions. Start the clock on the deprecation cycle, with examples of how to accomplish device subsetting without using -P. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20190125234837.2272-1-eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
* qemu-nbd: Add --list optionEric Blake2019-01-211-13/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to be able to detect whether a given qemu NBD server is exposing the right export(s) and dirty bitmaps, at least for regression testing. We could use 'nbd-client -l' from the upstream NBD project to list exports, but it's annoying to rely on out-of-tree binaries; furthermore, nbd-client doesn't necessarily know about all of the qemu NBD extensions. Thus, it is time to add a new mode to qemu-nbd that merely sniffs all possible information from the server during handshake phase, then disconnects and dumps the information. This patch actually implements --list/-L, while reusing other options such as --tls-creds for now designating how to connect as the client (rather than their non-list usage of how to operate as the server). I debated about adding this functionality to something akin to 'qemu-img info' - but that tool does not readily lend itself to connecting to an arbitrary NBD server without also tying to a specific export (I may, however, still add ImageInfoSpecificNBD for reporting the bitmaps available when connecting to a single export). And, while it may feel a bit odd that normally qemu-nbd is a server but 'qemu-nbd -L' is a client, we are not really making the qemu-nbd binary that much larger, because 'qemu-nbd -c' has to operate as both server and client simultaneously across two threads when feeding the kernel module for /dev/nbdN access. Sample output: $ qemu-nbd -L exports available: 1 export: '' size: 65536 flags: 0x4ed ( flush fua trim zeroes df cache ) min block: 512 opt block: 4096 max block: 33554432 available meta contexts: 1 base:allocation Note that the output only lists sizes if the server sent NBD_FLAG_HAS_FLAGS, because a newstyle server does not give the size otherwise. It has the side effect that for really old servers that did not send any flags, the size is not output even though it was available. However, I'm not too concerned about that - oldstyle servers are (rightfully) getting less common to encounter (qemu 3.0 was the last version where we even serve it), and most existing servers that still even offer oldstyle negotiation (such as nbdkit) still send flags (since that was added to the NBD protocol in 2007 to permit read-only connections). Not done here, but maybe worth future experiments: capture the meat of NBDExportInfo into a QAPI struct, and use the generated QAPI pretty-printers instead of hand-rolling our output loop. It would also permit us to add a JSON output mode for machine parsing. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Message-Id: <20190117193658.16413-20-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
* nbd/client: Move export name into NBDExportInfoEric Blake2019-01-211-2/+4
| | | | | | | | | | | | | | | | | | | Refactor the 'name' parameter of nbd_receive_negotiate() from being a separate parameter into being part of the in-out 'info'. This also spills over to a simplification of nbd_opt_go(). The main driver for this refactoring is that an upcoming patch would like to add support to qemu-nbd to list information about all exports available on a server, where the name(s) will be provided by the server instead of the client. But another benefit is that we can now allow the client to explicitly specify the empty export name "" even when connecting to an oldstyle server (even if qemu is no longer such a server after commit 7f7dfe2a). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20190117193658.16413-10-eblake@redhat.com>
* qemu-nbd: Avoid strtol open-codingEric Blake2019-01-211-19/+9Star
| | | | | | | | | | | | | | | | | | Our copy-and-pasted open-coding of strtol handling forgot to handle overflow conditions. Use qemu_strto*() instead. In the case of --partition, since we insist on a user-supplied partition to be non-zero, we can use 0 rather than -1 for our initial value to distinguish when a partition is not being served, for slightly more optimal code. The error messages for out-of-bounds values are less specific, but should not be a terrible loss in quality. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com> Message-Id: <20190117193658.16413-8-eblake@redhat.com>