summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | vfio-ccw: add force unlimited prefetch propertyHalil Pasic2018-06-182-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is at least one guest (OS) such that although it does not rely on the guarantees provided by ORB 1 word 9 bit (aka unlimited prefetch, aka P bit) not being set, it fails to tell this to the machine. Usually this ain't a big deal, as the original purpose of the P bit is to allow for performance optimizations. vfio-ccw however can not provide the guarantees required if the bit is not set. It is not possible to implement support for the P bit not set without transitioning to lower level protocols for vfio-ccw. So let's give the user the opportunity to force setting the P bit, if the user knows this is safe. For self modifying channel programs forcing the P bit is not safe. If the P bit is forced for a self modifying channel program things are expected to break in strange ways. Let's also avoid warning multiple about P bit not set in the ORB in case P bit is not told to be forced, and designate the affected vfio-ccw device. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Suggested-by: Dong Jia Shi <bjsdjshi@linux.ibm.com> Acked-by: Jason J. Herne <jjherne@linux.ibm.com> Tested-by: Jason J. Herne <jjherne@linux.ibm.com> Message-Id: <20180524175828.3143-2-pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
| * | virtio-ccw: clean up notifyHalil Pasic2018-06-181-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coverity recently started complaining about virtio_ccw_notify(). Turns out, there is a couple of things that can be cleaned up. Let's clean! Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: CID 1390619 Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Message-Id: <20180516132757.68558-1-pasic@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
* | | Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-20180618' ↵Peter Maydell2018-06-194-0/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging qemu-openbios queue # gpg: Signature made Mon 18 Jun 2018 19:28:08 BST # gpg: using RSA key 5BC2C56FAE0F321F # gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" # Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F * remotes/mcayland/tags/qemu-openbios-20180618: Update OpenBIOS images to 8fe6f5f96f built from submodule. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | Update OpenBIOS images to 8fe6f5f96f built from submodule.Mark Cave-Ayland2018-06-184-0/+0
| |/ / | | | | | | | | | Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
* | | Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-20180618' ↵Peter Maydell2018-06-193-24/+172
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging qemu-sparc queue # gpg: Signature made Mon 18 Jun 2018 18:43:24 BST # gpg: using RSA key 5BC2C56FAE0F321F # gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" # Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F * remotes/mcayland/tags/qemu-sparc-20180618: SPARC64: add icount support hw/sparc/sun4m: Fix problems with device introspection hw/sparc64/sun4u: Fix introspection by converting prom instance_init to realize Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | SPARC64: add icount supportMark Cave-Ayland2018-06-171-1/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds gen_io_start()/gen_io_end() to various instructions as required in order to boot my OpenBIOS test images on qemu-system-sparc64 with icount enabled. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
| * | | hw/sparc/sun4m: Fix problems with device introspectionThomas Huth2018-06-171-17/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several devices of the sun4m machines are using &error_fatal in their instance_init function and thus can cause QEMU to abort unexpectedly: $ echo "{'execute':'qmp_capabilities'}"\ "{'execute':'device-list-properties',"\ " 'arguments':{'typename':'openprom'}}" \ | sparc-softmmu/qemu-system-sparc -M SS-10 -S -qmp stdio {"QMP": {"version": {"qemu": {"micro": 91, "minor": 11, "major": 2}, "package": "build-all"}, "capabilities": []}} {"return": {}} RAMBlock "sun4m.prom" already registered, abort! Aborted (core dumped) $ echo "{'execute':'qmp_capabilities'}"\ "{'execute':'device-list-properties',"\ " 'arguments':{'typename':'macio_idreg'}}" \ | sparc-softmmu/qemu-system-sparc -M SS-10 -S -qmp stdio {"QMP": {"version": {"qemu": {"micro": 91, "minor": 11, "major": 2}, "package": "build-all"}, "capabilities": []}} {"return": {}} RAMBlock "sun4m.idreg" already registered, abort! Aborted (core dumped) $ echo "{'execute':'qmp_capabilities'}"\ "{'execute':'device-list-properties',"\ " 'arguments':{'typename':'tcx_afx'}}" \ | sparc-softmmu/qemu-system-sparc -M SS-5 -S -qmp stdio {"QMP": {"version": {"qemu": {"micro": 91, "minor": 11, "major": 2}, "package": "build-all"}, "capabilities": []}} {"return": {}} RAMBlock "sun4m.afx" already registered, abort! Aborted (core dumped) Fix the issues by converting the instance_init functions into realize() functions instead, which are allowed to fail (and not called during device introspection). Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
| * | | hw/sparc64/sun4u: Fix introspection by converting prom instance_init to realizeThomas Huth2018-06-171-6/+12
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The instance_init function of devices should always succeed to be able to introspect the device. However, the instance_init function of the "openprom" device can currently fail, for example like this: $ echo "{'execute':'qmp_capabilities'}"\ "{'execute':'device-list-properties',"\ " 'arguments':{'typename':'openprom'}}" \ | sparc64-softmmu/qemu-system-sparc64 -M sun4v,accel=qtest -qmp stdio {"QMP": {"version": {"qemu": {"micro": 91, "minor": 11, "major": 2}, "package": "build-all"}, "capabilities": []}} {"return": {}} RAMBlock "sun4u.prom" already registered, abort! Aborted (core dumped) This should not happen. Fix this problem by moving the affected code from instance_init into a realize function instead. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
* | | Merge remote-tracking branch 'remotes/rth/tags/pull-axp-20180618' into stagingPeter Maydell2018-06-191-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Place parallel device properly, fixing vga # gpg: Signature made Mon 18 Jun 2018 17:45:50 BST # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-axp-20180618: hw/isa/smc37c669: Change the parallel I/O base to 378H Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | hw/isa/smc37c669: Change the parallel I/O base to 378HPhilippe Mathieu-Daudé2018-06-171-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On the Alpha DP264 machine, the Cirrus VGA is I/O mapped in the 3C0H-3CFH range, thus I/O base used by the parallel device clashes, and since a4cb773928e the VGA is not working: (qemu) info mtree address-space: memory 0000000000000000-ffffffffffffffff (prio 0, i/o): system 00000801fc000000-00000801fdffffff (prio 0, i/o): pci0-io ... 00000801fc0003b4-00000801fc0003b5 (prio 0, i/o): vga 00000801fc0003ba-00000801fc0003ba (prio 0, i/o): vga 00000801fc0003bc-00000801fc0003c3 (prio 0, i/o): parallel ^^^ ^^^^^^^^ 00000801fc0003c0-00000801fc0003cf (prio 0, i/o): vga ^^^ 00000801fc0003d4-00000801fc0003d5 (prio 0, i/o): vga 00000801fc0003da-00000801fc0003da (prio 0, i/o): vga ... As there is no particular reason to use this base address (introduced in 7bea0dd434e), change to 378H which is the default on PC machines. Reported-by: Emilio G. Cota <cota@braap.org> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180614233935.26585-1-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* | | Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into stagingPeter Maydell2018-06-1924-319/+1836
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block layer patches: - Active mirror (blockdev-mirror copy-mode=write-blocking) - bdrv_drain_*() fixes and test cases - Fix crash with scsi-hd and drive_del # gpg: Signature made Mon 18 Jun 2018 17:44:10 BST # gpg: using RSA key 7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (35 commits) iotests: Add test for active mirroring block/mirror: Add copy mode QAPI interface block/mirror: Add active mirroring job: Add job_progress_increase_remaining() block/mirror: Add MirrorBDSOpaque block/dirty-bitmap: Add bdrv_dirty_iter_next_area test-hbitmap: Add non-advancing iter_next tests hbitmap: Add @advance param to hbitmap_iter_next() block: Generalize should_update_child() rule block/mirror: Use source as a BdrvChild block/mirror: Wait for in-flight op conflicts block/mirror: Use CoQueue to wait on in-flight ops block/mirror: Convert to coroutines block/mirror: Pull out mirror_perform() block: fix QEMU crash with scsi-hd and drive_del test-bdrv-drain: Test graph changes in drain_all section block: Allow graph changes in bdrv_drain_all_begin/end sections block: ignore_bds_parents parameter for drain functions block: Move bdrv_drain_all_begin() out of coroutine context block: Allow AIO_WAIT_WHILE with NULL ctx ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * \ \ Merge remote-tracking branch 'mreitz/tags/pull-block-2018-06-18' into ↵Kevin Wolf2018-06-1816-152/+799
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | queue-block Block patches: - Active mirror (blockdev-mirror copy-mode=write-blocking) # gpg: Signature made Mon Jun 18 17:08:19 2018 CEST # gpg: using RSA key F407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * mreitz/tags/pull-block-2018-06-18: iotests: Add test for active mirroring block/mirror: Add copy mode QAPI interface block/mirror: Add active mirroring job: Add job_progress_increase_remaining() block/mirror: Add MirrorBDSOpaque block/dirty-bitmap: Add bdrv_dirty_iter_next_area test-hbitmap: Add non-advancing iter_next tests hbitmap: Add @advance param to hbitmap_iter_next() block: Generalize should_update_child() rule block/mirror: Use source as a BdrvChild block/mirror: Wait for in-flight op conflicts block/mirror: Use CoQueue to wait on in-flight ops block/mirror: Convert to coroutines block/mirror: Pull out mirror_perform() Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| | * | | iotests: Add test for active mirroringMax Reitz2018-06-183-0/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-15-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Add copy mode QAPI interfaceMax Reitz2018-06-184-9/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows the user to specify whether to use active or only background mode for mirror block jobs. Currently, this setting will remain constant for the duration of the entire block job. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-14-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Add active mirroringMax Reitz2018-06-182-5/+265
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements active synchronous mirroring. In active mode, the passive mechanism will still be in place and is used to copy all initially dirty clusters off the source disk; but every write request will write data both to the source and the target disk, so the source cannot be dirtied faster than data is mirrored to the target. Also, once the block job has converged (BLOCK_JOB_READY sent), source and target are guaranteed to stay in sync (unless an error occurs). Active mode is completely optional and currently disabled at runtime. A later patch will add a way for users to enable it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-13-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | job: Add job_progress_increase_remaining()Max Reitz2018-06-182-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20180613181823.13618-12-mreitz@redhat.com Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Add MirrorBDSOpaqueMax Reitz2018-06-181-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will allow us to access the block job data when the mirror block driver becomes more complex. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-11-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/dirty-bitmap: Add bdrv_dirty_iter_next_areaMax Reitz2018-06-182-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new function allows to look for a consecutively dirty area in a dirty bitmap. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180613181823.13618-10-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | test-hbitmap: Add non-advancing iter_next testsMax Reitz2018-06-181-12/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a function that wraps hbitmap_iter_next() and always calls it in non-advancing mode first, and in advancing mode next. The result should always be the same. By using this function everywhere we called hbitmap_iter_next() before, we should get good test coverage for non-advancing hbitmap_iter_next(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180613181823.13618-9-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | hbitmap: Add @advance param to hbitmap_iter_next()Max Reitz2018-06-185-19/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new parameter allows the caller to just query the next dirty position without moving the iterator. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180613181823.13618-8-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block: Generalize should_update_child() ruleMax Reitz2018-06-181-10/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, bdrv_replace_node() refuses to create loops from one BDS to itself if the BDS to be replaced is the backing node of the BDS to replace it: Say there is a node A and a node B. Replacing B by A means making all references to B point to A. If B is a child of A (i.e. A has a reference to B), that would mean we would have to make this reference point to A itself -- so we'd create a loop. bdrv_replace_node() (through should_update_child()) refuses to do so if B is the backing node of A. There is no reason why we should create loops if B is not the backing node of A, though. The BDS graph should never contain loops, so we should always refuse to create them. If B is a child of A and B is to be replaced by A, we should simply leave B in place there because it is the most sensible choice. A more specific argument would be: Putting filter drivers into the BDS graph is basically the same as appending an overlay to a backing chain. But the main child BDS of a filter driver is not "backing" but "file", so restricting the no-loop rule to backing nodes would fail here. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-7-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Use source as a BdrvChildMax Reitz2018-06-181-8/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, the mirror_top_bs is no longer just a technically required node in the BDS graph but actually represents the block job operation. Also, drop MirrorBlockJob.source, as we can reach it through mirror_top_bs->backing. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-6-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Wait for in-flight op conflictsMax Reitz2018-06-181-18/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes the mirror code differentiate between simply waiting for any operation to complete (mirror_wait_for_free_in_flight_slot()) and specifically waiting for all operations touching a certain range of the virtual disk to complete (mirror_wait_on_conflicts()). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-5-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Use CoQueue to wait on in-flight opsMax Reitz2018-06-181-11/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attach a CoQueue to each in-flight operation so if we need to wait for any we can use it to wait instead of just blindly yielding and hoping for some operation to wake us. A later patch will use this infrastructure to allow requests accessing the same area of the virtual disk to specifically wait for each other. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-4-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Convert to coroutinesMax Reitz2018-06-181-62/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to talk to the source BDS (and maybe in the future to the target BDS as well) directly, we need to convert our existing AIO requests into coroutine I/O requests. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20180613181823.13618-3-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| | * | | block/mirror: Pull out mirror_perform()Max Reitz2018-06-181-22/+29
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When converting mirror's I/O to coroutines, we are going to need a point where these coroutines are created. mirror_perform() is going to be that point. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20180613181823.13618-2-mreitz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>
| * | | block: fix QEMU crash with scsi-hd and drive_delGreg Kurz2018-06-181-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removing a drive with drive_del while it is being used to run an I/O intensive workload can cause QEMU to crash. An AIO flush can yield at some point: blk_aio_flush_entry() blk_co_flush(blk) bdrv_co_flush(blk->root->bs) ... qemu_coroutine_yield() and let the HMP command to run, free blk->root and give control back to the AIO flush: hmp_drive_del() blk_remove_bs() bdrv_root_unref_child(blk->root) child_bs = blk->root->bs bdrv_detach_child(blk->root) bdrv_replace_child(blk->root, NULL) blk->root->bs = NULL g_free(blk->root) <============== blk->root becomes stale bdrv_unref(child_bs) bdrv_delete(child_bs) bdrv_close() bdrv_drained_begin() bdrv_do_drained_begin() bdrv_drain_recurse() aio_poll() ... qemu_coroutine_switch() and the AIO flush completion ends up dereferencing blk->root: blk_aio_complete() scsi_aio_complete() blk_get_aio_context(blk) bs = blk_bs(blk) ie, bs = blk->root ? blk->root->bs : NULL ^^^^^ stale The problem is that we should avoid making block driver graph changes while we have in-flight requests. Let's drain all I/O for this BB before calling bdrv_root_unref_child(). Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-bdrv-drain: Test graph changes in drain_all sectionKevin Wolf2018-06-181-2/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tests both adding and remove a node between bdrv_drain_all_begin() and bdrv_drain_all_end(), and enabled the existing detach test for drain_all. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Allow graph changes in bdrv_drain_all_begin/end sectionsKevin Wolf2018-06-184-17/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_drain_all_*() used bdrv_next() to iterate over all root nodes and did a subtree drain for each of them. This works fine as long as the graph is static, but sadly, reality looks different. If the graph changes so that root nodes are added or removed, we would have to compensate for this. bdrv_next() returns each root node only once even if it's the root node for multiple BlockBackends or for a monitor-owned block driver tree, which would only complicate things. The much easier and more obviously correct way is to fundamentally change the way the functions work: Iterate over all BlockDriverStates, no matter who owns them, and drain them individually. Compensation is only necessary when a new BDS is created inside a drain_all section. Removal of a BDS doesn't require any action because it's gone afterwards anyway. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: ignore_bds_parents parameter for drain functionsKevin Wolf2018-06-185-44/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the future, bdrv_drained_all_begin/end() will drain all invidiual nodes separately rather than whole subtrees. This means that we don't want to propagate the drain to all parents any more: If the parent is a BDS, it will already be drained separately. Recursing to all parents is unnecessary work and would make it an O(n²) operation. Prepare the drain function for the changed drain_all by adding an ignore_bds_parents parameter to the internal implementation that prevents the propagation of the drain to BDS parents. We still (have to) propagate it to non-BDS parents like BlockBackends or Jobs because those are not drained separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Move bdrv_drain_all_begin() out of coroutine contextKevin Wolf2018-06-181-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before we can introduce a single polling loop for all nodes in bdrv_drain_all_begin(), we must make sure to run it outside of coroutine context like we already do for bdrv_do_drained_begin(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Allow AIO_WAIT_WHILE with NULL ctxKevin Wolf2018-06-181-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_drain_all() wants to have a single polling loop for draining the in-flight requests of all nodes. This means that the AIO_WAIT_WHILE() condition relies on activity in multiple AioContexts, which is polled from the mainloop context. We must therefore call AIO_WAIT_WHILE() from the mainloop thread and use the AioWait notification mechanism. Just randomly picking the AioContext of any non-mainloop thread would work, but instead of bothering to find such a context in the caller, we can just as well accept NULL for ctx. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-bdrv-drain: Test that bdrv_drain_invoke() doesn't pollKevin Wolf2018-06-181-14/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a test case that goes wrong if bdrv_drain_invoke() calls aio_poll(). Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Defer .bdrv_drain_begin callback to polling phaseKevin Wolf2018-06-181-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We cannot allow aio_poll() in bdrv_drain_invoke(begin=true) until we're done with propagating the drain through the graph and are doing the single final BDRV_POLL_WHILE(). Just schedule the coroutine with the callback and increase bs->in_flight to make sure that the polling phase will wait for it. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-bdrv-drain: Graph change through parent callbackKevin Wolf2018-06-181-0/+130
| | | | | | | | | | | | | | | | Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Don't poll in parent drain callbacksKevin Wolf2018-06-183-9/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_do_drained_begin() is only safe if we have a single BDRV_POLL_WHILE() after quiescing all affected nodes. We cannot allow that parent callbacks introduce a nested polling loop that could cause graph changes while we're traversing the graph. Split off bdrv_do_drained_begin_quiesce(), which only quiesces a single node without waiting for its requests to complete. These requests will be waited for in the BDRV_POLL_WHILE() call down the call chain. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-bdrv-drain: Test node deletion in subtree recursionKevin Wolf2018-06-181-9/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If bdrv_do_drained_begin() polls during its subtree recursion, the graph can change and mess up the bs->children iteration. Test that this doesn't happen. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Drain recursively with a single BDRV_POLL_WHILE()Kevin Wolf2018-06-183-22/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Anything can happen inside BDRV_POLL_WHILE(), including graph changes that may interfere with its callers (e.g. child list iteration in recursive callers of bdrv_do_drained_begin). Switch to a single BDRV_POLL_WHILE() call for the whole subtree at the end of bdrv_do_drained_begin() to avoid such effects. The recursion happens now inside the loop condition. As the graph can only change between bdrv_drain_poll() calls, but not inside of it, doing the recursion here is safe. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | test-bdrv-drain: Add test for node deletionMax Reitz2018-06-181-0/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds two bdrv-drain tests for what happens if some BDS goes away during the drainage. The basic idea is that you have a parent BDS with some child nodes. Then, you drain one of the children. Because of that, the party who actually owns the parent decides to (A) delete it, or (B) detach all its children from it -- both while the child is still being drained. A real-world case where this can happen is the mirror block job, which may exit if you drain one of its children. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Remove bdrv_drain_recurse()Kevin Wolf2018-06-181-33/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For bdrv_drain(), recursively waiting for child node requests is pointless because we didn't quiesce their parents, so new requests could come in anyway. Letting the function work only on a single node makes it more consistent. For subtree drains and drain_all, we already have the recursion in bdrv_do_drained_begin(), so the extra recursion doesn't add anything either. Remove the useless code. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * | | block: Really pause block jobs on drainKevin Wolf2018-06-188-14/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already requested that block jobs be paused in .bdrv_drained_begin, but no guarantee was made that the job was actually inactive at the point where bdrv_drained_begin() returned. This introduces a new callback BdrvChildRole.bdrv_drained_poll() and uses it to make bdrv_drain_poll() consider block jobs using the node to be drained. For the test case to work as expected, we have to switch from block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even considered active and must be waited for when draining the node. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
| * | | block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()Kevin Wolf2018-06-182-15/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 91af091f923 added an additional aio_poll() to BDRV_POLL_WHILE() in order to make sure that all pending BHs are executed on drain. This was the wrong place to make the fix, as it is useless overhead for all other users of the macro and unnecessarily complicates the mechanism. This patch effectively reverts said commit (the context has changed a bit and the code has moved to AIO_WAIT_WHILE()) and instead polls in the loop condition for drain. The effect is probably hard to measure in any real-world use case because actual I/O will dominate, but if I run only the initialisation part of 'qemu-img convert' where it calls bdrv_block_status() for the whole image to find out how much data there is copy, this phase actually needs only roughly half the time after this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * | | tests/test-bdrv-drain: bdrv_drain_all() works in coroutines nowKevin Wolf2018-06-181-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we use bdrv_do_drained_begin/end() for bdrv_drain_all_begin/end(), coroutine context is automatically left with a BH, preventing the deadlocks that made bdrv_drain_all*() unsafe in coroutine context. Now that we even removed the old polling code as dead code, it's obvious that it's compatible now. Enable the coroutine test cases for bdrv_drain_all(). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * | | block: Don't manually poll in bdrv_drain_all()Kevin Wolf2018-06-181-29/+12Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All involved nodes are already idle, we called bdrv_do_drain_begin() on them. The comment in the code suggested that this was not correct because the completion of a request on one node could spawn a new request on a different node (which might have been drained before, so we wouldn't drain the new request). In reality, new requests to different nodes aren't spawned out of nothing, but only in the context of a parent request, and they aren't submitted to random nodes, but only to child nodes. As long as we still poll for the completion of the parent request (which we do), draining each root node separately is good enough. Remove the additional polling code from bdrv_drain_all_begin() and replace it with an assertion that all nodes are already idle after we drained them separately. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * | | block: Remove 'recursive' parameter from bdrv_drain_invoke()Kevin Wolf2018-06-181-10/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers pass false for the 'recursive' parameter now. Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * | | block: Use bdrv_do_drain_begin/end in bdrv_drain_all()Kevin Wolf2018-06-182-18/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | bdrv_do_drain_begin/end() implement already everything that bdrv_drain_all_begin/end() need and currently still do manually: Disable external events, call parent drain callbacks, call block driver callbacks. It also does two more things: The first is incrementing bs->quiesce_counter. bdrv_drain_all() already stood out in the test case by behaving different from the other drain variants. Adding this is not only safe, but in fact a bug fix. The second is calling bdrv_drain_recurse(). We already do that later in the same function in a loop, so basically doing an early first iteration doesn't hurt. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
| * | | test-bdrv-drain: bdrv_drain() works with cross-AioContext eventsKevin Wolf2018-06-182-5/+186
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As long as nobody keeps the other I/O thread from working, there is no reason why bdrv_drain() wouldn't work with cross-AioContext events. The key is that the root request we're waiting for is in the AioContext we're polling (which it always is for bdrv_drain()) so that aio_poll() is woken up in the end. Add a test case that shows that it works. Remove the comment in bdrv_drain() that claims otherwise. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* | | Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2018-06-18' ↵Peter Maydell2018-06-194-59/+121
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging Monitor patches for 2018-06-18 # gpg: Signature made Mon 18 Jun 2018 14:50:29 BST # gpg: using RSA key 3870B400EB918653 # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-monitor-2018-06-18: monitor: add lock to protect mon_fdsets monitor: move init global earlier monitor: remove event_clock_type monitor: fix comment for monitor_lock monitor: more comments on lock-free elements monitor: protect mon->fds with mon_lock monitor: rename out_lock to mon_lock Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | monitor: add lock to protect mon_fdsetsPeter Xu2018-06-183-12/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new global big lock for mon_fdsets. Take it where needed. The monitor_fdset_get_fd() handling is a bit tricky: now we need to call qemu_mutex_unlock() which might pollute errno, so we need to make sure the correct errno be passed up to the callers. To make things simpler, we let monitor_fdset_get_fd() return the -errno directly when error happens, then in qemu_open() we move it back into errno. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180608035511.7439-8-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>
| * | | monitor: move init global earlierPeter Xu2018-06-181-6/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, monitor fd helpers might be called even earlier than monitor_init_globals(). This can be problematic. After previous work, now monitor_init_globals() does not depend on accelerator initialization any more. Call it earlier (before CLI parsing; that's where the monitor APIs might be called) to make sure it is called before any of the monitor APIs. Suggested-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20180608035511.7439-7-peterx@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>