summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170717' ↵Peter Maydell2017-07-174-28/+73
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into staging ppc patch queue 2017-07-17 This pull requests supersedes the one from 2017-07-14. That one had a couple of subtle regressions: there was a build error for mingw32, and an instance_size which was theoretically wrong everywhere, but only actually bit on the Travis OSX build. There are two major batches in this set, rather than the usual collection of assorted fixes. * More DRC cleanup. This gets the state management into a state which should fix many of the hotplug+migration problems we've had. Plus it gets the migration stream format into something well defined and pretty minimal which we can reasonably support into the future. * Hashed Page Table resizing. It's been a while since this was posted, but it's been through several previous rounds of review. The kernel parts (both guest and host) are merged in 4.11, so this is the only remaining piece left to allow resizing of the HPT in a running guest. There are also a handful of unrelated fixes. # gpg: Signature made Mon 17 Jul 2017 07:36:52 BST # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.10-20170717: (21 commits) target/ppc: fix CPU hotplug when radix is enabled (TCG) spapr: fix memory leak in spapr_core_pre_plug() pseries: Allow HPT resizing with KVM pseries: Use smaller default hash page tables when guest can resize pseries: Enable HPT resizing for 2.10 pseries: Implement HPT resizing pseries: Stubs for HPT resizing ppc/pnv: Remove unused XICSState reference spapr: fix potential memory leak in spapr_core_plug() spapr: Implement DR-indicator for physical DRCs only spapr: Remove sPAPRConfigureConnectorState sub-structure spapr: Consolidate DRC state variables spapr: Cleanups relating to DRC awaiting_release field spapr: Refactor spapr_drc_detach() spapr: Abort on delete failure in spapr_drc_release() spapr: Simplify unplug path spapr: Remove 'awaiting_allocation' DRC flag spapr: Treat devices added before inbound migration as coldplugged spapr: Minor cleanups to events handling spapr: migrate pending_events of spapr state ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * pseries: Use smaller default hash page tables when guest can resizeDavid Gibson2017-07-172-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've now implemented a PAPR extension allowing PAPR guest to resize their hash page table (HPT) during runtime. This patch makes use of that facility to allocate smaller HPTs by default. Specifically when a guest is aware of the HPT resize facility, qemu sizes the HPT to the initial memory size, rather than the maximum memory size on the assumption that the guest will resize its HPT if necessary for hot plugged memory. When the initial memory size is much smaller than the maximum memory size (a common configuration with e.g. oVirt / RHEV) then this can save significant memory on the HPT. If the guest does *not* advertise HPT resize awareness when it makes the ibm,client-architecture-support call, qemu resizes the HPT for maxmimum memory size (unless it's been configured not to allow such guests at all). For now we make that reallocation assuming the guest has not yet used the HPT at all. That's true in practice, but not, strictly, an architectural or PAPR requirement. If we need to in future we can fix this by having the client-architecture-support call reboot the guest with the revised HPT size (the client-architecture-support call is explicitly permitted to trigger a reboot in this way). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
| * pseries: Implement HPT resizingDavid Gibson2017-07-171-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements hypercalls allowing a PAPR guest to resize its own hash page table. This will eventually allow for more flexible memory hotplug. The implementation is partially asynchronous, handled in a special thread running the hpt_prepare_thread() function. The state of a pending resize is stored in SPAPR_MACHINE->pending_hpt. The H_RESIZE_HPT_PREPARE hypercall will kick off creation of a new HPT, or, if one is already in progress, monitor it for completion. If there is an existing HPT resize in progress that doesn't match the size specified in the call, it will cancel it, replacing it with a new one matching the given size. The H_RESIZE_HPT_COMMIT completes transition to a resized HPT, and can only be called successfully once H_RESIZE_HPT_PREPARE has successfully completed initialization of a new HPT. The guest must ensure that there are no concurrent accesses to the existing HPT while this is called (this effectively means stop_machine() for Linux guests). For now H_RESIZE_HPT_COMMIT goes through the whole old HPT, rehashing each HPTE into the new HPT. This can have quite high latency, but it seems to be of the order of typical migration downtime latencies for HPTs of size up to ~2GiB (which would be used in a 256GiB guest). In future we probably want to move more of the rehashing to the "prepare" phase, by having H_ENTER and other hcalls update both current and pending HPTs. That's a project for another day, but should be possible without any changes to the guest interface. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * pseries: Stubs for HPT resizingDavid Gibson2017-07-171-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces stub implementations of the H_RESIZE_HPT_PREPARE and H_RESIZE_HPT_COMMIT hypercalls which we hope to add in a PAPR extension to allow run time resizing of a guest's hash page table. It also adds a new machine property for controlling whether this new facility is available. For now we only allow resizing with TCG, allowing it with KVM will require kernel changes as well. Finally, it adds a new string to the hypertas property in the device tree, advertising to the guest the availability of the HPT resizing hypercalls. This is a tentative suggested value, and would need to be standardized by PAPR before being merged. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
| * ppc/pnv: Remove unused XICSState referenceAlexey Kardashevskiy2017-07-171-2/+0Star
| | | | | | | | | | | | | | | | | | e6f7e110ee70 "ppc/xics: remove the XICSState classes" got rid of XICSState, this is just an leftover. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * spapr: Implement DR-indicator for physical DRCs onlyDavid Gibson2017-07-171-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | According to PAPR, the DR-indicator should only be valid for physical DRCs, not logical DRCs. At the moment we implement it for all DRCs, so restrict it to physical ones only. We move the state to the physical DRC subclass, which means adding some QOM boilerplate to handle the newly distinct type. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Remove sPAPRConfigureConnectorState sub-structureDavid Gibson2017-07-171-10/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the time, the state of a DRC object is contained in the single 'state' variable. However, during the transition from UNISOLATE to CONFIGURED state requires multiple calls to the ibm,configure-connector RTAS call to retrieve the device tree for the attached device. We need some extra state to keep track of where we're up to in delivering the device tree information to the guest. Currently that extra state is in a sPAPRConfigureConnectorState substructure which is only allocated when we're in the middle of the configure connector process. That sounds like a good idea, but the extra state is only two integers - on many platforms that will take up the same room as the (maybe NULL) ccs pointer even before malloc() overhead. Plus it's another object whose lifetime we need to manage. In short, it's not worth it. So, fold the sPAPRConfigureConnectorState substructure directly into the DRC object. Previously the structure was allocated lazily when the configure-connector call discovers it's not there. Now, we need to initialize the subfields pre-emptively, as soon as we enter UNISOLATE state. Although it's not strictly necessary (the field values should only ever be consulted when in UNISOLATE state), we try to keep them at -1 when in other states, as a debugging aid. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Consolidate DRC state variablesDavid Gibson2017-07-171-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each DRC has three fields describing its state: isolation_state, allocation_state and configured. At first this seems like a reasonable representation, since its based directly on the PAPR defined isolation-state and allocation-state indicators. However: * Only a few combinations of the two fields' values are permitted * allocation_state isn't used at all for physical DRCs * The indicators are write only so they don't really have a well defined current value independent of each other This replaces these variables with a single state variable, whose names and numbers are based on the diagram in LoPAPR section 13.4. Along with this we add code to check the current state on various operations and make sure the requested transition is permitted. Strictly speaking, this makes guest visible changes to behaviour (since we probably allowed some transitions we shouldn't have before). However, a hypothetical guest broken by that wasn't PAPR compliant, and probably wouldn't have worked under PowerVM. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Cleanups relating to DRC awaiting_release fieldDavid Gibson2017-07-171-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'awaiting_release' indicates that the host has requested an unplug of the device attached to the DRC, but the guest has not (yet) put the device into a state where it is safe to complete removal. 1. Rename it to 'unplug_requested' which to me at least is clearer 2. Remove the ->release_pending() method used to check this from outside spapr_drc.c. The method only plausibly has one implementation, so use a plain function (spapr_drc_unplug_requested()) instead. 3. Remove it from the migration stream. Attempting to migrate mid-unplug is broken not just for spapr - in general management has no good way to determine if the device should be present on the destination or not. So, until that's fixed, there's no point adding extra things to the stream. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Refactor spapr_drc_detach()David Gibson2017-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This function has two unused parameters - remove them. It also sets awaiting_release on all paths, except one. On that path setting it is harmless, since it will be immediately cleared by spapr_drc_release(). So factor it out of the if statements. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Remove 'awaiting_allocation' DRC flagDavid Gibson2017-07-171-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The awaiting_allocation flag in the DRC was introduced by aab9913 "spapr_drc: Prevent detach racing against attach for CPU DR", allegedly to prevent a guest crash on racing attach and detach. Except.. information from the BZ actually suggests a qemu crash, not a guest crash. And there shouldn't be a problem here anyway: if the guest has already moved the DRC away from UNUSABLE state, the detach would already be deferred, and if it hadn't it should be safe to detach it (the guest should fail gracefully when it attempts to change the allocation state). I think this was probably just a bandaid for some other problem in the state management. So, remove awaiting_allocation and associated code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Treat devices added before inbound migration as coldpluggedLaurent Vivier2017-07-171-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When migrating a guest which has already had devices hotplugged, libvirt typically starts the destination qemu with -incoming defer, adds those hotplugged devices with qmp, then initiates the incoming migration. This causes problems for the management of spapr DRC state. Because the device is treated as hotplugged, it goes into a DRC state for a device immediately after it's plugged, but before the guest has acknowledged its presence. However, chances are the guest on the source machine *has* acknowledged the device's presence and configured it. If the source has fully configured the device, then DRC state won't be sent in the migration stream: for maximum migration compatibility with earlier versions we don't migrate DRCs in coldplug-equivalent state. That means that the DRC effectively changes state over the migrate, causing problems later on. In addition, logging hotplug events for these devices isn't what we want because a) those events should already have been issued on the source host and b) the event queue should get wiped out by the incoming state anyway. In short, what we really want is to treat devices added before an incoming migration as if they were coldplugged. To do this, we first add a spapr_drc_hotplugged() helper which determines if the device is hotplugged in the sense relevant for DRC state management. We only send hotplug events when this is true. Second, when we add a device which isn't hotplugged in this sense, we force a reset of the DRC state - this ensures the DRC is in a coldplug-equivalent state (there isn't usually a system reset between these device adds and the incoming migration). This is based on an earlier patch by Laurent Vivier, cleaned up and extended. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
| * spapr: Minor cleanups to events handlingDavid Gibson2017-07-171-5/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rtas_error_log structure is marked packed, which strongly suggests its precise layout is important to match an external interface. Along with that one could expect it to have a fixed endianness to match the same interface. That used to be the case - matching the layout of PAPR RTAS event format and requiring BE fields. Now, however, it's only used embedded within sPAPREventLogEntry with the fields in native order, since they're processed internally. Clear that up by removing the nested structure in sPAPREventLogEntry. struct rtas_error_log is moved back to spapr_events.c where it is used as a temporary to help convert the fields in sPAPREventLogEntry to the correct in memory format when delivering an event to the guest. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
| * spapr: migrate pending_events of spapr stateDaniel Henrique Barboza2017-07-171-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In racing situations between hotplug events and migration operation, a rtas hotplug event could have not yet be delivered to the source guest when migration is started. In this case the pending_events of spapr state need be transmitted to the target so that the hotplug event can be finished on the target. To achieve the minimal VMSD possible to migrate the pending_events list, this patch makes the changes in spapr_events.c: - 'log_type' of sPAPREventLogEntry struct deleted. This information can be derived by inspecting the rtas_error_log summary field. A new function called 'spapr_event_log_entry_type' was added to retrieve the type of a given sPAPREventLogEntry. - sPAPREventLogEntry, epow_log_full and hp_log_full were redesigned. The only data we're going to migrate in the VMSD is the event log data itself, which can be divided in two parts: a rtas_error_log header and an extended event log field. The rtas_error_log header contains information about the size of the extended log field, which can be used inside VMSD as the size parameter of the VBUFFER_ALOC field that will store it. To allow this use, the header.extended_length field must be exposed inline to the VMSD instead of embedded into a 'data' field that holds everything. With this in mind, the following changes were done: * a new 'header' field was added to sPAPREventLogEntry. This field holds a a struct rtas_error_log inline. * the declaration of the 'rtas_error_log' struct was moved to spapr.h to be visible to the VMSD macros. * 'data' field of sPAPREventLogEntry was renamed to 'extended_log' and now holds only the contents of the extended event log. * 'struct rtas_error_log hdr' were taken away from both epow_log_full and hp_log_full. This information is now available at the header field of sPAPREventLogEntry. * epow_log_full and hp_log_full were renamed to epow_extended_log and hp_extended_log respectively. This rename makes it clearer to understand the new purpose of both structures: hold the information of an extended event log field. * spapr_powerdown_req and spapr_hotplug_req_event now creates a sPAPREventLogEntry structure that contains the full rtas log entry. * rtas_event_log_queue and rtas_event_log_dequeue now receives a sPAPREventLogEntry pointer as a parameter instead of a void pointer. - the endianess of the sPAPREventLogEntry header is now native instead of be32. We can use the fields in native endianess internally and write them in be32 in the guest physical memory inside 'check_exception'. This allows the VMSD inside spapr.c to read the correct size of the entended_log field. - inside spapr.c, pending_events is put in a subsection in the spapr state VMSD to make sure migration across different versions is not broken. A small change in rtas_event_log_queue and rtas_event_log_dequeue were also made: instead of calling qdev_get_machine(), both functions now receive a pointer to the sPAPRMachineState. This pointer is already available in the callers of these functions and we don't need to waste resources calling qdev() again. Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* | Merge remote-tracking branch ↵Peter Maydell2017-07-172-1/+19
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/famz/tags/block-and-testing-pull-request' into staging # gpg: Signature made Mon 17 Jul 2017 04:47:05 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/block-and-testing-pull-request: travis: add no-TCG build docker.py: Improve subprocess exit code handling docker.py: Drop infile parameter docker: Don't enable networking as a side-effect of DEBUG=1 ssh: support I/O from any AioContext sheepdog: add queue_lock qed: protect table cache with CoMutex qed: introduce bdrv_qed_init_state block: invoke .bdrv_drain callback in coroutine context and from AioContext qed: move tail of qed_aio_write_main to qed_aio_write_{cow, alloc} vvfat: make it thread-safe vpc: make it thread-safe vdi: make it thread-safe coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade qcow2: call CoQueue APIs under CoMutex Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | block: invoke .bdrv_drain callback in coroutine context and from AioContextPaolo Bonzini2017-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will let the callback take a CoMutex in the next patch. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-8-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
| * | coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgradePaolo Bonzini2017-07-171-0/+18
| |/ | | | | | | | | | | | | | | | | | | | | | | | | These functions are more efficient in the presence of contention. qemu_co_rwlock_downgrade also guarantees not to block, which may be useful in some algorithms too. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-3-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>
* | memory.h: Add memory_region_init_{ram, rom, rom_device}() handling migrationPeter Maydell2017-07-142-1/+90
| | | | | | | | | | | | | | | | | | | | | | | | Add new utility functions which both initialize a RAM MemoryRegion and arrange for its contents to be migrated; we give thes the memory_region_init_ram(), memory_region_init_rom() and memory_region_init_rom_device() names that we just freed up by renaming the old implementations to _nomigrate(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-6-git-send-email-peter.maydell@linaro.org
* | memory: Rename memory_region_init_rom() and _rom_device() to _nomigrate()Peter Maydell2017-07-141-16/+24
| | | | | | | | | | | | | | | | | | | | Rename memory_region_init_rom() to memory_region_init_rom_nomigrate() and memory_region_init_rom_device() to memory_region_init_rom_device_nomigrate(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-5-git-send-email-peter.maydell@linaro.org
* | memory: Rename memory_region_init_ram() to memory_region_init_ram_nomigrate()Peter Maydell2017-07-141-7/+8
| | | | | | | | | | | | | | | | | | | | Rename memory_region_init_ram() to memory_region_init_ram_nomigrate(). This leaves the way clear for us to provide a memory_region_init_ram() which does handle migration. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-4-git-send-email-peter.maydell@linaro.org
* | memory: Document that the RAM MR initializers do not handle migrationPeter Maydell2017-07-141-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | The various functions for initializing RAM MemoryRegions do not do anything to cause the data in the MemoryRegion to be migrated. Note in their documentation comments that this is the responsibility of the caller. (We will shortly add a new function that *does* do this for you.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-3-git-send-email-peter.maydell@linaro.org
* | include/hw/boards.h: Document memory_region_allocate_system_memory()Peter Maydell2017-07-141-0/+28
|/ | | | | | | | | | | Add a documentation comment for memory_region_allocate_system_memory(). In particular, the reason for this function's existence and the requirement on board code to call it exactly once are non-obvious. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1499438577-7674-2-git-send-email-peter.maydell@linaro.org
* Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20170714' into ↵Peter Maydell2017-07-149-18/+137
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging s390x/kvm/migration/cpumodel: fixes, enhancements and cleanups - add a network boot rom for s390 (Thomas Huth) - migration of storage attributes like the CMMA used/unused state - PCI related enhancements - full support for aen, ais and zpci - migration support for css with vmstates (Halil Pasic) - cpu model enhancements for cpu features - guarded storage support # gpg: Signature made Fri 14 Jul 2017 11:33:04 BST # gpg: using RSA key 0x117BBC80B5A61C7C # gpg: Good signature from "Christian Borntraeger (IBM) <borntraeger@de.ibm.com>" # Primary key fingerprint: F922 9381 A334 08F9 DBAB FBCA 117B BC80 B5A6 1C7C * remotes/borntraeger/tags/s390x-20170714: (40 commits) s390x/gdb: add gs registers s390x/arch_dump: also dump guarded storage control block s390x/kvm: enable guarded storage s390x/kvm: Enable KSS facility for nested virtualization s390x/cpumodel: add esop/esop2 to z12 model s390x/cpumodel: we are always in zarchitecture mode s390x/cpumodel: wire up new hardware features s390x/flic: migrate ais states s390x/cpumodel: add zpci, aen and ais facilities s390x: initialize cpu firstly pc-bios/s390: rebuild s390-ccw.img pc-bios/s390: add s390-netboot.img pc-bios/s390-ccw: Link libnet into the netboot image and do the TFTP load pc-bios/s390-ccw: Add virtio-net driver code pc-bios/s390-ccw: Add core files for the network bootloading program roms/SLOF: Update submodule to latest status pc-bios/s390-ccw: Add code for virtio feature negotiation pc-bios/s390-ccw: Remove unused structs from virtio.h pc-bios/s390-ccw: Move byteswap functions to a separate header pc-bios/s390-ccw: Add a write() function for stdio ... Conflicts: target/s390x/kvm.c Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * s390x/arch_dump: also dump guarded storage control blockChristian Borntraeger2017-07-141-0/+1
| | | | | | | | | | | | Write the new note section of type 30b (guarded storage control block). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/kvm: enable guarded storageFan Zhang2017-07-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Introduce guarded storage support for KVM guests on s390. We need to enable the capability, extend machine check validity, sigp store-additional-status-at-address, and migration. The feature is fenced for older machine type versions. Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/cpumodel: wire up new hardware featuresJason J. Herne2017-07-141-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | Some new guest features have been introduced recently. Let's wire them up in the CPU model. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [split patch]
| * s390x/flic: migrate ais statesYi Min Zhao2017-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | During migration we should transfer ais states to the target guest. This patch introduces a subsection to kvm_s390_flic_vmstate and new vmsd for qemu_flic. The ais states need to be migrated only when ais is supported. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
| * s390x/css: use SubchDev.orbHalil Pasic2017-07-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | Instead of passing around a pointer to ORB let us simplify some function signatures by using the previously introduced ORB saved at the subchannel (SubchDev). Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Message-Id: <20170711145441.33925-7-pasic@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/css: activate ChannelSubSys migrationHalil Pasic2017-07-141-0/+4
| | | | | | | | | | | | | | | | | | | | Turn on migration for the channel subsystem for the next machine. For legacy machines we still have to do things the old way. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Message-Id: <20170711145441.33925-6-pasic@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/css: add ORB to SubchDevHalil Pasic2017-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we are going to need a migration compatibility breaking change to activate ChannelSubSys migration let us use the opportunity to introduce ORB to the SubchDev before that (otherwise we would need separate handling e.g. a compat property). The ORB will be useful for implementing IDA, or async handling of subchannel work. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Reviewed-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Message-Id: <20170711145441.33925-5-pasic@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x: add css_migration_enabled to machine classHalil Pasic2017-07-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the migration of the channel subsystem (css) is only partial and is done by the virtio ccw proxies -- the only migratable css devices existing at the moment. With the current work on emulated and passthrough devices we need to decouple the migration of the channel subsystem state from virtio ccw, and have a separate section for it. A new section however necessarily breaks the migration compatibility. So let us introduce a switch at the machine class, and put it in 'off' state for now. We will turn the switch 'on' for future machines once all preparations are met. For compatibility machines the switch will stay 'off'. Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Message-Id: <20170711145441.33925-3-pasic@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/css: update css_adapter_interruptYi Min Zhao2017-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | Let's use the new inject_airq callback of flic to inject adapter interrupts. For kvm case, if the kernel flic doesn't support the new interface, the irq routine remains unchanged. For non-kvm case, qemu-flic handles the suppression process. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/sic: realize SIC handlingFei Li2017-07-141-0/+2
| | | | | | | | | | | | | | | | | | | | Currently, we do nothing for the SIC instruction, but we need to implement it properly. Let's add proper handling in the backend code. Co-authored-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/flic: introduce inject_airq callbackYi Min Zhao2017-07-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's introduce a specialized way to inject adapter interrupts that, unlike the common interrupt injection method, allows to take the characteristics of the adapter into account. For adapters subject to AIS facility: - for non-kvm case, we handle the suppression for a given ISC in QEMU. - for kvm case, we pass adapter id to kvm to do airq injection. Add add tracepoint for suppressed airq and suppressing airq. Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/flic: introduce modify_ais_mode callbackFei Li2017-07-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to emulate the adapter interruption suppression (AIS) facility properly, the guest needs to be able to modify the AIS mask. Interrupt suppression will be handled via the flic (for kvm, via a recently introduced kernel backend; for !kvm, in the flic code), so let's introduce a method to change the mode via the flic interface. We introduce the 'simm' and 'nimm' fields to QEMUS390FLICState to store interruption modes for each ISC. Each bit in 'simm' and 'nimm' targets one ISC, and collaboratively indicate three modes: ALL-Interruptions, SINGLE-Interruption and NO-Interruptions. This interface can initiate most transitions between the states; transition from SINGLE-Interruption to NO-Interruptions via adapter interrupt injection will be introduced in a following patch. The meaningful combinations are as follows: interruption mode | simm bit | nimm bit ------------------|----------|---------- ALL | 0 | 0 SINGLE | 1 | 0 NO | 1 | 1 Co-authored-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com> Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x: add flags field for registering I/O adapterFei Li2017-07-142-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | Introduce a new 'flags' field to IoAdapter to contain further characteristics of the adapter, like whether the adapter is subject to adapter-interruption suppression. For the kvm case, pass this value in the 'flags' field when registering an adapter. Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/migration: Monitor commands for storage attributesClaudio Imbrenda2017-07-141-0/+4
| | | | | | | | | | | | | | | | | | | | Add an "info" monitor command to non-destructively inspect the state of the storage attributes of the guest, and a normal command to toggle migration mode (useful for debugging). Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * s390x/migration: Storage attributes deviceClaudio Imbrenda2017-07-141-0/+77
| | | | | | | | | | | | | | Storage attributes device, like we have for storage keys. Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| * linux-headers: update to 4.13-rc0Christian Borntraeger2017-07-143-9/+14
| | | | | | | | | | | | | | | | | | | | commit af3c8d98508d37541d4bf57f13a984a7f73a328c Merge tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linux There is a change pending for v4.13-rc1 in linux-headers/linux/kvm.h I will submit a fixup patch for 2.10 as soon as it hits the kernel. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
* | Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell2017-07-1416-89/+214
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * gdbstub fixes (Alex) * IOMMU MemoryRegion subclass (Alexey) * Chardev hotswap (Anton) * NBD_OPT_GO support (Eric) * Misc bugfixes * DEFINE_PROP_LINK (minus the ARM patches - Fam) * MAINTAINERS updates (Philippe) # gpg: Signature made Fri 14 Jul 2017 11:06:27 BST # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (55 commits) spapr_rng: Convert to DEFINE_PROP_LINK cpu: Convert to DEFINE_PROP_LINK mips_cmgcr: Convert to DEFINE_PROP_LINK ivshmem: Convert to DEFINE_PROP_LINK dimm: Convert to DEFINE_PROP_LINK virtio-crypto: Convert to DEFINE_PROP_LINK virtio-rng: Convert to DEFINE_PROP_LINK virtio-scsi: Convert to DEFINE_PROP_LINK virtio-blk: Convert to DEFINE_PROP_LINK qdev: Add const qualifier to PropertyInfo definitions qmp: Use ObjectProperty.type if present qdev: Introduce DEFINE_PROP_LINK qdev: Introduce PropertyInfo.create qom: enforce readonly nature of link's check callback translate-all: remove redundant !tcg_enabled check in dump_exec_info vl: fix breakage of -tb-size nbd: Implement NBD_INFO_BLOCK_SIZE on client nbd: Implement NBD_INFO_BLOCK_SIZE on server nbd: Implement NBD_OPT_GO on client nbd: Implement NBD_OPT_GO on server ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | cpu: Convert to DEFINE_PROP_LINKFam Zheng2017-07-141-0/+1
| | | | | | | | | | | | | | | | | | Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170714021509.23681-20-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | qdev: Add const qualifier to PropertyInfo definitionsFam Zheng2017-07-143-30/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | The remaining non-const ones are in e1000e which modifies description at runtime. They can be addressed separatedly. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170714021509.23681-6-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | qdev: Introduce DEFINE_PROP_LINKFam Zheng2017-07-142-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | This property can be used to replace the object_property_add_link in device code, to add a link to other objects, which is a common pattern. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170714021509.23681-4-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | qdev: Introduce PropertyInfo.createFam Zheng2017-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows property implementation to provide a specialized property creation method. Update conditions guarding property types accordingly. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170714021509.23681-3-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | qom: enforce readonly nature of link's check callbackIgor Mammedov2017-07-142-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | link's check callback is supposed to verify/permit setting it, however currently nothing restricts it from misusing it and modifying target object from within. Make sure that readonly semantics are checked by compiler to prevent callback's misuse. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170714021509.23681-2-famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | nbd: Implement NBD_INFO_BLOCK_SIZE on clientEric Blake2017-07-141-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The upstream NBD Protocol has defined a new extension to allow the server to advertise block sizes to the client, as well as a way for the client to inform the server whether it intends to obey block sizes. When using the block layer as the client, we will obey block sizes; but when used as 'qemu-nbd -c' to hand off to the kernel nbd module as the client, we are still waiting for the kernel to implement a way for us to learn if it will honor block sizes (perhaps by an addition to sysfs, rather than an ioctl), as well as any way to tell the kernel what additional block sizes to obey (NBD_SET_BLKSIZE appears to be accurate for the minimum size, but preferred and maximum sizes would probably be new ioctl()s), so until then, we need to make our request for block sizes conditional. When using ioctl(NBD_SET_BLKSIZE) to hand off to the kernel, use the minimum block size as the sector size if it is larger than 512, which also has the nice effect of cooperating with (non-qemu) servers that don't do read-modify-write when exposing a block device with 4k sectors; it might also allow us to visit a file larger than 2T on a 32-bit kernel. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-10-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | nbd: Expose and debug more NBD constantsEric Blake2017-07-141-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NBD protocol has several constants defined in various extensions that we are about to implement. Expose them to the code, along with an easy way to map various constants to strings during diagnostic messages. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-4-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | nbd: Create struct for tracking export infoEric Blake2017-07-141-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NBD Protocol is introducing some additional information about exports, such as minimum request size and alignment, as well as an advertised maximum request size. It will be easier to feed this information back to the block layer if we gather all the information into a struct, rather than adding yet more pointer parameters during negotiation. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20170707203049.534-2-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | memory/iommu: introduce IOMMUMemoryRegionClassAlexey Kardashevskiy2017-07-143-11/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This finishes QOM'fication of IOMMUMemoryRegion by introducing a IOMMUMemoryRegionClass. This also provides a fastpath analog for IOMMU_MEMORY_REGION_GET_CLASS(). This makes IOMMUMemoryRegion an abstract class. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170711035620.4232-3-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | memory/iommu: QOM'fy IOMMU MemoryRegionAlexey Kardashevskiy2017-07-146-24/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This defines new QOM object - IOMMUMemoryRegion - with MemoryRegion as a parent. This moves IOMMU-related fields from MR to IOMMU MR. However to avoid dymanic QOM casting in fast path (address_space_translate, etc), this adds an @is_iommu boolean flag to MR and provides new helper to do simple cast to IOMMU MR - memory_region_get_iommu. The flag is set in the instance init callback. This defines memory_region_is_iommu as memory_region_get_iommu()!=NULL. This switches MemoryRegion to IOMMUMemoryRegion in most places except the ones where MemoryRegion may be an alias. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20170711035620.4232-2-aik@ozlabs.ru> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>