summaryrefslogtreecommitdiffstats
path: root/hw/i386
Commit message (Collapse)AuthorAgeFilesLines
* x86: do not re-randomize RNG seed on snapshot loadJason A. Donenfeld2022-10-271-1/+1
| | | | | | | | | | Snapshot loading is supposed to be deterministic, so we shouldn't re-randomize the various seeds used. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-4-Jason@zx2c4.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* reset: allow registering handlers that aren't called by snapshot loadingJason A. Donenfeld2022-10-272-5/+5
| | | | | | | | | | | | | Snapshot loading only expects to call deterministic handlers, not non-deterministic ones. So introduce a way of registering handlers that won't be called when reseting for snapshots. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-id: 20221025004327.568476-2-Jason@zx2c4.com [PMM: updated json doc comment with Markus' text; fixed checkpatch style nit] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* hyperv: fix SynIC SINT assertion failure on guest resetMaciej S. Szmigiero2022-10-182-6/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU assertion failure: hw/hyperv/hyperv.c:131: synic_reset: Assertion `QLIST_EMPTY(&synic->sint_routes)' failed. This happens both on normal guest reboot or when using "system_reset" HMP command. The failing assertion was introduced by commit 64ddecc88bcf ("hyperv: SControl is optional to enable SynIc") to catch dangling SINT routes on SynIC reset. The root cause of this problem is that the SynIC itself is reset before devices using SINT routes have chance to clean up these routes. Since there seems to be no existing mechanism to force reset callbacks (or methods) to be executed in specific order let's use a similar method that is already used to reset another interrupt controller (APIC) after devices have been reset - by invoking the SynIC reset from the machine reset handler via a new x86_cpu_after_reset() function co-located with the existing x86_cpu_reset() in target/i386/cpu.c. Opportunistically move the APIC reset handler there, too. Fixes: 64ddecc88bcf ("hyperv: SControl is optional to enable SynIc") # exposed the bug Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Message-Id: <cb57cee2e29b20d06f81dce054cbcea8b5d497e8.1664552976.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: pci: acpi: consolidate PCI slots creationIgor Mammedov2022-10-091-57/+54Star
| | | | | | | | | | | | | | No functional changes nor AML bytecode changes. Consolidate code that generates empty and populated slot descriptors. Besides eliminating duplication, it helps consolidate conditions for generating parts of Device{} desriptor in one place, which makes code more compact and easier to read. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-18-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* x86: pci: acpi: reorder Device's _DSM methodIgor Mammedov2022-10-091-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | align _DSM method in empty slot descriptor with a populated slot position. Expected change: + Device (SE8) + { + Name (_ADR, 0x001D0000) // _ADR: Address + Name (ASUN, 0x1D) Method (_DSM, 4, Serialized) // _DSM: Device-Specific Method { Local0 = Package (0x02) { BSEL, ASUN } Return (PDSM (Arg0, Arg1, Arg2, Arg3, Local0)) } - } - Device (SE8) - { - Name (_ADR, 0x001D0000) // _ADR: Address - Name (ASUN, 0x1D) Name (_SUN, 0x1D) // _SUN: Slot User Number Method (_EJ0, 1, NotSerialized) // _EJx: Eject Device { PCEJ (BSEL, _SUN) } + } i.e. put _DSM right after ASUN, with _SUN/_EJ0 following it. that will eliminate contextual changes (causing test failures) when follow up patches merge code generating populated and empty slots descriptors. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-16-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* x86: pci: acpi: reorder Device's _ADR and _SUN fieldsIgor Mammedov2022-10-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | no functional change, align order of fields in empty slot descriptor with a populated slot ordering. Expected diff: - Name (_SUN, 0x0X) // _SUN: Slot User Number Name (_ADR, 0xY) // _ADR: Address ... + Name (_SUN, 0xX) // _SUN: Slot User Number that will eliminate contextual changes (causing test failures) when follow up patches merge code generating populated and empty slots descriptors. Put mandatory _ADR as the 1st field, then ASUN as it can be present for both pupulated and empty slots and only then _SUN which is present only when slot is hotpluggable. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-13-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* x86: acpi: cleanup PCI device _DSM duplicationIgor Mammedov2022-10-091-29/+27Star
| | | | | | | | | | | | add ASUN variable to hotpluggable slots and use it instead of _SUN which has the same value to reuse _DMS code on both branches (hot- and non-hotpluggable). No functional change. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-10-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* x86: acpi: _DSM: use Package to pass parametersIgor Mammedov2022-10-091-13/+27
| | | | | | | | | | | | | | | | | Numer of possible arguments to pass to a method is limited in ACPI. The following patches will need to pass over more parameters to PDSM method, will hit that limit. Prepare for this by passing structure (Package) to method, which let us workaround arguments limitation. Pass to PDSM all standard arguments of _DSM as is, and pack custom parameters into Package that is passed as the last argument to PDSM. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-7-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* acpi: x86: refactor PDSM method to reduce nestingIgor Mammedov2022-10-091-62/+77
| | | | | | | | | | .., it will help with code readability and make easier to extend method in followup patches Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-6-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* acpi: x86: deduplicate HPET AML buildingIgor Mammedov2022-10-091-6/+4Star
| | | | | | | | | | HPET AML doesn't depend on piix4 nor q35, move code buiding it to common scope to avoid duplication. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220701133515.137890-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Revert "intel_iommu: Fix irqchip / X2APIC configuration checks"Peter Xu2022-10-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's true that when vcpus<=255 we don't require the length of 32bit APIC IDs. However here since we already have EIM=ON it means the hypervisor will declare the VM as x2apic supported (e.g. VT-d ECAP register will have EIM bit 4 set), so the guest should assume the APIC IDs are 32bits width even if vcpus<=255. In short, commit 77250171bdc breaks any simple cmdline that wants to boot a VM with >=9 but <=255 vcpus with: -device intel-iommu,intremap=on For anyone who does not want to enable x2apic, we can use eim=off in the intel-iommu parameters to skip enabling KVM x2apic. This partly reverts commit 77250171bdc02aee106083fd2a068147befa1a38, while keeping the valid bit on checking split irqchip, but revert the other change. One thing to mention is that this patch may break migration compatibility of such VM, however that's probably the best thing we can do, because the old behavior was simply wrong and not working for >8 vcpus. For <=8 vcpus, there could be a light guest ABI change (by enabling KVM x2apic after this patch), but logically it shouldn't affect the migration from working. Also, this is not the 1st commit to change x2apic behavior. Igor provided a full history of how this evolved for the past few years: https://lore.kernel.org/qemu-devel/20220922154617.57d1a1fb@redhat.com/ Relevant commits for reference: fb506e701e ("intel_iommu: reject broken EIM", 2016-10-17) c1bb5418e3 ("target/i386: Support up to 32768 CPUs without IRQ remapping", 2020-12-10) 77250171bd ("intel_iommu: Fix irqchip / X2APIC configuration checks", 2022-05-16) dc89f32d92 ("target/i386: Fix sanity check on max APIC ID / X2APIC enablement", 2022-05-16) We may want to have this for stable too (mostly for 7.1.0 only). Adding a fixes tag. Cc: David Woodhouse <dwmw2@infradead.org> Cc: Claudio Fontana <cfontana@suse.de> Cc: Igor Mammedov <imammedo@redhat.com> Fixes: 77250171bd ("intel_iommu: Fix irqchip / X2APIC configuration checks") Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220926153206.10881-1-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* x86: re-initialize RNG seed when selecting kernelJason A. Donenfeld2022-10-011-1/+4
| | | | | | | | | | | | | | We don't want it to be possible to re-read the RNG seed after ingesting it, because this ruins forward secrecy. Currently, however, the setup data section can just be re-read. Since the kernel is always read after the setup data, use the selection of the kernel as a trigger to re-initialize the RNG seed, just like we do on reboot, to preserve forward secrecy. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220922152847.3670513-1-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: re-enable rng seeding via SetupDataJason A. Donenfeld2022-09-273-3/+5
| | | | | | | | | | | | | | | | | This reverts 3824e25db1 ("x86: disable rng seeding via setup_data"), but for 7.2 rather than 7.1, now that modifying setup_data is safe to do. Cc: Laurent Vivier <laurent@vivier.eu> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Ard Biesheuvel <ardb@kernel.org> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-4-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: reinitialize RNG seed on system rebootJason A. Donenfeld2022-09-271-0/+7
| | | | | | | | | | | | Since this is read from fw_cfg on each boot, the kernel zeroing it out alone is insufficient to prevent it from being used twice. And indeed on reboot we always want a new seed, not the old one. So re-fill it in this circumstance. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-3-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: use typedef for SetupData structJason A. Donenfeld2022-09-271-7/+7
| | | | | | | | | | | | The preferred style is SetupData as a typedef, not setup_data as a plain struct. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ard Biesheuvel <ardb@kernel.org> Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-2-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* x86: return modified setup_data only if read as memory, not as fileJason A. Donenfeld2022-09-271-10/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If setup_data is being read into a specific memory location, then generally the setup_data address parameter is read first, so that the caller knows where to read it into. In that case, we should return setup_data containing the absolute addresses that are hard coded and determined a priori. This is the case when kernels are loaded by BIOS, for example. In contrast, when setup_data is read as a file, then we shouldn't modify setup_data, since the absolute address will be wrong by definition. This is the case when OVMF loads the image. This allows setup_data to be used like normal, without crashing when EFI tries to use it. (As a small development note, strangely, fw_cfg_add_file_callback() was exported but fw_cfg_add_bytes_callback() wasn't, so this makes that consistent.) Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Laurent Vivier <laurent@vivier.eu> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Richard Henderson <richard.henderson@linaro.org> Suggested-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220921093134.2936487-1-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* hw/i386/multiboot: Avoid dynamic stack allocationPhilippe Mathieu-Daudé2022-09-221-3/+2Star
| | | | | | | | | | Use autofree heap allocation instead of variable-length array on the stack. Replace the snprintf() call by g_strdup_printf(). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20220819153931.3147384-9-peter.maydell@linaro.org
* util: accept iova_tree_remove_parameter by valueEugenio Pérez2022-09-021-3/+3
| | | | | | | | | | | | | | It's convenient to call iova_tree_remove from a map returned from iova_tree_find or iova_tree_find_iova. With the current code this is not possible, since we will free it, and then we will try to search for it again. Fix it making accepting the map by value, forcing a copy of the argument. Not applying a fixes tag, since there is no use like that at the moment. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
* hw: Add compat machines for 7.2Cornelia Huck2022-08-253-2/+28
| | | | | | | | | | | Add 7.2 machine types for arm/i440fx/m68k/q35/s390x/spapr. Signed-off-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220727121755.395894-1-cohuck@redhat.com> [thuth: fixed conflict with pcmc->legacy_no_rng_seed] Signed-off-by: Thomas Huth <thuth@redhat.com>
* x86: disable rng seeding via setup_dataGerd Hoffmann2022-08-173-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Causes regressions when doing direct kernel boots with OVMF. At this point in the release cycle the only sensible action is to just disable this for 7.1 and sort it properly in the 7.2 devel cycle. Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <eduardo@habkost.net> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20220817083940.3174933-1-kraxel@redhat.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <eduardo@habkost.net> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* i386/pc: restrict AMD only enforcing of 1Tb hole to new machine typeJoao Martins2022-07-263-2/+6
| | | | | | | | | | | | | | | | | | | | | | The added enforcing is only relevant in the case of AMD where the range right before the 1TB is restricted and cannot be DMA mapped by the kernel consequently leading to IOMMU INVALID_DEVICE_REQUEST or possibly other kinds of IOMMU events in the AMD IOMMU. Although, there's a case where it may make sense to disable the IOVA relocation/validation when migrating from a non-amd-1tb-aware qemu to one that supports it. Relocating RAM regions to after the 1Tb hole has consequences for guest ABI because we are changing the memory mapping, so make sure that only new machine enforce but not older ones. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-12-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: relocate 4g start to 1T where applicableJoao Martins2022-07-261-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is assumed that the whole GPA space is available to be DMA addressable, within a given address space limit, except for a tiny region before the 4G. Since Linux v5.4, VFIO validates whether the selected GPA is indeed valid i.e. not reserved by IOMMU on behalf of some specific devices or platform-defined restrictions, and thus failing the ioctl(VFIO_DMA_MAP) with -EINVAL. AMD systems with an IOMMU are examples of such platforms and particularly may only have these ranges as allowed: 0000000000000000 - 00000000fedfffff (0 .. 3.982G) 00000000fef00000 - 000000fcffffffff (3.983G .. 1011.9G) 0000010000000000 - ffffffffffffffff (1Tb .. 16Pb[*]) We already account for the 4G hole, albeit if the guest is big enough we will fail to allocate a guest with >1010G due to the ~12G hole at the 1Tb boundary, reserved for HyperTransport (HT). [*] there is another reserved region unrelated to HT that exists in the 256T boundary in Fam 17h according to Errata #1286, documeted also in "Open-Source Register Reference for AMD Family 17h Processors (PUB)" When creating the region above 4G, take into account that on AMD platforms the HyperTransport range is reserved and hence it cannot be used either as GPAs. On those cases rather than establishing the start of ram-above-4g to be 4G, relocate instead to 1Tb. See AMD IOMMU spec, section 2.1.2 "IOMMU Logical Topology", for more information on the underlying restriction of IOVAs. After accounting for the 1Tb hole on AMD hosts, mtree should look like: 0000000000000000-000000007fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000007fffffff 0000010000000000-000001ff7fffffff (prio 0, i/o): alias ram-above-4g @pc.ram 0000000080000000-000000ffffffffff If the relocation is done or the address space covers it, we also add the the reserved HT e820 range as reserved. Default phys-bits on Qemu is TCG_PHYS_ADDR_BITS (40) which is enough to address 1Tb (0xff ffff ffff). On AMD platforms, if a ram-above-4g relocation is attempted and the CPU wasn't configured with a big enough phys-bits, an error message will be printed due to the maxphysaddr vs maxusedaddr check previously added. Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-11-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: bounds check phys-bits against max used GPAJoao Martins2022-07-261-0/+27
| | | | | | | | | | | | | | | | | | Calculate max *used* GPA against the CPU maximum possible address and error out if the former surprasses the latter. This ensures max used GPA is reacheable by configured phys-bits. Default phys-bits on Qemu is TCG_PHYS_ADDR_BITS (40) which is enough for the CPU to address 1Tb (0xff ffff ffff) or 1010G (0xfc ffff ffff) in AMD hosts with IOMMU. This is preparation for AMD guests with >1010G, where it will want relocate ram-above-4g to be after 1Tb instead of 4G. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-10-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: factor out device_memory base/size to helperJoao Martins2022-07-261-15/+31
| | | | | | | | | | | | | | | | | | Move obtaining hole64_start from device_memory memory region base/size into an helper alongside correspondent getters in pc_memory_init() when the hotplug range is unitialized. While doing that remove the memory region based logic from this newly added helper. This is the final step that allows pc_pci_hole64_start() to be callable at the beginning of pc_memory_init() before any memory regions are initialized. Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-9-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: handle unitialized mr in pc_get_cxl_range_end()Joao Martins2022-07-261-10/+8Star
| | | | | | | | | | | | | | | | | Remove pc_get_cxl_range_end() dependency on the CXL memory region, and replace with one that does not require the CXL host_mr to determine the start of CXL start. This in preparation to allow pc_pci_hole64_start() to be called early in pc_memory_init(), handle CXL memory region end when its underlying memory region isn't yet initialized. Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Message-Id: <20220719170014.27028-8-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com>
* i386/pc: factor out cxl range start to helperJoao Martins2022-07-261-7/+17
| | | | | | | | | | | | | | Factor out the calculation of the base address of the memory region. It will be used later on for the cxl range end counterpart calculation and as well in pc_memory_init() CXL memory region initialization, thus avoiding duplication. Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-7-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: factor out cxl range end to helperJoao Martins2022-07-261-10/+21
| | | | | | | | | | | | | | | | Move calculation of CXL memory region end to separate helper. This is in preparation to a future change that removes CXL range dependency on the CXL memory region, with the goal of allowing pc_pci_hole64_start() to be called before any memory region are initialized. Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-6-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: factor out above-4g end to an helperJoao Martins2022-07-261-15/+14Star
| | | | | | | | | | | There's a couple of places that seem to duplicate this calculation of RAM size above the 4G boundary. Move all those to a helper function. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-5-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: pass pci_hole64_size to pc_memory_init()Joao Martins2022-07-263-3/+17
| | | | | | | | | | | | | | | | | Use the pre-initialized pci-host qdev and fetch the pci-hole64-size into pc_memory_init() newly added argument. Use PCI_HOST_PROP_PCI_HOLE64_SIZE pci-host property for fetching pci-hole64-size. This is in preparation to determine that host-phys-bits are enough and for pci-hole64-size to be considered to relocate ram-above-4g to be at 1T (on AMD platforms). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-4-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* i386/pc: create pci-host qdev prior to pc_memory_init()Joao Martins2022-07-262-5/+8
| | | | | | | | | | | | | | | | | | | At the start of pc_memory_init() we usually pass a range of 0..UINT64_MAX as pci_memory, when really its 2G (i440fx) or 32G (q35). To get the real user value, we need to get pci-host passed property for default pci_hole64_size. Thus to get that, create the qdev prior to memory init to better make estimations on max used/phys addr. This is in preparation to determine that host-phys-bits are enough and also for pci-hole64-size to be considered to relocate ram-above-4g to be at 1T (on AMD platforms). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-3-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/i386: add 4g boundary start to X86MachineStateJoao Martins2022-07-264-7/+9
| | | | | | | | | | | | | | | Rather than hardcoding the 4G boundary everywhere, introduce a X86MachineState field @above_4g_mem_start and use it accordingly. This is in preparation for relocating ram-above-4g to be dynamically start at 1T on AMD platforms. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220719170014.27028-2-joao.m.martins@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/i386/pc: Always place CXL Memory Regions after device_memoryJonathan Cameron2022-07-261-4/+2Star
| | | | | | | | | | | | | | | | Previously broken_reserved_end was taken into account, but Igor Mammedov identified that this could lead to a clash between potential RAM being mapped in the region and CXL usage. Hence always add the size of the device_memory memory region. This only affects the case where the broken_reserved_end flag was set. Fixes: 6e4e3ae936e6 ("hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)") Reported-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20220701132300.2264-3-Jonathan.Cameron@huawei.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/i386: pass RNG seed via setup_data entryJason A. Donenfeld2022-07-225-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | Tiny machines optimized for fast boot time generally don't use EFI, which means a random seed has to be supplied some other way. For this purpose, Linux (≥5.20) supports passing a seed in the setup_data table with SETUP_RNG_SEED, specially intended for hypervisors, kexec, and specialized bootloaders. The linked commit shows the upstream kernel implementation. At Paolo's request, we don't pass these to versioned machine types ≤7.0. Link: https://git.kernel.org/tip/tip/c/68b8e9713c8 Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <eduardo@habkost.net> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Message-Id: <20220721125636.446842-1-Jason@zx2c4.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* microvm: turn off io reservations for pcie root portsGerd Hoffmann2022-07-191-0/+11
| | | | | | | | The pcie host bridge has no io window on microvm, so io reservations will not work. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20220701091516.43489-1-kraxel@redhat.com>
* hw/i386/xen/xen-hvm: Inline xen_piix_pci_write_config_client() and remove itBernhard Beschow2022-06-291-18/+0Star
| | | | | | | | | | | | | | xen_piix_pci_write_config_client() is implemented in the xen sub tree and uses PIIX constants internally, thus creating a direct dependency on PIIX. Now that xen_set_pci_link_route() is stubbable, the logic of xen_piix_pci_write_config_client() can be moved to PIIX which resolves the dependency. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-Id: <20220626094656.15673-3-shentey@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* hw/i386/xen/xen-hvm: Allow for stubbing xen_set_pci_link_route()Bernhard Beschow2022-06-291-1/+6
| | | | | | | | | | | | | | The only user of xen_set_pci_link_route() is xen_piix_pci_write_config_client() which implements PIIX-specific logic in the xen namespace. This makes xen-hvm depend on PIIX which could be avoided if xen_piix_pci_write_config_client() was implemented in PIIX. In order to do this, xen_set_pci_link_route() needs to be stubbable which this patch addresses. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Paul Durrant <paul@xen.org> Message-Id: <20220626094656.15673-2-shentey@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* hw/pci-host/i440fx: Remove unused parameter from i440fx_init()Bernhard Beschow2022-06-281-3/+0Star
| | | | | | | | | pi440fx_state is an out-parameter which is never read by the caller. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220612192800.40813-1-shentey@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* hw/i386/pc: Unexport functions used only internallyBernhard Beschow2022-06-111-2/+2
| | | | | | | | | Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220520180109.8224-5-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/i386/pc: Unexport PC_CPU_MODEL_IDS macroBernhard Beschow2022-06-111-0/+9
| | | | | | | | | | | The macro seems to be used only internally, so remove it. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220520180109.8224-4-shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw: Reuse TYPE_I8042 defineBernhard Beschow2022-06-111-2/+2
| | | | | | | | | | | TYPE_I8042 is exported, so reuse it for consistency. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220520180109.8224-2-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/rtc/mc146818rtc: QOM'ify io_base offsetBernhard Beschow2022-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Exposing the io_base offset as a QOM property not only allows it to be configurable but also to be displayed in HMP: Before: (qemu) info qtree ... dev: mc146818rtc, id "" gpio-out "" 1 base_year = 0 (0x0) irq = 8 (0x8) lost_tick_policy = "discard" After: dev: mc146818rtc, id "" gpio-out "" 1 base_year = 0 (0x0) iobase = 112 (0x70) irq = 8 (0x8) lost_tick_policy = "discard" Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220529184006.10712-4-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/i386/microvm-dt: Determine mc146818rtc's IRQ number from QOM propertyBernhard Beschow2022-06-111-1/+1
| | | | | | | | | | | | Since commit 3b004a16540aa41f2aa6a1ceb0bf306716766914 'hw/rtc/ mc146818rtc: QOM'ify IRQ number' mc146818rtc's IRQ number is configurable. Fix microvm-dt to respect its value. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220529184006.10712-3-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/i386/microvm-dt: Force explicit failure if retrieving QOM property failsBernhard Beschow2022-06-111-2/+3
| | | | | | | | | | New code will be added where this is best practice. So update existing code as well. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220529184006.10712-2-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/isa/piix3: Inline and remove piix3_create()Bernhard Beschow2022-06-111-1/+5
| | | | | | | | | | | During the previous changesets piix3_create() became a trivial wrapper around more generic functions. Modernize the code. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220603185045.143789-12-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/isa/piix3: Factor out ISABus retrieval from piix3_create()Bernhard Beschow2022-06-111-1/+2
| | | | | | | | | | Modernizes the code. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20220603185045.143789-11-shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/i386/pc_piix: create PIIX4_PM device directly instead of using ↵Mark Cave-Ayland2022-06-111-3/+8
| | | | | | | | | | | | | piix4_pm_initfn() Now that all external logic has been removed from piix4_pm_initfn() the PIIX4_PM device can be instantiated directly. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220528091934.15520-11-mark.cave-ayland@ilande.co.uk> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/acpi/piix4: use qdev gpio to wire up smi_irqMark Cave-Ayland2022-06-111-1/+2
| | | | | | | | | | | | | | Initialize the SMI IRQ in piix4_pm_init(). The smi_irq can now be wired up directly using a qdev gpio instead of having to set the IRQ externally in piix4_pm_initfn(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220528091934.15520-10-mark.cave-ayland@ilande.co.uk> [PMD: Partially squash 20220528091934.15520-8-mark.cave-ayland@ilande.co.uk] Reviewed-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/acpi/piix4: use qdev gpio to wire up sci_irqMark Cave-Ayland2022-06-111-2/+2
| | | | | | | | | | | | | | | Introduce piix4_pm_init() instance init function and use it to initialise the separate qdev gpio for the SCI IRQ. The sci_irq can now be wired up directly using a qdev gpio instead of having to set the IRQ externally in piix4_pm_initfn(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220528091934.15520-9-mark.cave-ayland@ilande.co.uk> [PMD: Partially squash 20220528091934.15520-8-mark.cave-ayland@ilande.co.uk] Reviewed-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/acpi/piix4: rename piix4_pm_init() to piix4_pm_initfn()Mark Cave-Ayland2022-06-111-3/+3
| | | | | | | | | | | | | | When QOMifying a device it is typical to use _init() as the suffix for an instance_init function, however this name is already in use by the legacy piix4_pm_init() wrapper function. Eventually the wrapper function will be removed, but for now rename it to piix4_pm_initfn() to avoid a naming collision. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220528091934.15520-7-mark.cave-ayland@ilande.co.uk> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
* hw/acpi/piix4: alter piix4_pm_init() to return PIIX4PMStateMark Cave-Ayland2022-06-111-5/+5
| | | | | | | | | | | This exposes the PIIX4_PM device to the caller to allow any qdev gpios to be mapped outside of piix4_pm_init(). Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20220528091934.15520-6-mark.cave-ayland@ilande.co.uk> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>