summaryrefslogtreecommitdiffstats
path: root/hw/intc/spapr_xive.c
Commit message (Collapse)AuthorAgeFilesLines
* spapr/xive: Add a 'hv-prio' property to represent the KVM escalation priorityCédric Le Goater2020-09-081-19/+14Star
| | | | | | | | | | | | | | | | | On POWER9, the KVM XIVE device uses priority 7 for the escalation interrupts. On POWER10, the host can use a reduced set of priorities and KVM will configure the escalation priority to a lower number. In any case, the guest is allowed to use priorities in a single range : [ 0 .. (maxprio - 1) ]. Introduce a 'hv-prio' property to represent the escalation priority number and use it to compute the "ibm,plat-res-int-priorities" property defining the priority ranges reserved by the hypervisor. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200819130843.2230799-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: Use xive_source_esb_len()Greg Kurz2020-08-141-1/+1
| | | | | | | | | | | static inline size_t xive_source_esb_len(XiveSource *xsrc) { return (1ull << xsrc->esb_shift) * xsrc->nr_irqs; } Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <159733969034.320580.6571451425779179477.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappersGreg Kurz2020-08-131-14/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This ensures that QEMU won't try to use the device if KVM is disabled or if an in-kernel irqchip isn't required. When using ic-mode=dual with the pseries machine, we have two possible interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper will return true as soon as any of the KVM device is created. It might lure QEMU to think that the other one is also around, while it is not. This is exactly what happens with ic-mode=dual at machine init when claiming IRQ numbers, which must be done on all possible IRQ backends, eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device is active but we end up calling kvmppc_xive_source_reset_one() anyway, which fails. This doesn't cause any trouble because of another bug : kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't see the failure. Most of the other kvmppc_xive_* functions have similar xive->fd checks to filter out the case when KVM XIVE isn't active. It might look safer to have idempotent functions but it doesn't really help to understand what's going on when debugging. Since we already have all the kvm_irqchip_in_kernel() in place, also have the callers to check xive->fd as well before calling KVM XIVE specific code. This is straight-forward for the spapr specific XIVE code. Some more care is needed for the platform agnostic XIVE code since it cannot access xive->fd directly. Introduce new in_kernel() methods in some base XIVE classes for this purpose and implement them only in spapr. In all cases, we still need to call kvm_irqchip_in_kernel() so that compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM isn't defined, thus avoiding the need for stubs. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <159679993438.876294.7285654331498605426.stgit@bahia.lan> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* error: Eliminate error_propagate() with Coccinelle, part 1Markus Armbruster2020-07-101-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. Convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... return ... } to if (!foo(..., errp)) { ... ... return ... } where nothing else needs @err. Coccinelle script: @rule1 forall@ identifier fun, err, errp, lbl; expression list args, args2; binary operator op; constant c1, c2; symbol false; @@ if ( ( - fun(args, &err, args2) + fun(args, errp, args2) | - !fun(args, &err, args2) + !fun(args, errp, args2) | - fun(args, &err, args2) op c1 + fun(args, errp, args2) op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; ) } @rule2 forall@ identifier fun, err, errp, lbl; expression list args, args2; expression var; binary operator op; constant c1, c2; symbol false; @@ - var = fun(args, &err, args2); + var = fun(args, errp, args2); ... when != err if ( ( var | !var | var op c1 ) ) { ... when != err when != lbl: when strict - error_propagate(errp, err); ... when != err ( return; | return c2; | return false; | return var; ) } @depends on rule1 || rule2@ identifier err; @@ - Error *err = NULL; ... when != err Not exactly elegant, I'm afraid. The "when != lbl:" is necessary to avoid transforming if (fun(args, &err)) { goto out } ... out: error_propagate(errp, err); even though other paths to label out still need the error_propagate(). For an actual example, see sclp_realize(). Without the "when strict", Coccinelle transforms vfio_msix_setup(), incorrectly. I don't know what exactly "when strict" does, only that it helps here. The match of return is narrower than what I want, but I can't figure out how to express "return where the operand doesn't use @err". For an example where it's too narrow, see vfio_intx_enable(). Silently fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. Line breaks tidied up manually. One nested declaration of @local_err deleted manually. Preexisting unwanted blank line dropped in hw/riscv/sifive_e.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-35-armbru@redhat.com>
* qom: Put name parameter before value / visitor parameterMarkus Armbruster2020-07-101-5/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The object_property_set_FOO() setters take property name and value in an unusual order: void object_property_set_FOO(Object *obj, FOO_TYPE value, const char *name, Error **errp) Having to pass value before name feels grating. Swap them. Same for object_property_set(), object_property_get(), and object_property_parse(). Convert callers with this Coccinelle script: @@ identifier fun = { object_property_get, object_property_parse, object_property_set_str, object_property_set_link, object_property_set_bool, object_property_set_int, object_property_set_uint, object_property_set, object_property_set_qobject }; expression obj, v, name, errp; @@ - fun(obj, v, name, errp) + fun(obj, name, v, errp) Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error message "no position information". Convert that one manually. Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Convert manually. Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused by RXCPU being used both as typedef and function-like macro there. Convert manually. The other files using RXCPU that way don't need conversion. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-27-armbru@redhat.com> [Straightforwad conflict with commit 2336172d9b "audio: set default value for pcspk.iobase property" resolved]
* qdev: Use returned bool to check for qdev_realize() etc. failureMarkus Armbruster2020-07-101-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for qdev_realize(), qdev_realize_and_unref(), qbus_realize() and their wrappers isa_realize_and_unref(), pci_realize_and_unref(), sysbus_realize(), sysbus_realize_and_unref(), usb_realize_and_unref(). Coccinelle script: @@ identifier fun = { isa_realize_and_unref, pci_realize_and_unref, qbus_realize, qdev_realize, qdev_realize_and_unref, sysbus_realize, sysbus_realize_and_unref, usb_realize_and_unref }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error message "no position information". Nothing to convert there; skipped. Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by ARMSSE being used both as typedef and function-like macro there. Converted manually. A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20200707160613.848843-5-armbru@redhat.com>
* qdev: Convert bus-less devices to qdev_realize() with CoccinelleMarkus Armbruster2020-06-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All remaining conversions to qdev_realize() are for bus-less devices. Coccinelle script: // only correct for bus-less @dev! @@ expression errp; expression dev; @@ - qdev_init_nofail(dev); + qdev_realize(dev, NULL, &error_fatal); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression errp; expression dev; symbol true; @@ - object_property_set_bool(OBJECT(dev), true, "realized", errp); + qdev_realize(DEVICE(dev), NULL, errp); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression errp; expression dev; symbol true; @@ - object_property_set_bool(dev, true, "realized", errp); + qdev_realize(DEVICE(dev), NULL, errp); Note that Coccinelle chokes on ARMSSE typedef vs. macro in hw/arm/armsse.c. Worked around by temporarily renaming the macro for the spatch run. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-57-armbru@redhat.com>
* qom: Less verbose object_initialize_child()Markus Armbruster2020-06-151-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All users of object_initialize_child() pass the obvious child size argument. Almost all pass &error_abort and no properties. Tiresome. Rename object_initialize_child() to object_initialize_child_with_props() to free the name. New convenience wrapper object_initialize_child() automates the size argument, and passes &error_abort and no properties. Rename object_initialize_childv() to object_initialize_child_with_propsv() for consistency. Convert callers with this Coccinelle script: @@ expression parent, propname, type; expression child, size; symbol error_abort; @@ - object_initialize_child(parent, propname, OBJECT(child), size, type, &error_abort, NULL) + object_initialize_child(parent, propname, child, size, type, &error_abort, NULL) @@ expression parent, propname, type; expression child; symbol error_abort; @@ - object_initialize_child(parent, propname, child, sizeof(*child), type, &error_abort, NULL) + object_initialize_child(parent, propname, child, type) @@ expression parent, propname, type; expression child; symbol error_abort; @@ - object_initialize_child(parent, propname, &child, sizeof(child), type, &error_abort, NULL) + object_initialize_child(parent, propname, &child, type) @@ expression parent, propname, type; expression child, size, err; expression list props; @@ - object_initialize_child(parent, propname, child, size, type, err, props) + object_initialize_child_with_props(parent, propname, child, size, type, err, props) Note that Coccinelle chokes on ARMSSE typedef vs. macro in hw/arm/armsse.c. Worked around by temporarily renaming the macro for the spatch run. Signed-off-by: Markus Armbruster <armbru@redhat.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> [Rebased: machine opentitan is new (commit fe0fe4735e7)] Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-37-armbru@redhat.com>
* spapr/xive: use SPAPR_IRQ_IPI to define IPI ranges exposed to the guestCédric Le Goater2020-03-171-2/+2
| | | | | | | | | | | | The "ibm,xive-lisn-ranges" defines ranges of interrupt numbers that the guest can use to configure IPIs. It starts at 0 today but it could change to some other offset. Make clear which IRQ range we are exposing by using SPAPR_IRQ_IPI in the property definition. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200306123307.1348-1-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* add device_legacy_reset function to prepare for reset api changeDamien Hedde2020-01-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Provide a temporary device_legacy_reset function doing what device_reset does to prepare for the transition with Resettable API. All occurrence of device_reset in the code tree are also replaced by device_legacy_reset. The new resettable API has different prototype and semantics (resetting child buses as well as the specified device). Subsequent commits will make the changeover for each call site individually; once that is complete device_legacy_reset() will be removed. Signed-off-by: Damien Hedde <damien.hedde@greensocs.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Cornelia Huck <cohuck@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20200123132823.1117486-2-damien.hedde@greensocs.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* qdev: set properties with device_class_set_props()Marc-André Lureau2020-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* xive: Add a "presenter" link property to the TCTX objectCédric Le Goater2020-01-081-1/+1
| | | | | | | | | | | | | This will be used in subsequent patches to access the XIVE associated to a TCTX without reaching out to the machine through qdev_get_machine(). Signed-off-by: Cédric Le Goater <clg@kaod.org> [ groug: - split patch - write subject and changelog ] Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200106145645.4539-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: Use device_class_set_parent_realize()Greg Kurz2020-01-081-1/+11
| | | | | | | | | | | | The XIVE router base class currently inherits an empty realize hook from the sysbus device base class, but it will soon implement one of its own to perform some sanity checks. Do the preliminary plumbing to have it called. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191219181155.32530-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/pnv: Extend XiveRouter with a get_block_id() handlerCédric Le Goater2019-12-171-0/+6
| | | | | | | | | When doing CAM line compares, fetch the block id from the interrupt controller which can have set the PC_TCTXT_CHIPID field. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-20-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/xive: Remove the get_tctx() XiveRouter handlerCédric Le Goater2019-12-171-8/+0Star
| | | | | | | | | It is now unused. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-16-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/xive: Move the TIMA operations to the controller modelCédric Le Goater2019-12-171-2/+31
| | | | | | | | | | | | | | | | | | | | On the P9 Processor, the thread interrupt context registers of a CPU can be accessed "directly" when by load/store from the CPU or "indirectly" by the IC through an indirect TIMA page. This requires to configure first the PC_TCTXT_INDIRx registers. Today, we rely on the get_tctx() handler to deduce from the CPU PIR the chip from which the TIMA access is being done. By handling the TIMA memory ops under the interrupt controller model of each machine, we can uniformize the TIMA direct and indirect ops under PowerNV. We can also check that the CPUs have been enabled in the XIVE controller. This prepares ground for the future versions of XIVE. Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-15-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr: Pass the maximum number of vCPUs to the KVM interrupt controllerGreg Kurz2019-12-171-2/+4
| | | | | | | | | | | | | | | | The XIVE and XICS-on-XIVE KVM devices on POWER9 hosts can greatly reduce their consumption of some scarce HW resources, namely Virtual Presenter identifiers, if they know the maximum number of vCPUs that may run in the VM. Prepare ground for this by passing the value down to xics_kvm_connect() and kvmppc_xive_connect(). This is purely mechanical, no functional change. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157478678301.67101.2717368060417156338.stgit@bahia.tlslab.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/xive: Implement the XivePresenter interfaceCédric Le Goater2019-12-171-0/+49
| | | | | | | | | | | | | | | | | | | Each XIVE Router model, sPAPR and PowerNV, now implements the 'match_nvt' handler of the XivePresenter QOM interface. This is simply moving code and taking into account the new API. To be noted that the xive_router_get_tctx() helper is not used anymore when doing CAM matching and will be removed later on after other changes. The XIVE presenter model is still too simple for the PowerNV machine and the CAM matching algo is not correct on multichip system. Subsequent patches will introduce more changes to scan all chips of the system. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191125065820.927-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* xive: Link "xive" property to XiveEndSource::xrtr pointerGreg Kurz2019-12-171-2/+2
| | | | | | | | | | | | | The END source object has both a pointer and a "xive" property pointing to the router object. Confusing bugs could arise if these ever go out of sync. Change the property definition so that it explicitely sets the pointer. The property isn't optional : not being able to set the link is a bug and QEMU should rather abort than exit in this case. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157383333784.165747.5298512574054268786.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* xive: Link "xive" property to XiveSource::xive pointerGreg Kurz2019-12-171-2/+2
| | | | | | | | | | | | | The source object has both a pointer and a "xive" property pointing to the notifier object. Confusing bugs could arise if these ever go out of sync. Change the property definition so that it explicitely sets the pointer. The property isn't optional : not being able to set the link is a bug and QEMU should rather abort than exit in this case. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157383333227.165747.12901571295951957951.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc: Add intc_destroy() handlers to SpaprInterruptController/PnvChipGreg Kurz2019-11-181-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | SpaprInterruptControllerClass and PnvChipClass have an intc_create() method that calls the appropriate routine, ie. icp_create() or xive_tctx_create(), to establish the link between the VCPU and the presenter component of the interrupt controller during realize. There aren't any symmetrical call to be called when the VCPU gets unrealized though. It is assumed that object_unparent() is the only thing to do. This is questionable because the parenting logic around the CPU and presenter objects is really an implementation detail of the interrupt controller. It shouldn't be open-coded in the machine code. Fix this by adding an intc_destroy() method that undoes what was done in intc_create(). Also NULLify the presenter pointers to avoid having stale pointers around. This will allow to reliably check if a vCPU has a valid presenter. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
* spapr/xive: Set the OS CAM line at resetCédric Le Goater2019-10-241-31/+17Star
| | | | | | | | | | | | | | | | | When a Virtual Processor is scheduled to run on a HW thread, the hypervisor pushes its identifier in the OS CAM line. When running with kernel_irqchip=off, QEMU needs to emulate the same behavior. Set the OS CAM line when the interrupt presenter of the sPAPR core is reset. This will also cover the case of hot-plugged CPUs. This change also has the benefit to remove the use of CPU_FOREACH() which can be unsafe. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20191022163812.330-8-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc: Reset the interrupt presenter from the CPU reset handlerCédric Le Goater2019-10-241-0/+9
| | | | | | | | | | | | | | | | | | | | | On the sPAPR machine and PowerNV machine, the interrupt presenters are created by a machine handler at the core level and are reset independently. This is not consistent and it raises issues when it comes to handle hot-plugged CPUs. In that case, the presenters are not reset. This is less of an issue in XICS, although a zero MFFR could be a concern, but in XIVE, the OS CAM line is not set and this breaks the presenting algorithm. The current code has workarounds which need a global cleanup. Extend the sPAPR IRQ backend and the PowerNV Chip class with a new cpu_intc_reset() handler called by the CPU reset handler and remove the XiveTCTX reset handler which is now redundant. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191022163812.330-6-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr, xics, xive: Move SpaprIrq::post_load hook to backendsDavid Gibson2019-10-241-2/+3
| | | | | | | | | | | The remaining logic in the post_load hook really belongs to the interrupt controller backends, and just needs to be called on the active controller (after the active controller is set to the right thing based on the incoming migration in the generic spapr_irq_post_load() logic). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Move SpaprIrq::reset hook logic into activate/deactivateDavid Gibson2019-10-241-0/+38
| | | | | | | | | | | | | | | | It turns out that all the logic in the SpaprIrq::reset hooks (and some in the SpaprIrq::post_load hooks) isn't really related to resetting the irq backend (that's handled by the backends' own reset routines). Rather its about getting the backend ready to be the active interrupt controller or stopping being the active interrupt controller - reset (and post_load) is just the only time that changes at present. To make this flow clearer, move the logic into the explicit backend activate and deactivate hooks. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Move dt_populate from SpaprIrq to SpaprInterruptControllerDavid Gibson2019-10-241-62/+63
| | | | | | | | | | | This method depends only on the active irq controller. Now that we've formalized the notion of active controller we can dispatch directly through that, rather than dispatching via SpaprIrq with the dual version having to do a second conditional dispatch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Move print_info from SpaprIrq to SpaprInterruptControllerDavid Gibson2019-10-241-0/+15
| | | | | | | | | | | This method depends only on the active irq controller. Now that we've formalized the notion of active controller we can dispatch directly through that, rather than dispatching via SpaprIrq with the dual version having to do a second conditional dispatch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Move set_irq from SpaprIrq to SpaprInterruptControllerDavid Gibson2019-10-241-0/+12
| | | | | | | | | | | This method depends only on the active irq controller. Now that we've formalized the notion of active controller we can dispatch directly through that, rather than dispatching via SpaprIrq with the dual version having to do a second conditional dispatch. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Move irq claim and free from SpaprIrq to ↵David Gibson2019-10-241-33/+38
| | | | | | | | | | | | | | SpaprInterruptController These methods, like cpu_intc_create, really belong to the interrupt controller, but need to be called on all possible intcs. Like cpu_intc_create, therefore, make them methods on the intc and always call it for all existing intcs. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Move cpu_intc_create from SpaprIrq to ↵David Gibson2019-10-241-0/+25
| | | | | | | | | | | | | | | | | | SpaprInterruptController This method essentially represents code which belongs to the interrupt controller, but needs to be called on all possible intcs, rather than just the currently active one. The "dual" version therefore calls into the xics and xive versions confusingly. Handle this more directly, by making it instead a method on the intc backend, and always calling it on every backend that exists. While we're there, streamline the error reporting a bit. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* spapr, xics, xive: Introduce SpaprInterruptController QOM interfaceDavid Gibson2019-10-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | The SpaprIrq structure is used to represent ths spapr machine's irq backend. Except that it kind of conflates two concepts: one is the backend proper - a specific interrupt controller that we might or might not be using, the other is the irq configuration which covers the layout of irq space and which interrupt controllers are allowed. This leads to some pretty confusing code paths for the "dual" configuration where its hooks redirect to other SpaprIrq structures depending on the currently active irq controller. To clean this up, we start by introducing a new SpaprInterruptController QOM interface to represent strictly an interrupt controller backend, not counting anything configuration related. We implement this interface in the XICs and XIVE interrupt controllers, and in future we'll move relevant methods from SpaprIrq into it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org>
* xive: Improve irq claim/free pathDavid Gibson2019-10-041-11/+9Star
| | | | | | | | | | | | | | | | | | | | | spapr_xive_irq_claim() returns a bool to indicate if it succeeded. But most of the callers and one callee use int return values and/or an Error * with more information instead. In any case, ints are a more common idiom for success/failure states than bools (one never knows what sense they'll be in). So instead change to an int return value to indicate presence of error + an Error * to describe the details through that call chain. It also didn't actually check if the irq was already claimed, which is one of the primary purposes of the claim path, so do that. spapr_xive_irq_free() also returned a bool... which no callers checked and was always true, so just drop it. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
* spapr, xics, xive: Better use of assert()s on irq claim/free pathsDavid Gibson2019-10-041-6/+2Star
| | | | | | | | | | | | | | | | | The irq claim and free paths for both XICS and XIVE check for some validity conditions. Some of these represent genuine runtime failures, however others - particularly checking that the basic irq number is in a sane range - could only fail in the case of bugs in the callin code. Therefore use assert()s instead of runtime failures for those. In addition the non backend-specific part of the claim/free paths should only be used for PAPR external irqs, that is in the range SPAPR_XIRQ_BASE to the maximum irq number. Put assert()s for that into the top level dispatchers as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>
* spapr/xive: Mask the EAS when allocating an IRQCédric Le Goater2019-08-211-1/+4
| | | | | | | | | | | | | | | | If an IRQ is allocated and not configured, such as a MSI requested by a PCI driver, it can be saved in its default state and possibly later on restored using the same state. If not initially MASKED, KVM will try to find a matching priority/target tuple for the interrupt and fail to restore the VM because 0/0 is not a valid target. When allocating a IRQ number, the EAS should be set to a sane default : VALID and MASKED. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190813164420.9829-1-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* ppc/xive: Improve 'info pic' supportCédric Le Goater2019-08-211-1/+0Star
| | | | | | | | | | Provide a better output of the XIVE END structures including the escalation information and extend the PowerNV machine 'info pic' command with a dump of the END EAS table used for escalations. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190718115420.19919-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Include hw/qdev-properties.h lessMarkus Armbruster2019-08-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
* Include migration/vmstate.h lessMarkus Armbruster2019-08-161-0/+1
| | | | | | | | | | | | | | | | | | In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
* Include sysemu/reset.h a lot lessMarkus Armbruster2019-08-161-0/+1
| | | | | | | | | | | | | | | | | | In my "build everything" tree, changing sysemu/reset.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The main culprit is hw/hw.h, which supposedly includes it for convenience. Include sysemu/reset.h only where it's needed. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-9-armbru@redhat.com>
* spapr/xive: rework the mapping the KVM memory regionsCédric Le Goater2019-07-021-28/+10Star
| | | | | | | | | | | | | | | | | | | | | Today, the interrupt device is fully initialized at reset when the CAS negotiation process has completed. Depending on the KVM capabilities, the SpaprXive memory regions (ESB, TIMA) are initialized with a host MMIO backend or a QEMU emulated backend. This results in a complex initialization sequence partially done at realize and later at reset, and some memory region leaks. To simplify this sequence and to remove of the late initialization of the emulated device which is required to be done only once, we introduce new memory regions specific for KVM. These regions are mapped as overlaps on top of the emulated device to make use of the host MMIOs. Also provide proper cleanups of these regions when the XIVE KVM device is destroyed to fix the leaks. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190614165920.12670-2-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster2019-06-121-0/+1
| | | | | | | | | Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
* spapr/xive: fix multiple resets when using the 'dual' interrupt modeCédric Le Goater2019-05-291-6/+5Star
| | | | | | | | | | | | | | | | | | | | | | Today, when a reset occurs on a pseries machine using the 'dual' interrupt mode, the KVM devices are released and recreated depending on the interrupt mode selected by CAS. If XIVE is selected, the SysBus memory regions of the SpaprXive model are initialized by the KVM backend initialization routine each time a reset occurs. This leads to a crash after a couple of resets because the machine reaches the QDEV_MAX_MMIO limit of SysBusDevice : qemu-system-ppc64: hw/core/sysbus.c:193: sysbus_init_mmio: Assertion `dev->num_mmio < QDEV_MAX_MMIO' failed. To fix, initialize the SysBus memory regions in spapr_xive_realize() called only once and remove the same inits from the QEMU and KVM backend initialization routines which are called at each reset. Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190522074016.10521-2-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/irq: initialize the IRQ device only onceCédric Le Goater2019-05-291-0/+9
| | | | | | | | | | Add a check to make sure that the routine initializing the emulated IRQ device is called once. We don't have much to test on the XICS side, so we introduce a 'init' boolean under ICSState. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-13-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/irq: introduce a spapr_irq_init_device() helperCédric Le Goater2019-05-291-21/+5Star
| | | | | | | | | | | | | | | | | The way the XICS and the XIVE devices are initialized follows the same pattern. First, try to connect to the KVM device and if not possible fallback on the emulated device, unless a kernel_irqchip is required. The spapr_irq_init_device() routine implements this sequence in generic way using new sPAPR IRQ handlers ->init_emu() and ->init_kvm(). The XIVE init sequence is moved under the associated sPAPR IRQ ->init() handler. This will change again when KVM support is added for the dual interrupt mode. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-12-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: add migration support for KVMCédric Le Goater2019-05-291-0/+24
| | | | | | | | | | | | | | | | | | | | | | When the VM is stopped, the VM state handler stabilizes the XIVE IC and marks the EQ pages dirty. These are then transferred to destination before the transfer of the device vmstates starts. The SpaprXive interrupt controller model captures the XIVE internal tables, EAT and ENDT and the XiveTCTX model does the same for the thread interrupt context registers. At restart, the SpaprXive 'post_load' method restores all the XIVE states. It is called by the sPAPR machine 'post_load' method, when all XIVE states have been transferred and loaded. Finally, the source states are restored in the VM change state handler when the machine reaches the running state. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: add state synchronization with KVMCédric Le Goater2019-05-291-7/+10
| | | | | | | | | | | | | | | This extends the KVM XIVE device backend with 'synchronize_state' methods used to retrieve the state from KVM. The HW state of the sources, the KVM device and the thread interrupt contexts are collected for the monitor usage and also migration. These get operations rely on their KVM counterpart in the host kernel which acts as a proxy for OPAL, the host firmware. The set operations will be added for migration support later. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190513084245.25755-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: add hcall support when under KVMCédric Le Goater2019-05-291-8/+82
| | | | | | | | | | | | | | | | | | | | | | XIVE hcalls are all redirected to QEMU as none are on a fast path. When necessary, QEMU invokes KVM through specific ioctls to perform host operations. QEMU should have done the necessary checks before calling KVM and, in case of failure, H_HARDWARE is simply returned. H_INT_ESB is a special case that could have been handled under KVM but the impact on performance was low when under QEMU. Here are some figures : kernel irqchip OFF ON H_INT_ESB KVM QEMU rtl8139 (LSI ) 1.19 1.24 1.23 Gbits/sec virtio 31.80 42.30 -- Gbits/sec Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: add KVM supportCédric Le Goater2019-05-291-6/+42
| | | | | | | | | | | | | | | | | This introduces a set of helpers when KVM is in use, which create the KVM XIVE device, initialize the interrupt sources at a KVM level and connect the interrupt presenters to the vCPU. They also handle the initialization of the TIMA and the source ESB memory regions of the controller. These have a different type under KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed to the guest and the associated VMAs on the host are populated dynamically with the appropriate pages using a fault handler. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190513084245.25755-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Fix typo on "info pic" monitor cmd output for xiveSatheesh Rajendran2019-05-291-1/+1
| | | | | | | | | | | | Instead of LISN i.e "Logical Interrupt Source Number" as per Xive PAPR document "info pic" prints as LSIN, let's fix it. Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Message-Id: <20190509080750.21999-1-sathnaga@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: print out the EQ page address in the monitorCédric Le Goater2019-05-291-2/+3
| | | | | | | | | | This proved to be a useful information when debugging issues with OS event queues allocated above 64GB. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190508171946.657-4-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* spapr/xive: fix EQ page addresses above 64GBCédric Le Goater2019-05-291-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | The high order bits of the address of the OS event queue is stored in bits [4-31] of word2 of the XIVE END internal structures and the low order bits in word3. This structure is using Big Endian ordering and computing the value requires some simple arithmetic which happens to be wrong. The mask removing bits [0-3] of word2 is applied to the wrong value and the resulting address is bogus when above 64GB. Guests with more than 64GB of RAM will allocate pages for the OS event queues which will reside above the 64GB limit. In this case, the XIVE device model will wake up the CPUs in case of a notification, such as IPIs, but the update of the event queue will be written at the wrong place in memory. The result is uncertain as the guest memory is trashed and IPI are not delivered. Introduce a helper xive_end_qaddr() to compute this value correctly in all places where it is used. Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20190508171946.657-3-clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>