summaryrefslogtreecommitdiffstats
path: root/hw/pci/pcie.c
Commit message (Collapse)AuthorAgeFilesLines
* pcie: Don't try triggering a LSI when not definedFrederic Barrat2022-04-201-2/+3
| | | | | | | | | | | | | | | | | This patch skips [de]asserting a LSI interrupt if the device doesn't have any LSI defined. Doing so would trigger an assert in pci_irq_handler(). The PCIE root port implementation in qemu requests a LSI (INTA), but a subclass may want to change that behavior since it's a valid configuration. For example on the POWER8/POWER9/POWER10 systems, the root bridge doesn't request any LSI. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20220408131303.147840-2-fbarrat@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
* acpi: pcihp: pcie: set power on cap on parent slotIgor Mammedov2022-03-061-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on creation a PCIDevice has power turned on at the end of pci_qdev_realize() however later on if PCIe slot isn't populated with any children it's power is turned off. It's fine if native hotplug is used as plug callback will power slot on among other things. However when ACPI hotplug is enabled it replaces native PCIe plug callbacks with ACPI specific ones (acpi_pcihp_device_*plug_cb) and as result slot stays powered off. It works fine as ACPI hotplug on guest side takes care of enumerating/initializing hotplugged device. But when later guest is migrated, call chain introduced by] commit d5daff7d312 (pcie: implement slot power control for pcie root ports) pcie_cap_slot_post_load() -> pcie_cap_update_power() -> pcie_set_power_device() -> pci_set_power() -> pci_update_mappings() will disable earlier initialized BARs for the hotplugged device in powered off slot due to commit 23786d13441 (pci: implement power state) which disables BARs if power is off. Fix it by setting PCI_EXP_SLTCTL_PCC to PCI_EXP_SLTCTL_PWR_ON on slot (root port/downstream port) at the time a device hotplugged into it. As result PCI_EXP_SLTCTL_PWR_ON is migrated to target and above call chain keeps device plugged into it powered on. Fixes: d5daff7d312 ("pcie: implement slot power control for pcie root ports") Fixes: 23786d13441 ("pci: implement power state") Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2053584 Suggested-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <20220301151200.3507298-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Add support for Single Root I/O Virtualization (SR/IOV)Knut Omang2022-03-061-0/+5
| | | | | | | | | | | This patch provides the building blocks for creating an SR/IOV PCIe Extended Capability header and register/unregister SR/IOV Virtual Functions. Signed-off-by: Knut Omang <knuto@ifi.uio.no> Message-Id: <20220217174504.1051716-2-lukasz.maniak@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Fix bad overflow check in hw/pci/pcie.cDaniella Lee2021-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Orginal qemu commit hash:14d02cfbe4adaeebe7cb833a8cc71191352cf03b In function pcie_add_capability, an assert contains the "offset < offset + size" expression. Both variable offset and variable size are uint16_t, the comparison is always true due to type promotion. The next expression may be the same. It might be like this: Thread 1 "qemu-system-x86" hit Breakpoint 1, pcie_add_capability ( dev=0x555557ce5f10, cap_id=1, cap_ver=2 '\002', offset=256, size=72) at ../hw/pci/pcie.c:930 930 { (gdb) n 931 assert(offset >= PCI_CONFIG_SPACE_SIZE); (gdb) n 932 assert(offset < offset + size); (gdb) p offset $1 = 256 (gdb) p offset < offset + size $2 = 1 (gdb) set offset=65533 (gdb) p offset < offset + size $3 = 1 (gdb) p offset < (uint16_t)(offset + size) $4 = 0 Signed-off-by: Daniella Lee <daniellalee111@gmail.com> Message-Id: <20211126061324.47331-1-daniellalee111@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: expire pending deleteGerd Hoffmann2021-11-151-0/+2
| | | | | | | | | | | | | | Add an expire time for pending delete, once the time is over allow pressing the attention button again. This makes pcie hotplug behave more like acpi hotplug, where one can try sending an 'device_del' monitor command again in case the guest didn't respond to the first attempt. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-7-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: fast unplug when slot power is offGerd Hoffmann2021-11-151-0/+10
| | | | | | | | | | | | | In case the slot is powered off (and the power indicator turned off too) we can unplug right away, without round-trip to the guest. Also clear pending attention button press, there is nothing to care about any more. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-6-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: factor out pcie_cap_slot_unplug()Gerd Hoffmann2021-11-151-12/+20
| | | | | | | | | No functional change. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-5-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: add power indicator blink checkGerd Hoffmann2021-11-151-0/+7
| | | | | | | | | | Refuse to push the attention button in case the guest is busy with some hotplug operation (as indicated by the power indicator blinking). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-4-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: implement slot power control for pcie root portsGerd Hoffmann2021-11-151-0/+28
| | | | | | | | | | | | | | With this patch hot-plugged pci devices will only be visible to the guest if the guests hotplug driver has enabled slot power. This should fix the hot-plug race which one can hit when hot-plugging a pci device at boot, while the guest is in the middle of the pci bus scan. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Message-Id: <20211111130859.1171890-3-kraxel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: Export pci_for_each_device_under_bus*()Peter Xu2021-11-021-3/+1Star
| | | | | | | | | | | | | | They're actually more commonly used than the helper without _under_bus, because most callers do have the pci bus on hand. After exporting we can switch a lot of the call sites to use these two helpers. Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20211028043129.38871-3-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au>
* hw/pci/pcie: Do not set HPC flag if acpihp is usedJulia Suvorova2021-07-161-1/+7
| | | | | | | | | | | | | | | | | | Instead of changing the hot-plug type in _OSC register, do not set the 'Hot-Plug Capable' flag. This way guest will choose ACPI hot-plug if it is preferred and leave the option to use SHPC with pcie-pci-bridge. The ability to control hot-plug for each downstream port is retained, while 'hotplug=off' on the port means all hot-plug types are disabled. Signed-off-by: Julia Suvorova <jusual@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210713004205.775386-4-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-pci: compat page aligned ATSJason Wang2021-04-061-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | Commit 4c70875372b8 ("pci: advertise a page aligned ATS") advertises the page aligned via ATS capability (RO) to unbrek recent Linux IOMMU drivers since 5.2. But it forgot the compat the capability which breaks the migration from old machine type: (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: 0 device: 20 cmask: ff wmask: 0 w1cmask:0 This patch introduces a new parameter "x-ats-page-aligned" for virtio-pci device and turns it on for machine type which is newer than 5.1. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: qemu-stable@nongnu.org Fixes: 4c70875372b8 ("pci: advertise a page aligned ATS") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20210406040330.11306-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: don't set link state active if the slot is emptyLaurent Vivier2021-02-231-10/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the pcie slot is initialized, by default PCI_EXP_LNKSTA_DLLLA (Data Link Layer Link Active) is set in PCI_EXP_LNKSTA (Link Status) without checking if the slot is empty or not. This is confusing for the kernel because as it sees the link is up it tries to read the vendor ID and fails: (From https://bugzilla.kernel.org/show_bug.cgi?id=211691) [ 1.661105] pcieport 0000:00:02.2: pciehp: Slot Capabilities : 0x0002007b [ 1.661115] pcieport 0000:00:02.2: pciehp: Slot Status : 0x0010 [ 1.661123] pcieport 0000:00:02.2: pciehp: Slot Control : 0x07c0 [ 1.661138] pcieport 0000:00:02.2: pciehp: Slot #0 AttnBtn+ PwrCtrl+ MRL- AttnInd+ PwrInd+ HotPlug+ Surprise+ Interlock+ NoCompl- IbPresDis- LLActRep+ [ 1.662581] pcieport 0000:00:02.2: pciehp: pciehp_get_power_status: SLOTCTRL 6c value read 7c0 [ 1.662597] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662703] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0010 from Slot Status [ 1.662706] pcieport 0000:00:02.2: pciehp: pcie_enable_notification: SLOTCTRL 6c write cmd 1031 [ 1.662730] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662748] pcieport 0000:00:02.2: pciehp: pciehp_check_link_active: lnk_status = 2204 [ 1.662750] pcieport 0000:00:02.2: pciehp: Slot(0-2): Link Up [ 2.896132] pcieport 0000:00:02.2: pciehp: pciehp_check_link_status: lnk_status = 2204 [ 2.896135] pcieport 0000:00:02.2: pciehp: Slot(0-2): No device found [ 2.896900] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0010 from Slot Status [ 2.896903] pcieport 0000:00:02.2: pciehp: pciehp_power_off_slot: SLOTCTRL 6c write cmd 400 [ 3.656901] pcieport 0000:00:02.2: pciehp: pending interrupts 0x0009 from Slot Status This is really a problem with virtio-net failover that hotplugs a VFIO card during the boot process. The kernel can shutdown the slot while QEMU is hotplugging it, and this likely ends by an automatic unplug of the card. At the end of the boot sequence the card has disappeared. To fix that, don't set the "Link Active" state in the init function, but rely on the plug function to do it, as the mechanism has already been introduced by 2f2b18f60bf1. Fixes: 2f2b18f60bf1 ("pcie: set link state inactive/active after hot unplug/plug") Cc: zhengxiang9@huawei.com Fixes: 3d67447fe7c2 ("pcie: Fill PCIESlot link fields to support higher speeds and widths") Cc: alex.williamson@redhat.com Fixes: b2101eae63ea ("pcie: Set the "link active" in the link status register") Cc: benh@kernel.crashing.org Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20210212135250.2738750-5-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: advertise a page aligned ATSJason Wang2020-10-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | After Linux kernel commit 61363c1474b1 ("iommu/vt-d: Enable ATS only if the device uses page aligned address."), ATS will be only enabled if device advertises a page aligned request. Unfortunately, vhost-net is the only user and we don't advertise the aligned request capability in the past since both vhost IOTLB and address_space_get_iotlb_entry() can support non page aligned request. Though it's not clear that if the above kernel commit makes sense. Let's advertise a page aligned ATS here to make vhost device IOTLB work with Intel IOMMU again. Note that in the future we may extend pcie_ats_init() to accept parameters like queue depth and page alignment. Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20200909081731.24688-1-jasowang@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* qdev: Drop qbus_set_hotplug_handler() parameter @errpMarkus Armbruster2020-07-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qbus_set_hotplug_handler() is a simple wrapper around object_property_set_link(). object_property_set_link() fails when the property doesn't exist, is not settable, or its .check() method fails. These are all programming errors here, so passing &error_abort to qbus_set_hotplug_handler() is appropriate. Most of its callers do. Exceptions: * pcie_cap_slot_init(), shpc_init(), spapr_phb_realize() pass NULL, i.e. they ignore errors. * spapr_machine_init() passes &error_fatal. * s390_pcihost_realize(), virtio_serial_device_realize(), s390_pcihost_plug() pass the error to their callers. The latter two keep going after the error, which looks wrong. Drop the @errp parameter, and instead pass &error_abort to object_property_set_link(). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200630090351.1247703-15-armbru@redhat.com>
* qdev: Convert to qdev_unrealize() with CoccinelleMarkus Armbruster2020-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | For readability, and consistency with qbus_realize(). Coccinelle script: @ depends on !(file in "hw/core/qdev.c")@ typedef DeviceState; DeviceState *dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(dev); @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@ expression dev; symbol false, error_abort; @@ - object_property_set_bool(OBJECT(dev), false, "realized", &error_abort); + qdev_unrealize(DEVICE(dev)); Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200610053247.1583243-8-armbru@redhat.com>
* hw/pci/pcie: Move hot plug capability check to pre_plug callbackJulia Suvorova2020-06-091-8/+11
| | | | | | | | | | | | | | | | | | | | Check for hot plug capability earlier to avoid removing devices attached during the initialization process. Run qemu with an unattached drive: -drive file=$FILE,if=none,id=drive0 \ -device pcie-root-port,id=rp0,slot=3,bus=pcie.0,hotplug=off Hotplug a block device: device_add virtio-blk-pci,id=blk0,drive=drive0,bus=rp0 If hotplug fails on plug_cb, drive0 will be deleted. Fixes: 0501e1aa1d32a6 ("hw/pci/pcie: Forbid hot-plug if it's disabled on the slot") Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200604125947.881210-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* qdev: Unrealize must not failMarkus Armbruster2020-05-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
* hw/pci/pcie: Replace PCI_DEVICE() casts with existing variableJulia Suvorova2020-05-041-3/+3
| | | | | | | | | | A little cleanup is possible because of hotplug_pdev introduction. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200427182440.92433-3-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
* hw/pci/pcie: Forbid hot-plug if it's disabled on the slotJulia Suvorova2020-05-041-0/+19
| | | | | | | | | | | | | Raise an error when trying to hot-plug/unplug a device through QMP to a device with disabled hot-plug capability. This makes the device behaviour more consistent and provides an explanation of the failure in the case of asynchronous unplug. Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200427182440.92433-2-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
* pcie_root_port: Add hotplug disabling optionJulia Suvorova2020-03-081-4/+7
| | | | | | | | | | | | | | | | | | | | | | Make hot-plug/hot-unplug on PCIe Root Ports optional to allow libvirt manage it and restrict unplug for the whole machine. This is going to prevent user-initiated unplug in guests (Windows mostly). Hotplug is enabled by default. Usage: -device pcie-root-port,hotplug=off,... If you want to disable hot-unplug on some downstream ports of one switch, disable hot-unplug on PCIe Root Port connected to the upstream port as well as on the selected downstream ports. Discussion related: https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg00530.html Signed-off-by: Julia Suvorova <jusual@redhat.com> Message-Id: <20200226174607.205941-1-jusual@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
* pci: mark device having guest unplug request pendingJens Freimann2019-10-291-0/+3
| | | | | | | | | | Set pending_deleted_event in DeviceState for failover primary devices that were successfully unplugged by the Guest OS. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-5-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: mark devices partially unpluggedJens Freimann2019-10-291-0/+3
| | | | | | | | | | | Only the guest unplug request was triggered. This is needed for the failover feature. In case of a failed migration we need to plug the device back to the guest. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Message-Id: <20191029114905.6856-4-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: minor cleanups for slot control/statusMichael S. Tsirkin2019-07-011-6/+11
| | | | | | | | | | Rename function arguments to make intent clearer. Better documentation for slot control logic. Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
* pcie: work around for racy guest initMichael S. Tsirkin2019-07-011-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During boot, linux guests tend to clear all bits in pcie slot status register which is used for hotplug. If they clear bits that weren't set this is racy and will lose events: not a big problem for manual hotplug on bare-metal, but a problem for us. For example, the following is broken ATM: /x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \ -device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -device virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 \ -monitor stdio disk.qcow2 (qemu)device_del balloon (qemu)cont Balloon isn't deleted as it should. As a work-around, detect this attempt to clear slot status and revert status to what it was before the write. Note: in theory this can be detected as a duplicate button press which cancels the previous press. Does not seem to happen in practice as guests seem to only have this bug during init. Note2: the right thing to do is probably to fix Linux to read status before clearing it, and act on the bits that are set. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com>
* pcie: check that slt ctrl changed before deletingMichael S. Tsirkin2019-07-011-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | During boot, linux would sometimes overwrites control of a powered off slot before powering it on. Unfortunately QEMU interprets that as a power off request and ejects the device. For example: /x86_64-softmmu/qemu-system-x86_64 -enable-kvm -S -machine q35 \ -device pcie-root-port,id=pcie_root_port_0,slot=2,chassis=2,addr=0x2,bus=pcie.0 \ -monitor stdio disk.qcow2 (qemu)device_add virtio-balloon-pci,id=balloon,bus=pcie_root_port_0 (qemu)cont Balloon is deleted during guest boot. To fix, save control beforehand and check that power or led state actually change before ejecting. Note: this is more a hack than a solution, ideally we'd find a better way to detect ejects, or move away from ejects completely and instead monitor whether it's safe to delete device due to e.g. its power state. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com>
* pcie: don't skip multi-mask eventsMichael S. Tsirkin2019-07-011-1/+1
| | | | | | | | | | | | If we are trying to set multiple bits at once, testing that just one of them is already set gives a false positive. As a result we won't interrupt guest if e.g. presence detection change and attention button press are both set. This happens with multi-function device removal. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
* Include qemu-common.h exactly where neededMarkus Armbruster2019-06-121-1/+0Star
| | | | | | | | | | | | | | | | No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
* pcie: Add a simple PCIe ACS (Access Control Services) helper functionKnut Omang2019-03-131-0/+38
| | | | | | | | | | | | | | Implementing an ACS capability on downstream ports and multifunction endpoints indicates isolation and IOMMU visibility to a finer granularity. This creates smaller IOMMU groups in the guest and thus more flexibility in assigning endpoints to guest userspace or an L2 guest. Signed-off-by: Knut Omang <knut.omang@oracle.com> Message-Id: <07489975121696f5573b0a92baaf3486ef51e35d.1550768238.git-series.knut.omang@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
* qdev: Let the hotplug_handler_unplug() caller delete the deviceDavid Hildenbrand2019-03-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When unplugging a device, at one point the device will be destroyed via object_unparent(). This will, one the one hand, unrealize the removed device hierarchy, and on the other hand, destroy/free the device hierarchy. When chaining hotplug handlers, we want to overwrite a bus hotplug handler by the machine hotplug handler, to be able to perform some part of the plug/unplug and to forward the calls to the bus hotplug handler. For now, the bus hotplug handler would trigger an object_unparent(), not allowing us to perform some unplug action on a device after we forwarded the call to the bus hotplug handler. The device would be gone at that point. machine_unplug_handler(dev) /* eventually do unplug stuff */ bus_unplug_handler(dev) /* dev is gone, we can't do more unplug stuff */ So move the object_unparent() to the original caller of the unplug. For now, keep the unrealize() at the original places of the object_unparent(). For implicitly chained hotplug handlers (e.g. pc code calling acpi hotplug handlers), the object_unparent() has to be done by the outermost caller. So when calling hotplug_handler_unplug() from inside an unplug handler, nothing is to be done. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { /* eventually do unplug stuff */ bus_unplug_handler(dev) -> calls unrealize(dev) /* we can do more unplug stuff but device already unrealized */ } object_unparent(dev) In the long run, every unplug action should be factored out of the unrealize() function into the unplug handler (especially for PCI). Then we can get rid of the additonal unrealize() calls and object_unparent() will properly unrealize the device hierarchy after the device has been unplugged. hotplug_handler_unplug(dev) -> calls machine_unplug_handler() machine_unplug_handler(dev) { /* eventually do unplug stuff */ bus_unplug_handler(dev) -> only unplugs, does not unrealize /* we can do more unplug stuff */ } object_unparent(dev) -> will unrealize The original approach was suggested by Igor Mammedov for the PCI part, but I extended it to all hotplug handlers. I consider this one step into the right direction. To summarize: - object_unparent() on synchronous unplugs is done by common code -- "Caller of hotplug_handler_unplug" - object_unparent() on asynchronous unplugs ("unplug requests") has to be done manually -- "Caller of hotplug_handler_unplug" Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190228122849.4296-2-david@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* pci: Sanity test minimum downstream LNKSTAAlex Williamson2019-02-221-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The entire link status register for SR-IOV VFs is defined as RsvdZ, reads simply return zero. Usually this is nothing more than lspci reporting inconsequentially broken values: LnkSta: Speed unknown, Width x0, ... However, now that we're using the downstream endpoint link status to fill in the value at the parent downstream port, invalid values become a problem. In particular, the PCIe hotplug driver in Linux looks for a valid negotiated link width and will fail to enumerate hot-added downstream endpoints without non-zero value here, ex: pciehp 0000:00:02.0:pcie004: Slot(0): Attention button pressed pciehp 0000:00:02.0:pcie004: Slot(0) Powering on due to button press pciehp 0000:00:02.0:pcie004: Slot(0): Card present pciehp 0000:00:02.0:pcie004: Slot(0): Link Up pciehp 0000:00:02.0:pcie004: link training error: status 0x2000 pciehp 0000:00:02.0:pcie004: Failed to check link status Resolve by using minimum width and speed values for the downstream port link status when the endpoint fails to provide valid values. Long term, we may want to implement emulation in the vfio-pci host driver to suppliment this field with the PF value as the SR-IOV spec seems to allow, but the solution here is compatible should that be implemented later. Fixes: 727b48661f75 ("pci: Sync PCIe downstream port LNKSTA on read") Reported-by: Jens Freimann <jfreimann@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Message-Id: <155060310248.19547.14979269067689441201.stgit@gimli.home> Tested-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* qdev: pass an Object * to qbus_set_hotplug_handler()Michael Roth2019-02-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Certain devices types, like memory/CPU, are now being handled using a hotplug interface provided by a top-level MachineClass. Hotpluggable host bridges are another such device where it makes sense to use a machine-level hotplug handler. However, unlike those devices, host-bridges have a parent bus (the main system bus), and devices with a parent bus use a different mechanism for registering their hotplug handlers: qbus_set_hotplug_handler(). This interface currently expects a handler to be a subclass of DeviceClass, but this is not the case for MachineClass, which derives directly from ObjectClass. Internally, the interface only requires an ObjectClass, so expose that in qbus_set_hotplug_handler(). Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <154999589921.690774.3640149277362188566.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* pci/pcie: stop plug/unplug if the slot is lockedDavid Hildenbrand2019-01-151-8/+17
| | | | | | | | | | | | | | | We better stop right away. For now, errors would be partially ignored (so the guest might get informed or the device might get unplugged), although actual plug/unplug will be reported as failed to the user. While at it, properly move the check to the pre_plug handler for the plug case, as we can test the slot state before the device will be realized. Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci/pcie: perform unplug via the hotplug handlerDavid Hildenbrand2018-12-201-1/+9
| | | | | | | | | | | | | | | | | Introduce and use the "unplug" callback. This is a preparation for multi-stage hotplug handlers, whereby the bus hotplug handler is overwritten by the machine hotplug handler. This handler will then pass control to the bus hotplug handler. So to get this running cleanly, we also have to make sure to go via the hotplug handler chain when actually unplugging a device after an unplug request. Lookup the hotplug handler and call "unplug". Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci/pcie: rename hotplug handler callbacksDavid Hildenbrand2018-12-201-9/+8Star
| | | | | | | | | | | | | | The callbacks are also called for cold plugged devices. Drop the "hot" to better match the actual callback names. While at it, also rename pcie_cap_slot_hotplug_common() to pcie_cap_slot_plug_common(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Fill PCIESlot link fields to support higher speeds and widthsAlex Williamson2018-12-191-0/+74
| | | | | | | | | | | | | | Make use of the PCIESlot speed and width fields to update link information beyond those configured in pcie_cap_v1_fill(). This is only called for devices supporting a version 2 capability and automatically skips any non-PCIESlot devices. Only devices with increased link values generate any visible config space differences. Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: Sync PCIe downstream port LNKSTA on readAlex Williamson2018-12-191-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PCIe link speed and width between a downstream device and its upstream port is negotiated on real hardware and susceptible to dynamic changes due to signal issues and power management. In the emulated device case there is no real hardware link, but we still might wish to have some consistency between endpoint and downstream port via a virtual negotiation. There is of course a real link for assigned devices and this same virtual negotiation allows the downstream port to match the endpoint, synchronizing on every read to support underlying physical hardware dynamically adjusting the link. This negotiation is intentionally unidirectional for compatibility. If the endpoint exceeds the capabilities of the downstream port or there is no endpoint device, the downstream port reports negotiation to its maximum speed and width, matching the previous case where negotiation was absent. De-tuning the endpoint to match a virtual link doesn't seem to benefit anyone and is a condition we've thus far reported without functional issues. Note that PCI_EXP_LNKSTA is already ignored for migration compatibility via pcie_cap_v1_fill(). Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Create enums for link speed and widthAlex Williamson2018-12-191-3/+4
| | | | | | | | | | | | | In preparation for reporting higher virtual link speeds and widths, create enums and macros to help us manage them. Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Tested-by: Geoffrey McRae <geoff@hostfission.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: set link state inactive/active after hot unplug/plugZheng Xiang2018-12-191-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When VM boots from the latest version of linux kernel, after hot-unpluging virtio-blk disks which are hotplugged into pcie-root-port, the VM's dmesg log shows: [ 151.046242] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0001 from Slot Status [ 151.046365] pciehp 0000:00:05.0:pcie004: Slot(0-3): Attention button pressed [ 151.046369] pciehp 0000:00:05.0:pcie004: Slot(0-3): Powering off due to button press [ 151.046420] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 151.046425] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200 [ 151.046464] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 151.046468] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0 [ 156.163421] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 2f1 [ 156.163427] pciehp 0000:00:05.0:pcie004: pciehp_unconfigure_device: domain:bus:dev = 0000:06:00 [ 156.198736] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 156.198772] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400 [ 157.224124] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0018 from Slot Status [ 157.224194] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300 [ 157.224220] pciehp 0000:00:05.0:pcie004: pciehp_check_link_active: lnk_status = 2011 [ 157.224223] pciehp 0000:00:05.0:pcie004: Slot(0-3): Link Up [ 157.224233] pciehp 0000:00:05.0:pcie004: pciehp_get_power_status: SLOTCTRL a8 value read 7f1 [ 157.224281] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 157.224285] pciehp 0000:00:05.0:pcie004: pciehp_power_on_slot: SLOTCTRL a8 write cmd 0 [ 157.224300] pciehp 0000:00:05.0:pcie004: __pciehp_link_set: lnk_ctrl = 0 [ 157.224336] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 157.224339] pciehp 0000:00:05.0:pcie004: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200 [ 159.739294] pci 0000:06:00.0 id reading try 50 times with interval 20 ms to get ffffffff [ 159.739315] pciehp 0000:00:05.0:pcie004: pciehp_check_link_status: lnk_status = 2011 [ 159.739318] pciehp 0000:00:05.0:pcie004: Failed to check link status [ 159.739371] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 159.739394] pciehp 0000:00:05.0:pcie004: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400 [ 160.771426] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 160.771452] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300 [ 160.771495] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 160.771499] pciehp 0000:00:05.0:pcie004: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40 [ 160.771535] pciehp 0000:00:05.0:pcie004: pending interrupts 0x0010 from Slot Status [ 160.771539] pciehp 0000:00:05.0:pcie004: pciehp_green_led_off: SLOTCTRL a8 write cmd 300 After analyzing the log information, it seems that qemu doesn't change the Link Status from active to inactive after hot-unplug. This results in the abnormal log after the linux kernel commit d331710ea78fea merged. Furthermore, If I hotplug the same virtio-blk disk after hot-unplug, the virtio-blk would turn on and then back off. So this patch set the Link Status inactive after hot-unplug and active after hot-plug. Signed-off-by: Zheng Xiang <zhengxiang9@huawei.com> Signed-off-by: Zheng Xiang <xiang.zheng@linaro.org> Cc: Wang Haibin <wanghaibin.wang@huawei.com> Cc: qemu-stable@nongnu.org Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: Eliminate redundant PCIDevice::bus pointerDavid Gibson2017-12-051-2/+3
| | | | | | | | | | | | | | | | | The bus pointer in PCIDevice is basically redundant with QOM information. It's always initialized to the qdev_get_parent_bus(), the only difference is the type. Therefore this patch eliminates the field, instead creating a pci_get_bus() helper to do the type mangling to derive it conveniently from the QOM Device object underneath. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
* pci: Convert to realizeMao Zhongyi2017-07-031-7/+17
| | | | | | | | | | | | | Convert i82801b11, io3130_upstream, io3130_downstream and pcie_root_port devices to realize. Cc: mst@redhat.com Cc: marcel@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci: Make errp the last parameter of pci_add_capability()Mao Zhongyi2017-07-031-2/+8
| | | | | | | | | | | | | | | | | | | Add Error argument for pci_add_capability() to leverage the errp to pass info on errors. This way is helpful for its callers to make a better error handling when moving to 'realize'. Cc: pbonzini@redhat.com Cc: rth@twiddle.net Cc: ehabkost@redhat.com Cc: mst@redhat.com Cc: jasowang@redhat.com Cc: marcel@redhat.com Cc: alex.williamson@redhat.com Cc: armbru@redhat.com Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/virtio: fix Link Control Register for PCI Express virtio devicesMarcel Apfelbaum2017-03-161-0/+14
| | | | | | | | | Make several Link Control Register flags writable to conform with the PCI Express spec. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/pcie: fix Extended Configuration Space for devices with no Extended ↵Marcel Apfelbaum2017-03-161-0/+6
| | | | | | | | | | | | | | | | Capabilities Absence of any Extended Capabilities is required to be indicated by an Extended Capability header with a Capability ID of 0000h, a Capability Version of 0h, and a Next Capability Offset of 000h. Instead of inserting a 'NULL' capability is simpler to mark the start of the Extended Configuration Space as read-only to achieve the same behaviour. Signed-off-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: simplify pcie_add_capability()Peter Xu2017-02-171-11/+3Star
| | | | | | | | | | | | | | When we add PCIe extended capabilities, we should be following the rule that we add the head extended cap (at offset 0x100) first, then the rest of them. Meanwhile, we are always adding new capability bits at the end of the list. Here the "next" looks meaningless in all cases since it should always be zero (along with the "header"). Simplify the function a bit, and it looks more readable now. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pci/pcie: don't assume cap id 0 is reservedMichael S. Tsirkin2017-02-171-4/+7
| | | | | | | | | | | | | | | VFIO actually wants to create a capability with ID == 0. This is done to make guest drivers skip the given capability. pcie_add_capability then trips up on this capability when looking for end of capability list. To support this use-case, it's easy enough to switch to e.g. 0xffffffff for these comparisons - we can be sure it will never match a 16-bit capability ID. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
* pcie: fix typo in commentsCao jin2017-01-241-1/+1
| | | | | Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
* virtio-pci: address space translation service (ATS) supportJason Wang2017-01-101-0/+15
| | | | | | | | | | | | This patches enable the Address Translation Service support for virtio pci devices. This is needed for a guest visible Device IOTLB implementation and will be required by vhost device IOTLB API implementation for intel IOMMU. Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: fix link active status bit migrationMichael S. Tsirkin2016-07-281-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We changed link status register in pci express endpoint capability over time. Specifically, commit b2101eae63ea57b571cee4a9075a4287d24ba4a4 ("pcie: Set the "link active" in the link status register") set data link layer link active bit in this register without adding compatibility to old machine types. When migrating from qemu 2.3 and older this affects xhci devices which under machine type 2.0 and older have a pci express endpoint capability even if they are on a pci bus. Add compatibility flags to make this bit value match what it was under 2.3. Additionally, to avoid breaking migration from qemu 2.3 and up, suppress checking link status during migration: this seems sane since hardware can change link status at any time. https://bugzilla.redhat.com/show_bug.cgi?id=1352860 Reported-by: Gerd Hoffmann <kraxel@redhat.com> Fixes: b2101eae63ea57b571cee4a9075a4287d24ba4a4 ("pcie: Set the "link active" in the link status register") Cc: qemu-stable@nongnu.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pcie: Introduce function for DSN capability creationDmitry Fleytman2016-06-021-0/+10
| | | | | | | Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com> Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>